Numeric Variable Cleaning
This step of the pipeline simply attempts to ensure that the numeric variables from the CMF for all years only contain numbers. The steps are as follows, for every numeric variable (capital, material value, production value etc.):
-
Mark if the observation contains non-numeric characters
-
Gather all observations with issues
-
Construct mappings to correct numbers, using the original images if necessary
-
Apply mappings back to the CMF data.