Data and model cross-validation to improve accuracy of microsimulation results: estimates for the Polish Household Budget Survey

We conduct detailed analysis of the Polish Household Budget Survey data for the years 2006-2011 with the focus on its representativeness from the point of view of microsimulation analysis. We find important discrepancies between the data weighted with baseline grossing-up weights and official statistics from other sources. A number of re-weighting exercises is examined from the point of view of the accuracy of microsimulation results and we show that using a combination of demographic calibration targets with several economic status variables or tax identifiers from the microsimulation model substantially improves the correspondence of model results and administrative data. While demographic re-weighting is neutral from the point of view of income distribution, calibrating the grossing-up weights to adjust for economic status and tax identifiers significantly increases income inequality. We argue that although data re-weighting can substantially improve the accuracy of microsimulation it should be used with caution.