Comparing Penalty Functions in Balancing and Dis-aggregating Social Accounting Matrices

Wolfgang Britz


Constructing a balanced and sufficiently detailed Social Accounting Matrix (SAM) is a necessary step for any work with Computable General Equilibrium (CGE) models. Even when starting with a given SAM, researchers might wish to develop their own, more detailed variants for a specific study by dis-aggregating sectors and products, a process termed splitting the SAM. We review three approaches for balancing and splitting a SAM: Cross-Entropy (CE), a Highest Posterior Density (HPD) estimator resulting in a quadratic loss penalty function, and a linear loss penalty function. The exercise considers upper and lower bounds on the (new) SAM entries, different weights for penalizing deviations from a priori information, and unknown row or column totals, to give the user flexibility in controlling outcomes. The approaches are assessed first by a systematic Monte-Carlo experiment. It re-balances smaller SAMs, after errors with known distributions are added. Here we find quite limited numerical differences between the CE and quadratic loss approaches. The CE approach was however considerably slower than the other candidates. Second, we tested the three approaches for dis-aggregating the Global Trade Analysis Project (GTAP) data base to provide, as an example, further agri-food detail. In such empirical applications, the distribution of the errors of the new SAM entries is typically not known. As in the SAM balancing exercise, we use CONOPT4 as a multi-purpose (non)linear solver which can be also be employed to solve the CGE model itself. For comparison, we add the specialized Linear and Quadratic Programming (QP) solvers CPLEDX and GUROBI. As in the Monte-Carlo experiment, the differences in results between the three approaches were moderate. The specialized solvers require very little time to solve the linear and quadratic loss problems. However, they did not achieve the same, very high accuracy as CONOPT4 for the quadratic loss problem. The CE problem could take longer by a factor of 100 or more, compared to a linear or quadratic loss approach solved with the specialized solvers. We conclude that using linear or quadratic loss approaches, especially combined with a specialized solver, are the most suitable candidates for larger SAM splitting / balancing problems. Additionally, we present a fast and accurate data processing chain to yield a benchmark data set for a CGE model from the GTAP Data Base which involves filtering out small cost, expenditure and revenue shares, and allows users to introduce further product and sectoral detail based on user provided information.


Data balancing; SAM balancing; Highest posterior density; Cross entropy

Full Text:




  • There are currently no refbacks.

Copyright (c) 2021 Wolfgang Britz