An example of market-clearing assessment

This short tutorial gives an example of statistically assessing whether a market is in an equilibrium state. The tutorial assumes some familiarity with the concepts and the functionality of the package. The basic_usage vignette can be helpful in acquiring this familiarity.

Setup the environment

Load the required libraries.

library(markets)

Prepare the data. Here, we simply simulate data using a data generating process for a market in equilibrium.

nobs <- 1000
tobs <- 5

alpha_d <- -3.9
beta_d0 <- 28.9
beta_d <- c(2.1, -0.7)
eta_d <- c(3.5, 6.25)

alpha_s <- 2.8
beta_s0 <- 26.2
beta_s <- c(2.65)
eta_s <- c(1.15, 4.2)

sigma_d <- 0.8
sigma_s <- 1.1
rho_ds <- 0.0

seed <- 42

eq_data <- simulate_data(
  "equilibrium_model", nobs, tobs,
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA, NA, c(NA),
  sigma_d = sigma_d, sigma_s = sigma_s, rho_ds = rho_ds,
  seed = seed
)

Estimate the models

Prepare the basic parameters for model initialization.

verbose <- 2
correlated_shocks <- TRUE
formula <-   Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2

Set the estimation parameters.

optimization_method <- "BFGS"
optimization_options <- list(maxit = 10000, reltol = 1e-8)

Using the above parameterization, construct and estimate the model objects. Here we estimate two equilibrium models and four disequilibrium models. All the models are constructed using the simulated data from a model of market in equilibrium.

eq_reg <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(method = "2SLS")
)
#> Info:  This is Equilibrium model.
#> Warning:  Removing unobserved '1' level(s).
eq_fit <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info:  This is Equilibrium model.
#> Warning:  Removing unobserved '1' level(s).
bs_fit <- diseq_basic(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info:  This is Basic model.
#> Warning:  Removing unobserved '1' level(s).
da_fit <- diseq_deterministic_adjustment(
  formula, eq_data,
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info:  This is Deterministic Adjustment model.
#> Info:  Dropping 1000 rows to generate 'LAGGED_P'.
#> Info:  Sample separated with 2000 rows in excess supply and 2000 rows
#> Info:   in excess demand states.

Post estimation analysis

Summaries

All the models provide estimates for the simulated data. Even with simulated data, it is difficult to assess which model performs better by examining only the summaries in separation or collectively.

summary(eq_reg@fit$first_stage_model)
#> 
#> Call:
#> lm(formula = first_stage_formula, data = object@data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.88388 -0.14215  0.00305  0.14088  0.81967 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.402566   0.003302  121.91   <2e-16 ***
#> Xd1          0.313982   0.003288   95.50   <2e-16 ***
#> Xd2         -0.103262   0.003274  -31.54   <2e-16 ***
#> X1           0.348932   0.003262  106.98   <2e-16 ***
#> X2           0.306357   0.003222   95.07   <2e-16 ***
#> Xs1         -0.398531   0.003284 -121.36   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.2087 on 3994 degrees of freedom
#> Multiple R-squared:  0.9194, Adjusted R-squared:  0.9193 
#> F-statistic:  9110 on 5 and 3994 DF,  p-value: < 2.2e-16
summary(eq_reg)
#> Equilibrium Model for Markets in Equilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Least square estimation:
#>   Method              : 2SLS
#> 
#> First Stage:
#> 
#> Call:
#> lm(formula = first_stage_formula, data = object@data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.88388 -0.14215  0.00305  0.14088  0.81967 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.402566   0.003302  121.91   <2e-16 ***
#> Xd1          0.313982   0.003288   95.50   <2e-16 ***
#> Xd2         -0.103262   0.003274  -31.54   <2e-16 ***
#> X1           0.348932   0.003262  106.98   <2e-16 ***
#> X2           0.306357   0.003222   95.07   <2e-16 ***
#> Xs1         -0.398531   0.003284 -121.36   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.2087 on 3994 degrees of freedom
#> Multiple R-squared:  0.9194, Adjusted R-squared:  0.9193 
#> F-statistic:  9110 on 5 and 3994 DF,  p-value: < 2.2e-16
#> 
#> 
#> Demand Equation:
#> 
#> Call:
#> lm(formula = demand_formula, data = object@data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -2.41586 -0.50003  0.00523  0.49533  2.71617 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 28.89089    0.01657 1744.02   <2e-16 ***
#> P_FITTED    -3.87656    0.02886 -134.32   <2e-16 ***
#> Xd1          2.10510    0.01475  142.74   <2e-16 ***
#> Xd2         -0.70778    0.01190  -59.49   <2e-16 ***
#> X1           3.50747    0.01514  231.61   <2e-16 ***
#> X2           6.24655    0.01436  434.93   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.7311 on 3994 degrees of freedom
#> Multiple R-squared:  0.9849, Adjusted R-squared:  0.9849 
#> F-statistic: 5.214e+04 on 5 and 3994 DF,  p-value: < 2.2e-16
#> 
#> 
#> Supply Equation:
#> 
#> Call:
#> lm(formula = supply_formula, data = object@data)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.4183 -0.4975  0.0040  0.4951  2.7379 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 26.18624    0.01803 1452.13   <2e-16 ***
#> P_FITTED     2.84269    0.03483   81.62   <2e-16 ***
#> Xs1          2.67766    0.01818  147.27   <2e-16 ***
#> X1           1.16279    0.01658   70.13   <2e-16 ***
#> X2           4.18829    0.01569  266.97   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.7312 on 3995 degrees of freedom
#> Multiple R-squared:  0.9849, Adjusted R-squared:  0.9849 
#> F-statistic: 6.517e+04 on 4 and 3995 DF,  p-value: < 2.2e-16
summary(eq_fit)
#> Equilibrium Model for Markets in Equilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -3.8766    28.8909     2.1051    -0.7078     3.5075     6.2466     2.8427 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    26.1862     2.6777     1.1628     4.1883     0.5339     0.5341     0.9998 
#> 
#> Coefficients:
#>            Estimate Std. Error  z value      Pr(z)
#> D_P        -3.87656    0.03181 -121.859  0.000e+00
#> D_CONST    28.89089    0.01826 1582.331  0.000e+00
#> D_Xd1       2.10503    0.01624  129.612  0.000e+00
#> D_Xd2      -0.70797    0.01313  -53.920  0.000e+00
#> D_X1        3.50748    0.01669  210.122  0.000e+00
#> D_X2        6.24655    0.01583  394.610  0.000e+00
#> S_P         2.84299    0.05390   52.750  0.000e+00
#> S_CONST    26.18611    0.02791  938.358  0.000e+00
#> S_Xs1       2.67778    0.02814   95.168  0.000e+00
#> S_X1        1.16268    0.02566   45.312  0.000e+00
#> S_X2        4.18819    0.02428  172.512  0.000e+00
#> D_VARIANCE  0.64938    0.01583   41.012  0.000e+00
#> S_VARIANCE  1.28026    0.03540   36.169 1.865e-286
#> RHO        -0.01919    0.01807   -1.062  2.883e-01
#> 
#> -2 log L: 6722.675
summary(bs_fit)
#> Basic Model for Markets in Disequilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -3.3899    28.6909     1.9497    -0.6543     3.3398     6.0968     1.5864 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    26.6854     2.1697     1.5963     4.5814     0.6012     1.0378     0.0000 
#> 
#> Coefficients:
#>            Estimate Std. Error z value      Pr(z)
#> D_P         -3.5088    0.04544  -77.22  0.000e+00
#> D_CONST     29.1229    0.03412  853.59  0.000e+00
#> D_Xd1        2.0719    0.02544   81.44  0.000e+00
#> D_Xd2       -0.7088    0.01662  -42.65  0.000e+00
#> D_X1         3.3653    0.02546  132.20  0.000e+00
#> D_X2         6.1435    0.02472  248.49  0.000e+00
#> S_P          2.0706    0.10797   19.18  5.585e-82
#> S_CONST     27.8542    0.05857  475.60  0.000e+00
#> S_Xs1        2.5828    0.06431   40.16  0.000e+00
#> S_X1         1.4476    0.05402   26.80 3.340e-158
#> S_X2         4.3986    0.05373   81.86  0.000e+00
#> D_VARIANCE   0.6386    0.02336   27.34 1.359e-164
#> S_VARIANCE   1.1249    0.07243   15.53  2.131e-54
#> RHO         -0.4709    0.04036  -11.67  1.872e-31
#> 
#> -2 log L: 8338.912
summary(da_fit)
#> Deterministic Adjustment Model for Markets in Disequilibrium:
#>   Demand RHS        :   D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        :   S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Separation Rule   : P_DIFF analogous to (D_Q - S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Demand Obs = 2000, Supply Obs = 2000
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation:
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#> -3.3899420 28.6908902  1.9497497 -0.6543147  3.3398395  6.0967924  1.5864480 
#>    S_CONST      S_Xs1       S_X1       S_X2     P_DIFF D_VARIANCE S_VARIANCE 
#> 26.6853581  2.1696502  1.5962938  4.5814415 -0.0007971  0.6012371  1.0378385 
#>        RHO 
#>  0.0000000 
#> 
#> Coefficients:
#>             Estimate Std. Error    z value      Pr(z)
#> D_P        -3.875716    0.03396 -114.13957  0.000e+00
#> D_CONST    28.891196    0.01875 1540.93819  0.000e+00
#> D_Xd1       2.105016    0.01624  129.59543  0.000e+00
#> D_Xd2      -0.707967    0.01313  -53.92060  0.000e+00
#> D_X1        3.507454    0.01670  210.07130  0.000e+00
#> D_X2        6.246540    0.01583  394.60032  0.000e+00
#> S_P         2.842225    0.05490   51.76649  0.000e+00
#> S_CONST    26.187074    0.03095  846.01088  0.000e+00
#> S_Xs1       2.677787    0.02814   95.16506  0.000e+00
#> S_X1        1.162670    0.02566   45.30859  0.000e+00
#> S_X2        4.188175    0.02428  172.49531  0.000e+00
#> P_DIFF      0.001553    0.02176    0.07137  9.431e-01
#> D_VARIANCE  0.649373    0.01584   41.00788  0.000e+00
#> S_VARIANCE  1.280272    0.03540   36.16830 1.918e-286
#> RHO        -0.019191    0.01807   -1.06181  2.883e-01
#> 
#> -2 log L: 6722.67

Model selection

The deterministic adjustment model has price dynamics that are analogous to excess demand and estimates one extra parameter. The directional model estimates one parameter less as the model does not have enough equations to identify prices in both demand and supply equations. The estimated parameters are summarized as follows.

da_coef <- coef(da_fit)
coef_names <- names(da_coef)

sim_coef <- c(
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA,
  sigma_d, sigma_s,
  rho_ds
)

coef_tbl <- function(fit) {
    dt <- data.frame(names(coef(fit)), coef(fit))
    names(dt) <- c("coef", substitute(fit))
    dt
}

comp <- coef_tbl(da_fit) |> 
    dplyr::left_join(coef_tbl(bs_fit), by = "coef") |>
    dplyr::left_join(coef_tbl(eq_reg), by = "coef") |>
    dplyr::left_join(coef_tbl(eq_fit), by = "coef") |>
    dplyr::mutate(sim = sim_coef) |>
    dplyr::mutate(sim = sim_coef,
                  da_fit_err = abs(da_fit - sim),
                  bs_fit_err = abs(bs_fit - sim),
                  eq_fit_err = abs(eq_fit - sim)) 

comp
#>          coef       da_fit     bs_fit     eq_reg      eq_fit   sim  da_fit_err
#> 1         D_P -3.875716134 -3.5088088 -3.8765576 -3.87656066 -3.90 0.024283866
#> 2     D_CONST 28.891196214 29.1228978 28.8908936 28.89089040 28.90 0.008803786
#> 3       D_Xd1  2.105016163  2.0718884  2.1050973  2.10503395  2.10 0.005016163
#> 4       D_Xd2 -0.707967193 -0.7088276 -0.7077763 -0.70796640 -0.70 0.007967193
#> 5        D_X1  3.507453596  3.3652961  3.5074743  3.50747721  3.50 0.007453596
#> 6        D_X2  6.246539693  6.1435192  6.2465504  6.24654862  6.25 0.003460307
#> 7         S_P  2.842225125  2.0706215  2.8426868  2.84299194  2.80 0.042225125
#> 8     S_CONST 26.187074381 27.8541734 26.1862350 26.18611453 26.20 0.012925619
#> 9       S_Xs1  2.677786510  2.5828274  2.6776568  2.67778036  2.65 0.027786510
#> 10       S_X1  1.162670211  1.4475854  1.1627870  1.16268125  1.15 0.012670211
#> 11       S_X2  4.188174719  4.3986442  4.1882859  4.18819002  4.20 0.011825281
#> 12     P_DIFF  0.001552936         NA         NA          NA    NA          NA
#> 13 D_VARIANCE  0.649372950  0.6386261  0.5338621  0.64937985  0.80 0.150627050
#> 14 S_VARIANCE  1.280272325  1.1248997  0.5340809  1.28025622  1.10 0.180272325
#> 15        RHO -0.019190550 -0.4708618  0.9997951 -0.01919118  0.00 0.019190550
#>     bs_fit_err  eq_fit_err
#> 1  0.391191222 0.023439340
#> 2  0.222897771 0.009109601
#> 3  0.028111597 0.005033951
#> 4  0.008827565 0.007966401
#> 5  0.134703926 0.007477205
#> 6  0.106480801 0.003451378
#> 7  0.729378471 0.042991936
#> 8  1.654173423 0.013885467
#> 9  0.067172625 0.027780365
#> 10 0.297585420 0.012681255
#> 11 0.198644190 0.011809976
#> 12          NA          NA
#> 13 0.161373855 0.150620153
#> 14 0.024899730 0.180256224
#> 15 0.470861756 0.019191185

Since we have used simulated data, we can calculate the average absolute error of the parameter estimation for each of the models. The population values are unknown in practice, and this calculation is impossible.

Moreover, the average absolute error cannot provide an overall estimation assessment as the market models have different parameter spaces. To assess the overall model performance, one can instead use an information criterion.