statspai.structural¶
structural ¶
Structural estimation methods.
BLPResult ¶
Results from BLP demand estimation.
Attributes:
| Name | Type | Description |
|---|---|---|
linear_params |
Series
|
Linear parameter estimates (β, α). |
nonlinear_params |
Series
|
Nonlinear parameter estimates (σ, random coefficient std devs). |
se_linear |
Series
|
Standard errors for linear parameters. |
se_nonlinear |
Series
|
Standard errors for nonlinear parameters. |
mean_utility |
Series
|
Estimated mean utility δ for each product-market. |
own_elasticities |
Series
|
Own-price elasticities for each product-market. |
n_markets |
int
|
Number of markets. |
n_products |
int
|
Total number of product-market observations. |
gmm_objective |
float
|
Value of the GMM objective at the optimum. |
converged |
bool
|
Whether the outer-loop optimization converged. |
elasticity_matrix ¶
Return the full own- and cross-price elasticity matrix for a market.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
market_id
|
hashable
|
Market to return. If None, returns the first market. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
(J_m x J_m) elasticity matrix with product labels. |
diversion_ratios ¶
Compute diversion ratios for a given market.
Diversion ratio D_{jk} = fraction of consumers leaving product j that switch to product k (rather than the outside option or other products). D_{jk} = (ds_k/dp_j) / (-ds_j/dp_j).
With logit-type models this simplifies to cross-elasticity ratios adjusted by shares.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
market_id
|
hashable
|
Market to compute for. If None, uses the first market. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
|
to_econometric_results ¶
to_econometric_results() -> EconometricResults
Convert to a standard EconometricResults object.
ProductionResult ¶
Bases: EconometricResults
Result object for production function estimation.
Inherits params / std_errors / summary / to_dict
from :class:EconometricResults and adds production-function-specific
payload:
coef— input elasticities keyed by input name (e.g.{"l": 0.62, "k": 0.31})tfp— firm-time TFP estimatesomega_it(in logs); same length as the post-stage-2 working sampleresiduals— i.i.d. shocketa_itfrom stage 1productivity_process—{"rho": float, "sigma": float}from the AR fit on omegamarkup— placeholder; populated by :func:statspai.markup
Use .summary() for a Stata-style table or .coef for the raw dict.
prod_fn ¶
prod_fn(data: DataFrame, output: str = 'y', free: Sequence[str] | str | None = None, state: Sequence[str] | str | None = None, proxy: str | None = None, panel_id: str = 'id', time: str = 'year', method: str = 'acf', polynomial_degree: int = 3, productivity_degree: int = 1, functional_form: str = 'cobb-douglas', boot_reps: int = 0, seed: Optional[int] = None, **kwargs) -> ProductionResult
Production function estimation — unified interface.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Long panel with one row per (firm, year). |
required |
output
|
str
|
Log output column. |
'y'
|
free
|
str or list
|
Free inputs (e.g. labor). |
``["l"]``
|
state
|
str or list
|
State / predetermined inputs (capital). |
``["k"]``
|
proxy
|
str
|
Productivity proxy. Defaults: |
None
|
panel_id
|
str
|
Panel identifier columns. |
'id'
|
time
|
str
|
Panel identifier columns. |
'id'
|
method
|
('op', 'lp', 'acf', 'wrdg')
|
Estimator. ACF is the modern default (corrects OP/LP identification problem). |
'op'
|
polynomial_degree
|
int
|
Stage-1 control function polynomial degree. |
3
|
productivity_degree
|
int
|
Productivity AR polynomial degree. Default |
1
|
functional_form
|
('cobb-douglas', 'translog')
|
Functional form. Translog adds 0.5 * x_j*2 own-quadratic terms
and x_jx_k cross terms — output elasticities then vary by
firm-time and Translog identification caveat: stage-2 instruments are formed
as polynomial transforms of the same raw set used for
Cobb-Douglas ( Wooldridge does not yet support translog (raises
|
'cobb-douglas'
|
boot_reps
|
int
|
Firm-cluster bootstrap replications. |
0
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
ProductionResult
|
|
Examples:
>>> import statspai as sp
>>> res = sp.prod_fn(df, output="y", free="l", state="k", proxy="m",
... panel_id="id", time="year",
... method="acf", boot_reps=200, seed=0)
>>> res.coef
{"l": 0.62, "k": 0.32}
>>> mu = sp.markup(res, revenue="log_rev", input_cost="log_mat",
... flexible_input="m")
See Also
olley_pakes, levinsohn_petrin, ackerberg_caves_frazer, wooldridge_prod markup : De Loecker-Warzynski (2012) firm-time markup.
References
Olley & Pakes (1996); Levinsohn & Petrin (2003); Ackerberg, Caves & Frazer (2015); Wooldridge (2009).
olley_pakes ¶
olley_pakes(data: DataFrame, output: str = 'y', free: Sequence[str] | str | None = None, state: Sequence[str] | str | None = None, proxy: str = 'i', panel_id: str = 'id', time: str = 'year', polynomial_degree: int = 3, productivity_degree: int = 1, functional_form: str = 'cobb-douglas', boot_reps: int = 0, seed: Optional[int] = None, drop_zero_proxy: bool = True) -> ProductionResult
Olley-Pakes (1996) production function estimator.
Uses investment as the proxy for unobserved productivity. Firms with zero investment are dropped by default — the inversion of the investment policy requires a strictly positive proxy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Long-form panel: one row per (firm, year). |
required |
output
|
str
|
Log output column. |
``"y"``
|
free
|
str or list
|
Freely chosen inputs (e.g. labor). Multiple are allowed. |
``["l"]``
|
state
|
str or list
|
State inputs (capital, predetermined). |
``["k"]``
|
proxy
|
str
|
Investment column (must be > 0 to invert). |
``"i"``
|
panel_id
|
str
|
Firm and year identifiers. |
'id'
|
time
|
str
|
Firm and year identifiers. |
'id'
|
polynomial_degree
|
int
|
Degree of the stage-1 polynomial in (free, state, proxy). |
3
|
productivity_degree
|
int
|
Degree of the polynomial g in the AR productivity process. |
3
|
functional_form
|
('cobb-douglas', 'translog')
|
Production function form. Translog adds quadratic and cross
terms; |
'cobb-douglas'
|
boot_reps
|
int
|
Firm-cluster bootstrap replications. |
0
|
seed
|
int
|
Bootstrap RNG seed. |
None
|
drop_zero_proxy
|
bool
|
Drop rows with non-positive proxy (required by the OP inversion).
Note that dropping period |
True
|
Returns:
| Type | Description |
|---|---|
ProductionResult
|
|
Examples:
>>> import statspai as sp
>>> res = sp.olley_pakes(df, output="y", free="l", state="k",
... proxy="i", panel_id="id", time="year",
... boot_reps=200, seed=0)
>>> res.coef # {"l": 0.62, "k": 0.31}
>>> res.summary()
References
Olley, G.S. & Pakes, A. (1996). The dynamics of productivity in the telecommunications equipment industry. Econometrica, 64(6), 1263-1297.
levinsohn_petrin ¶
levinsohn_petrin(data: DataFrame, output: str = 'y', free: Sequence[str] | str | None = None, state: Sequence[str] | str | None = None, proxy: str = 'm', panel_id: str = 'id', time: str = 'year', polynomial_degree: int = 3, productivity_degree: int = 1, functional_form: str = 'cobb-douglas', boot_reps: int = 0, seed: Optional[int] = None) -> ProductionResult
Levinsohn-Petrin (2003) production function estimator.
Uses intermediate input (materials / energy) as the productivity proxy. Avoids the OP zero-investment selection problem because most firms use materials in every period.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Long-form panel. |
required |
output
|
str
|
Log output. |
'y'
|
free
|
str or list
|
Free inputs. |
``["l"]``
|
state
|
str or list
|
State inputs. |
``["k"]``
|
proxy
|
str
|
Intermediate input (materials). |
``"m"``
|
panel_id
|
str
|
|
'id'
|
time
|
str
|
|
'id'
|
polynomial_degree
|
int
|
|
3
|
productivity_degree
|
int
|
|
3
|
functional_form
|
('cobb-douglas', 'translog')
|
|
'cobb-douglas'
|
boot_reps
|
int
|
|
0
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
ProductionResult
|
|
References
Levinsohn, J. & Petrin, A. (2003). Estimating production functions using inputs to control for unobservables. Review of Economic Studies, 70(2), 317-341.
ackerberg_caves_frazer ¶
ackerberg_caves_frazer(data: DataFrame, output: str = 'y', free: Sequence[str] | str | None = None, state: Sequence[str] | str | None = None, proxy: str = 'm', panel_id: str = 'id', time: str = 'year', polynomial_degree: int = 3, productivity_degree: int = 1, functional_form: str = 'cobb-douglas', boot_reps: int = 0, seed: Optional[int] = None) -> ProductionResult
Ackerberg-Caves-Frazer (2015) production function estimator.
Corrects the OP / LP "functional dependence" identification problem: when free inputs (labor) are chosen at the same time as the proxy, the labor coefficient is not identified in the stage-1 polynomial. ACF moves all coefficient identification to stage 2, instrumenting free inputs with their lagged values and state inputs at the contemporaneous level.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
|
required |
output
|
str
|
|
'y'
|
free
|
str or list
|
Free inputs — instrumented with their lag in stage 2. |
``["l"]``
|
state
|
str or list
|
|
``["k"]``
|
proxy
|
str
|
Intermediate input (materials). |
``"m"``
|
panel_id
|
str
|
|
'id'
|
time
|
str
|
|
'id'
|
polynomial_degree
|
int
|
|
3
|
productivity_degree
|
int
|
|
3
|
functional_form
|
('cobb-douglas', 'translog')
|
|
'cobb-douglas'
|
boot_reps
|
int
|
|
0
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
ProductionResult
|
|
Notes
Requires at least two consecutive time periods per firm so that lagged labor exists.
References
Ackerberg, D.A., Caves, K. & Frazer, G. (2015). Identification properties of recent production function estimators. Econometrica, 83(6), 2411-2451.
wooldridge_prod ¶
wooldridge_prod(data: DataFrame, output: str = 'y', free: Sequence[str] | str | None = None, state: Sequence[str] | str | None = None, proxy: str = 'm', panel_id: str = 'id', time: str = 'year', polynomial_degree: int = 2, productivity_degree: int = 2, functional_form: str = 'cobb-douglas', boot_reps: int = 0, seed: Optional[int] = None) -> ProductionResult
Wooldridge (2009) joint production function estimator (stacked NLS).
Estimates (beta_l, beta_k) jointly with the nonparametric
control function h(m, k) and productivity Markov polynomial
g(omega_{t-1}) by minimizing the sum of squared residuals over
a stacked level + productivity-substituted equation system. This
is equivalent to one-step GMM with identity weight matrix and
instruments equal to the regressors (NLS). A full GMM version
with optimal weighting is on the roadmap.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
Long panel with one row per (firm, year). |
required |
output
|
str
|
Log output column. |
``"y"``
|
free
|
str or list
|
Free inputs (labor). |
``["l"]``
|
state
|
str or list
|
State inputs (capital). |
``["k"]``
|
proxy
|
str
|
Productivity proxy (typically intermediate input). |
``"m"``
|
panel_id
|
str
|
Panel identifiers. |
'id'
|
time
|
str
|
Panel identifiers. |
'id'
|
polynomial_degree
|
int
|
Degree of |
2
|
productivity_degree
|
int
|
Degree of the AR polynomial |
2
|
boot_reps
|
int
|
Firm-cluster bootstrap replications. |
0
|
seed
|
int
|
|
None
|
Returns:
| Type | Description |
|---|---|
ProductionResult
|
|
References
Wooldridge, J.M. (2009). On estimating firm-level production functions using proxy variables to control for unobservables. Economics Letters, 104(3), 112-114.