Skip to content

statspai.core

core

Core package initialization

BaseModel

Bases: ABC

Abstract base class for all econometric models

fit abstractmethod

fit(**kwargs) -> EconometricResults

Fit the econometric model

Returns:

Type Description
EconometricResults

Fitted model results

predict abstractmethod

predict(data: Optional[DataFrame] = None) -> ndarray

Generate predictions from the fitted model

Parameters:

Name Type Description Default
data DataFrame

Data for prediction. If None, uses training data.

None

Returns:

Type Description
ndarray

Predicted values

summary

summary() -> str

Return a summary of the fitted model

Returns:

Type Description
str

Model summary

BaseEstimator

Bases: ABC

Abstract base class for estimation algorithms

estimate abstractmethod

estimate(y: ndarray, X: ndarray, **kwargs) -> Dict[str, Any]

Estimate model parameters

Parameters:

Name Type Description Default
y ndarray

Dependent variable

required
X ndarray

Independent variables

required
**kwargs

Additional estimation options

{}

Returns:

Type Description
Dict[str, Any]

Estimation results including parameters, standard errors, etc.

EconometricResults

Unified results class for econometric models

This class provides a consistent interface for accessing results from different econometric estimators, similar to R's broom package.

summary

summary(alpha: float = 0.05) -> str

Generate a summary table of results

Parameters:

Name Type Description Default
alpha float

Significance level for confidence intervals

0.05

Returns:

Type Description
str

Formatted summary table

conf_int

conf_int(alpha: float = 0.05) -> DataFrame

Return confidence intervals for parameters

Parameters:

Name Type Description Default
alpha float

Significance level

0.05

Returns:

Type Description
DataFrame

Confidence intervals

tidy

tidy(conf_level: float = 0.95) -> DataFrame

Return a long-format DataFrame of coefficients, broom-style.

Columns

term : str Variable / coefficient name. estimate : float std_error : float statistic : float t-statistic. p_value : float conf_low, conf_high : float Two-sided conf_level CI.

Examples:

>>> result = sp.regress("y ~ x1 + x2", data=df)
>>> result.tidy()
   term  estimate  std_error  statistic  p_value  conf_low  conf_high
0  Intercept     ...
See Also

glance : 1-row model-level summary (R^2, F, N, AIC, BIC).

glance

glance() -> DataFrame

Return a 1-row DataFrame of model-level statistics, broom-style.

Columns (present subset depends on the model type)

nobs : int r_squared : float adj_r_squared : float f_statistic : float f_p_value : float aic, bic : float df_resid, df_model : int method : str Estimation method label.

See Also

tidy : long-format coefficient table.

predict

predict(data: Optional[DataFrame] = None) -> ndarray

Generate predictions from the fitted model.

Parameters:

Name Type Description Default
data DataFrame

New data for out-of-sample prediction. If None, returns in-sample fitted values.

None

Returns:

Type Description
ndarray

Predicted values.

residuals

residuals() -> Optional[ndarray]

Return model residuals if available

Returns:

Type Description
ndarray or None

Residuals

fitted_values

fitted_values() -> Optional[ndarray]

Return fitted values if available

Returns:

Type Description
ndarray or None

Fitted values

next_steps

next_steps(print_result: bool = True) -> List[Dict[str, str]]

Agent-native workflow guidance: what to do after fitting this model.

Returns a list of recommended next steps — diagnostics, robustness checks, sensitivity analysis, and export options — tailored to the model type (OLS, IV, panel, etc.).

Parameters:

Name Type Description Default
print_result bool

Print formatted recommendations to stdout.

True

Returns:

Type Description
list of dict

Each dict has keys: action, reason, priority, category.

Examples:

>>> result = sp.regress("y ~ x1 + x2", data=df)
>>> result.next_steps()

violations

violations() -> List[Dict[str, Any]]

Agent-native structured list of assumption / diagnostic issues.

Inspects stored diagnostics (first-stage F, standard error finiteness, …) and returns any flagged concerns as dicts with keys kind, severity, test, value, threshold, message, recovery_hint, alternatives.

Returns:

Type Description
list of dict

Empty list if nothing flagged.

Examples:

>>> result = sp.iv("y ~ (x ~ z) + c", data=df)
>>> for v in result.violations():
...     if v['severity'] == 'error':
...         print(v['recovery_hint'])

to_agent_summary

to_agent_summary() -> Dict[str, Any]

JSON-ready nested summary for agent consumption.

Unlike summary() (prose for humans) and tidy() (long-form DataFrame), this returns a plain dict with coefficients, scalar diagnostics, violations, and recommended next steps — suitable for feeding into an LLM tool loop or logging.

Returns:

Type Description
dict

Keys: kind, model_type, robust, n_obs, df_resid, dependent_var, coefficients, diagnostics, violations, next_steps.

See Also

to_dict : Canonical flat agent payload — prefer to_dict(detail="agent") for new code. to_agent_summary is kept because it surfaces a richer kind / model_type / robust triplet that to_dict collapses into a single method field; two methods, two intentionally different shapes.

Examples:

>>> result = sp.regress("y ~ x", data=df)
>>> import json
>>> agent_payload = json.dumps(result.to_agent_summary())

to_docx

to_docx(filename: str, title: Optional[str] = None)

Export results to a Word (.docx) document.

Parameters:

Name Type Description Default
filename str

Output path (.docx).

required
title str

Table title. Defaults to model type.

None

to_latex

to_latex(path: Optional[str] = None, *, caption: Optional[str] = None, label: Optional[str] = None, siunitx: bool = False, threeparttable: bool = False, siunitx_preamble: bool = False, **kwargs: Any) -> str

Render the regression result as a publication-quality LaTeX table.

A thin wrapper over :func:~statspai.output.regtable (single column). Produces a booktabs-style \begin{table} float with significance stars and standard errors in parentheses.

Parameters:

Name Type Description Default
path str

If given, the LaTeX source is also written to this file (UTF-8). The string is always returned.

None
caption str

\caption{...} text (maps to regtable(title=...)).

None
label str

\label{...} cross-reference id, injected after the caption.

None
siunitx bool

Decimal-align numeric columns with siunitx S columns (journal style; requires \usepackage{siunitx} v3).

False
threeparttable bool

Wrap the table in threeparttable with a tablenotes block (requires \usepackage{threeparttable}).

False
siunitx_preamble bool

Prepend a comment listing the required \usepackage lines.

False
**kwargs Any

Forwarded to :func:~statspai.output.regtable — e.g. coef_labels, keep, drop, order, stats, se_type, stars, star_levels, fmt, template, notes.

{}

Returns:

Type Description
str

LaTeX source.

Examples:

>>> import statspai as sp
>>> r = sp.regress("y ~ x + z", data=df)
>>> tex = r.to_latex(caption="Main results", label="tab:main",
...                  coef_labels={"x": "Treatment"}, template="aer")
>>> tex = r.to_latex(siunitx=True, threeparttable=True)  # journal style

to_html

to_html(path: Optional[str] = None, **kwargs: Any) -> str

Render the regression result as an HTML table.

A thin wrapper over :func:~statspai.output.regtable. Returns the HTML string; also writes it to path when given. See :meth:to_latex for the forwarded **kwargs.

to_markdown

to_markdown(path: Optional[str] = None, *, quarto: bool = False, **kwargs: Any) -> str

Render the regression result as a Markdown table.

A thin wrapper over :func:~statspai.output.regtable. Set quarto=True for Quarto-flavoured output. Returns the Markdown string; also writes it to path when given. See :meth:to_latex for the forwarded **kwargs.

to_excel

to_excel(path: str, **kwargs: Any) -> str

Write the regression result to a styled .xlsx workbook.

A thin wrapper over :func:~statspai.output.regtable with booktabs-style cell borders (requires openpyxl). Returns path. See :meth:to_latex for the forwarded **kwargs.

to_word

to_word(path: str, *, caption: Optional[str] = None, **kwargs: Any) -> str

Write the regression result to a publication-quality .docx.

A thin wrapper over :func:~statspai.output.regtable with AER/QJE booktabs rules and Times New Roman typography (requires python-docx). Returns path. Unlike :meth:to_docx (which renders the broom-style coefficient grid), this routes through the same publication table builder used for multi-model exports, so the single-model output matches a one-column regtable. See :meth:to_latex for the forwarded **kwargs.

to_dict

to_dict(*, detail: str = 'standard') -> Dict[str, Any]

Return a JSON-safe dict representation of the regression result.

Parameters:

Name Type Description Default
detail (minimal, standard, agent)

Payload depth, bounded by approximate token budget:

  • "minimal" (~ < 600 chars / < 150 tokens) — identity only: method, model_type, dependent_var, n_obs, plus fit_stats (R², F, AIC, BIC) when available. No coefficient table.
  • "standard" (variable, ~ 50 chars × n_terms) — full coefficient table + diagnostics + glance row. Matches the legacy to_dict() shape.
  • "agent" — standard + violations + warnings + next_steps + suggested_functions. Equivalent to legacy :meth:for_agent and the form returned by sp.agent.execute_tool and the MCP server.
"minimal"

Returns:

Type Description
dict

JSON-safe and bounded — round-trips through json.dumps.

Notes

Used by sp.agent.execute_tool to send results back to an LLM, and useful for caching / pickling-free persistence.

cite

cite(format: str = 'bibtex') -> Any

Return the canonical citation for this estimator, if registered.

Mirrors :meth:CausalResult.cite so an agent can call .cite() on any fitted result uniformly — and sp.bib_for(result) (which duck-types on .cite) now works for regression results too.

Parameters:

Name Type Description Default
format (bibtex, apa, json)

Same semantics as :meth:CausalResult.cite. "bibtex" / "apa" return str; "json" returns a structured dict.

"bibtex"

Returns:

Type Description
str | dict
Notes

Zero-hallucination (CLAUDE.md §10): BibTeX comes from the single source of truth CausalResult._CITATIONS (mirroring :file:paper.bib); APA / JSON forms are derived from that string, never generated. Resolution is exact on model_info['citation_key']model_typemethod (normalised), so a textbook estimator with no canonical paper (OLS / logit / probit / poisson) honestly returns a placeholder rather than a fuzzy — and possibly wrong — match. Estimators that do have a canonical reference (e.g. tobit, heckman) should set model_info['citation_key'] or carry a matching model_type.

for_agent

for_agent() -> Dict[str, Any]

Agent-ready payload — alias for to_dict(detail="agent").

Kept for backward compatibility with code written before the unified detail parameter. New code should prefer to_dict(detail="agent") for explicit semantics.

brief

brief() -> str

One-line dashboard status string (≤ ~120 chars).

Surfaces the most-significant non-intercept coefficient so agents scanning a list of regressions can spot the active finding without paying a full to_dict round-trip.

to_json

to_json(indent: Optional[int] = None) -> str

Serialise :meth:to_dict via json.dumps.

sensitivity

sensitivity(**kwargs)

Run the unified sensitivity dashboard on this result.

See :func:statspai.robustness.unified_sensitivity.