`statspai.core`¶

core ¶

Core package initialization

BaseModel ¶

Bases: ABC

Abstract base class for all econometric models

fit `abstractmethod` ¶

fit(**kwargs: Any) -> EconometricResults

Fit the econometric model

Returns:

Type	Description
`EconometricResults`	Fitted model results

predict `abstractmethod` ¶

predict(data: Optional[DataFrame] = None) -> ndarray

Generate predictions from the fitted model

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Data for prediction. If None, uses training data.	`None`

Returns:

Type	Description
`ndarray`	Predicted values

summary ¶

summary() -> str

Return a summary of the fitted model

Returns:

Type	Description
`str`	Model summary

BaseEstimator ¶

Bases: ABC

Abstract base class for estimation algorithms

estimate `abstractmethod` ¶

estimate(y: ndarray, X: ndarray, **kwargs: Any) -> Dict[str, Any]

Estimate model parameters

Parameters:

Name	Type	Description	Default
`y`	`ndarray`	Dependent variable	required
`X`	`ndarray`	Independent variables	required
`**kwargs`	`Any`	Additional estimation options	`{}`

Returns:

Type	Description
`Dict[str, Any]`	Estimation results including parameters, standard errors, etc.

EconometricResults ¶

Unified results class for econometric models

This class provides a consistent interface for accessing results from different econometric estimators, similar to R's broom package.

Examples:

>>> import statspai as sp
>>> import numpy as np
>>> import pandas as pd
>>> rng = np.random.default_rng(0)
>>> x = rng.normal(size=50)
>>> df = pd.DataFrame({"y": 2.0 * x + rng.normal(size=50), "x": x})
>>> res = sp.regress("y ~ x", data=df)
>>> type(res).__name__
'EconometricResults'
>>> round(float(res.params["x"]), 1)
1.9

summary ¶

summary(alpha: float = 0.05) -> str

Generate a summary table of results

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Significance level for confidence intervals	`0.05`

Returns:

Type	Description
`str`	Formatted summary table

conf_int ¶

conf_int(alpha: float = 0.05) -> DataFrame

Return confidence intervals for parameters

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Significance level	`0.05`

Returns:

Type	Description
`DataFrame`	Confidence intervals

tidy ¶

tidy(conf_level: float = 0.95) -> DataFrame

Return a long-format DataFrame of coefficients, broom-style.

Columns

term : str Variable / coefficient name. estimate : float std_error : float statistic : float t-statistic. p_value : float conf_low, conf_high : float Two-sided conf_level CI.

Examples:

>>> result = sp.regress("y ~ x1 + x2", data=df)
>>> result.tidy()
   term  estimate  std_error  statistic  p_value  conf_low  conf_high
0  Intercept     ...

glance ¶

glance() -> DataFrame

Return a 1-row DataFrame of model-level statistics, broom-style.

Columns (present subset depends on the model type)

nobs : int r_squared : float adj_r_squared : float f_statistic : float f_p_value : float aic, bic : float df_resid, df_model : int method : str Estimation method label.

predict ¶

predict(data: Optional[DataFrame] = None) -> ndarray

Generate predictions from the fitted model.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	New data for out-of-sample prediction. If None, returns in-sample fitted values.	`None`

Returns:

Type	Description
`ndarray`	Predicted values.

residuals ¶

residuals() -> Optional[ndarray]

Return model residuals if available

Returns:

Type	Description
`ndarray or None`	Residuals

fitted_values ¶

fitted_values() -> Optional[ndarray]

Return fitted values if available

Returns:

Type	Description
`ndarray or None`	Fitted values

next_steps ¶

next_steps(print_result: bool = True) -> List[Dict[str, str]]

Agent-native workflow guidance: what to do after fitting this model.

Returns a list of recommended next steps — diagnostics, robustness checks, sensitivity analysis, and export options — tailored to the model type (OLS, IV, panel, etc.).

Parameters:

Name	Type	Description	Default
`print_result`	`bool`	Print formatted recommendations to stdout.	`True`

Returns:

Type	Description
`list of dict`	Each dict has keys: `action`, `reason`, `priority`, `category`.

Examples:

>>> result = sp.regress("y ~ x1 + x2", data=df)
>>> result.next_steps()

violations ¶

violations() -> List[Dict[str, Any]]

Agent-native structured list of assumption / diagnostic issues.

Inspects stored diagnostics (first-stage F, standard error finiteness, …) and returns any flagged concerns as dicts with keys kind, severity, test, value, threshold, message, recovery_hint, alternatives.

Returns:

Type	Description
`list of dict`	Empty list if nothing flagged.

Examples:

>>> result = sp.iv("y ~ (x ~ z) + c", data=df)
>>> for v in result.violations():
...     if v['severity'] == 'error':
...         print(v['recovery_hint'])

to_agent_summary ¶

to_agent_summary() -> Dict[str, Any]

JSON-ready nested summary for agent consumption.

Unlike summary() (prose for humans) and tidy() (long-form DataFrame), this returns a plain dict with coefficients, scalar diagnostics, violations, and recommended next steps — suitable for feeding into an LLM tool loop or logging.

Returns:

Type	Description
`dict`	Keys: `kind`, `model_type`, `robust`, `n_obs`, `df_resid`, `dependent_var`, `coefficients`, `diagnostics`, `violations`, `next_steps`.

to_docx ¶

to_docx(filename: str, title: Optional[str] = None) -> None

Export results to a Word (.docx) document.

Parameters:

Name	Type	Description	Default
`filename`	`str`	Output path (.docx).	required
`title`	`str`	Table title. Defaults to model type.	`None`

to_latex ¶

to_latex(path: Optional[str] = None, *, caption: Optional[str] = None, label: Optional[str] = None, siunitx: bool = False, threeparttable: bool = False, siunitx_preamble: bool = False, **kwargs: Any) -> str

Render the regression result as a publication-quality LaTeX table.

A thin wrapper over :func:~statspai.output.regtable (single column). Produces a booktabs-style \begin{table} float with significance stars and standard errors in parentheses.

Parameters:

Name	Type	Description	Default
`path`	`str`	If given, the LaTeX source is also written to this file (UTF-8). The string is always returned.	`None`
`caption`	`str`	`\caption{...}` text (maps to `regtable(title=...)`).	`None`
`label`	`str`	`\label{...}` cross-reference id, injected after the caption.	`None`
`siunitx`	`bool`	Decimal-align numeric columns with `siunitx` `S` columns (journal style; requires `\usepackage{siunitx}` v3).	`False`
`threeparttable`	`bool`	Wrap the table in `threeparttable` with a `tablenotes` block (requires `\usepackage{threeparttable}`).	`False`
`siunitx_preamble`	`bool`	Prepend a comment listing the required `\usepackage` lines.	`False`
`**kwargs`	`Any`	Forwarded to :func:`~statspai.output.regtable` — e.g. `coef_labels`, `keep`, `drop`, `order`, `stats`, `se_type`, `stars`, `star_levels`, `fmt`, `template`, `notes`.	`{}`

Returns:

Type	Description
`str`	LaTeX source.

Examples:

>>> import statspai as sp
>>> r = sp.regress("y ~ x + z", data=df)
>>> tex = r.to_latex(caption="Main results", label="tab:main",
...                  coef_labels={"x": "Treatment"}, template="aer")
>>> tex = r.to_latex(siunitx=True, threeparttable=True)  # journal style

to_html ¶

to_html(path: Optional[str] = None, **kwargs: Any) -> str

Render the regression result as an HTML table.

A thin wrapper over :func:~statspai.output.regtable. Returns the HTML string; also writes it to path when given. See :meth:to_latex for the forwarded **kwargs.

to_markdown ¶

to_markdown(path: Optional[str] = None, *, quarto: bool = False, **kwargs: Any) -> str

Render the regression result as a Markdown table.

A thin wrapper over :func:~statspai.output.regtable. Set quarto=True for Quarto-flavoured output. Returns the Markdown string; also writes it to path when given. See :meth:to_latex for the forwarded **kwargs.

to_excel ¶

to_excel(path: str, **kwargs: Any) -> str

Write the regression result to a styled .xlsx workbook.

A thin wrapper over :func:~statspai.output.regtable with booktabs-style cell borders (requires openpyxl). Returns path. See :meth:to_latex for the forwarded **kwargs.

to_word ¶

to_word(path: str, *, caption: Optional[str] = None, **kwargs: Any) -> str

Write the regression result to a publication-quality .docx.

A thin wrapper over :func:~statspai.output.regtable with AER/QJE booktabs rules and Times New Roman typography (requires python-docx). Returns path. Unlike :meth:to_docx (which renders the broom-style coefficient grid), this routes through the same publication table builder used for multi-model exports, so the single-model output matches a one-column regtable. See :meth:to_latex for the forwarded **kwargs.

to_dict ¶

to_dict(*, detail: str = 'standard') -> Dict[str, Any]

Return a JSON-safe dict representation of the regression result.

Parameters:

Name	Type	Description	Default
`detail`	`(minimal, standard, agent)`	Payload depth, bounded by approximate token budget: `"minimal"` (~ < 600 chars / < 150 tokens) — identity only: `method`, `model_type`, `dependent_var`, `n_obs`, plus `fit_stats` (R², F, AIC, BIC) when available. No coefficient table. `"standard"` (variable, ~ 50 chars × n_terms) — full coefficient table + diagnostics + glance row. Matches the legacy `to_dict()` shape. `"agent"` — standard + `violations` + `warnings` + `next_steps` + `suggested_functions`. Equivalent to legacy :meth:`for_agent` and the form returned by `sp.agent.execute_tool` and the MCP server.	`"minimal"`

Returns:

Type	Description
`dict`	JSON-safe and bounded — round-trips through `json.dumps`.

Notes

Used by sp.agent.execute_tool to send results back to an LLM, and useful for caching / pickling-free persistence.

cite ¶

cite(format: str = 'bibtex') -> Any

Return the canonical citation for this estimator, if registered.

Mirrors :meth:CausalResult.cite so an agent can call .cite() on any fitted result uniformly — and sp.bib_for(result) (which duck-types on .cite) now works for regression results too.

Parameters:

Name	Type	Description	Default
`format`	`(bibtex, apa, json)`	Same semantics as :meth:`CausalResult.cite`. `"bibtex"` / `"apa"` return `str`; `"json"` returns a structured `dict`.	`"bibtex"`

Returns:

Type	Description
`str \| dict`

Notes

Zero-hallucination (CLAUDE.md §10): BibTeX comes from the single source of truth CausalResult._CITATIONS (mirroring :file:paper.bib); APA / JSON forms are derived from that string, never generated. Resolution is exact on model_info['citation_key'] → model_type → method (normalised), so a textbook estimator with no canonical paper (OLS / logit / probit / poisson) honestly returns a placeholder rather than a fuzzy — and possibly wrong — match. Estimators that do have a canonical reference (e.g. tobit, heckman) should set model_info['citation_key'] or carry a matching model_type.

to_appendix ¶

to_appendix(format: str = 'latex', *, include_assumptions: bool = True, include_diagnostics: bool = True, include_citation: bool = True, include_provenance: bool = True) -> str

Generate a Methods and Formulas appendix for this model.

Shares the curated, zero-hallucination methods table with :class:CausalResult (CLAUDE.md §10). IV / 2SLS regressions resolve to the instrumental-variables entry; other regression families that are not in the causal methods table degrade to an explicit placeholder plus the inference actually used and a provenance trace — never an invented formula.

Parameters:

Name	Type	Description	Default
`format`	`(latex, markdown, text)`	Output format.	`"latex"`
`include_assumptions`	`bool`	Include the identifying-assumptions list (when registered).	`True`
`include_diagnostics`	`bool`	Include the inference block read off `model_info`.	`True`
`include_citation`	`bool`	Append the APA-style reference from :meth:`cite`.	`True`
`include_provenance`	`bool`	Append a one-line provenance trace (StatsPAI version + estimator identity + methods-spec key).	`True`

Returns:

Type	Description
`str`	The assembled appendix.

Examples:

>>> res = sp.ivreg('y ~ x | z', data=df)
>>> print(res.to_appendix(format='markdown'))

for_agent ¶

for_agent() -> Dict[str, Any]

Agent-ready payload — alias for to_dict(detail="agent").

Kept for backward compatibility with code written before the unified detail parameter. New code should prefer to_dict(detail="agent") for explicit semantics.

brief ¶

brief() -> str

One-line dashboard status string (≤ ~120 chars).

Surfaces the most-significant non-intercept coefficient so agents scanning a list of regressions can spot the active finding without paying a full to_dict round-trip.

to_json ¶

to_json(indent: Optional[int] = None) -> str

Serialise :meth:to_dict via json.dumps.

sensitivity ¶

sensitivity(**kwargs: Any) -> Any

Run the unified sensitivity dashboard on this result.

See :func:statspai.robustness.unified_sensitivity.

statspai.core¶

core ¶

BaseModel ¶

fit abstractmethod ¶

predict abstractmethod ¶

summary ¶

BaseEstimator ¶

estimate abstractmethod ¶

EconometricResults ¶

summary ¶

conf_int ¶

tidy ¶

glance ¶

predict ¶

residuals ¶

fitted_values ¶

next_steps ¶

violations ¶

to_agent_summary ¶

to_docx ¶

to_latex ¶

to_html ¶

to_markdown ¶

to_excel ¶

to_word ¶

to_dict ¶

cite ¶

to_appendix ¶

for_agent ¶

brief ¶

to_json ¶

sensitivity ¶

`statspai.core`¶

fit `abstractmethod` ¶

predict `abstractmethod` ¶

estimate `abstractmethod` ¶