MCP workflow for empirical economists¶
This guide is for the common agent workflow: a researcher has a local .dta,
CSV, Parquet, or Arrow file and wants Claude Code, Codex, Cursor, or another
MCP client to run a defensible empirical design without hand-copying arrays
between tools.
StatsPAI's MCP server is deliberately local-first. It loads the file you point it at, fits StatsPAI estimators, returns strict JSON, and caches fitted result handles so follow-up diagnostics can reuse the same object.
Configure the server¶
After installing StatsPAI, the package exposes statspai-mcp and the module
entry point python -m statspai.agent.mcp_server.
Claude Desktop-style configuration:
{
"mcpServers": {
"statspai": {
"command": "python",
"args": ["-m", "statspai.agent.mcp_server"]
}
}
}
For clients that prefer a console script, use:
The server speaks stdio JSON-RPC, advertises the 2025-06-18 MCP protocol
revision, and returns structuredContent with a compact outputSchema. Older
clients still receive the text JSON payload.
Data handoff¶
Every data-bound tool accepts data_path.
Supported local and remote formats include:
| Format | Notes |
|---|---|
.dta |
Native Stata files through pandas.read_stata |
.csv, .tsv, .txt |
Delimited text |
.parquet, .pq, .feather, .arrow |
Column projection supported |
.xlsx, .xls |
Spreadsheet inputs |
.json, .jsonl |
JSON records |
file://, https://, http://, s3://, gs:// |
Remote URLs through pandas/fsspec |
For large files, pass:
data_columns narrows the read when the backend supports it. data_sample_n
uses deterministic random sampling for quick exploration. Raise or disable the
loader cap with STATSPAI_MCP_MAX_DATA_BYTES only when the host has enough
memory.
When the MCP server loads a local file, tool results include
data_provenance: source path, format, requested columns/sample, file size,
mtime, and SHA-256. statspai://result/<id> exposes the same provenance through
the cached result metadata. Remote URLs are recorded after dropping query tokens;
StatsPAI does not hash remote bytes unless the data are first saved locally.
The core loop¶
Use result handles. They keep the agent from copying fitted objects through chat text.
detect_design -> preflight/recommend -> fit(as_handle=true)
-> audit_result -> sensitivity_from_result / plot_from_result
-> bibtex
The fitted estimator returns result_id. Pass it to:
| Tool | Purpose |
|---|---|
audit_result |
Reviewer checklist of missing diagnostics |
brief_result |
One-line estimate summary |
interpret_result |
Grounded explanation, optionally using MCP sampling |
plot_from_result |
Inline PNG diagnostic plot |
sensitivity_from_result |
E-value / Oster / Cinelli-Hazlett style checks |
honest_did_from_result |
Rambachan-Roth sensitivity from DID/event-study results |
bibtex |
Verified BibTeX from StatsPAI's citation registry |
One-call empirical pipelines¶
StatsPAI includes high-level MCP pipeline tools for the most common designs.
DID¶
{
"name": "pipeline_did",
"arguments": {
"data_path": "/abs/cfps_panel.dta",
"y": "lwage",
"treat": "treated",
"time": "year",
"id": "pid",
"cohort": "first_treat",
"covariates": ["age", "age2", "edu", "industry"],
"as_handle": true
}
}
When id and cohort are supplied, the pipeline dispatches the
Callaway-Sant'Anna path. Otherwise it falls back to the 2x2 DID path. It then
adds the audit, honest-DID sensitivity where possible, Bacon diagnostics where
available, a narrative, and follow-up calls.
IV¶
{
"name": "pipeline_iv",
"arguments": {
"data_path": "/abs/card.dta",
"formula": "lwage ~ exper + expersq + black + south + smsa + (educ ~ nearc4)",
"as_handle": true
}
}
The IV pipeline reports the fitted estimate and weak-IV diagnostics, and prioritizes Anderson-Rubin-style inference when first-stage evidence is weak.
RD¶
{
"name": "pipeline_rd",
"arguments": {
"data_path": "/abs/lee_senate.dta",
"y": "voteshare_next",
"x": "margin",
"c": 0,
"as_handle": true
}
}
The RD pipeline fits rdrobust, attempts the canonical RD plot, checks density
manipulation, and returns bandwidth-sensitivity follow-ups.
Stata and R command migration¶
StatsPAI ships translator tools for one command at a time:
| Source | MCP tool | Examples |
|---|---|---|
| Stata | from_stata |
regress, xtreg, reghdfe, ivreg2, ivreghdfe, csdid, did_imputation, synth, rdrobust, psmatch2, count-panel commands |
| R | from_r |
feols, felm, lm, glm, plm, matchit, att_gt, did, synth |
Use the built-in prompts:
| Prompt | Use |
|---|---|
stata_command_workflow |
Translate a single Stata command, fit the translated StatsPAI tool, then audit |
r_command_workflow |
Translate a single R expression, fit, then audit |
cross_language_command_check |
Translate Stata and R snippets, compare estimand/covariance conventions, then fit comparable StatsPAI calls |
These prompts are conservative. If a translator returns ok=false, the agent
should report suggestions instead of guessing. If Stata and R snippets imply
different controls, fixed effects, cohorts, or covariance conventions, treat
that as a mismatch before fitting.
For psmatch2, from_stata maps the common nearest-neighbor, kernel, radius,
common, and ai() paths onto sp.psmatch2. Convention-changing options such
as Stata's probit propensity score or ATE-focused requests are surfaced as
notes rather than silently claimed as exact parity; use sp.match directly for
ATE-oriented matching.
For ivreghdfe, from_stata maps the IV-with-fixed-effects command to the
same StatsPAI/fixest shape produced by R feols(... | fe | endog ~ instr):
the formula contains the IV block and fe=[...] carries the absorbed fixed
effects. This is a migration contract for dispatching StatsPAI; it is not a
live Stata run.
Cross-software verification discipline¶
StatsPAI already stores committed R and Stata parity artifacts under
tests/r_parity/ and tests/stata_parity/. The Track A 3-way report compares
Python, R, and Stata outputs with pre-registered tolerances. Use it to decide
whether a StatsPAI result is:
| Status | Meaning |
|---|---|
| machine-level agreement | Point estimates match at tight tolerance |
| iterative/cross-fit agreement | Random folds or optimizers need a wider registered tolerance |
| convention gap | Backends target different estimands or covariance conventions |
| unavailable | No committed external artifact yet |
In MCP clients, read statspai://parity/track-a-summary for a compact JSON
summary of the committed Track A report. It includes strictness-tier counts,
module ids, Stata command labels, convention notes, and a tool_evidence index
keyed by common StatsPAI tool names, without loading the full markdown artifact
into the model context.
Do not call a live Stata/R comparison unless a separate Stata or R MCP server is configured and actually invoked. The StatsPAI translators check whether command semantics align; they are not themselves Stata or R runtimes.
External data MCP servers¶
For World Bank, OECD, FRED, IMF, or OpenEcon-style data MCP servers, keep the responsibilities separate:
- Use the data MCP server to search and retrieve indicators.
- Save the returned table to CSV, Parquet, Arrow, or
.dta. - Pass the saved file path to StatsPAI through
data_path. - Preserve the source metadata in your notebook, table notes, or paper appendix: provider, indicator id, query, retrieval date, and transformation code.
StatsPAI should analyze the bytes it receives. It should not invent missing indicator values, source names, or retrieval provenance.
Suggested first prompt¶
Use the statspai MCP server. Load /abs/cfps_panel.dta.
Run pipeline_did with y=lwage, treat=treated, time=year, id=pid,
cohort=first_treat, controls age age2 edu industry, as_handle=true.
Then audit the result, run the first feasible high-importance follow-up,
render the canonical plot, and return a short methods paragraph plus a
regression-table export suggestion.