Estimators

OLSObservational

class formative.OLSObservational(dag, treatment, outcome)

Observational Ordinary Least Squares (OLS) estimator with DAG-based confounder identification.

Given a DAG encoding your causal assumptions, this estimator:

Identifies which variables must be controlled for (the adjustment set) using the backdoor criterion — any variable that is an ancestor of both treatment and outcome, but not a descendant of treatment.
Raises an error if unobserved confounders make OLS invalid.
Estimates the causal effect via OLS, controlling for the adjustment set.
Also runs unadjusted OLS so you can see the confounding bias directly.

Example:

dag = DAG()
dag.assume("ability").causes("education", "income")
dag.assume("education").causes("income")

# If 'ability' is in df, it is controlled for automatically.
# If 'ability' is not in df, an IdentificationError is raised.
result = OLSObservational(dag, treatment="education", outcome="income").fit(df)
print(result.summary())

Parameters:

dag (DAG)
treatment (str)
outcome (str)

fit(data)

Identify the adjustment set, then estimate the causal effect via OLS.

Runs both an adjusted model (controlling for confounders) and an unadjusted model (treatment only) so you can see the confounding bias.

Parameters:

data (pd.DataFrame) – Must contain columns for treatment, outcome, and any confounders that appear in the DAG. Column names must match the node names in the DAG. DAG nodes absent from the dataframe are treated as unobserved — if any of these are confounders, an IdentificationError is raised before estimation.

Raises:

IdentificationError – If confounders declared in the DAG are absent from the dataframe. Note: confounders not modelled in the DAG at all cannot be detected.
ValueError – If treatment or outcome columns are missing from the dataframe.

Return type:

OLSResult

OLSResult

class formative.OLSResult(adjusted_result, unadjusted_result, treatment, outcome, adjustment_set, dag)

The result of an OLS causal estimation.

Holds both the adjusted estimate (controlling for confounders) and the unadjusted estimate (treatment ~ outcome only), so you can see the effect of controlling for confounders directly.

Parameters:

treatment (str)
outcome (str)
adjustment_set (set[str])

property effect: float

causal effect of treatment on outcome.

Type:: Adjusted point estimate

property unadjusted_effect: float

naive regression without controlling for confounders.

Type:: Unadjusted point estimate

property std_err: float: Standard error of the adjusted treatment effect estimate.

property conf_int: tuple[float, float]: 95% confidence interval for the adjusted treatment effect.

property pvalue: float: p-value for the adjusted treatment effect (H0: effect = 0).

property adjustment_set: set[str]: Variables controlled for to satisfy the backdoor criterion.

property statsmodels_result: The underlying adjusted statsmodels result, for full diagnostics.

property statsmodels_unadjusted_result: The underlying unadjusted statsmodels result, for full diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

summary()

Concise tabular summary of the ATE estimate, confidence interval, and assumptions.

Return type:: str

refute(data)

Run refutation checks against this OLS estimation.

Currently runs:

Random common cause: adds a random noise column as an extra control and checks that the estimate does not shift by more than one standard error.

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().

IV2SLS

class formative.IV2SLS(dag, treatment, outcome, instrument)

Instrumental Variables estimator using Two-Stage Least Squares (2SLS).

Uses the DAG to validate the instrument structurally and identify observed confounders to include as controls in both stages:

Relevance (structural): the instrument must have a directed path to the treatment in the DAG.
Exclusion restriction (structural): no directed path from the instrument to the outcome that bypasses the treatment.
Observed confounders (backdoor criterion): any variable that is a common cause of treatment and outcome and is present in the data is included as a control. Unobserved confounders are handled by the instrument and do not raise an IdentificationError — this is the primary use case for IV estimation.

Example:

dag = DAG()
dag.assume("proximity").causes("education")
dag.assume("ability").causes("education", "income")
dag.assume("education").causes("income")

# 'ability' is absent from df (unobserved) — the instrument "controls" for it.
result = IV2SLS(
    dag, treatment="education", outcome="income", instrument="proximity"
).fit(df)
print(result.summary())

Parameters:

dag (DAG)
treatment (str)
outcome (str)
instrument (str)

fit(data)

Validate data and estimate the causal effect via 2SLS.

Observed confounders from the DAG are included as controls in both stages. Unobserved confounders are handled by the instrument.

Parameters:: data (pd.DataFrame) – Must contain columns for treatment, outcome, and instrument. Observed confounders present in the DAG are added as controls automatically if they appear in the dataframe.
Raises:: ValueError – If treatment, outcome, or instrument columns are missing from the dataframe.
Return type:: IVResult

IVResult

class formative.IVResult(result, unadjusted_result, treatment, outcome, instrument, adjustment_set, dag)

The result of an IV (2SLS) causal estimation.

Holds both the instrumented (2SLS) estimate and the unadjusted OLS estimate (outcome ~ treatment only), so you can see the confounding bias that IV corrects.

Parameters:

treatment (str)
outcome (str)
instrument (str)
adjustment_set (set[str])

property effect: float: 2SLS point estimate of the causal effect of treatment on outcome.

property unadjusted_effect: float

naive regression without instrument or controls.

Type:: Unadjusted OLS estimate

property std_err: float: Standard error of the 2SLS treatment effect estimate.

property conf_int: tuple[float, float]: 95% confidence interval for the 2SLS treatment effect.

property pvalue: float: p-value for the 2SLS treatment effect (H0: effect = 0).

property adjustment_set: set[str]: Observed confounders included as controls in both OLS stages.

property statsmodels_result: The underlying statsmodels IV2SLS result, for full diagnostics.

property statsmodels_unadjusted_result: The underlying unadjusted OLS result, for full diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

refute(data)

Run refutation checks against this IV estimation.

Re-uses the original data to run statistical tests that probe the assumptions underlying the IV estimate. Returns an IVRefutationReport with one RefutationCheck per test.

Currently runs:

First-stage F-statistic: tests instrument relevance. F < 10 indicates a weak instrument (Stock & Yogo, 2005).

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().

summary()

Concise tabular summary of the LATE estimate, confidence interval, and assumptions.

Return type:: str

PropensityScoreMatching

class formative.PropensityScoreMatching(dag, treatment, outcome)

Observational estimator using propensity score matching (1-to-1 nearest neighbour, with replacement).

Uses the DAG to identify observed confounders (backdoor criterion), then:

Estimates propensity scores via logistic regression of treatment on the adjustment set.
Matches each treated unit to its nearest control by propensity score.
Estimates the ATT as the mean outcome difference across matched pairs.
Computes standard errors via bootstrap over the full procedure.

Requires binary treatment (0/1). Raises IdentificationError if any DAG-declared confounders are absent from the data.

Example:

dag = DAG()
dag.assume("ability").causes("education", "income")
dag.assume("education").causes("income")

result = PropensityScoreMatching(
    dag, treatment="education", outcome="income"
).fit(df)
print(result.summary())

Parameters:

dag (DAG)
treatment (str)
outcome (str)

fit(data)

Identify confounders, match on propensity scores, and estimate the ATT.

Parameters:

data (pd.DataFrame) – Must contain a binary (0/1) treatment column, an outcome column, and any confounders declared in the DAG.

Raises:

IdentificationError – If DAG-declared confounders are absent from the dataframe.
ValueError – If treatment or outcome columns are missing, or treatment is not binary with both classes present.

Return type:

MatchingResult

MatchingResult

class formative.MatchingResult(att, unadjusted_effect, bootstrap_atts, treatment, outcome, adjustment_set, dag)

The result of a propensity score matching estimation.

Holds the ATT point estimate alongside its unadjusted counterpart (naive mean difference), so you can see the confounding bias that matching corrects. Standard errors and CIs are computed via bootstrap over the full matching procedure.

Parameters:

att (float)
unadjusted_effect (float)
bootstrap_atts (ndarray)
treatment (str)
outcome (str)
adjustment_set (set[str])

property effect: float

average treatment effect on the treated.

Type:: ATT

property unadjusted_effect: float: Naive mean difference Y|T=1 minus Y|T=0, no matching.

property std_err: float: Bootstrap standard error of the ATT.

property conf_int: tuple[float, float]: Bootstrap percentile 95% confidence interval.

property pvalue: float: Two-sided p-value for the ATT (H0: ATT = 0), via z-test.

property adjustment_set: set[str]: Observed confounders used in the propensity score model.

property bootstrap_atts: ndarray: Full array of per-bootstrap ATT values, for diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

summary()

Concise tabular summary of the ATT estimate, confidence interval, and assumptions.

Return type:: str

refute(data)

Run refutation checks against this matching estimation.

Currently runs:

Placebo treatment: randomly permutes treatment labels and re-runs matching. The placebo ATT should be near zero.
Random common cause: adds a random noise covariate to the propensity score model and checks that the ATT is stable.

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().

RCT

class formative.RCT(dag, treatment, outcome)

Randomized Controlled Trial estimator.

Estimates the Average Treatment Effect (ATE) via OLS regression of the outcome on the treatment indicator. Because treatment is randomly assigned, no confounder adjustment is needed.

DAG validation enforces the RCT assumption: treatment must have no declared causes (parents) in the DAG. Declaring a cause of treatment would contradict random assignment and raises a ValueError.

Example:

dag = DAG()
dag.assume("treatment").causes("outcome")

result = RCT(dag, treatment="treatment", outcome="outcome").fit(df)
print(result.summary())

Parameters:

dag (DAG)
treatment (str)
outcome (str)

fit(data)

Estimate the ATE via OLS regression of outcome on treatment.

Parameters:: data (pd.DataFrame) – Must contain columns for treatment and outcome. Treatment may be binary (0/1) or continuous.
Raises:: ValueError – If treatment or outcome columns are missing from the dataframe.
Return type:: RCTResult

RCTResult

class formative.RCTResult(result, treatment, outcome, dag)

The result of an RCT causal estimation.

Estimates the Average Treatment Effect (ATE) via OLS. Because treatment is randomly assigned, no confounder adjustment is needed and the ATE equals the difference in mean outcomes between treatment and control.

Parameters:

treatment (str)
outcome (str)

property effect: float

average treatment effect (difference in means).

Type:: ATE

property std_err: float: Standard error of the ATE estimate.

property conf_int: tuple[float, float]: 95% confidence interval for the ATE.

property pvalue: float: p-value for the ATE (H0: ATE = 0).

property statsmodels_result: The underlying statsmodels OLS result, for full diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

summary()

Concise tabular summary of the ATE estimate, confidence interval, and assumptions.

Return type:: str

refute(data)

Run refutation checks against this RCT estimation.

Currently runs:

Random common cause: adds a random noise column as an extra control and checks that the ATE does not shift by more than one standard error. Under randomisation the ATE should be robust to any additional covariate.

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().

DiD

class formative.DiD(dag, group, time, outcome)

Difference-in-Differences estimator.

Estimates the Average Treatment Effect on the Treated (ATT) by comparing how outcomes changed over time for the treated group versus the control group. The key insight is that any time trend common to both groups cancels out, isolating the treatment effect.

Implemented as OLS with group and time main effects plus their interaction:

outcome ~ group + time + group:time

The coefficient on group:time is the DiD estimate.

The DAG is used to validate that group, time, and outcome are declared nodes. It does not apply the backdoor criterion — identification in DiD comes from the panel design, not from controlling for observed confounders.

Requires binary group (0 = control, 1 = treated) and time (0 = pre-period, 1 = post-period) columns.

Example:

dag = DAG()
dag.assume("group").causes("outcome")
dag.assume("time").causes("outcome")

result = DiD(dag, group="group", time="time", outcome="outcome").fit(df)
print(result.summary())

Parameters:

dag (DAG)
group (str)
time (str)
outcome (str)

fit(data)

Estimate the ATT via OLS with group and time fixed effects.

Parameters:: data (pd.DataFrame) – Must contain binary (0/1) group and time columns, and a numeric outcome column.
Raises:: ValueError – If required columns are missing or group/time are not binary.
Return type:: DiDResult

DiDResult

class formative.DiDResult(result, group, time, outcome, naive_diff, dag)

The result of a Difference-in-Differences estimation.

The DiD estimate is the ATT: how much better (or worse) the treated group did relative to what would have been expected based on the control group’s trajectory.

Parameters:

group (str)
time (str)
outcome (str)
naive_diff (float)

property effect: float

ATT under the parallel trends assumption.

Type:: DiD point estimate

property naive_diff: float

treated mean minus control mean in the post period.

Type:: Naive post-period difference

property std_err: float: Standard error of the DiD estimate.

property conf_int: tuple[float, float]: 95% confidence interval for the DiD estimate.

property pvalue: float: p-value for the DiD estimate (H0: ATT = 0).

property statsmodels_result: The underlying statsmodels OLS result, for full diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

summary()

Concise tabular summary of the ATT estimate, confidence interval, and assumptions.

Return type:: str

refute(data)

Run refutation checks against this DiD estimation.

Currently runs:

Placebo group: randomly permutes group labels and re-runs DiD. The placebo estimate should be near zero if the result is not spurious.
Placebo time: randomly permutes time labels and re-runs DiD. The placebo estimate should be near zero if the effect is genuinely concentrated in the post period.
Random common cause: adds a random noise column as an extra control and checks that the estimate does not shift by more than one standard error.

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().

RDD

class formative.RDD(dag, treatment, running_var, cutoff, outcome, bandwidth=None)

Regression Discontinuity Design estimator.

Estimates the Local Average Treatment Effect at the cutoff (LATE at the cutoff) by fitting a local linear regression on both sides of the cutoff. Treatment is always derived from whether the running variable is at or above the cutoff — any existing treatment column in the data is overwritten by this rule.

The model fitted is:

outcome ~ treatment + (running_var - cutoff) + treatment:(running_var - cutoff)

The coefficient on treatment is the LATE at the cutoff: the jump in outcome at the threshold, after allowing slopes to differ on each side.

The DAG is used to validate that the running variable is an ancestor of treatment (i.e. the threshold rule is part of the assumed causal structure). Identification does not rely on the backdoor criterion — it comes from the sharp discontinuity in treatment assignment at the cutoff.

Example:

dag = DAG()
dag.assume("score").causes("treatment", "outcome")
dag.assume("treatment").causes("outcome")

result = RDD(dag, treatment="treatment", running_var="score",
             cutoff=0.0, outcome="outcome").fit(df)
print(result.summary())

Parameters:

dag (DAG)
treatment (str)
running_var (str)
cutoff (float)
outcome (str)
bandwidth (float | None)

fit(data)

Estimate the LATE at the cutoff via local linear regression.

Parameters:: data (pd.DataFrame) – Must contain the running variable and outcome columns. The treatment column is derived from the running variable and the cutoff — any existing column with the treatment name is overwritten.
Raises:: ValueError – If required columns are missing from the dataframe.
Return type:: RDDResult

RDDResult

class formative.RDDResult(result, treatment, running_var, cutoff, outcome, bandwidth, unadjusted_effect, dag)

The result of a Regression Discontinuity Design estimation.

The RDD estimate is the LATE at the cutoff: the jump in the outcome at the cutoff, estimated via local linear regression on both sides of the threshold.

Parameters:

treatment (str)
running_var (str)
cutoff (float)
outcome (str)
bandwidth (float | None)
unadjusted_effect (float)

property effect: float

the jump in outcome at the threshold (coefficient on the treatment indicator).

Type:: LATE at the cutoff

property unadjusted_effect: float

above-cutoff mean minus below-cutoff mean.

Type:: Naive mean difference

property std_err: float: Standard error of the LATE at the cutoff estimate.

property conf_int: tuple[float, float]: 95% confidence interval for the LATE at the cutoff estimate.

property pvalue: float: p-value for the LATE at the cutoff (H0: LATE at cutoff = 0).

property statsmodels_result: The underlying statsmodels OLS result, for full diagnostics.

property assumptions: list[Assumption]: Modelling assumptions required for a causal interpretation.

property cutoff: float: The threshold value that determines treatment assignment.

property running_var: str: Name of the running variable column.

property bandwidth: float | None: Bandwidth used to restrict observations around the cutoff, or None.

property n_obs: int: Number of observations used in the estimation (after bandwidth filtering).

executive_summary()

Narrative explanation of the method, DAG, assumptions, and result.

Return type:: str

summary()

Concise tabular summary of the LATE at the cutoff, confidence interval, and assumptions.

Return type:: str

refute(data)

Run refutation checks against this RDD estimation.

Currently runs:

Placebo cutoff: fits an RDD at a false cutoff in the all-control region. The placebo estimate should be near zero if the original result is not spurious.
Random common cause: adds a random noise column as an extra covariate and checks that the estimate does not shift by more than one standard error.

Parameters:: data (pd.DataFrame) – The same dataframe passed to fit().