Research-Methods & Statistics Catalog

The rigor behind every number — catalogued, cited, and browsable.

Every study design, statistical test, assumption check, remedy, and sampling method the toolbox knows how to run — each one grounded in the cited methods canon, never invented. This is the research-methods layer most analytics tools hide. We put it on the table.

34 methods · 16 study designs · 3 tests · 9 sampling methods — each with a cited source

Browse the catalog Talk to us about rigor

Why this catalog exists

Most tools give you a number. We show our work.

A dashboard that prints “significant” without an , or runs a test without an , is hiding the part that decides whether the answer is trustworthy. The toolbox treats research methods as a first-class, declarative catalog — the same way the visualization catalog treats charts. Every entry names the conditions it’s right for, the conditions it’s wrong for, and the canonical sources that ground it.

That’s the discipline behind the platform: paired with effect sizes and confidence intervals; considered before data is collected; study designs that are honest about and what can and cannot buy. When an assumption is violated, the remedy comes from the methods canon — cited, not invented.

Right tool, stated plainly
Every method names when to use it and — just as important — when not to.
Cited, never invented
Each entry is grounded in named sources from the methods canon, resolvable to full citations.
Wired to compute
Tests map to real statistical functions and the chart that renders the result — the catalog isn't just prose.

The catalog

34 methods, grouped by what they do.

Filter by kind or tag. Each card shows the method, when it fits, when it doesn’t, and the sources that ground it. The 16 study-design archetypes give the catalog its breadth; the test family is the rigor in motion.

Kind

Tag

34 of 34 methods

16 methods

Study designs

How a study is structured to support — or undercut — a causal claim. Each archetype names its causal-identification strategy and the validity threat that dominates it.

Randomized pretest-posttest control

Study design

Random assignment, measured before and after, with a control arm.

Family: experimental
Identification: randomization
Time structure: longitudinal

notation R O X O / R O O

Dominant validity threat: pretest × treatment interaction (external)

When to use: When the question + constraints fit a experimental design with randomization.
When not to use: When the dominant threat (pretest × treatment interaction (external)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Shadish-Cook-Campbell
Campbell & Stanley

Posttest-only randomized control

Study design

Random assignment, outcome measured once after — the cleanest two-arm design.

Family: experimental
Identification: randomization
Time structure: cross-sectional

notation R X O / R O

Dominant validity threat: — (clean internal validity)

When to use: When the question + constraints fit a experimental design with randomization.
When not to use: When the dominant threat (— (clean internal validity)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Shadish-Cook-Campbell

Solomon four-group

Study design

Four randomized arms that isolate testing effects from the treatment.

Family: experimental
Identification: randomization
Time structure: longitudinal

Dominant validity threat: complexity / cost

When to use: When the question + constraints fit a experimental design with randomization.
When not to use: When the dominant threat (complexity / cost) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Shadish-Cook-Campbell

One-group pretest-posttest

Study design

A single group measured before and after — weak internal validity (no control).

Family: pre_experiment
Identification: none
Time structure: longitudinal

notation O X O

Dominant validity threat: history, maturation, regression to the mean

When to use: When the question + constraints fit a pre_experiment design with no formal causal identification.
When not to use: When the dominant threat (history, maturation, regression to the mean) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Campbell & Stanley

Nonequivalent control group (NEGD)

Study design

Treatment vs a non-randomized comparison group, adjusted for baseline differences.

Family: quasi_experimental
Identification: statistical_adjustment
Time structure: longitudinal

notation N O X O / N O O

Dominant validity threat: selection × maturation

When to use: When the question + constraints fit a quasi_experimental design with statistical adjustment.
When not to use: When the dominant threat (selection × maturation) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Shadish-Cook-Campbell
Morgan & Winship

Interrupted time-series

Study design

Many pre/post observations on one series, interrupted by the intervention.

Family: quasi_experimental
Identification: none
Time structure: time_series

notation O O O X O O O

Dominant validity threat: history at the intervention point

When to use: When the question + constraints fit a quasi_experimental design with no formal causal identification.
When not to use: When the dominant threat (history at the intervention point) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Shadish-Cook-Campbell
Mostly Harmless Econometrics

Regression discontinuity (RDD)

Study design

Exploit a cutoff on a running variable to identify a local causal effect.

Family: quasi_experimental
Identification: regression_discontinuity
Time structure: cross_sectional

Dominant validity threat: bandwidth / functional form

When to use: When the question + constraints fit a quasi_experimental design with regression discontinuity.
When not to use: When the dominant threat (bandwidth / functional form) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Causal Inference: The Mixtape
Mostly Harmless Econometrics

Instrumental variable (IV / 2SLS)

Study design

Use an instrument that affects treatment but not the outcome directly to recover a causal effect.

Family: observational
Identification: instrumental_variable
Time structure: varies

Dominant validity threat: instrument validity (exclusion)

When to use: When the question + constraints fit a observational design with instrumental variable.
When not to use: When the dominant threat (instrument validity (exclusion)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Mostly Harmless Econometrics
Causal Inference: The Mixtape

Prospective cohort

Study design

Follow exposed vs unexposed groups forward in time to incidence.

Family: observational
Identification: statistical_adjustment
Time structure: longitudinal_cohort

Dominant validity threat: confounding, attrition

When to use: When the question + constraints fit a observational design with statistical adjustment.
When not to use: When the dominant threat (confounding, attrition) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Rothman
epidemiology shelf

Case-control

Study design

Compare those with vs without the outcome, looking backward at exposure (odds ratios).

Family: observational
Identification: statistical_adjustment
Time structure: retrospective

Dominant validity threat: recall / selection bias

When to use: When the question + constraints fit a observational design with statistical adjustment.
When not to use: When the dominant threat (recall / selection bias) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Rothman

Cross-sectional survey

Study design

Measure many variables at one time point — associations, not causation.

Family: survey
Identification: none
Time structure: cross_sectional

Dominant validity threat: reverse causation

When to use: When the question + constraints fit a survey design with no formal causal identification.
When not to use: When the dominant threat (reverse causation) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Babbie
Fowler

Panel survey

Study design

Re-measure the same respondents over waves — within-person change (fixed effects).

Family: survey
Identification: statistical_adjustment
Time structure: longitudinal_panel

Dominant validity threat: panel attrition

When to use: When the question + constraints fit a survey design with statistical adjustment.
When not to use: When the dominant threat (panel attrition) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Babbie
Mostly Harmless Econometrics

Factorial survey (vignette experiment)

Study design

Randomize vignette attributes within a survey — a conjoint-style embedded experiment.

Family: survey/experimental
Identification: randomization
Time structure: cross_sectional

Dominant validity threat: vignette realism (external)

When to use: When the question + constraints fit a survey/experimental design with randomization.
When not to use: When the dominant threat (vignette realism (external)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Factorial Survey Experiments

Case study

Study design

Deep examination of one or a few bounded cases (qualitative or mixed).

Family: case_study
Identification: none
Time structure: varies

Dominant validity threat: generalizability

When to use: When the question + constraints fit a case_study design with no formal causal identification.
When not to use: When the dominant threat (generalizability) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Babbie
Coding Manual

Ethnography / grounded theory

Study design

Immersive field study; themes built from coded observation (interpretivist).

Family: ethnographic_field
Identification: none
Time structure: longitudinal

Dominant validity threat: reflexivity / transferability

When to use: When the question + constraints fit a ethnographic_field design with no formal causal identification.
When not to use: When the dominant threat (reflexivity / transferability) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Coding Manual
Crotty

Meta-analysis

Study design

Pool effect sizes across studies (fixed/random effects) — Principia's priors ARE this.

Family: meta_analytic_synthesis
Identification: n.a.
Time structure: n.a.

Dominant validity threat: publication bias

When to use: When the question + constraints fit a meta_analytic_synthesis design with n.a..
When not to use: When the dominant threat (publication bias) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

Hunter & Schmidt
Borenstein et al.

3 methods

Statistical tests

The analysis that answers the question, with the effect size and confidence interval that say how much — not just whether — alongside the p-value.

Independent-samples t-test

Statistical test

Compare the means of two independent groups (equal-variance, pooled).

Effect size: cohens_d
Groups: 2
Outcome: interval

When to use: Two independent groups, a continuous outcome, comparable group variances and roughly normal data (or n large enough for the CLT).
When not to use: Unequal variances (use Welch's t), non-normal small samples (use Mann-Whitney), or paired/repeated measures (use a paired t).

Grounded in

Research Methods in Psychology
statistics shelf

Welch's t-test (unequal variance)

Statistical test

Two-group mean comparison that does NOT assume equal variances.

Effect size: cohens_d
Groups: 2
Outcome: interval

When to use: Two independent groups with a continuous outcome whenever group variances may differ (i.e. almost always — Welch is a safe default over the pooled t).
When not to use: Paired/repeated measures, or ordinal/badly non-normal small samples (use Mann-Whitney).

Grounded in

Welch (1947)
Delacre, Lakens & Leys (2017)

Mann-Whitney U (Wilcoxon rank-sum)

Statistical test

Non-parametric two-group comparison on ranks (no normality assumption).

Groups: 2
Outcome: ordinal

When to use: Two independent groups with an ordinal outcome, or a continuous outcome that is badly non-normal at small n.
When not to use: When means + their effect size are the target and data are roughly normal (a t-test is more powerful), or for paired data (Wilcoxon signed-rank).

Grounded in

Mann & Whitney (1947)
statistics shelf

2 methods

Assumptions

The conditions a test depends on. Checked severity-aware on effect size, not on a bare significance test, and linked to the remedies that fix a violation.

Normality

Assumption

Are the data (or residuals) approximately normally distributed?

Severity basis: skewness + excess kurtosis (effect-size view) + QQ correlation

When to use: Before a t-test / ANOVA / regression on a small-to-moderate sample.
When not to use: As a gate at large n (the CLT covers the mean) — read the QQ plot + effect instead.

Grounded in

Research Methods in Psychology
statistics shelf

Homoscedasticity (equal variances)

Assumption

Do the groups have comparable variances?

Severity basis: variance ratio + Brown-Forsythe Levene W (F-distributed)

When to use: Before a pooled (equal-variance) t-test or a fixed-effects ANOVA.
When not to use: If you already default to Welch — heteroscedasticity is then moot.

Grounded in

Brown & Forsythe (1974)
Levene (1960)

4 methods

Remedies

What to do when an assumption is violated — cited fixes drawn from the methods canon, never invented. Several are executable directly.

Welch correction

Remedie

Drop the equal-variance assumption — the default fix for heteroscedasticity.

Executable: Yes — runs directly

When to use: Two-group mean comparison where Levene / the variance ratio flags unequal variances.
When not to use: Already using Welch; or the issue is non-normality not variance (consider a rank test).

Grounded in

Welch (1947)
Delacre, Lakens & Leys (2017)

Switch to a rank test (Mann-Whitney)

Remedie

Non-parametric fallback when normality is badly violated at small n.

Executable: Yes — runs directly

When to use: Severe non-normality on a small two-group sample.
When not to use: Roughly normal data, or when the mean difference + Cohen's d are the target (a rank test changes the estimand).

Grounded in

Mann & Whitney (1947)

Bootstrap the sampling distribution

Remedie

Resample to get a distribution-free CI / p without the normality assumption.

Executable: Advisory

When to use: Non-normality where you still want the mean-difference estimand + a CI.
When not to use: Very small n (the bootstrap is unstable below ~20/group).

Grounded in

Efron & Tibshirani (1993)

Heteroscedasticity-robust SEs (HC3)

Remedie

Robust standard errors in a regression context (the regression analog of Welch).

Executable: Advisory

When to use: Regression with non-constant error variance.
When not to use: A simple two-group comparison — Welch is the direct fix.

Grounded in

MacKinnon & White (1985)
Long & Ervin (2000)

9 methods

Sampling methods

How participants are selected. Probability methods support generalizable inference; non-probability methods are honest about what they can and cannot claim.

Simple random sampling

Sampling method

Every unit has an equal, independent chance of selection.

Inference: Probability-based

When to use: When you need representative inference to a known frame.
When not to use: When no enumerable frame exists.

Grounded in

Babbie ch.7
Lavrakas

Systematic sampling

Sampling method

Select every k-th unit from an ordered frame after a random start.

Inference: Probability-based

When to use: When you need representative inference to a known frame.
When not to use: When no enumerable frame exists.

Grounded in

Babbie ch.7

Stratified sampling

Sampling method

Partition the frame into strata and sample within each (proportional or equal).

Inference: Probability-based

When to use: When you need representative inference to a known frame.
When not to use: When no enumerable frame exists.

Grounded in

Babbie ch.7
Fowler

Cluster sampling

Sampling method

Sample whole clusters (e.g. teams/sites), then units within.

Inference: Probability-based

When to use: When you need representative inference to a known frame.
When not to use: When no enumerable frame exists.

Grounded in

Babbie ch.7

Quota sampling

Sampling method

Non-probability: fill target quotas per subgroup (no random selection).

Inference: Non-probability

When to use: When a probability frame is infeasible and you will caveat generalizability.
When not to use: When you need unbiased inference to a population (use a probability method).

Grounded in

Babbie ch.7

Convenience sampling

Sampling method

Non-probability: whoever is available — high bias, low generalizability.

Inference: Non-probability

When to use: When a probability frame is infeasible and you will caveat generalizability.
When not to use: When you need unbiased inference to a population (use a probability method).

Grounded in

Babbie ch.7

Purposive sampling

Sampling method

Non-probability: deliberately chosen information-rich cases.

Inference: Non-probability

When to use: When a probability frame is infeasible and you will caveat generalizability.
When not to use: When you need unbiased inference to a population (use a probability method).

Grounded in

Babbie ch.7
Coding Manual

Snowball sampling

Sampling method

Non-probability: referrals chain from initial participants (hidden populations).

Inference: Non-probability

When to use: When a probability frame is infeasible and you will caveat generalizability.
When not to use: When you need unbiased inference to a population (use a probability method).

Grounded in

Babbie ch.7

Census

Sampling method

Enumerate the entire frame — no sampling error, but rarely feasible.

Inference: Probability-based

When to use: When you need representative inference to a known frame.
When not to use: When no enumerable frame exists.

Grounded in

Babbie ch.7

Put the rigor to work

These methods aren’t just catalogued — they’re shippable.

The statistical tests and decompositions in this catalog are exported as drop-in capabilities for the tools you already use. Browse the statistics shelf of the store, or describe your question and we’ll tell you honestly whether we have a drop-in for it today.

Browse the statistics store Describe your problem

Rigor you can read, on numbers you can trust.

This catalog is the methods discipline the whole toolbox is built on. Bring your question; we’ll bring the right design, the right test, and the citation behind both.

Book a scoping call See our methodology

The rigor behind every number — catalogued, cited, and browsable.

Right tool, stated plainly

Cited, never invented

Wired to compute