Research-Methods & Statistics Catalog

The rigor behind every number — catalogued, cited, and browsable.

Every study design, statistical test, assumption check, remedy, and sampling method the toolbox knows how to run — each one grounded in the cited methods canon, never invented. This is the research-methods layer most analytics tools hide. We put it on the table.

34 methods · 16 study designs · 3 tests · 9 sampling methods — each with a cited source

Why this catalog exists

Most tools give you a number. We show our work.

A dashboard that prints “significant” without an , or runs a test without an , is hiding the part that decides whether the answer is trustworthy. The toolbox treats research methods as a first-class, declarative catalog — the same way the visualization catalog treats charts. Every entry names the conditions it’s right for, the conditions it’s wrong for, and the canonical sources that ground it.

That’s the discipline behind the platform: paired with effect sizes and confidence intervals; considered before data is collected; study designs that are honest about and what can and cannot buy. When an assumption is violated, the remedy comes from the methods canon — cited, not invented.

  • Right tool, stated plainly

    Every method names when to use it and — just as important — when not to.

  • Cited, never invented

    Each entry is grounded in named sources from the methods canon, resolvable to full citations.

  • Wired to compute

    Tests map to real statistical functions and the chart that renders the result — the catalog isn't just prose.

The catalog

34 methods, grouped by what they do.

Filter by kind or tag. Each card shows the method, when it fits, when it doesn’t, and the sources that ground it. The 16 study-design archetypes give the catalog its breadth; the test family is the rigor in motion.

Kind
Tag

34 of 34 methods

16 methods

Study designs

How a study is structured to support — or undercut — a causal claim. Each archetype names its causal-identification strategy and the validity threat that dominates it.

Randomized pretest-posttest control

Study design

Random assignment, measured before and after, with a control arm.

Family
experimental
Identification
randomization
Time structure
longitudinal

notation R O X O / R O O

Dominant validity threat: pretest × treatment interaction (external)

When to use
When the question + constraints fit a experimental design with randomization.
When not to use
When the dominant threat (pretest × treatment interaction (external)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Shadish-Cook-Campbell
  • Campbell & Stanley

Posttest-only randomized control

Study design

Random assignment, outcome measured once after — the cleanest two-arm design.

Family
experimental
Identification
randomization
Time structure
cross-sectional

notation R X O / R O

Dominant validity threat: — (clean internal validity)

When to use
When the question + constraints fit a experimental design with randomization.
When not to use
When the dominant threat (— (clean internal validity)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Shadish-Cook-Campbell

Solomon four-group

Study design

Four randomized arms that isolate testing effects from the treatment.

Family
experimental
Identification
randomization
Time structure
longitudinal

Dominant validity threat: complexity / cost

When to use
When the question + constraints fit a experimental design with randomization.
When not to use
When the dominant threat (complexity / cost) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Shadish-Cook-Campbell

One-group pretest-posttest

Study design

A single group measured before and after — weak internal validity (no control).

Family
pre_experiment
Identification
none
Time structure
longitudinal

notation O X O

Dominant validity threat: history, maturation, regression to the mean

When to use
When the question + constraints fit a pre_experiment design with no formal causal identification.
When not to use
When the dominant threat (history, maturation, regression to the mean) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Campbell & Stanley

Nonequivalent control group (NEGD)

Study design

Treatment vs a non-randomized comparison group, adjusted for baseline differences.

Family
quasi_experimental
Identification
statistical_adjustment
Time structure
longitudinal

notation N O X O / N O O

Dominant validity threat: selection × maturation

When to use
When the question + constraints fit a quasi_experimental design with statistical adjustment.
When not to use
When the dominant threat (selection × maturation) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Shadish-Cook-Campbell
  • Morgan & Winship

Interrupted time-series

Study design

Many pre/post observations on one series, interrupted by the intervention.

Family
quasi_experimental
Identification
none
Time structure
time_series

notation O O O X O O O

Dominant validity threat: history at the intervention point

When to use
When the question + constraints fit a quasi_experimental design with no formal causal identification.
When not to use
When the dominant threat (history at the intervention point) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Shadish-Cook-Campbell
  • Mostly Harmless Econometrics

Regression discontinuity (RDD)

Study design

Exploit a cutoff on a running variable to identify a local causal effect.

Family
quasi_experimental
Identification
regression_discontinuity
Time structure
cross_sectional

Dominant validity threat: bandwidth / functional form

When to use
When the question + constraints fit a quasi_experimental design with regression discontinuity.
When not to use
When the dominant threat (bandwidth / functional form) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Causal Inference: The Mixtape
  • Mostly Harmless Econometrics

Instrumental variable (IV / 2SLS)

Study design

Use an instrument that affects treatment but not the outcome directly to recover a causal effect.

Family
observational
Identification
instrumental_variable
Time structure
varies

Dominant validity threat: instrument validity (exclusion)

When to use
When the question + constraints fit a observational design with instrumental variable.
When not to use
When the dominant threat (instrument validity (exclusion)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Mostly Harmless Econometrics
  • Causal Inference: The Mixtape

Prospective cohort

Study design

Follow exposed vs unexposed groups forward in time to incidence.

Family
observational
Identification
statistical_adjustment
Time structure
longitudinal_cohort

Dominant validity threat: confounding, attrition

When to use
When the question + constraints fit a observational design with statistical adjustment.
When not to use
When the dominant threat (confounding, attrition) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Rothman
  • epidemiology shelf

Case-control

Study design

Compare those with vs without the outcome, looking backward at exposure (odds ratios).

Family
observational
Identification
statistical_adjustment
Time structure
retrospective

Dominant validity threat: recall / selection bias

When to use
When the question + constraints fit a observational design with statistical adjustment.
When not to use
When the dominant threat (recall / selection bias) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Rothman

Cross-sectional survey

Study design

Measure many variables at one time point — associations, not causation.

Family
survey
Identification
none
Time structure
cross_sectional

Dominant validity threat: reverse causation

When to use
When the question + constraints fit a survey design with no formal causal identification.
When not to use
When the dominant threat (reverse causation) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Babbie
  • Fowler

Panel survey

Study design

Re-measure the same respondents over waves — within-person change (fixed effects).

Family
survey
Identification
statistical_adjustment
Time structure
longitudinal_panel

Dominant validity threat: panel attrition

When to use
When the question + constraints fit a survey design with statistical adjustment.
When not to use
When the dominant threat (panel attrition) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Babbie
  • Mostly Harmless Econometrics

Factorial survey (vignette experiment)

Study design

Randomize vignette attributes within a survey — a conjoint-style embedded experiment.

Family
survey/experimental
Identification
randomization
Time structure
cross_sectional

Dominant validity threat: vignette realism (external)

When to use
When the question + constraints fit a survey/experimental design with randomization.
When not to use
When the dominant threat (vignette realism (external)) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Factorial Survey Experiments

Case study

Study design

Deep examination of one or a few bounded cases (qualitative or mixed).

Family
case_study
Identification
none
Time structure
varies

Dominant validity threat: generalizability

When to use
When the question + constraints fit a case_study design with no formal causal identification.
When not to use
When the dominant threat (generalizability) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Babbie
  • Coding Manual

Ethnography / grounded theory

Study design

Immersive field study; themes built from coded observation (interpretivist).

Family
ethnographic_field
Identification
none
Time structure
longitudinal

Dominant validity threat: reflexivity / transferability

When to use
When the question + constraints fit a ethnographic_field design with no formal causal identification.
When not to use
When the dominant threat (reflexivity / transferability) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Coding Manual
  • Crotty

Meta-analysis

Study design

Pool effect sizes across studies (fixed/random effects) — Principia's priors ARE this.

Family
meta_analytic_synthesis
Identification
n.a.
Time structure
n.a.

Dominant validity threat: publication bias

When to use
When the question + constraints fit a meta_analytic_synthesis design with n.a..
When not to use
When the dominant threat (publication bias) is unacceptable for the decision at hand — escalate to a stronger-identification archetype.

Grounded in

  • Hunter & Schmidt
  • Borenstein et al.
3 methods

Statistical tests

The analysis that answers the question, with the effect size and confidence interval that say how much — not just whether — alongside the p-value.

Independent-samples t-test

Statistical test

Compare the means of two independent groups (equal-variance, pooled).

Effect size
cohens_d
Groups
2
Outcome
interval
When to use
Two independent groups, a continuous outcome, comparable group variances and roughly normal data (or n large enough for the CLT).
When not to use
Unequal variances (use Welch's t), non-normal small samples (use Mann-Whitney), or paired/repeated measures (use a paired t).

Grounded in

  • Research Methods in Psychology
  • statistics shelf

Welch's t-test (unequal variance)

Statistical test

Two-group mean comparison that does NOT assume equal variances.

Effect size
cohens_d
Groups
2
Outcome
interval
When to use
Two independent groups with a continuous outcome whenever group variances may differ (i.e. almost always — Welch is a safe default over the pooled t).
When not to use
Paired/repeated measures, or ordinal/badly non-normal small samples (use Mann-Whitney).

Grounded in

  • Welch (1947)
  • Delacre, Lakens & Leys (2017)

Mann-Whitney U (Wilcoxon rank-sum)

Statistical test

Non-parametric two-group comparison on ranks (no normality assumption).

Groups
2
Outcome
ordinal
When to use
Two independent groups with an ordinal outcome, or a continuous outcome that is badly non-normal at small n.
When not to use
When means + their effect size are the target and data are roughly normal (a t-test is more powerful), or for paired data (Wilcoxon signed-rank).

Grounded in

  • Mann & Whitney (1947)
  • statistics shelf
2 methods

Assumptions

The conditions a test depends on. Checked severity-aware on effect size, not on a bare significance test, and linked to the remedies that fix a violation.

Normality

Assumption

Are the data (or residuals) approximately normally distributed?

Severity basis
skewness + excess kurtosis (effect-size view) + QQ correlation
When to use
Before a t-test / ANOVA / regression on a small-to-moderate sample.
When not to use
As a gate at large n (the CLT covers the mean) — read the QQ plot + effect instead.

Grounded in

  • Research Methods in Psychology
  • statistics shelf

Homoscedasticity (equal variances)

Assumption

Do the groups have comparable variances?

Severity basis
variance ratio + Brown-Forsythe Levene W (F-distributed)
When to use
Before a pooled (equal-variance) t-test or a fixed-effects ANOVA.
When not to use
If you already default to Welch — heteroscedasticity is then moot.

Grounded in

  • Brown & Forsythe (1974)
  • Levene (1960)
4 methods

Remedies

What to do when an assumption is violated — cited fixes drawn from the methods canon, never invented. Several are executable directly.

Welch correction

Remedie

Drop the equal-variance assumption — the default fix for heteroscedasticity.

Executable
Yes — runs directly
When to use
Two-group mean comparison where Levene / the variance ratio flags unequal variances.
When not to use
Already using Welch; or the issue is non-normality not variance (consider a rank test).

Grounded in

  • Welch (1947)
  • Delacre, Lakens & Leys (2017)

Switch to a rank test (Mann-Whitney)

Remedie

Non-parametric fallback when normality is badly violated at small n.

Executable
Yes — runs directly
When to use
Severe non-normality on a small two-group sample.
When not to use
Roughly normal data, or when the mean difference + Cohen's d are the target (a rank test changes the estimand).

Grounded in

  • Mann & Whitney (1947)

Bootstrap the sampling distribution

Remedie

Resample to get a distribution-free CI / p without the normality assumption.

Executable
Advisory
When to use
Non-normality where you still want the mean-difference estimand + a CI.
When not to use
Very small n (the bootstrap is unstable below ~20/group).

Grounded in

  • Efron & Tibshirani (1993)

Heteroscedasticity-robust SEs (HC3)

Remedie

Robust standard errors in a regression context (the regression analog of Welch).

Executable
Advisory
When to use
Regression with non-constant error variance.
When not to use
A simple two-group comparison — Welch is the direct fix.

Grounded in

  • MacKinnon & White (1985)
  • Long & Ervin (2000)
9 methods

Sampling methods

How participants are selected. Probability methods support generalizable inference; non-probability methods are honest about what they can and cannot claim.

Simple random sampling

Sampling method

Every unit has an equal, independent chance of selection.

Inference
Probability-based
When to use
When you need representative inference to a known frame.
When not to use
When no enumerable frame exists.

Grounded in

  • Babbie ch.7
  • Lavrakas

Systematic sampling

Sampling method

Select every k-th unit from an ordered frame after a random start.

Inference
Probability-based
When to use
When you need representative inference to a known frame.
When not to use
When no enumerable frame exists.

Grounded in

  • Babbie ch.7

Stratified sampling

Sampling method

Partition the frame into strata and sample within each (proportional or equal).

Inference
Probability-based
When to use
When you need representative inference to a known frame.
When not to use
When no enumerable frame exists.

Grounded in

  • Babbie ch.7
  • Fowler

Cluster sampling

Sampling method

Sample whole clusters (e.g. teams/sites), then units within.

Inference
Probability-based
When to use
When you need representative inference to a known frame.
When not to use
When no enumerable frame exists.

Grounded in

  • Babbie ch.7

Quota sampling

Sampling method

Non-probability: fill target quotas per subgroup (no random selection).

Inference
Non-probability
When to use
When a probability frame is infeasible and you will caveat generalizability.
When not to use
When you need unbiased inference to a population (use a probability method).

Grounded in

  • Babbie ch.7

Convenience sampling

Sampling method

Non-probability: whoever is available — high bias, low generalizability.

Inference
Non-probability
When to use
When a probability frame is infeasible and you will caveat generalizability.
When not to use
When you need unbiased inference to a population (use a probability method).

Grounded in

  • Babbie ch.7

Purposive sampling

Sampling method

Non-probability: deliberately chosen information-rich cases.

Inference
Non-probability
When to use
When a probability frame is infeasible and you will caveat generalizability.
When not to use
When you need unbiased inference to a population (use a probability method).

Grounded in

  • Babbie ch.7
  • Coding Manual

Snowball sampling

Sampling method

Non-probability: referrals chain from initial participants (hidden populations).

Inference
Non-probability
When to use
When a probability frame is infeasible and you will caveat generalizability.
When not to use
When you need unbiased inference to a population (use a probability method).

Grounded in

  • Babbie ch.7

Census

Sampling method

Enumerate the entire frame — no sampling error, but rarely feasible.

Inference
Probability-based
When to use
When you need representative inference to a known frame.
When not to use
When no enumerable frame exists.

Grounded in

  • Babbie ch.7

Put the rigor to work

These methods aren’t just catalogued — they’re shippable.

The statistical tests and decompositions in this catalog are exported as drop-in capabilities for the tools you already use. Browse the statistics shelf of the store, or describe your question and we’ll tell you honestly whether we have a drop-in for it today.

Rigor you can read, on numbers you can trust.

This catalog is the methods discipline the whole toolbox is built on. Bring your question; we’ll bring the right design, the right test, and the citation behind both.