Research Methods & Rigor — plain-language explainer

The Research-Methods layer is the toolbox's catalog of study designs, statistical tests, assumption checks, remedies, and sampling methods — each one grounded in a cited source, naming when it fits and when it doesn't — so every number the platform ships arrives with its method visible instead of hidden behind a clean-looking figure.

A People Analytics Toolbox capability. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the real code — the catalog + contract at src/lib/research-methods/ and the public surface at src/app/(marketing)/research-methods/ (the rigor-catalog browser). Anything not yet built is marked (TBD).

1. What good looks like

Good research methods means three things, and they are all checkable, not aspirational:

The right tool is named for the right job — and the wrong job. Every method in the catalog declares the conditions it is right for and, just as load-bearing, the conditions it is wrong for. A t-test with the wrong design behind it is a wrong answer wearing a confident number.
Every claim is cited, never invented. Each catalog entry resolves to a named source in the methods canon. When an assumption is violated, the remedy comes from that canon — not from a model's improvisation.
The catalog is wired to compute. Tests map to real statistical functions and to the chart that renders their result. This is not a glossary of methods you could in principle run; it is the methods the platform actually runs.

The plain version: a dashboard that prints "significant" without an effect size, or runs a test without an assumption check, is hiding the part that decides whether the answer is trustworthy. "Good" is the opposite posture — the method is on the table, beside the number.

2. Why keep checking

The reasons rigor has to be a standing discipline, not a one-time setup:

A number's trustworthiness lives in its method, and the method is invisible by default. A p-value with no effect size, a test whose assumptions were never checked, a study design that can't actually support the causal claim laid on it — each produces a figure that looks identical to a trustworthy one. The only way to tell them apart is to keep the method exposed.
Power is a decision made before the data exists. Statistical power has to be considered before collection — afterward it is too late to fix an underpowered study. A catalog that surfaces sampling and power up front is a catalog that catches the error while it is still cheap.
Assumptions get violated in real workforce data constantly. Skew, non-independence, small cells, missingness — the assumption checks fail often, and each failure has a cited remedy. Keeping the checks running is how the violation surfaces instead of silently poisoning the result.

3. What the problem is — and why it matters

The pain it removes: most analytics tools give you a number and hide the work behind it. The study design, the test choice, the assumption checks, the effect size, the citation — the entire chain that determines whether the answer is trustworthy — is collapsed into a single confident figure on a dashboard. When that figure is wrong, nothing about it looks wrong.

Why it matters: people-analytics decisions move pay, promotions, headcount, and careers. A "significant" finding with no effect size can justify a program that does nothing; an underpowered study can declare "no difference" where a real one exists; the wrong study design can dress up correlation as cause. These are not abstract statistical sins — they are the difference between a defensible decision and an indefensible one.

The shift, stated plainly:

FROM a dashboard that prints a number and asks you to trust it.
TO a browsable catalog where every method names its fit, its anti-fit, and its cited source — and the tests are wired to the real functions that compute them and the charts that render them.

How it differs from the alternatives: a generic BI tool gives you the number with no method; a statistics textbook gives you the method with no wiring to your data. The Research-Methods layer is the only posture that is both cited like a textbook and executable like a tool — the rigor flag flown on the surface, not buried in an appendix.

4. Where it fits in the toolbox

Data flow and dependencies:

Consumes — the methods canon (the cited sources that ground each entry) and the toolbox's own statistical primitives (the functions a test maps to) in calculus and the factor/validity spokes.
Emits — the MethodEntry catalog shape (kind: study design / test / assumption / remedy / sampling, plus when-to-use, when-not-to-use, tags, and cited sources), defined in src/lib/research-methods/contract.ts and populated in catalog.ts. The same pattern the visualization catalog uses for charts.
Surfaces — the public /research-methods rigor-catalog browser (filterable by kind and tag, URL-reflected so a filtered view is linkable) and the recommend-design helper that maps a question to the appropriate design.
Feeds — the rigor flag flown across the whole platform: every spoke that runs a test, every analytic that reports a finding, draws on this layer for the method-and-citation discipline.

The honest rail: the catalog states what each method is right and wrong for, and grounds each entry in a named source — but a cited method is a correct method, not automatically the decisive one for your specific dataset. The catalog tells you the right tool; getting the answer right still depends on the design behind the data. We surface the method so you can judge that yourself, rather than asking you to trust the number blind.