{"id":981,"date":"2026-02-16T08:38:48","date_gmt":"2026-02-16T08:38:48","guid":{"rendered":"https:\/\/aiopsschool.com\/blog\/regression-discontinuity\/"},"modified":"2026-02-17T15:15:05","modified_gmt":"2026-02-17T15:15:05","slug":"regression-discontinuity","status":"publish","type":"post","link":"https:\/\/aiopsschool.com\/blog\/regression-discontinuity\/","title":{"rendered":"What is regression discontinuity? Meaning, Architecture, Examples, Use Cases, and How to Measure It (2026 Guide)"},"content":{"rendered":"\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Quick Definition (30\u201360 words)<\/h2>\n\n\n\n<p>Regression discontinuity is a quasi-experimental design that estimates causal effects by exploiting a cutoff or threshold in an assignment variable. Analogy: like comparing students just above and just below a test passing score to infer the effect of passing. Formal line: it estimates local treatment effects at the discontinuity using continuity assumptions on potential outcomes.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">What is regression discontinuity?<\/h2>\n\n\n\n<p>Regression discontinuity (RD) is a statistical design used to estimate causal effects when treatment assignment is determined by whether an observed running variable crosses a specific threshold. It is not a randomized experiment, though under certain assumptions it can yield estimates comparable to randomized controlled trials near the cutoff.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it is:<\/li>\n<li>A quasi-experimental causal inference technique.<\/li>\n<li>Uses a running variable and a deterministic cutoff to define treated vs control.<\/li>\n<li>\n<p>Estimates the local average treatment effect at the discontinuity.<\/p>\n<\/li>\n<li>\n<p>What it is NOT:<\/p>\n<\/li>\n<li>Not a global causal estimator across the entire distribution of the running variable.<\/li>\n<li>Not valid if agents can precisely manipulate the running variable around the cutoff.<\/li>\n<li>\n<p>Not an automatic replacement for randomized trials; assumptions must be assessed.<\/p>\n<\/li>\n<li>\n<p>Key properties and constraints:<\/p>\n<\/li>\n<li>Requires a clear running variable and a known cutoff.<\/li>\n<li>Requires continuity in the potential outcomes with respect to the running variable in absence of treatment.<\/li>\n<li>Sensitive to bandwidth choice, functional form, and covariate balance near the cutoff.<\/li>\n<li>\n<p>Typically estimates local treatment effects at the threshold, not average treatment effects away from it.<\/p>\n<\/li>\n<li>\n<p>Where it fits in modern cloud\/SRE workflows:<\/p>\n<\/li>\n<li>A\/B testing replacement when randomization is infeasible but an allocation cutoff exists.<\/li>\n<li>Evaluating feature-flags, rollout policies, or policy thresholds enacted in production.<\/li>\n<li>Informing incident response policies by measuring effects of threshold-based interventions.<\/li>\n<li>\n<p>Used by data platforms, MLOps teams, and SREs to estimate causal effects from telemetry when rollout uses thresholds or gating.<\/p>\n<\/li>\n<li>\n<p>A text-only diagram description readers can visualize:<\/p>\n<\/li>\n<li>Imagine a scatterplot of outcome Y on vertical axis and running variable X on horizontal axis.<\/li>\n<li>At a vertical line X = c there is a treatment assignment switch.<\/li>\n<li>Two regression lines are fit on either side of X = c and the vertical gap at c is the RD estimate.<\/li>\n<li>Smooth horizontal continuity would be expected absent treatment; a jump indicates treatment effect.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">regression discontinuity in one sentence<\/h3>\n\n\n\n<p>Regression discontinuity estimates causal effects by comparing outcomes immediately on either side of a deterministic cutoff in an assignment variable under an assumption of continuity in potential outcomes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">regression discontinuity vs related terms (TABLE REQUIRED)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>ID<\/th>\n<th>Term<\/th>\n<th>How it differs from regression discontinuity<\/th>\n<th>Common confusion<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>T1<\/td>\n<td>Randomized Controlled Trial<\/td>\n<td>Uses random assignment not a cutoff<\/td>\n<td>Confused as equivalent in internal validity<\/td>\n<\/tr>\n<tr>\n<td>T2<\/td>\n<td>Difference-in-Differences<\/td>\n<td>Relies on parallel trends over time not a threshold<\/td>\n<td>Mistaken for time-based RD<\/td>\n<\/tr>\n<tr>\n<td>T3<\/td>\n<td>Instrumental Variables<\/td>\n<td>Uses external instruments not deterministic cutoffs<\/td>\n<td>Thinking any instrument is an RD<\/td>\n<\/tr>\n<tr>\n<td>T4<\/td>\n<td>Matching<\/td>\n<td>Matches units across covariates not using a running variable<\/td>\n<td>Believing matching fixes RD assumptions<\/td>\n<\/tr>\n<tr>\n<td>T5<\/td>\n<td>Threshold experiments<\/td>\n<td>Can be randomized or adaptive unlike deterministic RD<\/td>\n<td>Using term interchangeably with RD<\/td>\n<\/tr>\n<tr>\n<td>T6<\/td>\n<td>Local Average Treatment Effect<\/td>\n<td>LATE is broader; RD yields local causal estimates<\/td>\n<td>Assuming RD gives global ATE<\/td>\n<\/tr>\n<tr>\n<td>T7<\/td>\n<td>Propensity Score Methods<\/td>\n<td>Model propensity not deterministic cutoff<\/td>\n<td>Confusion on treatment assignment mechanism<\/td>\n<\/tr>\n<tr>\n<td>T8<\/td>\n<td>Interrupted Time Series<\/td>\n<td>Uses time discontinuities not cross-sectional cutoffs<\/td>\n<td>Mistaking time-based jump for RD<\/td>\n<\/tr>\n<tr>\n<td>T9<\/td>\n<td>Regression Kink Design<\/td>\n<td>Uses slope change not level change at cutoff<\/td>\n<td>Thinking slope and level discontinuities are same<\/td>\n<\/tr>\n<tr>\n<td>T10<\/td>\n<td>Bayesian Causal Models<\/td>\n<td>Different inference approach; RD is design not only inference<\/td>\n<td>Presuming inference framework equals design<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Row Details (only if any cell says \u201cSee details below\u201d)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>None<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Why does regression discontinuity matter?<\/h2>\n\n\n\n<p>Regression discontinuity matters because it provides a credible way to infer causal effects when you cannot randomize and when a policy, rule, or system enforces assignment by threshold.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Business impact:<\/li>\n<li>Revenue: Understand whether price thresholds, discount cutoffs, or eligibility rules cause revenue jumps.<\/li>\n<li>Trust: Validate that gating policies (e.g., verification thresholds) actually improve outcomes without harming users.<\/li>\n<li>\n<p>Risk: Identify unintended consequences of binary thresholds that could create churn or fraud windows.<\/p>\n<\/li>\n<li>\n<p>Engineering impact:<\/p>\n<\/li>\n<li>Incident reduction: Quantify the effectiveness of threshold-based mitigations in reducing error rates or system load.<\/li>\n<li>Velocity: Use RD to validate feature gates and incremental rollouts that depend on thresholds, reducing rollout risk.<\/li>\n<li>\n<p>Cost: Determine whether resource caps produce desired savings without degrading performance.<\/p>\n<\/li>\n<li>\n<p>SRE framing:<\/p>\n<\/li>\n<li>SLIs\/SLOs: RD can evaluate the causal effect of an operational change triggered by a threshold on SLIs.<\/li>\n<li>Error budgets: Use RD to attribute SLI changes to threshold changes and allocate error budget burn.<\/li>\n<li>\n<p>Toil\/on-call: Measure whether threshold-based automation reduces manual interventions and paging.<\/p>\n<\/li>\n<li>\n<p>3\u20135 realistic \u201cwhat breaks in production\u201d examples:\n  1. A rate limiter that switches behavior at 1000 requests per minute causes latency to spike for users just above the limit because load shedding behavior differs.\n  2. A pricing tier with a usage threshold causes users just above the threshold to churn more than those just below.\n  3. An automated rollback that triggers when CPU exceeds 80% leads to oscillations near the cutoff as autoscaling interacts with rollback.\n  4. A verification gate that allows accounts with score &gt;= 70 to access a feature increases fraud if the scoring is gamable near the cutoff.\n  5. A serverless cold-start optimization that toggles at a concurrency threshold creates different tail-latency behavior around the cutoff.<\/p>\n<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Where is regression discontinuity used? (TABLE REQUIRED)<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>ID<\/th>\n<th>Layer\/Area<\/th>\n<th>How regression discontinuity appears<\/th>\n<th>Typical telemetry<\/th>\n<th>Common tools<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>L1<\/td>\n<td>Edge and CDN<\/td>\n<td>Rate limit or geo rule with cutoff by header or IP score<\/td>\n<td>request rate latency 429 rate<\/td>\n<td>WAF metrics CDN logs<\/td>\n<\/tr>\n<tr>\n<td>L2<\/td>\n<td>Network and Load Balancer<\/td>\n<td>Health threshold routing uses cutoff on health score<\/td>\n<td>connection errors latency drops<\/td>\n<td>LB metrics network traces<\/td>\n<\/tr>\n<tr>\n<td>L3<\/td>\n<td>Service and Application<\/td>\n<td>Feature gate enables at score or ID cutoff<\/td>\n<td>request success latency user events<\/td>\n<td>Feature flag platform APM<\/td>\n<\/tr>\n<tr>\n<td>L4<\/td>\n<td>Data and ML<\/td>\n<td>Model score cutoff for classification or eligibility<\/td>\n<td>score distribution precision recall<\/td>\n<td>Model monitoring data pipelines<\/td>\n<\/tr>\n<tr>\n<td>L5<\/td>\n<td>Platform and Orchestration<\/td>\n<td>Autoscaler threshold or pod eviction policy cutoff<\/td>\n<td>cpu mem pod restarts<\/td>\n<td>Kubernetes metrics Helm charts<\/td>\n<\/tr>\n<tr>\n<td>L6<\/td>\n<td>Cloud Cost and Quota<\/td>\n<td>Budget thresholds trigger throttling or alerts<\/td>\n<td>spend rate quota hits<\/td>\n<td>Cloud billing metrics alerts<\/td>\n<\/tr>\n<tr>\n<td>L7<\/td>\n<td>CI\/CD and Deployment<\/td>\n<td>Promotion criteria based on test scores or canary metrics<\/td>\n<td>test pass rate canary stats<\/td>\n<td>CI metrics git events<\/td>\n<\/tr>\n<tr>\n<td>L8<\/td>\n<td>Security and IAM<\/td>\n<td>Risk score cutoff for MFA or access<\/td>\n<td>login success failures anomalies<\/td>\n<td>SIEM auth logs policy tools<\/td>\n<\/tr>\n<tr>\n<td>L9<\/td>\n<td>Observability and Alerts<\/td>\n<td>Alerting rules with thresholds define pages<\/td>\n<td>alert count latency error spikes<\/td>\n<td>Monitoring systems alerting tools<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Row Details (only if needed)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>None<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">When should you use regression discontinuity?<\/h2>\n\n\n\n<p>When to use RD depends on whether the assignment mechanism naturally produces a cutoff and whether assumptions can be justified.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>When it\u2019s necessary:<\/li>\n<li>When treatment is assigned strictly by a known threshold and randomization is impossible.<\/li>\n<li>When you need causal estimates localized at the cutoff (e.g., policy evaluation at eligibility threshold).<\/li>\n<li>\n<p>When operational constraints enforce threshold-based rollouts or gating.<\/p>\n<\/li>\n<li>\n<p>When it\u2019s optional:<\/p>\n<\/li>\n<li>When you have randomization but prefer RD due to implementation simplicity.<\/li>\n<li>\n<p>When multiple quasi-experimental designs are possible and RD offers simpler diagnostics.<\/p>\n<\/li>\n<li>\n<p>When NOT to use \/ overuse it:<\/p>\n<\/li>\n<li>Do not use RD when treatment can be precisely manipulated by agents near the cutoff.<\/li>\n<li>Do not use RD when you need global ATE across the running variable.<\/li>\n<li>\n<p>Avoid RD when data density is sparse near the cutoff or when measurement error in running variable is high.<\/p>\n<\/li>\n<li>\n<p>Decision checklist:<\/p>\n<\/li>\n<li>If treatment is assigned by a clear cutoff AND running variable is not manipulable -&gt; Use RD.<\/li>\n<li>If you need global average effects OR assignment is probabilistic -&gt; Consider RCT or IV.<\/li>\n<li>\n<p>If data density near cutoff is low -&gt; Collect more data or consider alternative designs.<\/p>\n<\/li>\n<li>\n<p>Maturity ladder:<\/p>\n<\/li>\n<li>Beginner: Visual checks and simple local linear RD with fixed bandwidth.<\/li>\n<li>Intermediate: Data-driven bandwidth selection, covariate balance checks, robustness to polynomial order.<\/li>\n<li>Advanced: Fuzzy RD, RD with multiple cutoffs, heterogeneous effect estimation, automated pipelines for RD in production telemetry.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">How does regression discontinuity work?<\/h2>\n\n\n\n<p>RD works by comparing outcomes for units just below and just above a cutoff, assuming that without treatment units would be smoothly related to the running variable. The discontinuity at the cutoff is interpreted as the causal effect.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\n<p>Components and workflow:\n  1. Running variable X and known cutoff c define treatment D = 1[X &gt;= c] or 1[X &gt; c].\n  2. Outcome Y measured for units across X near c.\n  3. Pre-estimation diagnostics: density test, covariate continuity, visualization.\n  4. Choose bandwidth h and fit local regressions on either side of c.\n  5. Estimate the jump at c. For fuzzy RD, estimate using ratio of jumps (Wald estimator).\n  6. Robustness checks: alternative bandwidths, polynomial orders, placebo cutoffs.<\/p>\n<\/li>\n<li>\n<p>Data flow and lifecycle:<\/p>\n<\/li>\n<li>Collection: capture running variable and outcome from logs or databases.<\/li>\n<li>Preprocessing: clean, align timestamps, validate running variable precision, compute treatment indicator.<\/li>\n<li>Analysis: visualize scatterplot with polynomial fits, compute RD estimate with standard errors.<\/li>\n<li>Production integration: map RD analysis into dashboards and SLO evaluation if threshold-based policies are operational.<\/li>\n<li>\n<p>Monitoring: automate diagnostics to detect manipulation and distribution shifts.<\/p>\n<\/li>\n<li>\n<p>Edge cases and failure modes:<\/p>\n<\/li>\n<li>Manipulation at cutoff: agents gaming the running variable produces invalid estimates.<\/li>\n<li>Measurement error: noisy running variable blurs the cutoff and biases estimates.<\/li>\n<li>Sparse data near cutoff: high variance and weak inference.<\/li>\n<li>Nonlinear trends: polynomial mis-specification can bias results.<\/li>\n<li>Discontinuous covariates: if covariates also jump at cutoff, interpretation is complicated.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Typical architecture patterns for regression discontinuity<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p>Offline Analysis Pipeline\n   &#8211; Batch ETL -&gt; RD notebook or statistical script -&gt; report.\n   &#8211; Use when experiments are ad hoc or post-hoc policy evaluations.<\/p>\n<\/li>\n<li>\n<p>Streaming Telemetry RD\n   &#8211; Real-time metrics ingestion -&gt; windowed RD diagnostics -&gt; alert on discontinuity changes.\n   &#8211; Use when thresholds drive live system behavior and near-real-time monitoring is needed.<\/p>\n<\/li>\n<li>\n<p>Integrated Feature-Flag RD\n   &#8211; Feature flag system records running variable and assignment -&gt; automated RD computation per rollout segment.\n   &#8211; Use for incremental rollout and safety gates.<\/p>\n<\/li>\n<li>\n<p>Fuzzy RD with Instrumentation\n   &#8211; Instrument both assignment encouragement and actual treatment receipt; estimate via two-stage approach.\n   &#8211; Use when compliance is imperfect, e.g., enrollment offers accepted by subset.<\/p>\n<\/li>\n<li>\n<p>Multi-cutoff RD\n   &#8211; Evaluate multiple thresholds across geographies or cohorts and pool estimates with hierarchical models.\n   &#8211; Use for platform-wide policy with many local cutoffs.<\/p>\n<\/li>\n<li>\n<p>Model-based RD in ML pipelines\n   &#8211; Combine RD identification with supervised models to estimate heterogeneous treatment effects local to cutoffs.\n   &#8211; Use for personalized policies where cutoff effects vary.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Failure modes &amp; mitigation (TABLE REQUIRED)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>ID<\/th>\n<th>Failure mode<\/th>\n<th>Symptom<\/th>\n<th>Likely cause<\/th>\n<th>Mitigation<\/th>\n<th>Observability signal<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>F1<\/td>\n<td>Precise manipulation<\/td>\n<td>Jump in density at cutoff<\/td>\n<td>Agents adjusting running variable<\/td>\n<td>Use density tests exclude manipulated region<\/td>\n<td>density histogram spike<\/td>\n<\/tr>\n<tr>\n<td>F2<\/td>\n<td>Measurement error<\/td>\n<td>Blurred discontinuity<\/td>\n<td>Noisy running variable<\/td>\n<td>Improve instrumentation use fuzzy RD<\/td>\n<td>increased variance near cutoff<\/td>\n<\/tr>\n<tr>\n<td>F3<\/td>\n<td>Sparse data<\/td>\n<td>Wide CIs unstable estimates<\/td>\n<td>Low sample count near cutoff<\/td>\n<td>Aggregate more data widen window<\/td>\n<td>few samples per bin<\/td>\n<\/tr>\n<tr>\n<td>F4<\/td>\n<td>Mis-specified polynomial<\/td>\n<td>Contradictory estimates by order<\/td>\n<td>Wrong functional form choice<\/td>\n<td>Use local linear or data-driven bandwidth<\/td>\n<td>model residual patterns<\/td>\n<\/tr>\n<tr>\n<td>F5<\/td>\n<td>Covariate imbalance<\/td>\n<td>Covariate jumps at cutoff<\/td>\n<td>Confounded assignment or sorting<\/td>\n<td>Control for covariates check robustness<\/td>\n<td>covariate discontinuity plot<\/td>\n<\/tr>\n<tr>\n<td>F6<\/td>\n<td>Spillover effects<\/td>\n<td>Effect appears away from cutoff<\/td>\n<td>Treatment affects neighbors<\/td>\n<td>Model spatial spillovers exclude affected units<\/td>\n<td>outcome changes away from c<\/td>\n<\/tr>\n<tr>\n<td>F7<\/td>\n<td>Multiple cutoffs<\/td>\n<td>Confusion which cutoff matters<\/td>\n<td>Policy changes across cohorts<\/td>\n<td>Analyze per-cutoff pool with meta-analysis<\/td>\n<td>inconsistent jumps per group<\/td>\n<\/tr>\n<tr>\n<td>F8<\/td>\n<td>Fuzzy compliance<\/td>\n<td>Partial takeup reduces jump<\/td>\n<td>Imperfect treatment receipt<\/td>\n<td>Use IV\/Wald estimator instrumenting assignment<\/td>\n<td>jump in assignment but smaller in receipt<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Row Details (only if needed)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>None<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Key Concepts, Keywords &amp; Terminology for regression discontinuity<\/h2>\n\n\n\n<p>Term \u2014 1\u20132 line definition \u2014 why it matters \u2014 common pitfall<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Running variable \u2014 Observed variable used to assign treatment via cutoff \u2014 Central to RD design \u2014 Measurement error biases results.<\/li>\n<li>Cutoff \/ Threshold \u2014 The numerical point separating treated and control \u2014 Defines local comparison \u2014 Ambiguous cutoff invalidates analysis.<\/li>\n<li>Treatment assignment \u2014 Rule mapping running variable to treatment \u2014 Determines causal contrast \u2014 Unobserved heterogeneity can confound.<\/li>\n<li>Local Average Treatment Effect \u2014 Effect estimated at the cutoff \u2014 Provides credible causal estimate \u2014 Not generalizable away from cutoff.<\/li>\n<li>Sharp RD \u2014 Perfect compliance with deterministic cutoff \u2014 Simpler inference \u2014 Rare in practice when compliance imperfect.<\/li>\n<li>Fuzzy RD \u2014 Assignment affects probability of treatment not perfect compliance \u2014 Requires IV-style estimation \u2014 Needs strong instrument assumptions.<\/li>\n<li>Bandwidth \u2014 Range around cutoff used for estimation \u2014 Balances bias and variance \u2014 Wrong bandwidth leads to bias or noisy estimates.<\/li>\n<li>Local linear regression \u2014 Linear fit on each side within bandwidth \u2014 Preferred for boundary problems \u2014 Higher polynomials can overfit.<\/li>\n<li>Polynomial RD \u2014 Higher-order polynomials for fitting \u2014 Can model curvature \u2014 Risk of spurious oscillation near boundary.<\/li>\n<li>Covariate continuity \u2014 Covariates should be smooth across cutoff absent treatment \u2014 Key validity check \u2014 Discontinuities suggest confounding.<\/li>\n<li>McCrary density test \u2014 Test for manipulation in running variable density at cutoff \u2014 Detects sorting \u2014 Not definitive proof.<\/li>\n<li>Placebo cutoff \u2014 Test at other cutoffs where no treatment should occur \u2014 Robustness check \u2014 Multiple testing concerns.<\/li>\n<li>Heterogeneous treatment effects \u2014 Effects vary across subgroups \u2014 Explains differential impacts \u2014 Requires enough data for subgroup analysis.<\/li>\n<li>Bandwidth selection rule \u2014 Data-driven method to choose h \u2014 Improves estimator properties \u2014 Different selectors may disagree.<\/li>\n<li>Robust standard errors \u2014 SEs adjusted for heteroskedasticity or clustering \u2014 Provides reliable inference \u2014 Ignoring clustering underestimates SEs.<\/li>\n<li>Clustering \u2014 Correlated observations within groups \u2014 Affects inference \u2014 Cluster at appropriate level for valid CIs.<\/li>\n<li>Kernel weighting \u2014 Weighting scheme across bandwidth (triangular, uniform) \u2014 Affects estimator efficiency \u2014 Mis-specified kernel can affect bias.<\/li>\n<li>Continuity assumption \u2014 Potential outcomes are continuous at cutoff absent treatment \u2014 Fundamental to identification \u2014 Unverifiable but testable via covariates.<\/li>\n<li>Donut RD \u2014 Excluding observations very near cutoff to avoid manipulation \u2014 Mitigates manipulation bias \u2014 Reduces precision.<\/li>\n<li>Falsification test \u2014 Tests that should hold if RD is valid (e.g., covariate continuity) \u2014 Increases credibility \u2014 Multiple tests inflate false positives.<\/li>\n<li>Wald estimator \u2014 Ratio estimator for fuzzy RD \u2014 Provides complier average effect \u2014 Sensitive to weak first stage.<\/li>\n<li>First stage \u2014 Effect of assignment on treatment receipt in fuzzy RD \u2014 Strong first stage required \u2014 Weak first stage leads to weak instrument issues.<\/li>\n<li>Compliance \u2014 Whether assigned units comply with treatment \u2014 Determines sharp vs fuzzy RD \u2014 Noncompliance complicates estimation.<\/li>\n<li>Local randomization approach \u2014 Treats units close to cutoff as randomized \u2014 Alternative inference method \u2014 Requires small window assumption.<\/li>\n<li>External validity \u2014 Extent to which RD estimate generalizes away from cutoff \u2014 Often limited \u2014 Beware over-extrapolation.<\/li>\n<li>Manipulation \/ Sorting \u2014 Strategic movement across cutoff \u2014 Threatens identification \u2014 Use density and balance checks.<\/li>\n<li>Measurement precision \u2014 Granularity of running variable measurement \u2014 Coarse measurement can create bunching \u2014 Can mask continuous assignment.<\/li>\n<li>Multiple testing \u2014 Repeated hypothesis tests across places or subgroups \u2014 Can produce false positives \u2014 Adjust p-values or present confidence intervals.<\/li>\n<li>Meta-analysis of RD \u2014 Pooling RD estimates across many cutoffs \u2014 Provides broader picture \u2014 Requires consistency in design.<\/li>\n<li>Covariate adjustment \u2014 Including covariates in RD regression \u2014 Can improve precision \u2014 Must be pre-specified to avoid p-hacking.<\/li>\n<li>Cross-validation \u2014 Data-driven selection of model\/hyperparameters \u2014 Helpful for bandwidth\/order \u2014 Risk of overfitting if misused.<\/li>\n<li>Pre-trend \u2014 Lack of pre-treatment trend in time-based designs \u2014 Not necessarily relevant to cross-sectional RD \u2014 Misapplied from DiD thinking.<\/li>\n<li>Power calculation \u2014 Estimating sample needed to detect effect \u2014 Important for planning \u2014 Local effects require many observations near cutoff.<\/li>\n<li>Placebo outcomes \u2014 Outcomes that should not be affected by treatment \u2014 Used for falsification \u2014 Negative results strengthen claims.<\/li>\n<li>RD estimator \u2014 Statistical estimator used to compute discontinuity \u2014 Choice affects bias\/variance \u2014 Robust methods recommended.<\/li>\n<li>Heteroskedasticity \u2014 Non-constant variance across observations \u2014 Affects SEs \u2014 Use robust SEs.<\/li>\n<li>Bandwidth sensitivity check \u2014 Running RD with different h to assess robustness \u2014 Standard robustness procedure \u2014 Conflicting results indicate fragility.<\/li>\n<li>Local randomization inference \u2014 Permutation-based p-values within small window \u2014 Nonparametric alternative \u2014 Requires treating window as randomized.<\/li>\n<li>Regression Kink Design \u2014 Uses discontinuities in slope of policy at cutoff \u2014 Similar idea but different estimand \u2014 Not interchangeable with level RD.<\/li>\n<li>Implementation diagnostics \u2014 Suite of tests to verify RD assumptions \u2014 Makes results credible \u2014 Common pitfall is selective reporting.<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">How to Measure regression discontinuity (Metrics, SLIs, SLOs) (TABLE REQUIRED)<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>ID<\/th>\n<th>Metric\/SLI<\/th>\n<th>What it tells you<\/th>\n<th>How to measure<\/th>\n<th>Starting target<\/th>\n<th>Gotchas<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>M1<\/td>\n<td>Local jump in outcome<\/td>\n<td>Estimated causal effect at cutoff<\/td>\n<td>Local regression difference at cutoff<\/td>\n<td>Varies by context<\/td>\n<td>Sensitive to bandwidth<\/td>\n<\/tr>\n<tr>\n<td>M2<\/td>\n<td>Density discontinuity<\/td>\n<td>Evidence of manipulation in running var<\/td>\n<td>McCrary test or density plot<\/td>\n<td>No significant jump<\/td>\n<td>Low power with sparse data<\/td>\n<\/tr>\n<tr>\n<td>M3<\/td>\n<td>Covariate continuity<\/td>\n<td>Balance of covariates around cutoff<\/td>\n<td>Compare means just sides of cutoff<\/td>\n<td>No significant differences<\/td>\n<td>Multiple covariates need correction<\/td>\n<\/tr>\n<tr>\n<td>M4<\/td>\n<td>First-stage strength<\/td>\n<td>Assignment effect on treatment receipt<\/td>\n<td>Difference in takeup at cutoff<\/td>\n<td>Strong and significant<\/td>\n<td>Weak instruments invalidate fuzzy RD<\/td>\n<\/tr>\n<tr>\n<td>M5<\/td>\n<td>Bandwidth sensitivity<\/td>\n<td>Robustness of estimate to h<\/td>\n<td>Recompute estimates over range of h<\/td>\n<td>Stable within range<\/td>\n<td>Divergent results show fragility<\/td>\n<\/tr>\n<tr>\n<td>M6<\/td>\n<td>CI width at cutoff<\/td>\n<td>Precision of estimate<\/td>\n<td>Bootstrap or robust SEs<\/td>\n<td>Narrow enough for decision<\/td>\n<td>Too wide if sparse data<\/td>\n<\/tr>\n<tr>\n<td>M7<\/td>\n<td>Placebo cutoff checks<\/td>\n<td>False positive detection<\/td>\n<td>Apply RD at non-policy cutoffs<\/td>\n<td>No significant effects<\/td>\n<td>Data snooping raises alarms<\/td>\n<\/tr>\n<tr>\n<td>M8<\/td>\n<td>Heterogeneity by subgroup<\/td>\n<td>Variation of effect<\/td>\n<td>RD within strata or interaction<\/td>\n<td>Pre-specified subgroup effects<\/td>\n<td>Multiple comparisons risk<\/td>\n<\/tr>\n<tr>\n<td>M9<\/td>\n<td>Spillover indicator<\/td>\n<td>Treatment impact beyond cutoff<\/td>\n<td>Time or space outcome trends<\/td>\n<td>Minimal spillovers<\/td>\n<td>Hard to measure for networked systems<\/td>\n<\/tr>\n<tr>\n<td>M10<\/td>\n<td>Operational telemetry alignment<\/td>\n<td>Match RD events to system metrics<\/td>\n<td>Correlate discontinuity times with telemetry<\/td>\n<td>Aligned for causal story<\/td>\n<td>Misaligned times weaken inference<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Row Details (only if needed)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>None<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Best tools to measure regression discontinuity<\/h3>\n\n\n\n<p>Choose tools that support statistical modeling, visualization, and integration with telemetry.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Tool \u2014 Jupyter \/ Notebook environment<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it measures for regression discontinuity: Flexible RD estimation and visualization.<\/li>\n<li>Best-fit environment: Data science teams and analysis pipelines.<\/li>\n<li>Setup outline:<\/li>\n<li>Ingest telemetry with secure credentials.<\/li>\n<li>Preprocess and bin running variable.<\/li>\n<li>Run regression fits and diagnostic tests.<\/li>\n<li>Export figures and tables to reports.<\/li>\n<li>Strengths:<\/li>\n<li>Highly flexible and reproducible.<\/li>\n<li>Good for ad hoc analysis and exploration.<\/li>\n<li>Limitations:<\/li>\n<li>Not a production monitoring tool.<\/li>\n<li>Requires manual orchestration for automation.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Tool \u2014 Statistical libraries (R, Python causal packages)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it measures for regression discontinuity: RD estimators, robust SEs, bandwidth selection.<\/li>\n<li>Best-fit environment: Data science and analytics platforms.<\/li>\n<li>Setup outline:<\/li>\n<li>Install RD packages.<\/li>\n<li>Implement local linear\/ polynomial fits.<\/li>\n<li>Run McCrary and placebo tests.<\/li>\n<li>Package results into CI-friendly outputs.<\/li>\n<li>Strengths:<\/li>\n<li>Rigorous statistical methods.<\/li>\n<li>Established inference routines.<\/li>\n<li>Limitations:<\/li>\n<li>Needs careful parameter tuning.<\/li>\n<li>Integration with observability requires ETL.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Tool \u2014 Observability platforms (APM, metrics systems)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it measures for regression discontinuity: Operational signals aligned with RD events.<\/li>\n<li>Best-fit environment: SRE and production monitoring.<\/li>\n<li>Setup outline:<\/li>\n<li>Instrument running variable and outcome metrics.<\/li>\n<li>Create dashboards showing pre\/post cutoff signals.<\/li>\n<li>Automate alerts for discontinuity diagnostics.<\/li>\n<li>Strengths:<\/li>\n<li>Real-time monitoring and alerting.<\/li>\n<li>Operational context for RD findings.<\/li>\n<li>Limitations:<\/li>\n<li>Limited statistical features for inference.<\/li>\n<li>May miss nuanced RD diagnostics.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Tool \u2014 Feature-flag platforms<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it measures for regression discontinuity: Assignment and exposure logs for cutoff-based flags.<\/li>\n<li>Best-fit environment: Product rollouts and canary releases.<\/li>\n<li>Setup outline:<\/li>\n<li>Record assignment reason and running variable.<\/li>\n<li>Capture downstream outcomes.<\/li>\n<li>Generate per-cutoff RD reports.<\/li>\n<li>Strengths:<\/li>\n<li>Tie assignment metadata to outcomes.<\/li>\n<li>Enables rolling experiments with thresholds.<\/li>\n<li>Limitations:<\/li>\n<li>Flag platforms might not provide full RD tooling.<\/li>\n<\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Tool \u2014 Data warehouses \/ OLAP systems<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What it measures for regression discontinuity: Large-scale aggregation and cohorting by running variable.<\/li>\n<li>Best-fit environment: Analytical pipelines and reports.<\/li>\n<li>Setup outline:<\/li>\n<li>Create derived tables for running variable bins.<\/li>\n<li>Aggregate outcomes around cutoff.<\/li>\n<li>Schedule RD report generation.<\/li>\n<li>Strengths:<\/li>\n<li>Scales to large datasets.<\/li>\n<li>Integrates with BI dashboards.<\/li>\n<li>Limitations:<\/li>\n<li>Latency for near-real-time needs.<\/li>\n<li>Statistical nuance requires external libraries.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Recommended dashboards &amp; alerts for regression discontinuity<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Executive dashboard:<\/li>\n<li>Panel: Local RD estimate and CI \u2014 quick view of causal effect magnitude.<\/li>\n<li>Panel: Business KPI near cutoff \u2014 shows practical implications.<\/li>\n<li>Panel: Density test result and covariate balance summary \u2014 high-level validity signals.<\/li>\n<li>\n<p>Why: Stakeholders need magnitude, credibility, and business relevance.<\/p>\n<\/li>\n<li>\n<p>On-call dashboard:<\/p>\n<\/li>\n<li>Panel: Telemetry traces aligned to cutoff events \u2014 latency, error rate, 429 counts.<\/li>\n<li>Panel: Bandwidth sensitivity table \u2014 quick robustness check.<\/li>\n<li>Panel: Alert count and burn-rate for SLOs affected by policy.<\/li>\n<li>\n<p>Why: SREs need operational signals to act quickly if threshold-driven behavior breaks.<\/p>\n<\/li>\n<li>\n<p>Debug dashboard:<\/p>\n<\/li>\n<li>Panel: Scatterplot and fitted lines around cutoff with residuals.<\/li>\n<li>Panel: Covariate continuity plots and McCrary density plot.<\/li>\n<li>Panel: Raw logs and sample traces for units near cutoff.<\/li>\n<li>Why: Facilitates root cause analysis and validation of RD assumptions.<\/li>\n<\/ul>\n\n\n\n<p>Alerting guidance:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What should page vs ticket:<\/li>\n<li>Page for system incidents where threshold change causes SLO breaches or cascading failures.<\/li>\n<li>Ticket for statistical robustness issues (e.g., suspicious density test) that require investigation.<\/li>\n<li>Burn-rate guidance:<\/li>\n<li>If RD shows treatment causes SLI degradation, compute projected error budget burn and page at sustained burn &gt; 2x baseline rate.<\/li>\n<li>Noise reduction tactics:<\/li>\n<li>Dedupe alerts by cutoff ID and region.<\/li>\n<li>Group alerts by impacted SLO and service.<\/li>\n<li>Suppress transient alerts by requiring sustained metric change over short window and cross-checking RD diagnostics.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Implementation Guide (Step-by-step)<\/h2>\n\n\n\n<p>1) Prerequisites\n   &#8211; Clear definition of running variable and known cutoff.\n   &#8211; Sufficient data density near cutoff.\n   &#8211; Instrumented collection of outcomes and covariates.\n   &#8211; Access-controlled analytics environment.<\/p>\n\n\n\n<p>2) Instrumentation plan\n   &#8211; Log running variable and timestamp for every relevant request or unit.\n   &#8211; Record treatment receipt indicator separate from assignment.\n   &#8211; Capture outcome metrics and relevant covariates.\n   &#8211; Retain raw events for at least retention window needed for power.<\/p>\n\n\n\n<p>3) Data collection\n   &#8211; Ingest events into analytics store with consistent schema.\n   &#8211; Validate measurement precision and deduplicate.\n   &#8211; Build cohort tables centered on cutoff.<\/p>\n\n\n\n<p>4) SLO design\n   &#8211; Map RD outcome to SLI relevant to business or reliability.\n   &#8211; Define SLO window and error budget implications for threshold-driven policies.\n   &#8211; Document alert thresholds and paging rules informed by RD estimates.<\/p>\n\n\n\n<p>5) Dashboards\n   &#8211; Build visualizations: scatter with local fits, density test, covariate plots.\n   &#8211; Provide drill-down to raw events and traces for units near cutoff.\n   &#8211; Expose bandwidth sensitivity and placebo checks panels.<\/p>\n\n\n\n<p>6) Alerts &amp; routing\n   &#8211; Alert on SLO breaches caused by threshold change.\n   &#8211; Alert on manipulations indicated by density discontinuity.\n   &#8211; Route to data scientists for statistical anomalies and to SRE for operational impacts.<\/p>\n\n\n\n<p>7) Runbooks &amp; automation\n   &#8211; Create runbook: steps to validate cutoff, run diagnostics, and rollback policy.\n   &#8211; Automate routine RD checks nightly or on policy change.\n   &#8211; Automate report generation for stakeholders after each policy update.<\/p>\n\n\n\n<p>8) Validation (load\/chaos\/game days)\n   &#8211; Run load tests stressing thresholds to observe behavior near cutoff.\n   &#8211; Chaos scenarios toggling policies around cutoff values.\n   &#8211; Game days simulating manipulation, sparse data, or measurement drift.<\/p>\n\n\n\n<p>9) Continuous improvement\n   &#8211; Periodically reassess bandwidth, choice of estimator, and covariate sets.\n   &#8211; Log decisions and tests in reproducible notebooks.\n   &#8211; Incorporate RD checks into pre-deploy and post-deploy pipelines.<\/p>\n\n\n\n<p>Checklists<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Pre-production checklist<\/li>\n<li>Running variable defined and instrumented.<\/li>\n<li>Outcome metrics validated and stable.<\/li>\n<li>Minimum sample size estimated.<\/li>\n<li>Logging and trace context enabled.<\/li>\n<li>\n<p>Initial RD script and dashboard in place.<\/p>\n<\/li>\n<li>\n<p>Production readiness checklist<\/p>\n<\/li>\n<li>Automated RD diagnostics scheduled.<\/li>\n<li>Alerting rules validated for paging vs tickets.<\/li>\n<li>Runbooks posted with owner and escalation.<\/li>\n<li>\n<p>Security and access control for analytics pipelines configured.<\/p>\n<\/li>\n<li>\n<p>Incident checklist specific to regression discontinuity<\/p>\n<\/li>\n<li>Verify whether cutoff changed recently.<\/li>\n<li>Run density and covariate continuity checks immediately.<\/li>\n<li>Inspect telemetry for SLI changes and traces for affected units.<\/li>\n<li>If manipulation suspected, quarantine data and escalate compliance review.<\/li>\n<li>Revert policy if immediate SLO violation and roll back in safe manner.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Use Cases of regression discontinuity<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>\n<p>Feature eligibility\n   &#8211; Context: New feature unlocked for users with score &gt;= 75.\n   &#8211; Problem: Does the feature improve retention or increase fraud?\n   &#8211; Why RD helps: Compares users just above and below score 75.\n   &#8211; What to measure: Retention, conversion, fraud rate.\n   &#8211; Typical tools: Feature flag logs, analytics DB, RD scripts.<\/p>\n<\/li>\n<li>\n<p>Pricing tier evaluation\n   &#8211; Context: Usage-based billing escalates at 1000 units.\n   &#8211; Problem: Do customers just above threshold churn or reduce usage?\n   &#8211; Why RD helps: Local effect of crossing pricing tier.\n   &#8211; What to measure: Churn, usage, revenue per user.\n   &#8211; Typical tools: Billing telemetry, transaction logs, BI.<\/p>\n<\/li>\n<li>\n<p>Autoscaler policy change\n   &#8211; Context: Autoscaler adds instances if CPU &gt;= 75%.\n   &#8211; Problem: Does the policy reduce latency or cause oscillation?\n   &#8211; Why RD helps: Effects at CPU threshold can be estimated.\n   &#8211; What to measure: Latency, instance churn, CPU variance.\n   &#8211; Typical tools: Kubernetes metrics, APM, RD diagnostics.<\/p>\n<\/li>\n<li>\n<p>Fraud detection threshold\n   &#8211; Context: Accounts scoring &gt;= 0.8 blocked.\n   &#8211; Problem: Does blocking reduce fraud without harming customers?\n   &#8211; Why RD helps: Estimate causal reduction in fraud near cutoff.\n   &#8211; What to measure: Fraud events, false positive rate.\n   &#8211; Typical tools: Model monitoring, security logs.<\/p>\n<\/li>\n<li>\n<p>Rate limiting impact\n   &#8211; Context: Rate limit enforcement at 500 req\/min.\n   &#8211; Problem: Does limit mitigate overload without harming legit users?\n   &#8211; Why RD helps: Compare performance and errors around limit.\n   &#8211; What to measure: 429 rate, latency, error budget burn.\n   &#8211; Typical tools: CDN logs, APM, monitoring.<\/p>\n<\/li>\n<li>\n<p>Education policy evaluation\n   &#8211; Context: Scholarship awarded for test scores &gt;= pass mark.\n   &#8211; Problem: Does scholarship improve graduation rates?\n   &#8211; Why RD helps: Natural cutoff provides quasi-experimental setting.\n   &#8211; What to measure: Graduation, dropout rates.\n   &#8211; Typical tools: Student records, analytics.<\/p>\n<\/li>\n<li>\n<p>ML model thresholding\n   &#8211; Context: Classification uses score threshold for label.\n   &#8211; Problem: How does threshold affect downstream system load?\n   &#8211; Why RD helps: Assess operational impact of score cutoff.\n   &#8211; What to measure: Throughput, false positives, decision latency.\n   &#8211; Typical tools: Model serving logs, RD tools.<\/p>\n<\/li>\n<li>\n<p>Security gating in IAM\n   &#8211; Context: MFA requirement enabled if risk score &gt;= X.\n   &#8211; Problem: Does MFA reduce account takeover?\n   &#8211; Why RD helps: Evaluate causal effect of enforcing MFA at threshold.\n   &#8211; What to measure: Account takeover incidents, login failures.\n   &#8211; Typical tools: SIEM, auth logs.<\/p>\n<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Scenario Examples (Realistic, End-to-End)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Scenario #1 \u2014 Kubernetes autoscaling threshold evaluation<\/h3>\n\n\n\n<p><strong>Context:<\/strong> Cluster autoscaler scales pods when CPU utilization &gt;= 70%.<br\/>\n<strong>Goal:<\/strong> Determine if the autoscaler threshold reduces tail latency without excessive pod churn.<br\/>\n<strong>Why regression discontinuity matters here:<\/strong> Autoscaling is applied deterministically at a CPU cutoff; RD isolates the local effect on latency and churn.<br\/>\n<strong>Architecture \/ workflow:<\/strong> Metrics pipeline collects per-pod CPU and request latency; feature flag records autoscaler config; RD pipeline ingests data around CPU cutoff.<br\/>\n<strong>Step-by-step implementation:<\/strong> <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Instrument pod CPU and latency at fine granularity.<\/li>\n<li>Define running variable X = pod CPU utilization and cutoff c = 70%.<\/li>\n<li>Aggregate observations into fine bins around c.<\/li>\n<li>Run local linear RD estimating jump in 95th percentile latency at c.<\/li>\n<li>Run McCrary test on pod CPU distribution.<\/li>\n<li>Run bandwidth sensitivity and covariate balance tests (traffic mix, request type).\n<strong>What to measure:<\/strong> 95th percentile latency, pod restart count, scale-up events, error rates.<br\/>\n<strong>Tools to use and why:<\/strong> Kubernetes metrics server for CPU, Prometheus for telemetry and histograms, Jupyter notebooks for RD estimation.<br\/>\n<strong>Common pitfalls:<\/strong> Measurement lag between CPU reported and scaling action; insufficient observations near cutoff.<br\/>\n<strong>Validation:<\/strong> Load tests to generate data near cutoff, sensitivity checks across thresholds.<br\/>\n<strong>Outcome:<\/strong> Estimate shows 12% drop in tail latency at cost of 8% more pod churn; informs adjusting cooldown settings.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Scenario #2 \u2014 Serverless concurrency threshold for cold-start mitigation<\/h3>\n\n\n\n<p><strong>Context:<\/strong> Serverless platform uses a pre-warm pool when concurrent invocations &gt;= 50.<br\/>\n<strong>Goal:<\/strong> Evaluate causal impact of pre-warming on tail latency and cost.<br\/>\n<strong>Why regression discontinuity matters here:<\/strong> Pre-warming is triggered by a concurrency threshold; RD evaluates near-threshold behavior.<br\/>\n<strong>Architecture \/ workflow:<\/strong> Invocation metrics and cold-start indicators are logged; billing cost aggregated; RD analysis performed on invocation concurrency near cutoff.<br\/>\n<strong>Step-by-step implementation:<\/strong> <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Ensure each invocation records current concurrency and cold-start flag.<\/li>\n<li>Define running variable X = measured concurrency; cutoff c = 50.<\/li>\n<li>Estimate local jump in cold-start rate and tail latency.<\/li>\n<li>Compute cost per invocation change at cutoff.<\/li>\n<li>Perform bandwidth sensitivity and placebos at other concurrency values.\n<strong>What to measure:<\/strong> Cold-start incidence, 99th percentile latency, cost per 1k invocations.<br\/>\n<strong>Tools to use and why:<\/strong> Serverless platform logs, metrics store for latency distributions, RD scripts.<br\/>\n<strong>Common pitfalls:<\/strong> Concurrency measurement delayed or sampled; billing aggregation frequency misaligned.<br\/>\n<strong>Validation:<\/strong> Simulate traffic patterns to produce sustained concurrency near threshold.<br\/>\n<strong>Outcome:<\/strong> Pre-warming reduces cold-starts by 40% near cutoff but increases cost by 6%, guiding dynamic pre-warm policies.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Scenario #3 \u2014 Incident-response policy triggered by error-rate threshold<\/h3>\n\n\n\n<p><strong>Context:<\/strong> Automation triggers circuit breaker when service error-rate &gt; 5% for 1 minute.<br\/>\n<strong>Goal:<\/strong> Quantify whether circuit breaker reduces downstream system degradation and time-to-recover.<br\/>\n<strong>Why regression discontinuity matters here:<\/strong> Circuit breaker is a threshold-driven policy; RD measures immediate effect on outcomes.<br\/>\n<strong>Architecture \/ workflow:<\/strong> Error rates and recovery times logged; circuit breaker assignment timestamped; procedural playbooks executed.<br\/>\n<strong>Step-by-step implementation:<\/strong> <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Collect time-series of error rates and dependent service latencies.<\/li>\n<li>Use time-based RD by treating running variable as error rate and cutoff at 5%.<\/li>\n<li>Estimate discontinuity in downstream latencies and time-to-recover metrics.<\/li>\n<li>Check for manipulation or pre-emptive mitigations.\n<strong>What to measure:<\/strong> Downstream latency, incident duration, manual interventions.<br\/>\n<strong>Tools to use and why:<\/strong> Monitoring system, incident management tool logs, RD analysis scripts.<br\/>\n<strong>Common pitfalls:<\/strong> Time synchronization errors and multiple overlapping mitigations.<br\/>\n<strong>Validation:<\/strong> Controlled fire drills where error injection crosses threshold.<br\/>\n<strong>Outcome:<\/strong> Circuit breaker shortens incident duration by 30% but increases false positives near 5%; adjust hysteresis.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Scenario #4 \u2014 Pricing tier switch causing churn (cost\/performance trade-off)<\/h3>\n\n\n\n<p><strong>Context:<\/strong> Billing moves users from tier A to B when usage &gt;= 1000 units.<br\/>\n<strong>Goal:<\/strong> Measure churn effect and revenue impact of the tier cutoff.<br\/>\n<strong>Why regression discontinuity matters here:<\/strong> The price change is deterministic at usage cutoff; RD isolates effect on churn.<br\/>\n<strong>Architecture \/ workflow:<\/strong> Billing events, user sessions, and cancellation events are captured; RD pipeline analyzes users around 1000 units.<br\/>\n<strong>Step-by-step implementation:<\/strong> <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Capture monthly usage for each customer and cancellation events.<\/li>\n<li>Define running variable X = monthly usage; cutoff c = 1000.<\/li>\n<li>Estimate jump in churn and change in revenue per user.<\/li>\n<li>Run subgroup RD for different cohorts to detect heterogeneity.\n<strong>What to measure:<\/strong> Churn rate, ARPU, average usage post-cutoff.<br\/>\n<strong>Tools to use and why:<\/strong> Billing DB, analytics tools, RD estimation scripts.<br\/>\n<strong>Common pitfalls:<\/strong> Bunching just below cutoff due to strategic throttling; time window alignment.<br\/>\n<strong>Validation:<\/strong> Trial changes on smaller customer segments and compare RD estimates.<br\/>\n<strong>Outcome:<\/strong> Crossing cutoff increases churn by 7% but increases average revenue by 10%; informs smoothing of pricing boundary.<\/li>\n<\/ol>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Common Mistakes, Anti-patterns, and Troubleshooting<\/h2>\n\n\n\n<p>List of mistakes with Symptom -&gt; Root cause -&gt; Fix (15\u201325 items)<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Symptom: Large density spike at cutoff -&gt; Root cause: Manipulation or rounding of running variable -&gt; Fix: Run McCrary, exclude manipulated region, use donut RD.<\/li>\n<li>Symptom: Wide confidence intervals -&gt; Root cause: Sparse observations near cutoff -&gt; Fix: Collect more data, widen bandwidth cautiously.<\/li>\n<li>Symptom: Estimates change drastically with polynomial order -&gt; Root cause: Overfitting with high-order polynomials -&gt; Fix: Use local linear or triangular kernel and check sensitivity.<\/li>\n<li>Symptom: Covariates jump at cutoff -&gt; Root cause: Confounding or sorting -&gt; Fix: Investigate mechanism, control covariates, consider alternative designs.<\/li>\n<li>Symptom: No first stage in fuzzy RD -&gt; Root cause: Assignment not affecting treatment receipt -&gt; Fix: Reassess instrument or use alternative identification.<\/li>\n<li>Symptom: Operational SLOs degrade unexpectedly near cutoff -&gt; Root cause: Threshold-triggered automation misconfigured -&gt; Fix: Revisit automation policy and runbook.<\/li>\n<li>Symptom: Placebo cutoffs show significant effects -&gt; Root cause: Data snooping or underlying nonlocal trend -&gt; Fix: Pre-specify tests and correct multiple testing.<\/li>\n<li>Symptom: Jump appears across many variables -&gt; Root cause: Systemic change at cutoff not related to treatment -&gt; Fix: Check deployment logs and policy changes coincident with cutoff.<\/li>\n<li>Symptom: Conflicting results across cohorts -&gt; Root cause: Heterogeneous effects or multiple cutoffs -&gt; Fix: Conduct subgroup analysis and hierarchical pooling.<\/li>\n<li>Symptom: Estimates sensitive to kernel type -&gt; Root cause: Weighting choice matters with uneven density -&gt; Fix: Compare kernels and report robustness.<\/li>\n<li>Symptom: Incorrect SEs understate uncertainty -&gt; Root cause: Ignoring clustering or heteroskedasticity -&gt; Fix: Use robust and cluster-robust SEs.<\/li>\n<li>Symptom: RD script contradicts operational dashboard -&gt; Root cause: Mismatched definitions of running variable or time window -&gt; Fix: Align definitions and timestamps.<\/li>\n<li>Symptom: High false positives in alerts -&gt; Root cause: Bad grouping or lack of suppression -&gt; Fix: Improve dedupe, add aggregation windows.<\/li>\n<li>Symptom: Post-hoc selection of bandwidth -&gt; Root cause: P-hacking to find significant result -&gt; Fix: Use pre-registered selection or report full sensitivity.<\/li>\n<li>Symptom: Misinterpreting local effect as general policy guidance -&gt; Root cause: Over-extrapolation from local estimate -&gt; Fix: Communicate local validity and run further studies for generalization.<\/li>\n<li>Symptom: Model fails in presence of spillovers -&gt; Root cause: Treatment affecting neighbors or network effects -&gt; Fix: Model spillovers explicitly or remove affected units.<\/li>\n<li>Symptom: Running variable recorded at coarse granularity -&gt; Root cause: Low measurement precision causing bunching -&gt; Fix: Improve instrumentation or aggregate differently.<\/li>\n<li>Symptom: Re-run analyses produce different results -&gt; Root cause: Non-deterministic sampling or data pipeline changes -&gt; Fix: Version data and analytic code, ensure reproducibility.<\/li>\n<li>Symptom: Observability dashboards missing context -&gt; Root cause: Poor telemetry linking assignment and outcome -&gt; Fix: Add correlation panels and raw logs.<\/li>\n<li>Symptom: Statistical team and SRE disagree on impact -&gt; Root cause: Different metrics and windows used -&gt; Fix: Align stakeholder definitions and run joint analysis.<\/li>\n<li>Symptom: Over-alerting from RD diagnostics -&gt; Root cause: Running RD continuously with noisy inputs -&gt; Fix: Smooth or aggregate alerts and require corroboration.<\/li>\n<li>Symptom: Using RD when manipulation obvious -&gt; Root cause: Ignoring McCrary or balance tests -&gt; Fix: Move to alternative causal designs or randomized pilots.<\/li>\n<li>Symptom: Forgetting to account for seasonality in time-based RD -&gt; Root cause: Time trends confounding results -&gt; Fix: Remove seasonality or use time-fixed effects.<\/li>\n<li>Symptom: Confusing regression kink with RD -&gt; Root cause: Misreading policy as slope change rather than level change -&gt; Fix: Diagnose slope vs level and choose correct design.<\/li>\n<li>Symptom: Not securing analytics pipelines -&gt; Root cause: Data access issues or leaks -&gt; Fix: Apply RBAC and audit logging for analysis platform.<\/li>\n<\/ol>\n\n\n\n<p>Observability pitfalls (at least 5 included above):<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Missing linkage between assignment and telemetry.<\/li>\n<li>Inaccurate timestamp alignment.<\/li>\n<li>Sampling causing bias near cutoff.<\/li>\n<li>Insufficient retention of raw events.<\/li>\n<li>Over-aggregation hiding local discontinuities.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Best Practices &amp; Operating Model<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Ownership and on-call:<\/li>\n<li>Data team owns RD pipelines and diagnostics.<\/li>\n<li>SRE owns operational telemetry and runbooks for threshold-driven systems.<\/li>\n<li>\n<p>Rotate on-call for RD alerts that indicate manipulation or operational SLO impacts.<\/p>\n<\/li>\n<li>\n<p>Runbooks vs playbooks:<\/p>\n<\/li>\n<li>Runbooks: Step-by-step procedures for validating cutoffs, running diagnostics, and emergency rollback.<\/li>\n<li>\n<p>Playbooks: Strategic guidelines for when to redesign thresholds, run experiments, or pursue randomized trials.<\/p>\n<\/li>\n<li>\n<p>Safe deployments (canary\/rollback):<\/p>\n<\/li>\n<li>Use canary windows to observe behavior near cutoff before global enforcement.<\/li>\n<li>\n<p>Maintain automated rollback tied to SLO breach or RD diagnostics signaling adverse jumps.<\/p>\n<\/li>\n<li>\n<p>Toil reduction and automation:<\/p>\n<\/li>\n<li>Automate routine RD checks, dashboards, and reports.<\/li>\n<li>Use templates for covariate balance and placebo tests.<\/li>\n<li>\n<p>Automate alerts for first-stage weakening or density jumps.<\/p>\n<\/li>\n<li>\n<p>Security basics:<\/p>\n<\/li>\n<li>Protect running variable and assignment logs as they may be sensitive.<\/li>\n<li>Monitor for adversarial manipulation of inputs that determine cutoffs.<\/li>\n<li>\n<p>Ensure RBAC on analytics pipelines and results; treat RD outputs as decision-critical.<\/p>\n<\/li>\n<li>\n<p>Weekly\/monthly routines:<\/p>\n<\/li>\n<li>Weekly: Run automated RD diagnostics for active thresholds and check SLO impact.<\/li>\n<li>Monthly: Reassess bandwidth selection, update dashboards, baseline drift checks.<\/li>\n<li>\n<p>Quarterly: Review thresholds as part of policy audits and model governance.<\/p>\n<\/li>\n<li>\n<p>What to review in postmortems related to regression discontinuity:<\/p>\n<\/li>\n<li>Whether a threshold change preceded the incident.<\/li>\n<li>RD diagnostics run during incident and their results.<\/li>\n<li>Any evidence of manipulation or mismeasurement.<\/li>\n<li>Adjustments to thresholds, runbooks, and instrumentation.<\/li>\n<li>Actions to improve data collection and monitoring.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Tooling &amp; Integration Map for regression discontinuity (TABLE REQUIRED)<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table>\n<thead>\n<tr>\n<th>ID<\/th>\n<th>Category<\/th>\n<th>What it does<\/th>\n<th>Key integrations<\/th>\n<th>Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>I1<\/td>\n<td>Metrics store<\/td>\n<td>Stores time series and histograms for RD inputs<\/td>\n<td>Observability APM dashboards alerts<\/td>\n<td>High cardinality can be costly<\/td>\n<\/tr>\n<tr>\n<td>I2<\/td>\n<td>Feature flagging<\/td>\n<td>Records assignment reasons and rollout thresholds<\/td>\n<td>CI\/CD analytics feature logs<\/td>\n<td>Useful for provenance<\/td>\n<\/tr>\n<tr>\n<td>I3<\/td>\n<td>Data warehouse<\/td>\n<td>Aggregates and cohorts for RD estimation<\/td>\n<td>ETL pipelines BI tools<\/td>\n<td>Batch oriented not real-time<\/td>\n<\/tr>\n<tr>\n<td>I4<\/td>\n<td>Notebook environment<\/td>\n<td>Implements RD analysis and plots<\/td>\n<td>Version control auth logs<\/td>\n<td>Good for reproducibility<\/td>\n<\/tr>\n<tr>\n<td>I5<\/td>\n<td>Statistical libraries<\/td>\n<td>Provides RD estimators and tests<\/td>\n<td>Notebook and ETL systems<\/td>\n<td>Requires statistical expertise<\/td>\n<\/tr>\n<tr>\n<td>I6<\/td>\n<td>Alerting system<\/td>\n<td>Pages on SLO breaches\/density anomalies<\/td>\n<td>Incident management on-call roster<\/td>\n<td>Must avoid alert fatigue<\/td>\n<\/tr>\n<tr>\n<td>I7<\/td>\n<td>Model monitoring<\/td>\n<td>Tracks model score distributions and drift<\/td>\n<td>ML pipeline model registry<\/td>\n<td>Important for model-based cutoffs<\/td>\n<\/tr>\n<tr>\n<td>I8<\/td>\n<td>Log aggregation<\/td>\n<td>Stores raw events including running var<\/td>\n<td>Tracing APM dashboards<\/td>\n<td>Helpful for debugging edge cases<\/td>\n<\/tr>\n<tr>\n<td>I9<\/td>\n<td>CI\/CD<\/td>\n<td>Automates RD checks in pre-deploy job<\/td>\n<td>Feature flagging repos metrics store<\/td>\n<td>Ensures gating before rollout<\/td>\n<\/tr>\n<tr>\n<td>I10<\/td>\n<td>Governance<\/td>\n<td>Records decisions and policy cutoffs<\/td>\n<td>Audit logs SLO reviews<\/td>\n<td>Compliance tracking for thresholds<\/td>\n<\/tr>\n<\/tbody>\n<\/table><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Row Details (only if needed)<\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>None<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Frequently Asked Questions (FAQs)<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">What is the main assumption of RD?<\/h3>\n\n\n\n<p>The main assumption is continuity of potential outcomes at the cutoff absent treatment; units just above and below are comparable.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Can RD establish causality without randomization?<\/h3>\n\n\n\n<p>Yes, locally at the cutoff, if assumptions hold and manipulation is ruled out, RD yields credible causal estimates.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is the difference between sharp and fuzzy RD?<\/h3>\n\n\n\n<p>Sharp RD has perfect compliance with the cutoff, while fuzzy RD has assignment affecting treatment probability imperfectly.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How do I choose bandwidth?<\/h3>\n\n\n\n<p>Use data-driven selectors or cross-validation, and report sensitivity across a reasonable range.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What tests should I run to validate RD?<\/h3>\n\n\n\n<p>Run density (McCrary) tests, covariate continuity checks, placebo cutoffs, and bandwidth sensitivity analyses.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Is RD appropriate for time-series cutoffs?<\/h3>\n\n\n\n<p>Yes, but additional time-series considerations like seasonality and autocorrelation must be addressed.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How many observations do I need near the cutoff?<\/h3>\n\n\n\n<p>Varies by effect size and variance; perform power calculations focused on local sample size near cutoff.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Can RD handle multiple cutoffs?<\/h3>\n\n\n\n<p>Yes, analyze per cutoff and consider hierarchical pooling or meta-analysis for aggregation.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What if agents manipulate the running variable?<\/h3>\n\n\n\n<p>Manipulation undermines RD identification; consider excluding manipulated observations, using donut RD, or different designs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How do I interpret local effects?<\/h3>\n\n\n\n<p>RD estimates the effect for units infinitesimally close to cutoff; avoid generalizing to populations far from the threshold.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Can I automate RD monitoring in production?<\/h3>\n\n\n\n<p>Yes, automate diagnostics, dashboards, and alerts but require human review for statistical anomalies.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How to handle covariate imbalance at cutoff?<\/h3>\n\n\n\n<p>Investigate mechanism, include covariates to improve precision, or reconsider identification strategy if imbalance implies confounding.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Are polynomial regressions recommended?<\/h3>\n\n\n\n<p>Local linear regressions with triangular kernels are typically preferred; higher polynomials risk overfitting.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How to measure fuzzy RD?<\/h3>\n\n\n\n<p>Use instrumental variables approach where assignment is instrument for treatment receipt and compute Wald estimator.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Should RD be used for pricing policy decisions?<\/h3>\n\n\n\n<p>RD can quantify local impacts of pricing thresholds but combine with business judgment and broader experiments for global decisions.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What are common pitfalls in RD inference?<\/h3>\n\n\n\n<p>Manipulation, sparse data, bandwidth overfitting, ignored clustering, and misaligned telemetry are common pitfalls.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Can RD be used with machine learning models?<\/h3>\n\n\n\n<p>Yes, to evaluate score thresholds and their operational impacts; ensure model score measurement is precise.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">How to report RD results to stakeholders?<\/h3>\n\n\n\n<p>Report point estimate, confidence intervals, robustness checks, and clear statement that effects are local to cutoff.<\/p>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Regression discontinuity is a powerful quasi-experimental tool for causal inference when treatment assignment is determined by thresholds. In 2026 cloud-native systems, RD bridges data science and SRE by enabling causal analysis of threshold-driven policies, feature rollouts, and automation decisions. Its validity rests on testable diagnostics and careful operational integration.<\/p>\n\n\n\n<p>Next 7 days plan:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Day 1: Instrument running variable and treatment receipt logging end-to-end.<\/li>\n<li>Day 2: Build initial RD notebook and visualize scatter and density near cutoff.<\/li>\n<li>Day 3: Implement automated McCrary and covariate continuity checks.<\/li>\n<li>Day 4: Create dashboards for executive and on-call use with RD panels.<\/li>\n<li>Day 5: Run bandwidth sensitivity and placebo cutoff analyses and document results.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator\" \/>\n\n\n\n<h2 class=\"wp-block-heading\">Appendix \u2014 regression discontinuity Keyword Cluster (SEO)<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Primary keywords<\/li>\n<li>regression discontinuity<\/li>\n<li>regression discontinuity design<\/li>\n<li>RD design<\/li>\n<li>RD estimator<\/li>\n<li>sharp regression discontinuity<\/li>\n<li>fuzzy regression discontinuity<\/li>\n<li>local average treatment effect<\/li>\n<li>\n<p>cutoff causal inference<\/p>\n<\/li>\n<li>\n<p>Secondary keywords<\/p>\n<\/li>\n<li>running variable<\/li>\n<li>threshold analysis<\/li>\n<li>McCrary density test<\/li>\n<li>local linear regression RD<\/li>\n<li>bandwidth selection RD<\/li>\n<li>RD robustness checks<\/li>\n<li>donut RD<\/li>\n<li>\n<p>placebo cutoff tests<\/p>\n<\/li>\n<li>\n<p>Long-tail questions<\/p>\n<\/li>\n<li>how does regression discontinuity work in production<\/li>\n<li>regression discontinuity vs randomized controlled trial<\/li>\n<li>fuzzy regression discontinuity explained for engineers<\/li>\n<li>best practices for RD in cloud systems<\/li>\n<li>how to test manipulation in RD<\/li>\n<li>RD bandwidth selection guide 2026<\/li>\n<li>RD for feature flags and canary releases<\/li>\n<li>regression discontinuity in Kubernetes autoscaling<\/li>\n<li>RD for serverless concurrency thresholds<\/li>\n<li>how to monitor RD diagnostics in observability<\/li>\n<li>RD pipelines for analytics teams<\/li>\n<li>what is local average treatment effect in RD<\/li>\n<li>interpreting RD results for product decisions<\/li>\n<li>regression discontinuity code examples for data teams<\/li>\n<li>RD sensitivity analysis checklist<\/li>\n<li>how to handle sparse data in RD<\/li>\n<li>regression discontinuity pitfalls for SREs<\/li>\n<li>using RD to measure pricing threshold effects<\/li>\n<li>RD vs difference-in-differences practical guide<\/li>\n<li>\n<p>regression discontinuity for ML model thresholds<\/p>\n<\/li>\n<li>\n<p>Related terminology<\/p>\n<\/li>\n<li>treatment effect<\/li>\n<li>local effect<\/li>\n<li>covariate balance<\/li>\n<li>triangular kernel<\/li>\n<li>robust standard errors<\/li>\n<li>clustering in RD<\/li>\n<li>first stage in fuzzy RD<\/li>\n<li>Wald estimator<\/li>\n<li>regression kink design<\/li>\n<li>local randomization<\/li>\n<li>spillover effects<\/li>\n<li>heterogeneity in RD<\/li>\n<li>power calculation for RD<\/li>\n<li>placebo outcomes<\/li>\n<li>RD meta-analysis<\/li>\n<li>instrumentation precision<\/li>\n<li>observability telemetry<\/li>\n<li>SLO impact analysis<\/li>\n<li>automated RD monitoring<\/li>\n<li>runbook for threshold incidents<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>&#8212;<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[239],"tags":[],"class_list":["post-981","post","type-post","status-publish","format-standard","hentry","category-what-is-series"],"_links":{"self":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/981","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=981"}],"version-history":[{"count":1,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/981\/revisions"}],"predecessor-version":[{"id":2580,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/981\/revisions\/2580"}],"wp:attachment":[{"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=981"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=981"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aiopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=981"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}