Feature flags are easy to add and easy to forget. A flag goes in to gate a risky change, the change ships to 100%, everyone moves on -- and the flag, plus the dead if/else around it, stays in the codebase forever. Everyone knows this happens. We wanted to measure how often, and how badly, with real data instead of estimates.
So we pointed the open-source FlagShark scanner at more than 1,900 public GitHub repositories that import a feature-flag SDK, and counted the stale flags left behind.
Looking for the wider industry picture and trends? See our 2026 industry report. This post is different: it's primary data from an actual scan of public code, not estimates.
By the numbers
| Metric | Value |
|---|---|
| Repos matched by a flag-SDK code search | 1,900+ |
| ...that actually evaluate flags in code | 252 (16%) |
| Popular such repos (25★+) — our sample | 34 |
| ...carrying at least one stale flag | 71% (24) |
| Stale flags found in the sample | 56 |
| Average age of a stale flag | ~14 months |
| Stale flags older than 1 year | 46% |
| Stale flags older than 2 years | 13% |
| Oldest flag still wired into code | 4.1 years |
The headline
Among popular open-source projects where we could statically detect feature flags, roughly seven in ten were carrying at least one stale flag -- a flag that has shipped and gone quiet but still branches live code. The oldest we found has been sitting untouched for more than four years.
To be precise, and honest about the denominator (the method is below): of 34 popular repos (25+ stars) where our static analysis could resolve flag keys, 24 (71%) had one or more stale flags, for 56 stale flags in total. Every number here is a conservative floor -- static analysis only sees flags whose keys are written literally in the code, not the ones built dynamically from config or a database.
How a 1,900-repo search becomes a 34-repo sample
That jump deserves an explanation, because the narrowing is itself a finding. GitHub code search casts a wide net: a flag-SDK reference turns up in ~1,900 repos, but a reference is not the same as flag usage. When we actually scanned them, three things stripped the number down (figures from a sample we cloned and inspected by hand):
- About half are noise. The SDK name appears in a lockfile, a docs page, a security wordlist, or even a local file that happens to be named after a provider -- the SDK is never actually used.
- About a third import a dual-purpose SDK for something other than flags. PostHog, Statsig, and GrowthBook are analytics and experimentation tools first. Plenty of repos import them for
capture()events and never call the flag API. - The rest key flags dynamically --
posthog.isFeatureEnabled(flagName), where the key is a variable that no static analyzer can resolve.
Strip those away and 252 repos (16%) genuinely evaluate flags with resolvable keys -- 34 of them popular enough (25★+) to form a clean sample. Two things worth saying plainly: when the scanner does see real flag usage, it catches it (there were zero repos where it found an SDK call but failed to extract the key); and a meaningful slice of real flag usage is invisible to any code scanner, which is precisely why platform-side data (the flag list from LaunchDarkly or PostHog itself) matters.
Stale flags don't get cleaned up -- they accumulate
The striking part isn't that stale flags exist -- it's how long they linger. The average stale flag we found is about 14 months old (median 10 months). 84% are more than six months past their rollout, 46% are over a year old, and 13% have been dead for more than two years. The distribution has a long tail that stretches past four years.
A flag that's been at 100% for over a year isn't really a flag anymore. It's a permanent, untested branch in your code that every future change has to read, test around, and reason about -- forever, until someone deletes it.
The oldest flags we found
These are real flags in real, recognizable projects. We're listing them as neutral examples, not a wall of shame -- nearly every team has a few of these:
| Flag | Repository | Provider | Dormant for |
|---|---|---|---|
chaos.monkey.repository | codecentric/chaos-monkey-spring-boot | Unleash | ~4.1 years |
dashboard_metrics_enabled | gitpod-io/gitpod | ConfigCat | ~2.8 years |
live_schedule | mouredev/python-web | ConfigCat | ~2.3 years |
isTheGraphEnabled | cowprotocol/cowswap | LaunchDarkly | ~2.1 years |
beta_gmailusers_mainpagetitle | VladislavAntonyuk/MauiSamples | ConfigCat | ~1.7 years |
Wondering what's lurking in your codebase? Scan any public repo in ~30 seconds → — no signup, no install.
Where the debt lives
By provider, the public-repo world skews toward the tools open source reaches for -- PostHog, Unleash, and GrowthBook show up most, with ConfigCat and LaunchDarkly close behind:
And by language, it's overwhelmingly a TypeScript/JavaScript story, with Python a distant second:
Which tools' flags rot the longest
Stale is stale, but some ecosystems let flags sit longer than others. Averaged across the flags we found, ConfigCat flags were by far the most dormant -- nearly 28 months old on average -- while GrowthBook, PostHog, and Unleash clustered closer to the one-year mark.
We wouldn't read too much into the ranking itself (this averages only providers with at least five stale flags), but the takeaway is consistent across all of them: once a flag goes stale, it tends to stay that way for years, not weeks.
The honest part: "old" is not the same as "stale"
Here's the hard problem, and the reason we were careful with the numbers. Static analysis can tell you a flag is old and referenced in only one place (the signature of a finished rollout). It cannot tell you whether a flag is genuinely abandoned or intentionally permanent -- a kill switch, an entitlement gate, a config toggle meant to live forever.
We saw this directly. One project's homegrown flag system had toggles like Puppet and DHCP that are over a decade old -- but those are permanent capability switches, not debt. So we excluded homegrown/custom flag frameworks from the headline (19 such flags across 9 repos, reported separately), required a completed-rollout signal, and limited the headline to named commercial SDKs.
That ambiguity is exactly the gap a code scanner alone can't close -- and exactly why FlagShark reads your flag platform's own metadata (for example, LaunchDarkly's temporary: false marker) before it ever proposes a removal. The code tells you where to look; the platform signal tells you what's safe to delete.
Method (and how to reproduce it)
- Corpus. We searched GitHub for repositories importing a feature-flag SDK across many providers and languages, then excluded SDK vendors and example/tutorial/demo repos (their "flags" are fixtures, not production debt) and focused on repos with 25+ stars.
- Detection. Each repo was scanned with the open-source
flagsharkCLI -- the samenpx flagshark scananyone can run. Detection is deterministic AST/regex parsing, not AI. - "Stale." A flag counts only if it comes from a named SDK, carries a completed-rollout (single-file) signal, and its line has gone untouched past the staleness threshold (measured precisely via
git blame). - Limits. Static analysis can't see dynamically-keyed flags (
isFeatureEnabled(flagName)), so totals are a floor. The sample narrows because "imports a flag SDK" overcounts real usage: a code search returns ~1,900 repos, but most are dependency listings, docs, or analytics-only usage of dual-purpose tools -- only 252 genuinely evaluate flags in code, 34 of them popular. We report that exact denominator rather than inflate it.
See your own flag debt
The interesting question isn't the average across open source -- it's what's hiding in your repo. You can find out in about thirty seconds, free:
- Scan any public repo → Paste a GitHub URL and see its stale flags, live, right now.
- Install the free GitHub Action → Get stale-flag findings as a comment on every pull request -- and, for LaunchDarkly and PostHog, an automatically-opened cleanup PR. No credit card.
- Browse the live scoreboard → This post is a snapshot; the scoreboard is the running version, updated as we rescan.
We'll refresh this study annually. If the pattern holds, most of the flags we found this year will still be there next year -- unless someone goes and removes them.