Feature FlagsAugust 28, 20259 min read

Joseph McGrath · Founder of FlagShark

Feature Flag Debt: What CTOs Need to Know

A strategic overview for CTOs on how feature flag debt silently erodes engineering velocity, increases incident risk, and what executive-level actions can reverse the trend.

Feature Flags Engineering Management Technical Debt

On this page

Every CTO has a mental model of where technical debt lives in their organization. It is in the legacy monolith that should have been decomposed three years ago. It is in the database schema that predates half the current team. It is in the test suite that everyone knows is insufficient but nobody has time to fix.

What almost never appears in that mental model is feature flags. And that blind spot is costing your organization more than you realize.

The invisible tax on your engineering organization

Feature flags have become essential to modern software delivery. They enable safe deployments, gradual rollouts, and rapid experimentation. Your teams probably create dozens of them every month. The problem is not the creation -- it is what happens afterward.

Based on what we have seen across engineering organizations, the pattern is consistent:

The vast majority of feature flags are never properly removed. Most teams clean up fewer than half of the flags they create.
Release flags frequently outlive their purpose by months. Flags intended to last a few weeks often persist for 6+ months.
Engineers lose meaningful time every week navigating flag complexity. Code reviews take longer, debugging is harder, and new hires ramp up more slowly.
Flag-related production incidents are more common than most teams realize. They are frequently misclassified as deployment or code quality issues.
Many flags have no documented owner. When no one feels responsible, cleanup never happens.

These observations come from working with engineering teams and analyzing codebases firsthand. They translate directly into velocity loss, incident risk, and talent attrition.

For any engineering organization of meaningful size, flag debt represents a significant and ongoing cost in lost productivity -- before you account for incident costs, opportunity costs, or the compounding effect on future development.

Why this problem is invisible to leadership

Feature flag debt has a unique property that makes it dangerous at the executive level: it is almost perfectly invisible until it causes a catastrophe.

No line item in any budget

Unlike infrastructure costs (which appear in your AWS bill) or tooling costs (which appear in procurement), flag debt hides inside developer time. It manifests as slightly longer sprint cycles, marginally more complex code reviews, and incrementally slower onboarding. No single data point triggers an alarm. The degradation is gradual, distributed, and difficult to attribute.

The "working fine" illusion

Stale flags that are permanently enabled or disabled create no visible symptoms. The code works. The feature is live. From the outside, everything appears functional. But every engineer who touches that code must reason about both branches of a conditional that will never actually toggle. Every test must account for a state that will never exist in production. The cost is real but silent.

Survivorship bias in incident analysis

When flag-related incidents do occur, they are frequently misclassified. A deployment failure caused by an unexpected flag interaction gets labeled as a "deployment process issue." A production bug caused by stale flag logic gets categorized as a "code quality problem." The flag debt contributing factor rarely surfaces in post-mortems because teams focus on the proximate cause, not the underlying complexity.

The board-level risk you cannot ignore

If the productivity argument does not compel action, the risk argument should. Feature flag debt creates organizational risk that belongs on the executive radar.

The Knight Capital precedent

On August 1, 2012, Knight Capital Group deployed code that reactivated an old feature flag controlling an obsolete trading algorithm. In 45 minutes, the firm lost $460 million and was forced into an emergency acquisition. The root cause was not a sophisticated attack or an unforeseen market event. It was a stale feature flag that nobody removed.

Knight Capital is the extreme case, but the pattern repeats at smaller scales across the industry:

Configuration drift incidents where stale flags interact with new features in untested ways
Rollback failures where engineers cannot cleanly revert because flag states have diverged from expectations
Security exposure where deprecated authentication flags leave bypasses in production code
Compliance violations where flag-controlled data handling logic no longer matches documented procedures

Quantifying the risk

A useful framework for communicating flag debt risk to boards and executive teams. These are illustrative ranges, not precise predictions -- actual impact depends on your organization's scale and industry:

Risk Category	Likelihood (unmanaged)	Potential Impact Range	Risk Score
Major production incident	Medium-High	Moderate to severe	High
Data exposure via stale flags	Medium	Significant (regulatory)	High
Deployment failure cascade	High	Moderate per incident	Medium-High
Talent attrition from code quality	High	High (cost of replacing senior engineers)	Medium-High
Compliance audit finding	Medium	Moderate to high (remediation)	Medium

The cumulative risk of unmanaged flag debt -- across all categories -- is significant enough to warrant executive attention and strategic investment.

Assessing your organization's flag debt level

Before you can address the problem, you need to understand its scope. Here is a diagnostic framework for CTOs to assess flag debt across their engineering organization.

The Flag Debt Assessment

Ask your engineering leaders these five questions:

1. How many active feature flags exist across all services?

If the answer is "we don't know," you have a problem. If the answer is a number, benchmark it:

Org Size	Healthy Flag Count	Warning Zone	Critical
10-25 engineers	< 75	75-200	> 200
25-75 engineers	< 200	200-500	> 500
75-200 engineers	< 400	400-1,000	> 1,000
200+ engineers	< 800	800-2,000	> 2,000

2. What percentage of flags are older than 90 days?

For release and experiment flags, 90 days is the threshold between "in active use" and "probably stale." Healthy organizations keep this under 15%. Most organizations are above 40%.

3. Do flags have assigned owners?

Ownerless flags are the highest-risk category. They represent code that nobody feels responsible for maintaining, testing, or removing. If more than 20% of your flags lack clear ownership, your cleanup processes have a structural gap.

4. What is your flag creation-to-removal ratio?

Divide the number of flags removed per quarter by the number created. A ratio below 0.8 means debt is accumulating. A ratio below 0.5 means debt is accumulating rapidly.

5. When was the last flag-related production incident?

If the answer is "never," either your team is exceptionally disciplined or (more likely) incidents are being misclassified. Ask your incident response team to re-examine the last 10 production incidents for flag-related contributing factors.

Scoring your assessment

Score	Level	Action Required
5/5 healthy answers	Low debt	Maintain current practices
3-4 healthy answers	Moderate debt	Process improvements needed within 1 quarter
1-2 healthy answers	High debt	Dedicated initiative needed within 1 month
0 healthy answers	Critical debt	Executive intervention required immediately

Strategic approaches to flag debt reduction

Addressing flag debt at the organizational level requires a combination of tooling, process, and cultural change. The balance between these three depends on your organization's size, maturity, and the severity of the problem.

Tooling investment vs. headcount

The most common executive response to technical debt is "let's hire more engineers." For flag debt specifically, this is the wrong answer. More engineers creating more flags without better processes and tooling will accelerate debt accumulation, not reduce it.

The effective investment hierarchy:

Automated detection and lifecycle tracking (highest ROI) -- Tools that automatically identify flag creation, track flag age, assign ownership, and surface stale flags. This eliminates the visibility problem that allows debt to accumulate silently. Solutions like FlagShark integrate directly with your GitHub workflow to provide this capability without requiring engineers to change their daily habits.
Process and policy enforcement (medium ROI) -- Naming conventions, expiration rules, ownership requirements, and code review gates. These are low-cost to implement but require management discipline to maintain.
Dedicated cleanup headcount (lowest ROI) -- Hiring engineers specifically for cleanup work. This is sometimes necessary for severe debt, but it treats the symptom rather than the cause. Use it as a short-term bridge while implementing categories 1 and 2.

Cultural change: The hardest lever

Technology and process changes are necessary but insufficient. Lasting flag hygiene requires a cultural shift where engineers view cleanup as professional excellence rather than janitorial work.

Executive actions that drive cultural change:

Make flag health metrics visible at the engineering all-hands. What gets measured at the leadership level gets prioritized at the team level.
Include flag hygiene in engineering ladder criteria. If promotions reward feature velocity without considering code stewardship, you are incentivizing debt creation.
Celebrate cleanup work. When a team reduces their stale flag count by 50%, that achievement deserves the same visibility as a major feature launch.
Lead by example. When CTOs and VPs ask about flag health in their staff meetings with the same regularity they ask about sprint velocity, the organization follows.

The compounding nature of flag debt

Unlike many forms of technical debt, flag debt exhibits compounding behavior that makes delayed action increasingly expensive. Understanding this compounding effect is critical for executive decision-making about when to intervene.

The interaction complexity curve

Each new flag added to a system does not exist in isolation. It interacts with every other flag in its vicinity, creating exponential complexity growth. With n feature flags in a single service, the theoretical number of system states is 2^n:

Active Flags	Possible States	Testing Reality
10	1,024	Manageable with strategic coverage
20	1,048,576	Impossible to test comprehensively
30	1,073,741,824	Complete testing is mathematically infeasible
50	1.13 quadrillion	Each new flag doubles the untested state space

In practice, not all flag combinations occur. But the point stands: every stale flag that remains in the codebase multiplies the complexity burden for every engineer, every test, and every deployment.

The knowledge decay problem

Flag debt compounds through knowledge decay. When a flag is created, at least one engineer understands its purpose, its dependencies, and its safe removal path. Over time, that knowledge degrades:

Month 1-3: Creator remembers everything. Removal is straightforward.
Month 3-6: Creator remembers the purpose but may have forgotten edge cases. Removal requires some investigation.
Month 6-12: Creator has moved to other projects. Removal requires significant archeology.
Year 1+: Creator may have left the company. Removal requires reverse-engineering the flag's behavior from code, tests, and (if you are lucky) documentation.

The cost of removing a flag at month 12 is dramatically higher than removing it at month 1. Every month you delay, the per-flag removal cost increases while the total number of flags also grows. This is the compounding effect that makes delayed action so expensive.

The velocity spiral

Flag debt creates a self-reinforcing cycle that accelerates organizational slowdown:

Stale flags increase code complexity
Increased complexity slows development velocity
Slower velocity increases pressure to ship faster
Shipping pressure deprioritizes cleanup work
Deprioritized cleanup leads to more stale flags
Return to step 1, with higher baseline complexity

Each iteration through this cycle degrades velocity further. Organizations that do not break the cycle eventually reach a state where the majority of engineering effort goes toward navigating existing complexity rather than creating new value. By the time this becomes visible in metrics that reach the executive level, the problem has been compounding for years.

ROI calculations for flag management investment

When presenting the business case to your board or CEO, frame flag management as an infrastructure investment with quantifiable returns.

The cost model

The exact dollar cost of flag debt varies by organization, but the cost categories are consistent:

Lost productivity from flag complexity: Engineers spend time navigating dead code paths, reviewing stale conditionals, and maintaining tests for unreachable branches. This is the largest cost category by far.
Incident costs: Flag-related production incidents are often among the most expensive to diagnose because they span configuration and code.
Onboarding overhead: New engineers ramp up more slowly when the codebase is cluttered with flags whose purpose is unclear.
Talent attrition: Senior engineers who care about code quality are the most likely to leave organizations where technical debt goes unaddressed.

Investment in flag management tooling and process typically pays for itself quickly. The tooling cost is modest compared to engineering salaries, and even a small improvement in developer productivity across a team produces significant returns.

The compounding benefit of a cleaner codebase enabling faster future development is the most valuable return, though it is also the hardest to quantify upfront.

Benchmarking against industry standards

How does your organization compare? Use these benchmarks from high-performing engineering organizations to calibrate your expectations.

Flag hygiene maturity model

Level	Characteristics
Level 1: Chaotic	No flag tracking, no ownership, no cleanup process
Level 2: Reactive	Manual tracking, cleanup happens after incidents
Level 3: Proactive	Policies exist, regular cleanup sprints, basic metrics
Level 4: Systematic	Automated tracking, enforced policies, integrated into SDLC
Level 5: Optimized	Fully automated lifecycle, real-time metrics, continuous cleanup

In our experience, most organizations fall into Level 1 or 2. Moving to Level 3 delivers the majority of the value. Moving from Level 3 to Level 4 requires tooling investment but yields the most sustainable results.

Velocity improvements

Organizations that invest in flag hygiene consistently report meaningful improvements across key engineering metrics: faster PR cycle times, higher deployment frequency, fewer flag-related incidents, faster onboarding for new engineers, and better retention of senior engineers who value code quality. The exact numbers vary by organization, but the direction is always the same -- less flag debt means faster, more reliable development.

These velocity gains compound over time. A team that ships faster this quarter builds on that advantage next quarter, while a team drowning in flag debt falls further behind.

Your executive action plan

This month

Commission a flag debt assessment. Ask each engineering leader to answer the five diagnostic questions above. Aggregate the results into an organizational view.
Quantify the cost. Use the cost model framework to estimate what flag debt is costing your organization annually. Even rough estimates will be eye-opening.
Add flag health to your engineering metrics dashboard. If it is not measured, it will not be managed.

This quarter

Approve tooling investment. Automated flag lifecycle management delivers the highest ROI with the least organizational friction. Evaluate options and commit to a solution. FlagShark and similar tools can have you operational within days, not months.
Establish organizational standards. Naming conventions, expiration policies, and ownership requirements should be consistent across teams.
Set targets. Define what "healthy" looks like for your organization and hold engineering leaders accountable to those benchmarks.

This year

Integrate flag hygiene into engineering performance frameworks. What gets rewarded gets done.
Benchmark quarterly. Track your position on the maturity model and your velocity metrics against baseline.
Report to the board. Flag debt risk belongs in your technology risk reporting alongside security, compliance, and infrastructure resilience.

Feature flag debt is one of the few technical debt categories where the ROI on remediation is unambiguously positive, the risk of inaction is quantifiably severe, and the solution is well-understood. The organizations that address it strategically -- with tooling, process, and cultural investment -- will outperform their competitors in velocity, reliability, and talent retention.

The organizations that ignore it will continue to wonder why everything takes so long, why incidents keep recurring, and why their best engineers keep leaving.

The data is clear. The playbook exists. The only remaining question is whether you act before the next flag-related incident forces your hand.

Get more like this

Join engineers who get practical insights on feature flag management, technical debt, and shipping faster.

Keep reading

Feature Flags12 min read

Feature FlagsAugust 28, 20259 min read

Joseph McGrath · Founder of FlagShark

Feature Flag Debt: What CTOs Need to Know

A strategic overview for CTOs on how feature flag debt silently erodes engineering velocity, increases incident risk, and what executive-level actions can reverse the trend.

Feature Flags Engineering Management Technical Debt

On this page

What almost never appears in that mental model is feature flags. And that blind spot is costing your organization more than you realize.

The invisible tax on your engineering organization

Based on what we have seen across engineering organizations, the pattern is consistent:

The vast majority of feature flags are never properly removed. Most teams clean up fewer than half of the flags they create.
Release flags frequently outlive their purpose by months. Flags intended to last a few weeks often persist for 6+ months.
Engineers lose meaningful time every week navigating flag complexity. Code reviews take longer, debugging is harder, and new hires ramp up more slowly.
Flag-related production incidents are more common than most teams realize. They are frequently misclassified as deployment or code quality issues.
Many flags have no documented owner. When no one feels responsible, cleanup never happens.

These observations come from working with engineering teams and analyzing codebases firsthand. They translate directly into velocity loss, incident risk, and talent attrition.

Why this problem is invisible to leadership

Feature flag debt has a unique property that makes it dangerous at the executive level: it is almost perfectly invisible until it causes a catastrophe.

No line item in any budget

The "working fine" illusion

Survivorship bias in incident analysis

The board-level risk you cannot ignore

If the productivity argument does not compel action, the risk argument should. Feature flag debt creates organizational risk that belongs on the executive radar.

The Knight Capital precedent

Knight Capital is the extreme case, but the pattern repeats at smaller scales across the industry:

Configuration drift incidents where stale flags interact with new features in untested ways
Rollback failures where engineers cannot cleanly revert because flag states have diverged from expectations
Security exposure where deprecated authentication flags leave bypasses in production code
Compliance violations where flag-controlled data handling logic no longer matches documented procedures

Quantifying the risk

Risk Category	Likelihood (unmanaged)	Potential Impact Range	Risk Score
Major production incident	Medium-High	Moderate to severe	High
Data exposure via stale flags	Medium	Significant (regulatory)	High
Deployment failure cascade	High	Moderate per incident	Medium-High
Talent attrition from code quality	High	High (cost of replacing senior engineers)	Medium-High
Compliance audit finding	Medium	Moderate to high (remediation)	Medium

The cumulative risk of unmanaged flag debt -- across all categories -- is significant enough to warrant executive attention and strategic investment.

Assessing your organization's flag debt level

Before you can address the problem, you need to understand its scope. Here is a diagnostic framework for CTOs to assess flag debt across their engineering organization.

The Flag Debt Assessment

Ask your engineering leaders these five questions:

1. How many active feature flags exist across all services?

If the answer is "we don't know," you have a problem. If the answer is a number, benchmark it:

Org Size	Healthy Flag Count	Warning Zone	Critical
10-25 engineers	< 75	75-200	> 200
25-75 engineers	< 200	200-500	> 500
75-200 engineers	< 400	400-1,000	> 1,000
200+ engineers	< 800	800-2,000	> 2,000

2. What percentage of flags are older than 90 days?

For release and experiment flags, 90 days is the threshold between "in active use" and "probably stale." Healthy organizations keep this under 15%. Most organizations are above 40%.

3. Do flags have assigned owners?

4. What is your flag creation-to-removal ratio?

Divide the number of flags removed per quarter by the number created. A ratio below 0.8 means debt is accumulating. A ratio below 0.5 means debt is accumulating rapidly.

5. When was the last flag-related production incident?

Scoring your assessment

Score	Level	Action Required
5/5 healthy answers	Low debt	Maintain current practices
3-4 healthy answers	Moderate debt	Process improvements needed within 1 quarter
1-2 healthy answers	High debt	Dedicated initiative needed within 1 month
0 healthy answers	Critical debt	Executive intervention required immediately

Strategic approaches to flag debt reduction

Tooling investment vs. headcount

The effective investment hierarchy:

Automated detection and lifecycle tracking (highest ROI) -- Tools that automatically identify flag creation, track flag age, assign ownership, and surface stale flags. This eliminates the visibility problem that allows debt to accumulate silently. Solutions like FlagShark integrate directly with your GitHub workflow to provide this capability without requiring engineers to change their daily habits.
Process and policy enforcement (medium ROI) -- Naming conventions, expiration rules, ownership requirements, and code review gates. These are low-cost to implement but require management discipline to maintain.
Dedicated cleanup headcount (lowest ROI) -- Hiring engineers specifically for cleanup work. This is sometimes necessary for severe debt, but it treats the symptom rather than the cause. Use it as a short-term bridge while implementing categories 1 and 2.

Cultural change: The hardest lever

Technology and process changes are necessary but insufficient. Lasting flag hygiene requires a cultural shift where engineers view cleanup as professional excellence rather than janitorial work.

Executive actions that drive cultural change:

Make flag health metrics visible at the engineering all-hands. What gets measured at the leadership level gets prioritized at the team level.
Include flag hygiene in engineering ladder criteria. If promotions reward feature velocity without considering code stewardship, you are incentivizing debt creation.
Celebrate cleanup work. When a team reduces their stale flag count by 50%, that achievement deserves the same visibility as a major feature launch.
Lead by example. When CTOs and VPs ask about flag health in their staff meetings with the same regularity they ask about sprint velocity, the organization follows.

The compounding nature of flag debt

The interaction complexity curve

Active Flags	Possible States	Testing Reality
10	1,024	Manageable with strategic coverage
20	1,048,576	Impossible to test comprehensively
30	1,073,741,824	Complete testing is mathematically infeasible
50	1.13 quadrillion	Each new flag doubles the untested state space

The knowledge decay problem

Flag debt compounds through knowledge decay. When a flag is created, at least one engineer understands its purpose, its dependencies, and its safe removal path. Over time, that knowledge degrades:

Month 1-3: Creator remembers everything. Removal is straightforward.
Month 3-6: Creator remembers the purpose but may have forgotten edge cases. Removal requires some investigation.
Month 6-12: Creator has moved to other projects. Removal requires significant archeology.
Year 1+: Creator may have left the company. Removal requires reverse-engineering the flag's behavior from code, tests, and (if you are lucky) documentation.

The velocity spiral

Flag debt creates a self-reinforcing cycle that accelerates organizational slowdown:

Stale flags increase code complexity
Increased complexity slows development velocity
Slower velocity increases pressure to ship faster
Shipping pressure deprioritizes cleanup work
Deprioritized cleanup leads to more stale flags
Return to step 1, with higher baseline complexity

ROI calculations for flag management investment

When presenting the business case to your board or CEO, frame flag management as an infrastructure investment with quantifiable returns.

The cost model

The exact dollar cost of flag debt varies by organization, but the cost categories are consistent:

Lost productivity from flag complexity: Engineers spend time navigating dead code paths, reviewing stale conditionals, and maintaining tests for unreachable branches. This is the largest cost category by far.
Incident costs: Flag-related production incidents are often among the most expensive to diagnose because they span configuration and code.
Onboarding overhead: New engineers ramp up more slowly when the codebase is cluttered with flags whose purpose is unclear.
Talent attrition: Senior engineers who care about code quality are the most likely to leave organizations where technical debt goes unaddressed.

The compounding benefit of a cleaner codebase enabling faster future development is the most valuable return, though it is also the hardest to quantify upfront.

Benchmarking against industry standards

How does your organization compare? Use these benchmarks from high-performing engineering organizations to calibrate your expectations.

Flag hygiene maturity model

Level	Characteristics
Level 1: Chaotic	No flag tracking, no ownership, no cleanup process
Level 2: Reactive	Manual tracking, cleanup happens after incidents
Level 3: Proactive	Policies exist, regular cleanup sprints, basic metrics
Level 4: Systematic	Automated tracking, enforced policies, integrated into SDLC
Level 5: Optimized	Fully automated lifecycle, real-time metrics, continuous cleanup

Velocity improvements

These velocity gains compound over time. A team that ships faster this quarter builds on that advantage next quarter, while a team drowning in flag debt falls further behind.

Your executive action plan

This month

Commission a flag debt assessment. Ask each engineering leader to answer the five diagnostic questions above. Aggregate the results into an organizational view.
Quantify the cost. Use the cost model framework to estimate what flag debt is costing your organization annually. Even rough estimates will be eye-opening.
Add flag health to your engineering metrics dashboard. If it is not measured, it will not be managed.

This quarter

Approve tooling investment. Automated flag lifecycle management delivers the highest ROI with the least organizational friction. Evaluate options and commit to a solution. FlagShark and similar tools can have you operational within days, not months.
Establish organizational standards. Naming conventions, expiration policies, and ownership requirements should be consistent across teams.
Set targets. Define what "healthy" looks like for your organization and hold engineering leaders accountable to those benchmarks.

This year

Integrate flag hygiene into engineering performance frameworks. What gets rewarded gets done.
Benchmark quarterly. Track your position on the maturity model and your velocity metrics against baseline.
Report to the board. Flag debt risk belongs in your technology risk reporting alongside security, compliance, and infrastructure resilience.

The organizations that ignore it will continue to wonder why everything takes so long, why incidents keep recurring, and why their best engineers keep leaving.

The data is clear. The playbook exists. The only remaining question is whether you act before the next flag-related incident forces your hand.

Get more like this

Join engineers who get practical insights on feature flag management, technical debt, and shipping faster.

Keep reading

Feature Flags12 min read

The invisible tax on your engineering organization

Why this problem is invisible to leadership

No line item in any budget

The "working fine" illusion

Survivorship bias in incident analysis

The board-level risk you cannot ignore

The Knight Capital precedent

Quantifying the risk

Assessing your organization's flag debt level

The Flag Debt Assessment

Scoring your assessment

Strategic approaches to flag debt reduction

Tooling investment vs. headcount

Cultural change: The hardest lever

The compounding nature of flag debt

The interaction complexity curve

The knowledge decay problem

The velocity spiral

ROI calculations for flag management investment

The cost model

Benchmarking against industry standards

Flag hygiene maturity model

Velocity improvements

Your executive action plan

This month

This quarter

This year

Get more like this

Related Articles

New Developer Onboarding: How Stale Feature Flags Slow Down Your Team

Feature Flag Governance: A Framework for Engineering Teams at Scale

How Many Feature Flags Is Too Many? A Data-Driven Answer

The invisible tax on your engineering organization

Why this problem is invisible to leadership

No line item in any budget

The "working fine" illusion

Survivorship bias in incident analysis

The board-level risk you cannot ignore

The Knight Capital precedent

Quantifying the risk

Assessing your organization's flag debt level

The Flag Debt Assessment

Scoring your assessment

Strategic approaches to flag debt reduction

Tooling investment vs. headcount

Cultural change: The hardest lever

The compounding nature of flag debt

The interaction complexity curve

The knowledge decay problem

The velocity spiral

ROI calculations for flag management investment

The cost model

Benchmarking against industry standards

Flag hygiene maturity model

Velocity improvements

Your executive action plan

This month

This quarter

This year

Get more like this

Related Articles

New Developer Onboarding: How Stale Feature Flags Slow Down Your Team

Feature Flag Governance: A Framework for Engineering Teams at Scale

How Many Feature Flags Is Too Many? A Data-Driven Answer