September 11, 2025·11 min read

How to Audit Your Feature Flags in 30 Minutes

A step-by-step guide to quickly audit your feature flags and identify stale, orphaned, and risky flags in under 30 minutes using simple tools and techniques.

Feature Flags Best Practices DevOps

Somewhere in your codebase right now, there are feature flags that nobody remembers creating. Flags controlling code paths that haven't been touched in months. Flags referencing experiments that ended two quarters ago. Flags whose owners left the company a year before you joined.

You know they're there. You've stumbled across them during late-night debugging sessions, muttered something about cleaning them up "next sprint," and moved on. The backlog is full, the roadmap is packed, and flag cleanup never quite makes it to the top.

Here is the uncomfortable truth: most enterprise codebases contain far more feature flags than anyone expects, and the majority of them are stale. Every one of those stale flags adds cognitive load, increases testing complexity, and creates potential failure modes. But most teams never audit their flags because they assume it will take days of painstaking work.

It does not have to. You can perform a meaningful feature flag audit in 30 minutes flat. Not a comprehensive deep-clean -- that comes later -- but a rapid triage that identifies your biggest risks, builds a removal backlog, and gives you the data to justify dedicated cleanup time to your engineering lead.

Set a timer. Let's go.

Minute 0-5: Build your flag inventory with grep

Before you can assess anything, you need a list. The fastest way to build a flag inventory is to search your codebase for calls to your feature flag SDK.

Finding flags by SDK method calls

Start with your primary flag provider. If you use LaunchDarkly, Unleash, Split, or any other provider, you already know the method names. Search for those directly.

LaunchDarkly (Go):

grep -rn "BoolVariation\|StringVariation\|IntVariation\|Float64Variation\|JSONVariation" \
  --include="*.go" \
  --exclude-dir=vendor \
  --exclude-dir=node_modules \
  . | sort -t: -k1,1 > /tmp/flag-audit.txt

LaunchDarkly (TypeScript/JavaScript):

grep -rn "variation\|boolVariation\|stringVariation\|intVariation\|jsonVariation" \
  --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" \
  --exclude-dir=node_modules \
  --exclude-dir=dist \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

LaunchDarkly (Python):

grep -rn "variation\|variation_detail\|bool_variation" \
  --include="*.py" \
  --exclude-dir=venv \
  --exclude-dir=.venv \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

Unleash (any language):

grep -rn "isEnabled\|is_enabled\|IsEnabled\|getVariant\|get_variant\|GetVariant" \
  --include="*.go" --include="*.ts" --include="*.tsx" --include="*.js" --include="*.py" \
  --exclude-dir=vendor --exclude-dir=node_modules --exclude-dir=venv \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

Extracting unique flag keys

Now extract just the flag key strings from those results:

grep -ohP '(?:Variation|isEnabled|is_enabled|IsEnabled|GetVariant)\s*\(\s*"([^"]+)"' \
  /tmp/flag-audit.txt | \
  grep -oP '"[^"]+"' | \
  sort -u > /tmp/flag-keys.txt

wc -l /tmp/flag-keys.txt

That wc -l output is your total flag count. Write it down -- this is your baseline metric.

What you should have after 5 minutes:

A file listing every line in your codebase that evaluates a flag
A deduplicated list of unique flag keys
A total flag count

If your total is under 20, you are in decent shape. Between 20 and 50, you have work to do. Over 100, you have a serious flag debt problem. Over 200, you are not alone -- but you need to act soon.

Minute 5-10: Cross-reference with your flag management platform

Open your flag management platform (LaunchDarkly, Unleash, Split, ConfigCat, or whatever you use) in another browser tab. You are looking for two things: flags that exist in code but not in the platform, and flags that exist in the platform but not in code.

Flags in code but not in the platform

These are orphaned flags -- code references to flags that have been deleted from your management platform. They are evaluating against default values, which means they are dead code paths that still add complexity.

Export your platform's flag list (most platforms have a CSV or API export) and compare:

# Export your platform flags to a file (method varies by provider)
# For LaunchDarkly CLI:
# ld flags list --project default --environment production > /tmp/platform-flags.txt

# Compare: flags in code but NOT in platform
comm -23 /tmp/flag-keys.txt /tmp/platform-flags.txt > /tmp/orphaned-flags.txt

echo "Orphaned flags (in code, not in platform):"
cat /tmp/orphaned-flags.txt

Flags in the platform but not in code

These are phantom flags -- configured in your management platform but never referenced in code. They might be remnants of deleted features, or they might indicate that flag cleanup happened in code but nobody removed the platform configuration.

# Compare: flags in platform but NOT in code
comm -13 /tmp/flag-keys.txt /tmp/platform-flags.txt > /tmp/phantom-flags.txt

echo "Phantom flags (in platform, not in code):"
cat /tmp/phantom-flags.txt

Quick platform checks

While you are in your platform, note these details for each flag:

Check	What to Look For
Targeting rules	Flags with no targeting rules (serving default to everyone)
Percentage rollout	Flags at 100% or 0% for more than 30 days
Last evaluated	Flags with zero evaluations in the past 30 days
Environments	Flags with different states across environments
Tags/descriptions	Flags with no tags, descriptions, or ownership metadata

What you should have after 10 minutes:

A list of orphaned flags (in code, not in platform)
A list of phantom flags (in platform, not in code)
Notes on flags with suspicious platform configurations

Minute 10-15: Check git blame for age and ownership

Now it gets interesting. For each flag in your inventory, you want to know two things: how old is it, and who created it?

Finding flag creation dates

Use git log to find when each flag key first appeared in the codebase:

while IFS= read -r flag; do
  # Remove quotes from flag key
  clean_flag=$(echo "$flag" | tr -d '"')

  # Find the first commit that introduced this flag
  first_commit=$(git log --all --diff-filter=A -p --reverse -S "$clean_flag" \
    --format="%H %ai %an" -- '*.go' '*.ts' '*.tsx' '*.js' '*.py' | head -1)

  if [ -n "$first_commit" ]; then
    echo "$clean_flag | $first_commit"
  else
    echo "$clean_flag | UNKNOWN ORIGIN"
  fi
done < /tmp/flag-keys.txt > /tmp/flag-ages.txt

This command searches the entire git history for the first commit that added each flag key. It outputs the flag name, commit hash, date, and author.

Quick age analysis

# Count flags older than 90 days
echo "=== Flag Age Distribution ==="
echo "Flags older than 90 days:"
awk -F'|' '{print $2}' /tmp/flag-ages.txt | \
  awk '{print $2}' | \
  while read date; do
    if [ "$(date -d "$date" +%s 2>/dev/null)" -lt "$(date -d '90 days ago' +%s)" ]; then
      echo "$date"
    fi
  done | wc -l

echo "Flags older than 180 days:"
awk -F'|' '{print $2}' /tmp/flag-ages.txt | \
  awk '{print $2}' | \
  while read date; do
    if [ "$(date -d "$date" +%s 2>/dev/null)" -lt "$(date -d '180 days ago' +%s)" ]; then
      echo "$date"
    fi
  done | wc -l

Ownership check

For each flag, identify whether the original author is still on the team:

# List unique flag authors
echo "=== Flag Authors ==="
awk -F'|' '{print $2}' /tmp/flag-ages.txt | awk '{for(i=4;i<=NF;i++) printf "%s ",$i; print ""}' | sort | uniq -c | sort -rn

Flags whose authors have left the company are higher risk -- there is no institutional knowledge about why they were created or what edge cases they handle.

What you should have after 15 minutes:

The creation date for each flag
The original author for each flag
A count of flags by age bucket (30/60/90/180+ days)
A list of flags with no identifiable owner

Minute 15-20: Categorize your flags

Now categorize every flag into one of five buckets. This is the most important step because it determines your action plan.

The five flag categories

Category	Definition	Criteria	Action
Active	Currently in rollout or experiment	Created < 30 days ago, targeting rules active, not at 100%	Leave alone
Completed	Rollout finished, flag at 100%	At 100% for 30+ days, no targeting rules	Remove (safe)
Stale	No recent activity, unclear purpose	90+ days old, no recent evaluations, no documentation	Investigate, then remove
Orphaned	In code but not in platform	Flag key not found in management platform	Remove (safe)
Risky	Complex dependencies or interactions	Nested with other flags, shared across services, kill switch	Careful removal with testing

Scoring rubric

Assign each flag a risk score from 1-10 based on these factors:

Factor	Low Risk (1-3)	Medium Risk (4-6)	High Risk (7-10)
Age	< 30 days	30-90 days	90+ days
References	1-2 files	3-5 files	6+ files
Nesting	No nesting	Inside one conditional	Nested with other flags
Owner	Active team member	Team member, different team	Left company / unknown
Test coverage	Flag-specific tests exist	General tests cover path	No test coverage
Documentation	Documented with purpose	Partial documentation	No documentation
Service scope	Single service	2-3 services	Cross-service dependency

Total the scores. Flags scoring 25+ out of 70 should be prioritized for immediate investigation. Flags scoring 40+ are ticking time bombs.

Building the audit spreadsheet

Create a spreadsheet (or CSV) with this template:

Flag Key,Category,Age (Days),Owner,Files Referenced,Risk Score,Platform Status,Last Evaluated,Action,Priority,Ticket
enable-new-checkout,Completed,127,jane.doe,3,18,100% ON,2025-08-15,Remove,High,
legacy-auth-bypass,Stale,340,UNKNOWN,7,42,Not Found,Never,Investigate,Critical,
experiment-pricing-v2,Active,12,bob.smith,2,8,50% rollout,2025-09-10,Monitor,Low,
temp-fix-api-timeout,Orphaned,95,sarah.jones,1,22,Not Found,N/A,Remove,High,

Fill this out as fast as you can -- you do not need perfect data for every column. Estimates are fine. The goal is a complete picture, not a perfect one.

What you should have after 20 minutes:

Every flag categorized into one of five buckets
A risk score for each flag
A spreadsheet tracking all flags with key metadata

Minute 20-25: Prioritize removal candidates

With your categorized inventory, sort by priority. The goal is to identify the flags you can remove with the least risk and the most impact.

The removal priority matrix

Priority	Category	Criteria	Estimated Effort
P0 - Immediate	Orphaned	In code, not in platform, evaluating defaults	15-30 min per flag
P1 - This Sprint	Completed	100% on/off for 90+ days, clear purpose	30-60 min per flag
P2 - Next Sprint	Stale	90+ days, low reference count, owner available	1-2 hours per flag
P3 - Scheduled	Stale	High reference count, complex dependencies	2-4 hours per flag
P4 - Investigate	Risky	Nested flags, cross-service, unknown purpose	4+ hours per flag

Quick wins: flags you can remove today

Orphaned flags are almost always safe to remove immediately. They reference flag keys that no longer exist in your management platform, which means they are already evaluating to their default values in production. Removing them changes nothing about runtime behavior -- it just removes dead code.

Completed flags at 100% are the next easiest. If a flag has been serving 100% of traffic for 90+ days with no issues, the feature is stable. The flag is no longer providing value; it is only adding complexity.

Estimating cleanup impact

Calculate the total hours needed to clear your backlog:

P0 flags: ___ flags x 0.5 hours = ___ hours
P1 flags: ___ flags x 0.75 hours = ___ hours
P2 flags: ___ flags x 1.5 hours = ___ hours
P3 flags: ___ flags x 3 hours = ___ hours
P4 flags: ___ flags x 5 hours = ___ hours
---
Total estimated cleanup: ___ hours

This number is critical for step 6. It tells your engineering lead exactly how much investment is needed and lets you plan cleanup across multiple sprints rather than trying to do everything at once.

What you should have after 25 minutes:

A prioritized removal list
Estimated effort for each priority tier
A total cleanup hours estimate
A shortlist of quick wins you can tackle immediately

Minute 25-30: Create cleanup tickets

The audit is worthless if it does not lead to action. In the final five minutes, convert your findings into trackable work items.

Creating effective cleanup tickets

For each P0 and P1 flag, create a ticket with this template:

## Remove feature flag: [FLAG_KEY]

**Category:** [Orphaned / Completed / Stale]
**Risk Score:** [X/70]
**Age:** [X days]
**Owner:** [Original author]
**Files affected:** [List files]

### Context
[One sentence about what this flag controlled]

### Acceptance Criteria
- [ ] Flag evaluation code removed from all files
- [ ] Default/winning code path preserved
- [ ] Dead code path removed
- [ ] Flag removed from management platform (if applicable)
- [ ] Tests updated to remove flag-specific branches
- [ ] No regressions in affected test suites

### Estimated Effort
[X hours]

Batch tickets for efficiency

For P2 and P3 flags, group them into batch tickets by service or module:

## Flag cleanup batch: [Service/Module Name]

**Flags to remove:** [count]
**Total estimated effort:** [X hours]

### Flags
| Flag Key | Category | Risk Score | Files |
|----------|----------|------------|-------|
| flag-1   | Stale    | 28         | 3     |
| flag-2   | Stale    | 31         | 4     |
| flag-3   | Completed| 15         | 2     |

### Notes
[Any dependencies or ordering requirements]

The summary ticket

Create one summary ticket that captures the audit results:

## Feature Flag Audit Results - [Date]

### Summary
- **Total flags found:** [X]
- **Active (keep):** [X]
- **Removal candidates:** [X]
- **Estimated total cleanup:** [X hours]

### Breakdown
| Category | Count | Est. Hours |
|----------|-------|-----------|
| Orphaned (P0) | X | X |
| Completed (P1) | X | X |
| Stale (P2) | X | X |
| Complex (P3) | X | X |
| Risky (P4) | X | X |

### Recommendation
[Your recommendation for cleanup cadence and timeline]

What you should have after 30 minutes:

Individual tickets for P0 and P1 flags
Batch tickets for P2 and P3 flags
A summary ticket with full audit results
A recommended cleanup timeline

After the audit: Making it stick

A one-time audit solves today's problem. Preventing the next flag graveyard requires building auditing into your regular workflow.

Establish a recurring audit cadence

Team Size	Recommended Cadence	Time Investment
1-10 engineers	Monthly	30 minutes
10-30 engineers	Bi-weekly	30-45 minutes
30-100 engineers	Weekly	45-60 minutes
100+ engineers	Continuous (automated)	Tool-driven

Automate what you can

The manual grep-and-git-blame approach works for a quick audit, but it does not scale. As your team and codebase grow, you need automated flag detection and lifecycle tracking.

Tools like FlagShark automate the entire inventory and age-tracking process by analyzing every pull request for flag additions and removals. Instead of running grep commands monthly, you get a continuously updated view of every flag in your codebase, when it was added, who added it, and whether it has been removed.

Set flag hygiene metrics

Track these metrics over time to measure improvement:

Metric	Target	How to Measure
Total flag count	Stable or declining	Monthly audit
Stale flag percentage	< 20%	Flags > 90 days with no changes
Average flag age	< 60 days	Mean age of all active flags
Orphaned flag count	0	Cross-reference code vs. platform
Time to remove	< 14 days after 100% rollout	Track from rollout completion to removal
Audit completion rate	100%	Tickets created vs. tickets completed

The flag hygiene dashboard

If you track nothing else, track the stale flag ratio: the percentage of flags older than 90 days with no recent modifications. This single metric captures the health of your flag management practices better than any other.

A stale flag ratio under 20% means your team is actively managing flag lifecycle. Between 20-40% suggests flag debt is accumulating but manageable. Over 40% means you are heading toward flag hell and need to prioritize cleanup.

Common audit pitfalls to avoid

Pitfall 1: Only searching for known patterns

Your grep commands only find flags you know about. Teams often have flags hidden in configuration files, environment variables, or database-driven feature toggles that do not appear in code searches. Check your config files, YAML, JSON, and environment templates too.

Pitfall 2: Ignoring test files

Test files contain flag references that need cleanup too. When you remove a flag from production code, the corresponding test mocks, fixtures, and assertions also need updating. Include test files in your audit scope.

Pitfall 3: Treating all flags equally

A kill switch that protects against a known failure mode should not be treated the same as a stale release flag. Your scoring rubric accounts for this, but resist the urge to set a blanket "remove everything older than X days" policy without nuance.

Pitfall 4: Auditing without authority to act

An audit that produces tickets nobody works on is worse than no audit at all -- it creates the illusion of progress while debt continues to accumulate. Before you start, confirm that your team lead or engineering manager supports allocating time for the resulting cleanup work.

The 30-minute audit in practice

Here is what a real audit looks like for a mid-size codebase:

Metric	Example Result
Total flags found	67
Active (keep)	12
Completed (P1 removal)	18
Orphaned (P0 removal)	8
Stale (P2/P3)	23
Risky (P4 investigate)	6
Estimated cleanup hours	47 hours
Stale flag ratio	46%

In this example, 26 flags (the P0 and P1 items) can be removed with relatively low risk, taking roughly 16 hours of work. That is two days of focused cleanup that eliminates nearly 40% of the flag inventory. The remaining 29 flags need more investigation, but they are now tracked and prioritized instead of invisible.

Forty-seven hours might sound like a lot, but spread across a team of 10 engineers over two sprints, that is less than 3 hours per person per sprint. The productivity gains from removing 55 flags will pay that investment back within weeks.

Thirty minutes. That is all it takes to go from "we probably have some stale flags" to a prioritized, ticketed cleanup plan backed by real data. The flags hiding in your codebase are not going to clean themselves up. But now you know exactly where they are, how old they are, who owns them, and what it will take to remove them.

The only question left is when you start. Set a timer. Open a terminal. The audit begins now.

Progressive Delivery and Feature Flags: A Practical Guide

Progressive delivery uses feature flags for canary releases, percentage rollouts, and ring deployments. A practical guide to implementation, monitoring, and the cleanup challenge it creates.

February 5, 2026·12 min read

Feature Flag Rollback Strategy: When and How to Use Kill Switches

Kill switches enable instant rollbacks — but they become liabilities when forgotten. A practical guide to rollback strategies, kill switch design, and knowing when to retire them.

February 4, 2026·14 min read

Feature Flag Testing Strategy: How to Test Without Losing Your Mind

With n flags you have 2^n possible states. Here's how to build a practical testing strategy that covers what matters without drowning in combinatorial complexity.

January 29, 2026·13 min read

View all articles

September 11, 2025·11 min read

How to Audit Your Feature Flags in 30 Minutes

A step-by-step guide to quickly audit your feature flags and identify stale, orphaned, and risky flags in under 30 minutes using simple tools and techniques.

Feature Flags Best Practices DevOps

Set a timer. Let's go.

Minute 0-5: Build your flag inventory with grep

Before you can assess anything, you need a list. The fastest way to build a flag inventory is to search your codebase for calls to your feature flag SDK.

Finding flags by SDK method calls

Start with your primary flag provider. If you use LaunchDarkly, Unleash, Split, or any other provider, you already know the method names. Search for those directly.

LaunchDarkly (Go):

grep -rn "BoolVariation\|StringVariation\|IntVariation\|Float64Variation\|JSONVariation" \
  --include="*.go" \
  --exclude-dir=vendor \
  --exclude-dir=node_modules \
  . | sort -t: -k1,1 > /tmp/flag-audit.txt

LaunchDarkly (TypeScript/JavaScript):

grep -rn "variation\|boolVariation\|stringVariation\|intVariation\|jsonVariation" \
  --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" \
  --exclude-dir=node_modules \
  --exclude-dir=dist \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

LaunchDarkly (Python):

grep -rn "variation\|variation_detail\|bool_variation" \
  --include="*.py" \
  --exclude-dir=venv \
  --exclude-dir=.venv \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

Unleash (any language):

grep -rn "isEnabled\|is_enabled\|IsEnabled\|getVariant\|get_variant\|GetVariant" \
  --include="*.go" --include="*.ts" --include="*.tsx" --include="*.js" --include="*.py" \
  --exclude-dir=vendor --exclude-dir=node_modules --exclude-dir=venv \
  . | sort -t: -k1,1 >> /tmp/flag-audit.txt

Extracting unique flag keys

Now extract just the flag key strings from those results:

grep -ohP '(?:Variation|isEnabled|is_enabled|IsEnabled|GetVariant)\s*\(\s*"([^"]+)"' \
  /tmp/flag-audit.txt | \
  grep -oP '"[^"]+"' | \
  sort -u > /tmp/flag-keys.txt

wc -l /tmp/flag-keys.txt

That wc -l output is your total flag count. Write it down -- this is your baseline metric.

What you should have after 5 minutes:

A file listing every line in your codebase that evaluates a flag
A deduplicated list of unique flag keys
A total flag count

If your total is under 20, you are in decent shape. Between 20 and 50, you have work to do. Over 100, you have a serious flag debt problem. Over 200, you are not alone -- but you need to act soon.

Minute 5-10: Cross-reference with your flag management platform

Flags in code but not in the platform

Export your platform's flag list (most platforms have a CSV or API export) and compare:

# Export your platform flags to a file (method varies by provider)
# For LaunchDarkly CLI:
# ld flags list --project default --environment production > /tmp/platform-flags.txt

# Compare: flags in code but NOT in platform
comm -23 /tmp/flag-keys.txt /tmp/platform-flags.txt > /tmp/orphaned-flags.txt

echo "Orphaned flags (in code, not in platform):"
cat /tmp/orphaned-flags.txt

Flags in the platform but not in code

# Compare: flags in platform but NOT in code
comm -13 /tmp/flag-keys.txt /tmp/platform-flags.txt > /tmp/phantom-flags.txt

echo "Phantom flags (in platform, not in code):"
cat /tmp/phantom-flags.txt

Quick platform checks

While you are in your platform, note these details for each flag:

Check	What to Look For
Targeting rules	Flags with no targeting rules (serving default to everyone)
Percentage rollout	Flags at 100% or 0% for more than 30 days
Last evaluated	Flags with zero evaluations in the past 30 days
Environments	Flags with different states across environments
Tags/descriptions	Flags with no tags, descriptions, or ownership metadata

What you should have after 10 minutes:

A list of orphaned flags (in code, not in platform)
A list of phantom flags (in platform, not in code)
Notes on flags with suspicious platform configurations

Minute 10-15: Check git blame for age and ownership

Now it gets interesting. For each flag in your inventory, you want to know two things: how old is it, and who created it?

Finding flag creation dates

Use git log to find when each flag key first appeared in the codebase:

while IFS= read -r flag; do
  # Remove quotes from flag key
  clean_flag=$(echo "$flag" | tr -d '"')

  # Find the first commit that introduced this flag
  first_commit=$(git log --all --diff-filter=A -p --reverse -S "$clean_flag" \
    --format="%H %ai %an" -- '*.go' '*.ts' '*.tsx' '*.js' '*.py' | head -1)

  if [ -n "$first_commit" ]; then
    echo "$clean_flag | $first_commit"
  else
    echo "$clean_flag | UNKNOWN ORIGIN"
  fi
done < /tmp/flag-keys.txt > /tmp/flag-ages.txt

This command searches the entire git history for the first commit that added each flag key. It outputs the flag name, commit hash, date, and author.

Quick age analysis

# Count flags older than 90 days
echo "=== Flag Age Distribution ==="
echo "Flags older than 90 days:"
awk -F'|' '{print $2}' /tmp/flag-ages.txt | \
  awk '{print $2}' | \
  while read date; do
    if [ "$(date -d "$date" +%s 2>/dev/null)" -lt "$(date -d '90 days ago' +%s)" ]; then
      echo "$date"
    fi
  done | wc -l

echo "Flags older than 180 days:"
awk -F'|' '{print $2}' /tmp/flag-ages.txt | \
  awk '{print $2}' | \
  while read date; do
    if [ "$(date -d "$date" +%s 2>/dev/null)" -lt "$(date -d '180 days ago' +%s)" ]; then
      echo "$date"
    fi
  done | wc -l

Ownership check

For each flag, identify whether the original author is still on the team:

# List unique flag authors
echo "=== Flag Authors ==="
awk -F'|' '{print $2}' /tmp/flag-ages.txt | awk '{for(i=4;i<=NF;i++) printf "%s ",$i; print ""}' | sort | uniq -c | sort -rn

Flags whose authors have left the company are higher risk -- there is no institutional knowledge about why they were created or what edge cases they handle.

What you should have after 15 minutes:

The creation date for each flag
The original author for each flag
A count of flags by age bucket (30/60/90/180+ days)
A list of flags with no identifiable owner

Minute 15-20: Categorize your flags

Now categorize every flag into one of five buckets. This is the most important step because it determines your action plan.

The five flag categories

Category	Definition	Criteria	Action
Active	Currently in rollout or experiment	Created < 30 days ago, targeting rules active, not at 100%	Leave alone
Completed	Rollout finished, flag at 100%	At 100% for 30+ days, no targeting rules	Remove (safe)
Stale	No recent activity, unclear purpose	90+ days old, no recent evaluations, no documentation	Investigate, then remove
Orphaned	In code but not in platform	Flag key not found in management platform	Remove (safe)
Risky	Complex dependencies or interactions	Nested with other flags, shared across services, kill switch	Careful removal with testing

Scoring rubric

Assign each flag a risk score from 1-10 based on these factors:

Factor	Low Risk (1-3)	Medium Risk (4-6)	High Risk (7-10)
Age	< 30 days	30-90 days	90+ days
References	1-2 files	3-5 files	6+ files
Nesting	No nesting	Inside one conditional	Nested with other flags
Owner	Active team member	Team member, different team	Left company / unknown
Test coverage	Flag-specific tests exist	General tests cover path	No test coverage
Documentation	Documented with purpose	Partial documentation	No documentation
Service scope	Single service	2-3 services	Cross-service dependency

Total the scores. Flags scoring 25+ out of 70 should be prioritized for immediate investigation. Flags scoring 40+ are ticking time bombs.

Building the audit spreadsheet

Create a spreadsheet (or CSV) with this template:

Flag Key,Category,Age (Days),Owner,Files Referenced,Risk Score,Platform Status,Last Evaluated,Action,Priority,Ticket
enable-new-checkout,Completed,127,jane.doe,3,18,100% ON,2025-08-15,Remove,High,
legacy-auth-bypass,Stale,340,UNKNOWN,7,42,Not Found,Never,Investigate,Critical,
experiment-pricing-v2,Active,12,bob.smith,2,8,50% rollout,2025-09-10,Monitor,Low,
temp-fix-api-timeout,Orphaned,95,sarah.jones,1,22,Not Found,N/A,Remove,High,

Fill this out as fast as you can -- you do not need perfect data for every column. Estimates are fine. The goal is a complete picture, not a perfect one.

What you should have after 20 minutes:

Every flag categorized into one of five buckets
A risk score for each flag
A spreadsheet tracking all flags with key metadata

Minute 20-25: Prioritize removal candidates

With your categorized inventory, sort by priority. The goal is to identify the flags you can remove with the least risk and the most impact.

The removal priority matrix

Priority	Category	Criteria	Estimated Effort
P0 - Immediate	Orphaned	In code, not in platform, evaluating defaults	15-30 min per flag
P1 - This Sprint	Completed	100% on/off for 90+ days, clear purpose	30-60 min per flag
P2 - Next Sprint	Stale	90+ days, low reference count, owner available	1-2 hours per flag
P3 - Scheduled	Stale	High reference count, complex dependencies	2-4 hours per flag
P4 - Investigate	Risky	Nested flags, cross-service, unknown purpose	4+ hours per flag

Quick wins: flags you can remove today

Estimating cleanup impact

Calculate the total hours needed to clear your backlog:

P0 flags: ___ flags x 0.5 hours = ___ hours
P1 flags: ___ flags x 0.75 hours = ___ hours
P2 flags: ___ flags x 1.5 hours = ___ hours
P3 flags: ___ flags x 3 hours = ___ hours
P4 flags: ___ flags x 5 hours = ___ hours
---
Total estimated cleanup: ___ hours

This number is critical for step 6. It tells your engineering lead exactly how much investment is needed and lets you plan cleanup across multiple sprints rather than trying to do everything at once.

What you should have after 25 minutes:

A prioritized removal list
Estimated effort for each priority tier
A total cleanup hours estimate
A shortlist of quick wins you can tackle immediately

Minute 25-30: Create cleanup tickets

The audit is worthless if it does not lead to action. In the final five minutes, convert your findings into trackable work items.

Creating effective cleanup tickets

For each P0 and P1 flag, create a ticket with this template:

## Remove feature flag: [FLAG_KEY]

**Category:** [Orphaned / Completed / Stale]
**Risk Score:** [X/70]
**Age:** [X days]
**Owner:** [Original author]
**Files affected:** [List files]

### Context
[One sentence about what this flag controlled]

### Acceptance Criteria
- [ ] Flag evaluation code removed from all files
- [ ] Default/winning code path preserved
- [ ] Dead code path removed
- [ ] Flag removed from management platform (if applicable)
- [ ] Tests updated to remove flag-specific branches
- [ ] No regressions in affected test suites

### Estimated Effort
[X hours]

Batch tickets for efficiency

For P2 and P3 flags, group them into batch tickets by service or module:

## Flag cleanup batch: [Service/Module Name]

**Flags to remove:** [count]
**Total estimated effort:** [X hours]

### Flags
| Flag Key | Category | Risk Score | Files |
|----------|----------|------------|-------|
| flag-1   | Stale    | 28         | 3     |
| flag-2   | Stale    | 31         | 4     |
| flag-3   | Completed| 15         | 2     |

### Notes
[Any dependencies or ordering requirements]

The summary ticket

Create one summary ticket that captures the audit results:

## Feature Flag Audit Results - [Date]

### Summary
- **Total flags found:** [X]
- **Active (keep):** [X]
- **Removal candidates:** [X]
- **Estimated total cleanup:** [X hours]

### Breakdown
| Category | Count | Est. Hours |
|----------|-------|-----------|
| Orphaned (P0) | X | X |
| Completed (P1) | X | X |
| Stale (P2) | X | X |
| Complex (P3) | X | X |
| Risky (P4) | X | X |

### Recommendation
[Your recommendation for cleanup cadence and timeline]

What you should have after 30 minutes:

Individual tickets for P0 and P1 flags
Batch tickets for P2 and P3 flags
A summary ticket with full audit results
A recommended cleanup timeline

After the audit: Making it stick

A one-time audit solves today's problem. Preventing the next flag graveyard requires building auditing into your regular workflow.

Establish a recurring audit cadence

Team Size	Recommended Cadence	Time Investment
1-10 engineers	Monthly	30 minutes
10-30 engineers	Bi-weekly	30-45 minutes
30-100 engineers	Weekly	45-60 minutes
100+ engineers	Continuous (automated)	Tool-driven

Automate what you can

The manual grep-and-git-blame approach works for a quick audit, but it does not scale. As your team and codebase grow, you need automated flag detection and lifecycle tracking.

Set flag hygiene metrics

Track these metrics over time to measure improvement:

Metric	Target	How to Measure
Total flag count	Stable or declining	Monthly audit
Stale flag percentage	< 20%	Flags > 90 days with no changes
Average flag age	< 60 days	Mean age of all active flags
Orphaned flag count	0	Cross-reference code vs. platform
Time to remove	< 14 days after 100% rollout	Track from rollout completion to removal
Audit completion rate	100%	Tickets created vs. tickets completed

The flag hygiene dashboard

Common audit pitfalls to avoid

Pitfall 1: Only searching for known patterns

Pitfall 2: Ignoring test files

Pitfall 3: Treating all flags equally

Pitfall 4: Auditing without authority to act

The 30-minute audit in practice

Here is what a real audit looks like for a mid-size codebase:

Metric	Example Result
Total flags found	67
Active (keep)	12
Completed (P1 removal)	18
Orphaned (P0 removal)	8
Stale (P2/P3)	23
Risky (P4 investigate)	6
Estimated cleanup hours	47 hours
Stale flag ratio	46%

The only question left is when you start. Set a timer. Open a terminal. The audit begins now.

Progressive Delivery and Feature Flags: A Practical Guide

Progressive delivery uses feature flags for canary releases, percentage rollouts, and ring deployments. A practical guide to implementation, monitoring, and the cleanup challenge it creates.

February 5, 2026·12 min read

Feature Flag Rollback Strategy: When and How to Use Kill Switches

Kill switches enable instant rollbacks — but they become liabilities when forgotten. A practical guide to rollback strategies, kill switch design, and knowing when to retire them.

February 4, 2026·14 min read

Feature Flag Testing Strategy: How to Test Without Losing Your Mind

With n flags you have 2^n possible states. Here's how to build a practical testing strategy that covers what matters without drowning in combinatorial complexity.

January 29, 2026·13 min read

View all articles

Minute 0-5: Build your flag inventory with grep

Finding flags by SDK method calls

Extracting unique flag keys

Minute 5-10: Cross-reference with your flag management platform

Flags in code but not in the platform

Flags in the platform but not in code

Quick platform checks

Minute 10-15: Check git blame for age and ownership

Finding flag creation dates

Quick age analysis

Ownership check

Minute 15-20: Categorize your flags

The five flag categories

Scoring rubric

Building the audit spreadsheet

Minute 20-25: Prioritize removal candidates

The removal priority matrix

Quick wins: flags you can remove today

Estimating cleanup impact

Minute 25-30: Create cleanup tickets

Creating effective cleanup tickets

Batch tickets for efficiency

The summary ticket

After the audit: Making it stick

Establish a recurring audit cadence

Automate what you can

Set flag hygiene metrics

The flag hygiene dashboard

Common audit pitfalls to avoid

Pitfall 1: Only searching for known patterns

Pitfall 2: Ignoring test files

Pitfall 3: Treating all flags equally

Pitfall 4: Auditing without authority to act

The 30-minute audit in practice

More articles

Progressive Delivery and Feature Flags: A Practical Guide

Feature Flag Rollback Strategy: When and How to Use Kill Switches

Feature Flag Testing Strategy: How to Test Without Losing Your Mind

Minute 0-5: Build your flag inventory with grep

Finding flags by SDK method calls

Extracting unique flag keys

Minute 5-10: Cross-reference with your flag management platform

Flags in code but not in the platform

Flags in the platform but not in code

Quick platform checks

Minute 10-15: Check git blame for age and ownership

Finding flag creation dates

Quick age analysis

Ownership check

Minute 15-20: Categorize your flags

The five flag categories

Scoring rubric

Building the audit spreadsheet

Minute 20-25: Prioritize removal candidates

The removal priority matrix

Quick wins: flags you can remove today

Estimating cleanup impact

Minute 25-30: Create cleanup tickets

Creating effective cleanup tickets

Batch tickets for efficiency

The summary ticket

After the audit: Making it stick

Establish a recurring audit cadence

Automate what you can

Set flag hygiene metrics

The flag hygiene dashboard

Common audit pitfalls to avoid

Pitfall 1: Only searching for known patterns

Pitfall 2: Ignoring test files

Pitfall 3: Treating all flags equally

Pitfall 4: Auditing without authority to act

The 30-minute audit in practice

More articles

Progressive Delivery and Feature Flags: A Practical Guide

Feature Flag Rollback Strategy: When and How to Use Kill Switches

Feature Flag Testing Strategy: How to Test Without Losing Your Mind