October 23, 2025·11 min read

Feature Flags in CI/CD: Automate Cleanup

How to integrate feature flag cleanup into your CI/CD pipeline with automated detection, age warnings, and PR-based removal workflows.

Feature Flags DevOps Automation

Your team has a thorough CI/CD pipeline. Every pull request is linted, tested, type-checked, and scanned for security vulnerabilities before it can merge. But feature flags? They flow into the codebase unchecked and accumulate uncleaned, completely invisible to the pipeline that guards everything else.

This is the blind spot in modern CI/CD. Organizations invest heavily in automated quality gates for code style, test coverage, dependency vulnerabilities, and even accessibility compliance. Yet the single largest source of long-lived technical debt -- abandoned feature flags -- passes through every gate without triggering a single warning.

Based on what we have seen, most engineering teams accumulate a steady stream of stale flags every quarter. Over a year, that can easily mean dozens of abandoned flags adding conditional complexity, inflating test matrices, and slowing down every developer who encounters them. The fix is not more discipline or more spreadsheets. The fix is treating flag hygiene the same way you treat code quality: as an automated, enforceable part of your CI/CD pipeline.

Why flags need CI/CD integration

Feature flag management platforms like LaunchDarkly, Split, and Unleash excel at controlling flag evaluation at runtime. They tell you which flags exist, who they target, and whether they are on or off. What they cannot tell you is where a flag lives in your code, how deeply it is embedded, or whether removing it is safe.

That gap between the management platform and the codebase is where technical debt breeds. A flag can be archived in LaunchDarkly while still controlling critical code paths in production. A flag can be 100% enabled for six months with no one realizing it should have been hardcoded and cleaned up weeks after rollout.

CI/CD is the natural enforcement point for flag hygiene because it already sits at the intersection of code changes and deployment decisions. Every relevant event -- flag creation, flag aging, flag removal -- manifests as a code change that passes through your pipeline.

The flag lifecycle mapped to CI/CD events

Flag Lifecycle Stage	CI/CD Event	Automated Action
Flag created	New flag reference detected in PR	Log flag creation, start age tracking
Flag aging	Time passes, flag remains in code	Warn at 30/60/90 days in PR checks
Flag fully rolled out	Flag at 100% for N days	Generate cleanup PR automatically
Flag being removed	Removal PR opened	Validate complete removal, run tests
Flag removed	Removal PR merged	Confirm no remaining references
Flag re-introduced	Old flag key reappears	Block merge, require justification

Building the flag-aware CI pipeline

Integrating flag cleanup into CI/CD involves four layers, each building on the previous one. Teams can adopt these incrementally, starting with detection and progressing to full enforcement.

Layer 1: Flag detection on every PR

The foundation is knowing when flags enter or leave your codebase. A PR check that scans for flag SDK method calls gives you visibility into every flag change.

GitHub Actions example -- flag detection check:

name: Flag Detection
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  detect-flags:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Detect flag changes in PR
        run: |
          # Get changed files
          CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)

          # Patterns for common flag SDK methods
          FLAG_PATTERNS=(
            "variation\("
            "BoolVariation\("
            "StringVariation\("
            "isEnabled\("
            "is_enabled\("
            "useFeatureFlag\("
            "getFlag\("
          )

          NEW_FLAGS=()
          REMOVED_FLAGS=()

          for file in $CHANGED_FILES; do
            if [ -f "$file" ]; then
              for pattern in "${FLAG_PATTERNS[@]}"; do
                # Check for new flag references (added lines)
                ADDED=$(git diff origin/${{ github.base_ref }}...HEAD -- "$file" \
                  | grep "^+" | grep -E "$pattern" || true)
                if [ -n "$ADDED" ]; then
                  NEW_FLAGS+=("$file: $ADDED")
                fi

                # Check for removed flag references
                REMOVED=$(git diff origin/${{ github.base_ref }}...HEAD -- "$file" \
                  | grep "^-" | grep -E "$pattern" || true)
                if [ -n "$REMOVED" ]; then
                  REMOVED_FLAGS+=("$file: $REMOVED")
                fi
              done
            fi
          done

          # Report findings
          if [ ${#NEW_FLAGS[@]} -gt 0 ]; then
            echo "::notice::New flag references detected in this PR"
            printf '%s\n' "${NEW_FLAGS[@]}"
          fi

          if [ ${#REMOVED_FLAGS[@]} -gt 0 ]; then
            echo "::notice::Flag references removed in this PR"
            printf '%s\n' "${REMOVED_FLAGS[@]}"
          fi

This basic approach uses regex matching and works as a starting point. For production use, AST-based parsing (using tools like tree-sitter) provides significantly more accurate detection. Regex will miss dynamic flag key construction and generate false positives on comments and strings that happen to match the pattern.

What this gives you: Visibility. Every PR that adds or removes a flag reference is annotated, creating a searchable history of flag changes in your repository.

Layer 2: Age-based warnings and enforcement

Once you can detect flags, the next step is tracking their age and surfacing warnings when flags exceed their expected lifespan.

GitHub Actions example -- flag age warnings:

name: Flag Age Check
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  check-flag-age:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Check for stale flag references
        run: |
          # Define age thresholds (in days)
          WARN_THRESHOLD=60
          ERROR_THRESHOLD=90
          BLOCK_THRESHOLD=180

          WARNINGS=0
          ERRORS=0

          # Get all flag references in changed files
          CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)

          for file in $CHANGED_FILES; do
            if [ -f "$file" ]; then
              # Find flag key strings in the file
              FLAG_KEYS=$(grep -oP '["'"'"']([a-z][a-z0-9_-]+\.?[a-z0-9_-]+)["'"'"']' "$file" \
                | sort -u || true)

              for key in $FLAG_KEYS; do
                # Check when this flag key first appeared in git history
                FIRST_COMMIT_DATE=$(git log --all --diff-filter=A \
                  -S "$key" --format="%ai" -- "$file" | tail -1)

                if [ -n "$FIRST_COMMIT_DATE" ]; then
                  FIRST_EPOCH=$(date -d "$FIRST_COMMIT_DATE" +%s 2>/dev/null || echo "0")
                  NOW_EPOCH=$(date +%s)
                  AGE_DAYS=$(( (NOW_EPOCH - FIRST_EPOCH) / 86400 ))

                  if [ "$AGE_DAYS" -gt "$BLOCK_THRESHOLD" ]; then
                    echo "::error file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $BLOCK_THRESHOLD days). This flag must be cleaned up before adding new references."
                    ERRORS=$((ERRORS + 1))
                  elif [ "$AGE_DAYS" -gt "$ERROR_THRESHOLD" ]; then
                    echo "::error file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $ERROR_THRESHOLD days). Schedule cleanup immediately."
                    ERRORS=$((ERRORS + 1))
                  elif [ "$AGE_DAYS" -gt "$WARN_THRESHOLD" ]; then
                    echo "::warning file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $WARN_THRESHOLD days). Consider scheduling cleanup."
                    WARNINGS=$((WARNINGS + 1))
                  fi
                fi
              done
            fi
          done

          echo "Flag age check complete: $WARNINGS warnings, $ERRORS errors"

          # Optionally fail the check on errors
          if [ "$ERRORS" -gt 0 ]; then
            echo "::error::$ERRORS flag(s) exceed the maximum age threshold. Clean up stale flags before modifying them."
            exit 1
          fi

Recommended age thresholds:

Flag Age	CI Action	Rationale
0-30 days	No action	Normal development lifecycle
30-60 days	Info annotation	Gentle reminder to plan cleanup
60-90 days	Warning	Flag likely stale, should be scheduled
90-120 days	Error (non-blocking)	Flag is overdue for cleanup
120-180 days	Error (blocking on new references)	No new code should touch this flag
180+ days	Error (blocking all changes)	Flag must be removed before any related changes

The key principle is progressive enforcement. Early warnings are informational. Later stages become blocking. This gives teams time to schedule cleanup without creating a sudden enforcement cliff that breaks existing workflows.

Layer 3: Flag management platform integration

Connecting your CI pipeline to your flag management platform (LaunchDarkly, Split, Unleash, etc.) enables much richer checks. You can verify whether a flag is still active, what percentage it is rolled out to, and whether it has been archived.

GitHub Actions example -- LaunchDarkly integration:

name: Flag Platform Sync
on:
  pull_request:
    types: [opened, synchronize]
  schedule:
    - cron: '0 9 * * 1'  # Weekly Monday check

jobs:
  sync-flag-status:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Check flag status against LaunchDarkly
        env:
          LD_API_KEY: ${{ secrets.LAUNCHDARKLY_API_KEY }}
          LD_PROJECT: ${{ vars.LD_PROJECT_KEY }}
        run: |
          # Fetch all flags from LaunchDarkly
          FLAGS_RESPONSE=$(curl -s -H "Authorization: $LD_API_KEY" \
            "https://app.launchdarkly.com/api/v2/flags/$LD_PROJECT?summary=true")

          # Parse flag statuses
          echo "$FLAGS_RESPONSE" | jq -r '.items[] | "\(.key) \(.archived) \(.environments.production.on)"' \
            > /tmp/ld_flags.txt

          STALE_IN_CODE=()

          # Find flag references in code that are archived in LaunchDarkly
          while IFS=' ' read -r flag_key archived prod_on; do
            if [ "$archived" = "true" ]; then
              # Flag is archived in LD but might still be in code
              REFERENCES=$(grep -r "$flag_key" --include="*.go" --include="*.ts" \
                --include="*.py" --include="*.java" -l . || true)
              if [ -n "$REFERENCES" ]; then
                STALE_IN_CODE+=("$flag_key (archived in LD, still in: $REFERENCES)")
              fi
            fi
          done < /tmp/ld_flags.txt

          if [ ${#STALE_IN_CODE[@]} -gt 0 ]; then
            echo "## Stale Flags Found" >> $GITHUB_STEP_SUMMARY
            echo "The following flags are archived in LaunchDarkly but still referenced in code:" >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
            for flag in "${STALE_IN_CODE[@]}"; do
              echo "- $flag" >> $GITHUB_STEP_SUMMARY
            done
          fi

Platform integration checks to implement:

Check	What It Catches	Severity
Archived flag in code	Flag removed from platform but code still branches on it	Error
100% rollout for 30+ days	Flag fully enabled but not cleaned up	Warning
Code reference with no platform flag	Typo in flag key or flag deleted from platform	Error
Platform flag with no code references	Flag exists in platform but is not used anywhere	Info
Flag targeting "everyone"	Flag that should be hardcoded	Warning

Layer 4: Automated cleanup PR generation

The most impactful CI/CD integration is automated generation of cleanup pull requests. When a flag has been fully rolled out for a defined period, the pipeline automatically creates a PR that removes the flag from code, replacing conditional logic with the winning code path.

The automated cleanup workflow:

Flag reaches 100% rollout
         ↓
Grace period elapses (e.g., 14 days at 100%)
         ↓
CI detects flag is a cleanup candidate
         ↓
Automated PR is generated:
  ├── Removes flag evaluation calls
  ├── Removes dead code branches (the "off" path)
  ├── Removes flag imports if no longer needed
  └── Adds PR description with context and verification steps
         ↓
PR is assigned to flag owner for review
         ↓
Standard review and merge process
         ↓
Follow-up check confirms all references removed

GitHub Actions example -- scheduled cleanup PR generation:

name: Flag Cleanup Generator
on:
  schedule:
    - cron: '0 10 * * 1'  # Every Monday at 10 AM
  workflow_dispatch:        # Allow manual trigger

jobs:
  generate-cleanup-prs:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    steps:
      - uses: actions/checkout@v4

      - name: Identify cleanup candidates
        id: candidates
        env:
          LD_API_KEY: ${{ secrets.LAUNCHDARKLY_API_KEY }}
          LD_PROJECT: ${{ vars.LD_PROJECT_KEY }}
        run: |
          # Fetch flags that are 100% ON for 14+ days
          FLAGS=$(curl -s -H "Authorization: $LD_API_KEY" \
            "https://app.launchdarkly.com/api/v2/flags/$LD_PROJECT")

          CLEANUP_CANDIDATES=$(echo "$FLAGS" | jq -r '
            .items[]
            | select(.environments.production.on == true)
            | select(.environments.production.lastModified
                | split("T")[0]
                | strptime("%Y-%m-%d")
                | mktime < (now - 1209600))
            | .key
          ')

          echo "candidates=$CLEANUP_CANDIDATES" >> $GITHUB_OUTPUT

      - name: Generate removal PRs
        if: steps.candidates.outputs.candidates != ''
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          for flag_key in ${{ steps.candidates.outputs.candidates }}; do
            BRANCH="cleanup/remove-flag-${flag_key}"

            # Check if cleanup PR already exists
            EXISTING=$(gh pr list --search "head:$BRANCH" --json number --jq length)
            if [ "$EXISTING" -gt 0 ]; then
              echo "Cleanup PR already exists for $flag_key, skipping"
              continue
            fi

            # Create cleanup branch
            git checkout -b "$BRANCH"

            # Remove flag references (simplified -- production would use AST parsing)
            grep -rl "$flag_key" --include="*.go" --include="*.ts" --include="*.py" . \
              | while read file; do
                echo "Would clean $flag_key from $file"
                # AST-based removal would happen here
              done

            # Create PR if there are changes
            if [ -n "$(git status --porcelain)" ]; then
              git add -A
              git commit -m "chore: remove stale flag '$flag_key'"
              git push origin "$BRANCH"

              gh pr create \
                --title "chore: remove stale flag '$flag_key'" \
                --body "## Automated Flag Cleanup

          This PR removes the feature flag \`$flag_key\` which has been 100% enabled for 14+ days.

          ### Verification
          - [ ] Flag is confirmed 100% ON in all environments
          - [ ] No active experiments depend on this flag
          - [ ] Tests pass with flag code removed
          - [ ] No other flags depend on this flag's value

          *Generated automatically by the Flag Cleanup pipeline.*" \
                --label "flag-cleanup,automated"
            fi

            git checkout main
          done

This example demonstrates the pattern, but production-grade flag removal requires AST-based code transformation to safely remove conditional branches. Regex-based removal is fragile and dangerous for anything beyond trivial flag usage. Tools like FlagShark handle this complexity by using tree-sitter parsing to understand the syntax tree and generate safe, accurate removal PRs across multiple programming languages.

The flag cleanup gate pattern

The most effective CI/CD integration pattern is the "flag cleanup gate," a structured approach that combines all four layers into a unified policy enforcement system.

How the flag cleanup gate works

Developer opens PR
         ↓
┌─────────────────────────────────────────────┐
│            FLAG CLEANUP GATE                 │
│                                              │
│  1. DETECT: Scan for flag references         │
│     → New flags? Log creation event          │
│     → Removed flags? Verify complete removal │
│                                              │
│  2. AGE CHECK: Evaluate flag ages            │
│     → Under 60 days? Pass                    │
│     → 60-90 days? Warning annotation         │
│     → 90+ days? Require cleanup plan         │
│                                              │
│  3. PLATFORM SYNC: Check flag status         │
│     → Archived in platform? Block merge      │
│     → 100% enabled 30+ days? Warn            │
│     → Unknown flag key? Error                │
│                                              │
│  4. POLICY: Enforce team standards           │
│     → Max flags per service? Check           │
│     → Required flag documentation? Check     │
│     → Expiration date set? Check             │
│                                              │
│  RESULT: Pass / Warn / Block                 │
└─────────────────────────────────────────────┘
         ↓
Standard review and merge flow

Policy configuration example

Define your flag cleanup policies in a configuration file that lives in your repository:

# .flag-policy.yaml
flag_cleanup:
  detection:
    enabled: true
    languages: [go, typescript, python, java]
    scan_config_files: true

  age_thresholds:
    info: 30      # days
    warning: 60
    error: 90
    block: 180

  platform_integration:
    provider: launchdarkly
    project_key: my-project
    check_archived: true
    check_fully_rolled_out: true
    fully_rolled_out_grace_days: 14

  enforcement:
    block_new_references_to_stale_flags: true
    require_expiration_date: true
    max_flags_per_file: 5
    require_flag_documentation: false

  cleanup_prs:
    enabled: true
    schedule: weekly
    auto_assign_to: flag_owner
    labels: [flag-cleanup, automated]
    require_approval: true

Enforcement levels for different teams

Not every team is ready for full enforcement on day one. Adopt the flag cleanup gate progressively:

Level	Detection	Age Warnings	Platform Sync	Blocking	Cleanup PRs
Level 1: Observability	On	Info only	Off	None	Off
Level 2: Awareness	On	Warnings	On	None	Off
Level 3: Guidance	On	Warnings + Errors	On	Stale flags only	Manual trigger
Level 4: Enforcement	On	All thresholds	On	Full policy	Automated weekly
Level 5: Zero-debt	On	All thresholds	On	Full policy + max flag limits	Automated daily

Most teams should start at Level 2 and progress to Level 4 over one quarter. Level 5 is aspirational and appropriate for teams with mature flag management practices.

Measuring CI/CD flag integration effectiveness

Track these metrics to evaluate whether your CI/CD flag integration is working:

Leading indicators (process health)

Metric	Target	How to Measure
Flag detection coverage	100% of PRs scanned	CI job success rate
Age warning response rate	80%+ warnings addressed within 2 weeks	Time from warning to flag removal
Cleanup PR merge time	< 1 week from generation	PR open duration
False positive rate	< 5% of detections	Manual review of flagged items

Lagging indicators (outcomes)

The outcomes you should expect to see after integrating flag cleanup into CI/CD include:

Average flag age drops significantly. Flags that used to linger for months get cleaned up within weeks.
Stale flag percentage declines. The proportion of flags older than 90 days should decrease steadily.
Cleanup velocity increases. Automated detection and cleanup PR generation means more flags get removed per month with less manual effort.
Flag-related incidents decrease. Fewer stale flags means fewer unexpected interactions and dead code paths in production.

In our experience, the key driver is not the blocking enforcement -- it is the visibility. When developers see age warnings on every PR, they internalize flag lifecycle management as a natural part of development, not a separate chore.

Common pitfalls and how to avoid them

Pitfall 1: Starting with blocking enforcement

Teams that immediately block merges for stale flags face developer revolt. Engineers with deadlines will work around blocking checks rather than pause to clean up flags they did not create. Start with visibility and warnings. Build the cultural expectation before adding enforcement.

Pitfall 2: Regex-based detection in production

Simple grep patterns catch obvious cases but miss dynamic flag evaluation, generate false positives on comments and documentation, and cannot distinguish between flag creation and flag removal. AST-based parsing is essential for accurate detection at scale.

Pitfall 3: Ignoring configuration files

Flags are not only referenced in application code. They appear in configuration files, environment variables, Terraform modules, Kubernetes manifests, and CI/CD pipeline definitions. Your scanning must cover these non-code locations.

Pitfall 4: No grace period for newly created flags

A flag created yesterday should not trigger a 90-day warning. Ensure your age checks account for flag creation date, not just the date the file was last modified. This is a common bug in naive implementations that generates noisy false positives.

Pitfall 5: Treating all flags the same

Kill switches, experiment flags, and release flags have different expected lifespans. Your policy should differentiate between flag types:

Flag Type	Expected Lifespan	Warning Threshold	Block Threshold
Release flag	2-4 weeks	60 days	120 days
Experiment flag	1-2 weeks	30 days	60 days
Ops/kill switch	Indefinite	Annual review	Never
Permission flag	Varies	90 days	180 days

Getting started: A 30-day adoption plan

Week 1: Detection only

Add a flag detection job to your CI pipeline (Layer 1)
Configure it to annotate PRs with flag changes -- no blocking
Run it for a week to calibrate detection accuracy
Fix false positives by tuning patterns or switching to AST-based detection

Week 2: Age tracking and warnings

Implement age-based warnings (Layer 2)
Start with generous thresholds (90-day warning, 180-day error)
Review the first week's warnings with the team
Adjust thresholds based on team feedback

Week 3: Platform integration

Connect your CI pipeline to your flag management platform (Layer 3)
Implement the "archived in platform, still in code" check
Run the weekly platform sync report
Review results with engineering leads

Week 4: Cleanup automation

Enable automated cleanup PR generation (Layer 4)
Start with manual trigger only (no scheduled generation)
Generate cleanup PRs for the 5 oldest stale flags
Refine the PR template and review process based on feedback

The CI/CD pipeline is the most under-utilized tool in the fight against feature flag debt. Every organization already has the infrastructure to enforce flag hygiene -- they just have not connected the pieces. By treating flag cleanup as a first-class CI/CD concern, you transform flag management from an afterthought into an automated, measurable, and enforceable part of your development workflow. The flags that used to haunt your codebase for years will now have a clear path from creation to cleanup, tracked and enforced at every step.

Progressive Delivery and Feature Flags: A Practical Guide

Progressive delivery uses feature flags for canary releases, percentage rollouts, and ring deployments. A practical guide to implementation, monitoring, and the cleanup challenge it creates.

February 5, 2026·12 min read

How to Configure .flagshark.yaml for Custom Flag Providers

A step-by-step tutorial for configuring FlagShark to detect feature flags from any provider, including LaunchDarkly, Unleash, Split, and custom in-house solutions.

October 2, 2025·13 min read

FlagShark vs Piranha: Flag Cleanup Compared

A detailed comparison of FlagShark and Uber's Piranha for automated feature flag cleanup. Learn which tool fits your team's needs for detecting, tracking, and removing stale flags.

July 3, 2025·10 min read

View all articles

October 23, 2025·11 min read

Feature Flags in CI/CD: Automate Cleanup

How to integrate feature flag cleanup into your CI/CD pipeline with automated detection, age warnings, and PR-based removal workflows.

Feature Flags DevOps Automation

Why flags need CI/CD integration

The flag lifecycle mapped to CI/CD events

Flag Lifecycle Stage	CI/CD Event	Automated Action
Flag created	New flag reference detected in PR	Log flag creation, start age tracking
Flag aging	Time passes, flag remains in code	Warn at 30/60/90 days in PR checks
Flag fully rolled out	Flag at 100% for N days	Generate cleanup PR automatically
Flag being removed	Removal PR opened	Validate complete removal, run tests
Flag removed	Removal PR merged	Confirm no remaining references
Flag re-introduced	Old flag key reappears	Block merge, require justification

Building the flag-aware CI pipeline

Integrating flag cleanup into CI/CD involves four layers, each building on the previous one. Teams can adopt these incrementally, starting with detection and progressing to full enforcement.

Layer 1: Flag detection on every PR

The foundation is knowing when flags enter or leave your codebase. A PR check that scans for flag SDK method calls gives you visibility into every flag change.

GitHub Actions example -- flag detection check:

name: Flag Detection
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  detect-flags:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Detect flag changes in PR
        run: |
          # Get changed files
          CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)

          # Patterns for common flag SDK methods
          FLAG_PATTERNS=(
            "variation\("
            "BoolVariation\("
            "StringVariation\("
            "isEnabled\("
            "is_enabled\("
            "useFeatureFlag\("
            "getFlag\("
          )

          NEW_FLAGS=()
          REMOVED_FLAGS=()

          for file in $CHANGED_FILES; do
            if [ -f "$file" ]; then
              for pattern in "${FLAG_PATTERNS[@]}"; do
                # Check for new flag references (added lines)
                ADDED=$(git diff origin/${{ github.base_ref }}...HEAD -- "$file" \
                  | grep "^+" | grep -E "$pattern" || true)
                if [ -n "$ADDED" ]; then
                  NEW_FLAGS+=("$file: $ADDED")
                fi

                # Check for removed flag references
                REMOVED=$(git diff origin/${{ github.base_ref }}...HEAD -- "$file" \
                  | grep "^-" | grep -E "$pattern" || true)
                if [ -n "$REMOVED" ]; then
                  REMOVED_FLAGS+=("$file: $REMOVED")
                fi
              done
            fi
          done

          # Report findings
          if [ ${#NEW_FLAGS[@]} -gt 0 ]; then
            echo "::notice::New flag references detected in this PR"
            printf '%s\n' "${NEW_FLAGS[@]}"
          fi

          if [ ${#REMOVED_FLAGS[@]} -gt 0 ]; then
            echo "::notice::Flag references removed in this PR"
            printf '%s\n' "${REMOVED_FLAGS[@]}"
          fi

What this gives you: Visibility. Every PR that adds or removes a flag reference is annotated, creating a searchable history of flag changes in your repository.

Layer 2: Age-based warnings and enforcement

Once you can detect flags, the next step is tracking their age and surfacing warnings when flags exceed their expected lifespan.

GitHub Actions example -- flag age warnings:

name: Flag Age Check
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  check-flag-age:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Check for stale flag references
        run: |
          # Define age thresholds (in days)
          WARN_THRESHOLD=60
          ERROR_THRESHOLD=90
          BLOCK_THRESHOLD=180

          WARNINGS=0
          ERRORS=0

          # Get all flag references in changed files
          CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD)

          for file in $CHANGED_FILES; do
            if [ -f "$file" ]; then
              # Find flag key strings in the file
              FLAG_KEYS=$(grep -oP '["'"'"']([a-z][a-z0-9_-]+\.?[a-z0-9_-]+)["'"'"']' "$file" \
                | sort -u || true)

              for key in $FLAG_KEYS; do
                # Check when this flag key first appeared in git history
                FIRST_COMMIT_DATE=$(git log --all --diff-filter=A \
                  -S "$key" --format="%ai" -- "$file" | tail -1)

                if [ -n "$FIRST_COMMIT_DATE" ]; then
                  FIRST_EPOCH=$(date -d "$FIRST_COMMIT_DATE" +%s 2>/dev/null || echo "0")
                  NOW_EPOCH=$(date +%s)
                  AGE_DAYS=$(( (NOW_EPOCH - FIRST_EPOCH) / 86400 ))

                  if [ "$AGE_DAYS" -gt "$BLOCK_THRESHOLD" ]; then
                    echo "::error file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $BLOCK_THRESHOLD days). This flag must be cleaned up before adding new references."
                    ERRORS=$((ERRORS + 1))
                  elif [ "$AGE_DAYS" -gt "$ERROR_THRESHOLD" ]; then
                    echo "::error file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $ERROR_THRESHOLD days). Schedule cleanup immediately."
                    ERRORS=$((ERRORS + 1))
                  elif [ "$AGE_DAYS" -gt "$WARN_THRESHOLD" ]; then
                    echo "::warning file=$file::Flag '$key' is $AGE_DAYS days old (threshold: $WARN_THRESHOLD days). Consider scheduling cleanup."
                    WARNINGS=$((WARNINGS + 1))
                  fi
                fi
              done
            fi
          done

          echo "Flag age check complete: $WARNINGS warnings, $ERRORS errors"

          # Optionally fail the check on errors
          if [ "$ERRORS" -gt 0 ]; then
            echo "::error::$ERRORS flag(s) exceed the maximum age threshold. Clean up stale flags before modifying them."
            exit 1
          fi

Recommended age thresholds:

Flag Age	CI Action	Rationale
0-30 days	No action	Normal development lifecycle
30-60 days	Info annotation	Gentle reminder to plan cleanup
60-90 days	Warning	Flag likely stale, should be scheduled
90-120 days	Error (non-blocking)	Flag is overdue for cleanup
120-180 days	Error (blocking on new references)	No new code should touch this flag
180+ days	Error (blocking all changes)	Flag must be removed before any related changes

Layer 3: Flag management platform integration

GitHub Actions example -- LaunchDarkly integration:

name: Flag Platform Sync
on:
  pull_request:
    types: [opened, synchronize]
  schedule:
    - cron: '0 9 * * 1'  # Weekly Monday check

jobs:
  sync-flag-status:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Check flag status against LaunchDarkly
        env:
          LD_API_KEY: ${{ secrets.LAUNCHDARKLY_API_KEY }}
          LD_PROJECT: ${{ vars.LD_PROJECT_KEY }}
        run: |
          # Fetch all flags from LaunchDarkly
          FLAGS_RESPONSE=$(curl -s -H "Authorization: $LD_API_KEY" \
            "https://app.launchdarkly.com/api/v2/flags/$LD_PROJECT?summary=true")

          # Parse flag statuses
          echo "$FLAGS_RESPONSE" | jq -r '.items[] | "\(.key) \(.archived) \(.environments.production.on)"' \
            > /tmp/ld_flags.txt

          STALE_IN_CODE=()

          # Find flag references in code that are archived in LaunchDarkly
          while IFS=' ' read -r flag_key archived prod_on; do
            if [ "$archived" = "true" ]; then
              # Flag is archived in LD but might still be in code
              REFERENCES=$(grep -r "$flag_key" --include="*.go" --include="*.ts" \
                --include="*.py" --include="*.java" -l . || true)
              if [ -n "$REFERENCES" ]; then
                STALE_IN_CODE+=("$flag_key (archived in LD, still in: $REFERENCES)")
              fi
            fi
          done < /tmp/ld_flags.txt

          if [ ${#STALE_IN_CODE[@]} -gt 0 ]; then
            echo "## Stale Flags Found" >> $GITHUB_STEP_SUMMARY
            echo "The following flags are archived in LaunchDarkly but still referenced in code:" >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
            for flag in "${STALE_IN_CODE[@]}"; do
              echo "- $flag" >> $GITHUB_STEP_SUMMARY
            done
          fi

Platform integration checks to implement:

Check	What It Catches	Severity
Archived flag in code	Flag removed from platform but code still branches on it	Error
100% rollout for 30+ days	Flag fully enabled but not cleaned up	Warning
Code reference with no platform flag	Typo in flag key or flag deleted from platform	Error
Platform flag with no code references	Flag exists in platform but is not used anywhere	Info
Flag targeting "everyone"	Flag that should be hardcoded	Warning

Layer 4: Automated cleanup PR generation

The automated cleanup workflow:

Flag reaches 100% rollout
         ↓
Grace period elapses (e.g., 14 days at 100%)
         ↓
CI detects flag is a cleanup candidate
         ↓
Automated PR is generated:
  ├── Removes flag evaluation calls
  ├── Removes dead code branches (the "off" path)
  ├── Removes flag imports if no longer needed
  └── Adds PR description with context and verification steps
         ↓
PR is assigned to flag owner for review
         ↓
Standard review and merge process
         ↓
Follow-up check confirms all references removed

GitHub Actions example -- scheduled cleanup PR generation:

name: Flag Cleanup Generator
on:
  schedule:
    - cron: '0 10 * * 1'  # Every Monday at 10 AM
  workflow_dispatch:        # Allow manual trigger

jobs:
  generate-cleanup-prs:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write
    steps:
      - uses: actions/checkout@v4

      - name: Identify cleanup candidates
        id: candidates
        env:
          LD_API_KEY: ${{ secrets.LAUNCHDARKLY_API_KEY }}
          LD_PROJECT: ${{ vars.LD_PROJECT_KEY }}
        run: |
          # Fetch flags that are 100% ON for 14+ days
          FLAGS=$(curl -s -H "Authorization: $LD_API_KEY" \
            "https://app.launchdarkly.com/api/v2/flags/$LD_PROJECT")

          CLEANUP_CANDIDATES=$(echo "$FLAGS" | jq -r '
            .items[]
            | select(.environments.production.on == true)
            | select(.environments.production.lastModified
                | split("T")[0]
                | strptime("%Y-%m-%d")
                | mktime < (now - 1209600))
            | .key
          ')

          echo "candidates=$CLEANUP_CANDIDATES" >> $GITHUB_OUTPUT

      - name: Generate removal PRs
        if: steps.candidates.outputs.candidates != ''
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          for flag_key in ${{ steps.candidates.outputs.candidates }}; do
            BRANCH="cleanup/remove-flag-${flag_key}"

            # Check if cleanup PR already exists
            EXISTING=$(gh pr list --search "head:$BRANCH" --json number --jq length)
            if [ "$EXISTING" -gt 0 ]; then
              echo "Cleanup PR already exists for $flag_key, skipping"
              continue
            fi

            # Create cleanup branch
            git checkout -b "$BRANCH"

            # Remove flag references (simplified -- production would use AST parsing)
            grep -rl "$flag_key" --include="*.go" --include="*.ts" --include="*.py" . \
              | while read file; do
                echo "Would clean $flag_key from $file"
                # AST-based removal would happen here
              done

            # Create PR if there are changes
            if [ -n "$(git status --porcelain)" ]; then
              git add -A
              git commit -m "chore: remove stale flag '$flag_key'"
              git push origin "$BRANCH"

              gh pr create \
                --title "chore: remove stale flag '$flag_key'" \
                --body "## Automated Flag Cleanup

          This PR removes the feature flag \`$flag_key\` which has been 100% enabled for 14+ days.

          ### Verification
          - [ ] Flag is confirmed 100% ON in all environments
          - [ ] No active experiments depend on this flag
          - [ ] Tests pass with flag code removed
          - [ ] No other flags depend on this flag's value

          *Generated automatically by the Flag Cleanup pipeline.*" \
                --label "flag-cleanup,automated"
            fi

            git checkout main
          done

The flag cleanup gate pattern

The most effective CI/CD integration pattern is the "flag cleanup gate," a structured approach that combines all four layers into a unified policy enforcement system.

How the flag cleanup gate works

Developer opens PR
         ↓
┌─────────────────────────────────────────────┐
│            FLAG CLEANUP GATE                 │
│                                              │
│  1. DETECT: Scan for flag references         │
│     → New flags? Log creation event          │
│     → Removed flags? Verify complete removal │
│                                              │
│  2. AGE CHECK: Evaluate flag ages            │
│     → Under 60 days? Pass                    │
│     → 60-90 days? Warning annotation         │
│     → 90+ days? Require cleanup plan         │
│                                              │
│  3. PLATFORM SYNC: Check flag status         │
│     → Archived in platform? Block merge      │
│     → 100% enabled 30+ days? Warn            │
│     → Unknown flag key? Error                │
│                                              │
│  4. POLICY: Enforce team standards           │
│     → Max flags per service? Check           │
│     → Required flag documentation? Check     │
│     → Expiration date set? Check             │
│                                              │
│  RESULT: Pass / Warn / Block                 │
└─────────────────────────────────────────────┘
         ↓
Standard review and merge flow

Policy configuration example

Define your flag cleanup policies in a configuration file that lives in your repository:

# .flag-policy.yaml
flag_cleanup:
  detection:
    enabled: true
    languages: [go, typescript, python, java]
    scan_config_files: true

  age_thresholds:
    info: 30      # days
    warning: 60
    error: 90
    block: 180

  platform_integration:
    provider: launchdarkly
    project_key: my-project
    check_archived: true
    check_fully_rolled_out: true
    fully_rolled_out_grace_days: 14

  enforcement:
    block_new_references_to_stale_flags: true
    require_expiration_date: true
    max_flags_per_file: 5
    require_flag_documentation: false

  cleanup_prs:
    enabled: true
    schedule: weekly
    auto_assign_to: flag_owner
    labels: [flag-cleanup, automated]
    require_approval: true

Enforcement levels for different teams

Not every team is ready for full enforcement on day one. Adopt the flag cleanup gate progressively:

Level	Detection	Age Warnings	Platform Sync	Blocking	Cleanup PRs
Level 1: Observability	On	Info only	Off	None	Off
Level 2: Awareness	On	Warnings	On	None	Off
Level 3: Guidance	On	Warnings + Errors	On	Stale flags only	Manual trigger
Level 4: Enforcement	On	All thresholds	On	Full policy	Automated weekly
Level 5: Zero-debt	On	All thresholds	On	Full policy + max flag limits	Automated daily

Most teams should start at Level 2 and progress to Level 4 over one quarter. Level 5 is aspirational and appropriate for teams with mature flag management practices.

Measuring CI/CD flag integration effectiveness

Track these metrics to evaluate whether your CI/CD flag integration is working:

Leading indicators (process health)

Metric	Target	How to Measure
Flag detection coverage	100% of PRs scanned	CI job success rate
Age warning response rate	80%+ warnings addressed within 2 weeks	Time from warning to flag removal
Cleanup PR merge time	< 1 week from generation	PR open duration
False positive rate	< 5% of detections	Manual review of flagged items

Lagging indicators (outcomes)

The outcomes you should expect to see after integrating flag cleanup into CI/CD include:

Average flag age drops significantly. Flags that used to linger for months get cleaned up within weeks.
Stale flag percentage declines. The proportion of flags older than 90 days should decrease steadily.
Cleanup velocity increases. Automated detection and cleanup PR generation means more flags get removed per month with less manual effort.
Flag-related incidents decrease. Fewer stale flags means fewer unexpected interactions and dead code paths in production.

Common pitfalls and how to avoid them

Pitfall 1: Starting with blocking enforcement

Pitfall 2: Regex-based detection in production

Pitfall 3: Ignoring configuration files

Pitfall 4: No grace period for newly created flags

Pitfall 5: Treating all flags the same

Kill switches, experiment flags, and release flags have different expected lifespans. Your policy should differentiate between flag types:

Flag Type	Expected Lifespan	Warning Threshold	Block Threshold
Release flag	2-4 weeks	60 days	120 days
Experiment flag	1-2 weeks	30 days	60 days
Ops/kill switch	Indefinite	Annual review	Never
Permission flag	Varies	90 days	180 days

Getting started: A 30-day adoption plan

Week 1: Detection only

Add a flag detection job to your CI pipeline (Layer 1)
Configure it to annotate PRs with flag changes -- no blocking
Run it for a week to calibrate detection accuracy
Fix false positives by tuning patterns or switching to AST-based detection

Week 2: Age tracking and warnings

Implement age-based warnings (Layer 2)
Start with generous thresholds (90-day warning, 180-day error)
Review the first week's warnings with the team
Adjust thresholds based on team feedback

Week 3: Platform integration

Connect your CI pipeline to your flag management platform (Layer 3)
Implement the "archived in platform, still in code" check
Run the weekly platform sync report
Review results with engineering leads

Week 4: Cleanup automation

Enable automated cleanup PR generation (Layer 4)
Start with manual trigger only (no scheduled generation)
Generate cleanup PRs for the 5 oldest stale flags
Refine the PR template and review process based on feedback

Progressive Delivery and Feature Flags: A Practical Guide

Progressive delivery uses feature flags for canary releases, percentage rollouts, and ring deployments. A practical guide to implementation, monitoring, and the cleanup challenge it creates.

February 5, 2026·12 min read

How to Configure .flagshark.yaml for Custom Flag Providers

A step-by-step tutorial for configuring FlagShark to detect feature flags from any provider, including LaunchDarkly, Unleash, Split, and custom in-house solutions.

October 2, 2025·13 min read

FlagShark vs Piranha: Flag Cleanup Compared

A detailed comparison of FlagShark and Uber's Piranha for automated feature flag cleanup. Learn which tool fits your team's needs for detecting, tracking, and removing stale flags.

July 3, 2025·10 min read

View all articles

Why flags need CI/CD integration

The flag lifecycle mapped to CI/CD events

Building the flag-aware CI pipeline

Layer 1: Flag detection on every PR

Layer 2: Age-based warnings and enforcement

Layer 3: Flag management platform integration

Layer 4: Automated cleanup PR generation

The flag cleanup gate pattern

How the flag cleanup gate works

Policy configuration example

Enforcement levels for different teams

Measuring CI/CD flag integration effectiveness

Leading indicators (process health)

Lagging indicators (outcomes)

Common pitfalls and how to avoid them

Pitfall 1: Starting with blocking enforcement

Pitfall 2: Regex-based detection in production

Pitfall 3: Ignoring configuration files

Pitfall 4: No grace period for newly created flags

Pitfall 5: Treating all flags the same

Getting started: A 30-day adoption plan

Week 1: Detection only

Week 2: Age tracking and warnings

Week 3: Platform integration

Week 4: Cleanup automation

More articles

Progressive Delivery and Feature Flags: A Practical Guide

How to Configure .flagshark.yaml for Custom Flag Providers

FlagShark vs Piranha: Flag Cleanup Compared

Why flags need CI/CD integration

The flag lifecycle mapped to CI/CD events

Building the flag-aware CI pipeline

Layer 1: Flag detection on every PR

Layer 2: Age-based warnings and enforcement

Layer 3: Flag management platform integration

Layer 4: Automated cleanup PR generation

The flag cleanup gate pattern

How the flag cleanup gate works

Policy configuration example

Enforcement levels for different teams

Measuring CI/CD flag integration effectiveness

Leading indicators (process health)

Lagging indicators (outcomes)

Common pitfalls and how to avoid them

Pitfall 1: Starting with blocking enforcement

Pitfall 2: Regex-based detection in production

Pitfall 3: Ignoring configuration files

Pitfall 4: No grace period for newly created flags

Pitfall 5: Treating all flags the same

Getting started: A 30-day adoption plan

Week 1: Detection only

Week 2: Age tracking and warnings

Week 3: Platform integration

Week 4: Cleanup automation

More articles

Progressive Delivery and Feature Flags: A Practical Guide

How to Configure .flagshark.yaml for Custom Flag Providers

FlagShark vs Piranha: Flag Cleanup Compared