Your engineering team just agreed that stale feature flags are a problem. The codebase has 200+ flags, half of them controlling features that shipped months ago, and debugging sessions routinely turn into archaeology expeditions through nested conditionals. Everyone agrees: something needs to change.
So you start researching automated cleanup tools and quickly land on two names: Piranha, Uber's open-source refactoring engine that removed 2,000 stale flags from their mobile apps, and FlagShark, a SaaS platform that monitors your repositories continuously and generates cleanup PRs automatically.
Both tools solve the same fundamental problem -- getting dead flag code out of your codebase -- but they approach it from radically different directions. Choosing the wrong one for your team's needs means either over-investing in infrastructure you don't need or under-investing in a solution that won't scale.
This comparison breaks down every meaningful dimension so you can make an informed decision.
The core philosophy difference
Before diving into features, it helps to understand the fundamental design philosophy behind each tool. This shapes every downstream decision about architecture, setup, and maintenance.
Piranha was built inside Uber to solve Uber's specific problem: massive mobile codebases with thousands of flags managed through a known internal SDK. It's an open-source refactoring engine that you run against your code, configured with rules that describe your flag patterns. Think of it as a programmable code transformation tool specialized for flag removal.
FlagShark was built as an external service that integrates with your development workflow. It installs as a GitHub App, monitors pull requests in real time, and maintains a complete lifecycle record of every flag it detects. Think of it as a continuous monitoring and cleanup system that operates alongside your existing development process.
This philosophical difference -- batch refactoring tool vs. continuous monitoring service -- cascades into every aspect of how the tools work.
| Dimension | Piranha | FlagShark |
|---|---|---|
| Core model | Batch refactoring engine | Continuous monitoring service |
| Deployment | Self-hosted, runs locally or in CI | SaaS, GitHub App integration |
| Trigger | On-demand or scheduled runs | Every PR event, plus scheduled scans |
| Detection approach | Rule-based pattern matching + AST | Tree-sitter AST parsing, multi-provider |
| Configuration | Extensive rule files required | Zero-config with .flagshark.yaml optional |
| Origin | Uber engineering (open-source) | Purpose-built SaaS product |
Architecture and how each tool works
Piranha's architecture
Piranha operates as a standalone code transformation engine. The original version (now deprecated) supported Java, Swift, and Objective-C. The current version, Polyglot Piranha, was rewritten in Rust and supports a broader set of languages through a unified rule engine.
The workflow looks like this:
- You define rules that describe your flag SDK patterns (method names, argument positions, flag key locations)
- You specify the stale flags you want to remove (flag names and their resolved values)
- Piranha parses your codebase using tree-sitter, matches the rules against your code, and generates transformations
- You run Piranha either locally, in CI, or as a scheduled job
- Piranha produces diffs or modified files with the flag code removed
# Example Piranha rule configuration
[[rules]]
name = "replace_isTreated_with_true"
query = """
(
(call_expression
function: (member_expression
object: (_) @obj
property: (property_identifier) @prop)
arguments: (arguments
(string) @flag_name))
(#eq? @prop "isTreated")
(#eq? @flag_name "\"stale_flag_name\"")
)
"""
replace_node = "call_expression"
replace = "true"
The power here is flexibility. Piranha's rule engine can handle virtually any code pattern because you're writing tree-sitter queries directly. If your flag SDK has an unusual API or your codebase uses flags in non-standard ways, you can write rules to match.
FlagShark's architecture
FlagShark operates as a GitHub App that integrates into your pull request workflow. The architecture is event-driven:
- Install the GitHub App on your repository (one-click)
- FlagShark monitors every PR for flag additions and removals
- Tree-sitter parsing detects flags across 11 languages using built-in provider configurations
- Lifecycle tracking records when flags are added, which PRs reference them, and when they become stale
- Automated cleanup PRs are generated when flags meet staleness criteria
- GitHub comments on PRs provide immediate feedback about flag changes
The detection engine ships with pre-configured support for major flag providers (LaunchDarkly, Unleash, Split, custom SDKs) and can be extended through a .flagshark.yaml configuration file. The key architectural difference is that FlagShark maintains state -- it knows the full history of every flag, when it was introduced, how it has been modified, and which files reference it.
Setup complexity: Minutes vs. hours (or days)
This is where the tools diverge most sharply, and for many teams, this is the deciding factor.
Setting up Piranha
Piranha requires meaningful upfront investment before it delivers value:
- Install the Piranha CLI or integrate the Rust library
- Write rule files for every flag SDK pattern in your codebase
- Identify stale flags manually (Piranha doesn't detect staleness -- you tell it which flags to remove)
- Test the rules against your codebase to verify correct transformations
- Integrate into CI/CD or set up scheduled runs
- Maintain rules as your flag SDK usage evolves
For a team using LaunchDarkly's Go SDK, you would need rules for BoolVariation, StringVariation, IntVariation, Float64Variation, JSONVariation, and potentially wrapper functions your team has created. Multiply this across every language in your stack, and the rule configuration can become substantial.
Estimated setup time: 2-8 hours for a single language and SDK. 1-3 days for a polyglot codebase with multiple flag providers.
Setting up FlagShark
FlagShark's setup is designed to be minimal:
- Install the GitHub App on your repository
- That's it -- FlagShark begins monitoring PRs immediately
Optionally, you can add a .flagshark.yaml file to configure custom flag providers or adjust detection behavior, but out-of-the-box support covers the most common flag SDKs across all 11 supported languages.
Estimated setup time: Under 5 minutes.
| Setup Task | Piranha | FlagShark |
|---|---|---|
| Installation | CLI install or Rust library integration | One-click GitHub App |
| Configuration | Extensive rule files per language/SDK | Zero-config (optional YAML) |
| Stale flag identification | Manual (you provide the list) | Automatic (lifecycle tracking) |
| CI/CD integration | Manual pipeline configuration | Built-in (PR event-driven) |
| Maintenance | Ongoing rule updates | Managed service |
| Time to first value | Hours to days | Minutes |
Language support and detection accuracy
Both tools use tree-sitter for code parsing, which gives them a significant accuracy advantage over regex or pattern-matching approaches. But the implementations differ in scope and depth.
Piranha's language support
Polyglot Piranha supports the following languages through its rule engine:
- Java
- Swift
- Objective-C (legacy version)
- Python
- TypeScript/JavaScript
- Kotlin
- Go
- Rust (partial)
The quality of support depends entirely on the rules you write. Piranha gives you the tree-sitter query language as a primitive, and you build detection logic on top. This means Piranha can theoretically support any language that tree-sitter supports, but you bear the cost of writing and maintaining the rules.
FlagShark's language support
FlagShark ships with built-in detection for 11 languages, organized into tiers based on the depth of testing and detection coverage:
Tier 1 (comprehensive detection and testing):
- Go
- Python
- TypeScript / JavaScript
- Rust
- C#
Tier 2 (standard detection):
- Java
- Kotlin
- C++
- PHP
- Ruby
- Swift
Each language detector is purpose-built with knowledge of common flag SDK patterns, meaning FlagShark understands not just the syntax but the semantics of flag usage. For example, it knows that the first argument to ldclient.BoolVariation() is a flag key, while the second is a context object and the third is a default value.
Detection accuracy comparison
Both tools use tree-sitter AST parsing, which makes them fundamentally more accurate than regex-based approaches. However, their accuracy profiles differ:
| Accuracy Factor | Piranha | FlagShark |
|---|---|---|
| AST-based parsing | Yes (tree-sitter) | Yes (tree-sitter) |
| False positive rate | Depends on rule quality | Low (pre-tuned detectors) |
| Custom SDK support | Excellent (write any rule) | Good (via .flagshark.yaml) |
| Nested flag detection | Manual rule composition | Built-in |
| Cross-file tracking | Limited | Full lifecycle tracking |
Piranha's accuracy advantage: If you have highly unusual flag patterns or internal SDKs with non-standard APIs, Piranha's raw tree-sitter query access lets you match virtually anything. The accuracy ceiling is higher for edge cases, provided you invest the time in rule development.
FlagShark's accuracy advantage: For standard flag SDK usage (which covers the vast majority of real-world codebases), FlagShark's pre-built detectors have been refined across many repositories and flag providers. You get production-grade detection without writing a single rule.
Workflow integration and developer experience
How a cleanup tool fits into your team's daily workflow determines whether it actually gets used or sits forgotten in a CI config file.
Piranha's workflow
Piranha is fundamentally a batch tool. The typical workflow:
- Someone identifies that flags need cleanup (manual process)
- They compile a list of stale flags and their resolved values
- They run Piranha against the codebase
- They review the generated diffs
- They create a PR with the changes
- The team reviews and merges
This can be automated to some degree by scheduling Piranha runs in CI, but the identification of which flags are stale remains a separate problem that Piranha doesn't solve. You need another system (your flag management platform, a spreadsheet, manual audits) to determine which flags should be removed.
FlagShark's workflow
FlagShark integrates into the development workflow at the PR level:
- A developer opens a PR that adds or modifies flags
- FlagShark comments on the PR identifying the flag changes
- The flag lifecycle begins tracking automatically
- When flags become stale (based on configurable criteria), FlagShark generates cleanup PRs
- The team reviews and merges the cleanup PR like any other code change
The critical difference is continuous awareness. Developers see flag information on every PR, building a culture of flag hygiene without requiring dedicated cleanup sprints or manual audits.
Developer experience comparison
| Workflow Aspect | Piranha | FlagShark |
|---|---|---|
| Flag identification | Manual | Automatic |
| Cleanup trigger | On-demand / scheduled | Continuous monitoring |
| PR integration | Manual PR creation | Automatic PR generation |
| Developer visibility | During cleanup runs only | Every PR, real-time |
| Team awareness | Requires process discipline | Built into workflow |
| Feedback loop | Delayed (batch) | Immediate (per PR) |
Maintenance burden and total cost of ownership
The upfront setup cost is only part of the picture. The ongoing maintenance burden is what determines the real total cost of ownership.
Maintaining Piranha
Piranha is open-source software that you host and maintain:
- Rule maintenance: As your team adopts new flag SDKs, changes wrapper functions, or introduces new languages, rules must be updated
- Infrastructure: CI/CD pipeline configuration, compute resources for running Piranha, artifact storage
- Upgrades: Tracking Piranha releases, testing compatibility with your rules, managing breaking changes
- Knowledge: Someone on the team must understand tree-sitter queries and Piranha's rule engine
- Flag identification: A separate process must exist to determine which flags are stale
For large organizations with dedicated platform teams, this maintenance is manageable. For smaller teams, it can become a meaningful drag on engineering bandwidth -- ironically adding to the very technical debt the tool is supposed to reduce.
Maintaining FlagShark
As a managed SaaS product, FlagShark's maintenance burden falls primarily on the service provider:
- Updates: Automatic, no action required
- Infrastructure: Fully managed
- New language support: Shipped as updates
- Provider support: New flag SDK patterns added continuously
- Flag identification: Built-in lifecycle tracking
The team's maintenance responsibility is limited to occasionally updating .flagshark.yaml if they add custom flag providers.
Total cost of ownership
| Cost Factor | Piranha | FlagShark |
|---|---|---|
| Software license | Free (open-source) | Subscription |
| Infrastructure | Self-hosted compute | Included |
| Setup labor | 8-40 hours | < 1 hour |
| Ongoing maintenance | 2-4 hours/month | Near zero |
| Rule updates | As needed (manual) | Automatic |
| Stale flag identification | Separate tool/process needed | Included |
| Specialization required | Tree-sitter query expertise | None |
When to choose Piranha
Piranha is the stronger choice in specific scenarios:
You should consider Piranha if:
- You have a dedicated platform/tooling team that can own the rule configuration and maintenance
- Your flag patterns are highly custom with internal SDKs that don't match standard provider patterns
- You need maximum control over the transformation logic and want to write custom tree-sitter queries
- You're already invested in a batch cleanup workflow and want to automate the code transformation step
- Budget constraints make SaaS tools infeasible and you have engineering bandwidth to invest instead
- You operate at Uber-scale with thousands of flags and need the flexibility of a programmable refactoring engine
Piranha has been battle-tested at Uber, where it successfully removed approximately 2,000 stale flags from their mobile applications. That track record at scale is meaningful. The tool is also backed by active research (published in IEEE/ACM conferences) and continues to evolve.
The key Piranha strength: unmatched flexibility for teams willing to invest in configuration. If your cleanup needs are complex and non-standard, Piranha gives you the primitives to handle them.
When to choose FlagShark
FlagShark is the stronger choice in a different set of scenarios:
You should consider FlagShark if:
- You want immediate value without investing days in setup and configuration
- Your team uses standard flag providers (LaunchDarkly, Unleash, Split, or similar)
- You need continuous monitoring, not just periodic cleanup runs
- Lifecycle tracking matters -- you want to know when every flag was introduced, by whom, and in which PR
- You lack tree-sitter expertise and don't want to learn a query language to remove stale flags
- Your team is small to mid-size and can't dedicate engineering bandwidth to tooling maintenance
- You use GitHub as your primary development platform and want native integration
The key FlagShark strength: zero-config continuous monitoring that integrates into the workflow developers already use. The tool works from day one without requiring anyone to become a tree-sitter expert.
The hybrid approach
It's worth noting that Piranha and FlagShark aren't mutually exclusive. Some teams use both:
- FlagShark for continuous monitoring and lifecycle tracking -- catching flags as they're introduced, tracking their age, and generating cleanup PRs for standard patterns
- Piranha for complex one-off cleanups -- when a major refactoring initiative requires removing deeply embedded flags with custom transformation logic
This hybrid approach gives you the best of both worlds: day-to-day automation through FlagShark and the power of a programmable refactoring engine through Piranha when you need it.
Comparison summary
| Feature | Piranha | FlagShark |
|---|---|---|
| Type | Open-source tool | SaaS platform |
| Cost | Free (+ engineering time) | Subscription |
| Setup time | Hours to days | Minutes |
| Configuration | Extensive rules required | Zero-config |
| Detection | Tree-sitter + custom rules | Tree-sitter + built-in providers |
| Languages | 8 (rule-dependent) | 11 (built-in) |
| Stale flag identification | External process required | Built-in lifecycle tracking |
| Workflow integration | Batch / CI pipeline | GitHub App, per-PR |
| Cleanup PRs | Manual creation | Automatic generation |
| Monitoring | Point-in-time | Continuous |
| Maintenance | Self-managed | Fully managed |
| Best for | Large teams, custom patterns, maximum control | Small-to-mid teams, standard providers, fast setup |
| Proven scale | Uber (2,000+ flags removed) | Multi-language, multi-provider |
Making the decision
The right tool depends on your team's specific constraints. Ask yourself these questions:
-
Do you have dedicated tooling engineers? If yes, Piranha's flexibility may be worth the investment. If no, FlagShark's managed approach saves engineering bandwidth.
-
How custom are your flag patterns? Standard SDK usage points toward FlagShark. Heavily customized internal abstractions point toward Piranha.
-
Do you need continuous monitoring or periodic cleanup? If flags are accumulating faster than you can clean them, continuous monitoring (FlagShark) prevents the problem from compounding. If you have a known backlog to clear, Piranha's batch approach may be more efficient for the initial sweep.
-
What's your tolerance for setup time? If the team needs results this week, FlagShark delivers. If you can invest a sprint in tooling setup, Piranha's long-term flexibility may pay off.
-
What's your budget model? If you have more engineering time than budget, Piranha's open-source model works. If you have more budget than engineering time, FlagShark's subscription model trades money for time.
Neither tool is universally superior. The best choice is the one that fits your team's constraints, workflow, and scale. What matters most is that you choose something -- because the cost of doing nothing about stale flags compounds every day, and manual cleanup doesn't scale.
Stale feature flags are a universal problem, but the solutions don't have to be one-size-fits-all. Whether you choose the programmable power of Piranha, the zero-config simplicity of FlagShark, or a combination of both, the critical step is moving from manual flag management to automated cleanup. Your codebase -- and your team's productivity -- will thank you.