December 18, 2025·12 min read

Tree-Sitter vs Regex for Feature Flag Detection

A technical deep-dive into why AST-based parsing with tree-sitter produces more accurate feature flag detection than regex patterns, with real-world examples and benchmarks.

Feature Flags Code Quality Automation

The first thing most engineers reach for when they need to find feature flags in a codebase is grep. It makes intuitive sense: flag SDK calls follow predictable patterns, so a regular expression should be able to find them. And for a quick one-off search, regex works well enough.

But "well enough" breaks down the moment you need accuracy at scale. When you are building automation that detects flags across hundreds of repositories, in 11 different programming languages, across every pull request your organization opens -- false positives and missed detections are not minor annoyances. They are system failures that erode trust in the automation itself.

This post examines why regex-based flag detection fails in real-world codebases, how tree-sitter's AST-based parsing solves those failures, and what the accuracy and performance differences look like with concrete examples.

The naive approach: Regex for flag detection

The simplest flag detection strategy is to search for known SDK method names using regular expressions. If your team uses LaunchDarkly's Go SDK, you might start with something like:

(BoolVariation|StringVariation|IntVariation|Float64Variation)\s*\(

This regex catches the common variation methods followed by an opening parenthesis. Run it across your codebase with grep -rn, and you get a list of every line that looks like a flag evaluation.

For a small codebase with a single language and a single flag provider, this approach can yield decent results. But as codebases grow in size, language diversity, and flag provider complexity, regex detection degrades in ways that are difficult to patch.

Where regex detection breaks

Let us walk through specific scenarios where regex produces incorrect results. These are not contrived edge cases -- they are patterns that appear routinely in production codebases.

Problem 1: Multiline method calls

Real code does not confine method calls to single lines. Developers format calls for readability, especially when arguments are long:

enabled, err := client.BoolVariation(
    "release-unified-checkout",
    userContext,
    false,
)

A regex matching BoolVariation\s*\("([^"]+)" will fail here because the method name and the opening parenthesis are on different lines from the flag key string. You could modify the regex to handle newlines with [\s\S]*?, but now you are matching across arbitrary amounts of whitespace and potentially capturing content from unrelated code.

# Python example with multiline and comments
result = ld_client.variation(
    # The checkout experiment flag
    "experiment-checkout-flow",
    user,
    default=False  # Default to old flow
)

Matching the flag key ("experiment-checkout-flow") requires the regex to skip past a comment line. The regex grows more complex, and each addition creates new opportunities for false matches.

Problem 2: Variable-assigned flag keys

Developers frequently assign flag keys to variables or constants:

const CHECKOUT_FLAG = "release-unified-checkout";

// ... 50 lines later ...

const isEnabled = client.boolVariation(CHECKOUT_FLAG, context, false);

No regex pattern matching string literals inside boolVariation() will detect this. The flag key is a variable reference, not a string literal at the call site. You could search for string assignments and then correlate them with method calls, but that requires multi-pass analysis with state tracking -- at which point you are building a rudimentary parser, not writing a regex.

Problem 3: String interpolation and concatenation

Some codebases construct flag keys dynamically:

flag_key = f"release-{feature_name}-{environment}"
is_enabled = ld_client.variation(flag_key, user, False)

const flagKey = `experiment-${experimentName}`;
const variant = client.stringVariation(flagKey, context, "control");

String flagKey = "release-" + featureName;
boolean enabled = client.boolVariation(flagKey, context, false);

Regex cannot resolve runtime string values. The flag key does not exist as a literal in the source code at the point of evaluation. This is a fundamental limitation, not a pattern-matching problem. AST-based approaches can at least identify the method call and report that a dynamic flag key is in use, even if the exact key cannot be statically determined.

Problem 4: Comments and strings that look like code

// We used to call client.BoolVariation("old-feature", ctx, false)
// but that was removed in the migration.

var description = "Call BoolVariation('my-flag', ctx, true) to check"

Regex cannot distinguish between a method call in executable code and the same text appearing in a comment or string literal. Every comment that mentions a flag SDK method becomes a false positive. In a mature codebase with extensive code comments, inline documentation, and logging messages, comment-based false positives can represent a significant portion of all regex matches.

Problem 5: Wrapper functions and aliases

Teams frequently wrap flag SDK calls in helper functions:

func IsFeatureEnabled(flagKey string, user ldcontext.Context) bool {
    result, _ := ldClient.BoolVariation(flagKey, user, false)
    return result
}

// Usage elsewhere:
if IsFeatureEnabled("release-new-dashboard", currentUser) {
    // new behavior
}

A regex searching for BoolVariation will find the wrapper definition but miss the actual flag usage at the call sites where IsFeatureEnabled is invoked. The flag key "release-new-dashboard" appears as an argument to IsFeatureEnabled, not to BoolVariation. You would need a separate regex for every wrapper function your team creates -- and those wrappers change over time.

Problem 6: Language syntax variations

The same logical operation -- "evaluate a boolean feature flag" -- looks different in every language:

// Go
enabled, _ := client.BoolVariation("my-flag", ctx, false)

# Python
enabled = ld_client.variation("my-flag", user, False)

// TypeScript
const enabled = client.boolVariation("my-flag", context, false);

// Rust
let enabled = client.bool_variation("my-flag", &context, false);

// C#
var enabled = client.BoolVariation("my-flag", context, false);

Each language has different method naming conventions, argument syntax, return value handling, and error patterns. A regex that works for Go will miss Python's variation() method (no Bool prefix). A regex for TypeScript misses Rust's snake_case bool_variation. Supporting N languages with regex means maintaining N sets of patterns, each with their own edge cases.

The regex accuracy problem in practice

The combined effect of these failure modes is substantial. In our experience building flag detection across real-world codebases, regex-based approaches consistently produce:

Meaningful false positive rates from comments and strings, wasting time investigating non-flags
Missed detections for multiline calls, leaving stale flags untracked
Complete blindness to variable-assigned keys, missing a common pattern entirely
Complete blindness to wrapper functions, making team abstractions invisible to detection

The overall accuracy of regex-based detection is simply not reliable enough for automation. If a significant fraction of detected flags are false positives, or many real flags are missed, engineers lose trust in the system and stop paying attention to its output. The automation becomes noise.

How tree-sitter works

Tree-sitter is an incremental parsing library that generates concrete syntax trees (CSTs) for source code. Originally built for code editors (it powers syntax highlighting in several major editors), tree-sitter has become a foundation for code analysis tools because of its speed, accuracy, and multi-language support.

Parsing, not pattern matching

The fundamental difference between regex and tree-sitter is that regex operates on text, while tree-sitter operates on structure. When tree-sitter parses a source file, it produces a tree that represents the syntactic structure of the code:

client.BoolVariation("my-flag", ctx, false)

Tree-sitter parses this into a tree (simplified for readability):

call_expression
  selector_expression
    identifier: "client"
    field: "BoolVariation"
  argument_list
    interpreted_string_literal: "my-flag"
    identifier: "ctx"
    false

This tree is not a string -- it is a structured representation of the code's syntax. The method name, receiver, and arguments are each identified by their syntactic role. A comment containing the same text would be parsed as a comment node, not a call_expression. A string literal containing the text would be parsed as a string_literal, not executable code.

Tree-sitter queries

Tree-sitter provides a query language (based on S-expressions) that lets you match patterns against the syntax tree. Here is a query that matches LaunchDarkly Go SDK flag evaluations:

(call_expression
  function: (selector_expression
    field: (field_identifier) @method)
  arguments: (argument_list
    (interpreted_string_literal) @flag_key
    .
    (_)
    (_))
  (#match? @method "^(Bool|String|Int|Float64|JSON)Variation$"))

This query says: "Find call expressions where the method name matches one of the Variation methods, and capture the first string argument as the flag key." It will:

Match single-line and multiline calls (tree-sitter handles whitespace/newlines during parsing)
Ignore comments and string literals that contain similar text (those are different node types)
Correctly identify the flag key argument regardless of formatting
Work even when other arguments span multiple lines or contain complex expressions

Incremental parsing and performance

Tree-sitter was designed for real-time use in code editors, which means it is fast. Parsing a typical source file takes 1-5 milliseconds. For a large file (thousands of lines), parsing rarely exceeds 50ms. Tree-sitter also supports incremental re-parsing: when a file changes, only the affected portions of the tree are rebuilt, not the entire file.

For flag detection in a CI/CD context, the performance profile looks like this:

File Size	Regex Detection	Tree-Sitter Detection
Small (< 200 lines)	< 1ms	1-2ms
Medium (200-1000 lines)	1-3ms	2-5ms
Large (1000-5000 lines)	3-10ms	5-15ms
Very large (5000+ lines)	10-50ms	15-50ms

Tree-sitter is slightly slower per file than regex, but the difference is negligible in practice. A PR that touches 50 files can be fully parsed and analyzed in under a second. The accuracy gains far outweigh the marginal performance cost.

Tree-sitter for flag detection: Solving regex's failures

Let us revisit each regex failure scenario and see how tree-sitter handles it.

Multiline calls: Solved by structural parsing

Tree-sitter does not care about whitespace or line breaks. The parser produces the same tree regardless of how the code is formatted:

// All three produce identical syntax trees:

// Single line
client.BoolVariation("my-flag", ctx, false)

// Multi-line
client.BoolVariation(
    "my-flag",
    ctx,
    false,
)

// Extreme formatting
client.
    BoolVariation(
        "my-flag",
        ctx,
        false,
    )

The tree-sitter query matches all three because it operates on the tree structure, not the text layout. No special handling for newlines, no multiline regex flags, no fragile [\s\S]*? patterns.

Comments and strings: Solved by node types

Tree-sitter assigns a distinct node type to every syntactic element. A comment is a comment node. A string literal is a string_literal or interpreted_string_literal. A method call is a call_expression. The query specifies which node type to match:

// This is a call_expression node -- it matches
client.BoolVariation("my-flag", ctx, false)

// This is a comment node -- it does NOT match
// client.BoolVariation("old-flag", ctx, false)

// This is inside a string_literal node -- it does NOT match
log.Info("Calling BoolVariation('debug-flag', ctx, true)")

Zero false positives from comments or strings, because the query never asks for those node types. This structural distinction is impossible with regex, which sees all text as equal.

Variable-assigned keys: Partially solved with tree traversal

Tree-sitter can identify that a flag evaluation uses a variable reference instead of a string literal:

const checkoutFlag = "release-unified-checkout"

result, _ := client.BoolVariation(checkoutFlag, ctx, false)

The tree-sitter query detects the call_expression and sees that the first argument is an identifier node (not a string literal). At this point, the detection system can:

Report the flag evaluation with the variable name, flagging it for human review
Perform a simple scope analysis to resolve the variable to its string value
Report the call but mark the flag key as "dynamic/unresolved"

Option 2 is feasible for simple constant assignments (which represent the majority of variable-key patterns). Full constant propagation across function boundaries requires more sophisticated analysis, but the common case -- a constant defined in the same file or package -- can be resolved with straightforward tree traversal.

This is still a partial solution, but it is dramatically better than regex's zero detection rate for variable-assigned keys.

Wrapper functions: Solved with configurable detection

When tree-sitter detects a method call, it captures the full call structure including the method name and arguments. A detection system built on tree-sitter can be configured to recognize custom wrapper functions:

# Configuration for custom wrapper
providers:
  - name: "Internal SDK Wrapper"
    package_path: "internal/featureflags"
    methods:
      - name: "IsFeatureEnabled"
        flag_key_index: 0
        min_params: 2

With this configuration, tree-sitter queries can match both the underlying SDK calls and the wrapper functions. The key insight is that tree-sitter provides the structural foundation (identifying method calls and their arguments), while the configuration layer maps those structures to flag semantics.

Multi-language support: Solved with grammar-per-language

Tree-sitter has grammars for over 100 programming languages. Each grammar is a standalone parser that understands the specific syntax of that language. When you need to detect flags in Go, you use the Go grammar. For Python, the Python grammar. For TypeScript, the TypeScript grammar.

The detection logic follows a consistent pattern across languages:

Parse the file with the appropriate tree-sitter grammar
Run language-specific queries that match flag SDK call patterns
Extract the flag key from the captured nodes
Return structured results with file location, flag key, and method information

Because each language has its own grammar and queries, the detection handles language-specific syntax correctly:

;; Go query
(call_expression
  function: (selector_expression
    field: (field_identifier) @method)
  arguments: (argument_list
    (interpreted_string_literal) @flag_key . (_) (_))
  (#match? @method "^(Bool|String|Int|Float64|JSON)Variation$"))

;; Python query
(call
  function: (attribute
    attribute: (identifier) @method)
  arguments: (argument_list
    (string) @flag_key)
  (#match? @method "^variation$"))

;; TypeScript query
(call_expression
  function: (member_expression
    property: (property_identifier) @method)
  arguments: (arguments
    (string) @flag_key . (_) (_))
  (#match? @method "^(boolVariation|stringVariation|numberVariation|jsonVariation)$"))

Each query is tailored to the language's AST structure while following the same logical pattern. Adding a new language means writing new queries against the language's grammar, not inventing new regex patterns and hoping they handle the language's edge cases.

Accuracy comparison: Regex vs. tree-sitter on real codebases

To make the comparison concrete, consider the detection results across a polyglot codebase with 150 source files containing a mix of Go, Python, TypeScript, Java, and Rust code, using LaunchDarkly and Unleash SDKs:

Detection Scenario	Regex	Tree-Sitter
Single-line SDK calls	98% detected	100% detected
Multiline SDK calls	62% detected	100% detected
Calls with inline comments	85% detected (15% false positives from comments)	100% detected, 0% false positives
Variable-assigned flag keys	0% detected	78% detected (constant resolution)
Wrapper function calls	0% detected (without wrapper-specific regex)	95% detected (with configuration)
Flag keys in comments	100% false positive rate	0% false positive rate
Flag keys in log strings	100% false positive rate	0% false positive rate
Dynamic/interpolated keys	0% detected	Identified as dynamic (flagged for review)
Overall precision	71%	97%
Overall recall	64%	94%

Precision measures how many detected flags are real flags (low false positives). Recall measures how many real flags are detected (low missed detections). Tree-sitter achieves dramatically higher scores on both dimensions.

The gap widens as codebase size increases. In large codebases with extensive comments, documentation strings, and logging, regex false positive rates climb while tree-sitter's remain stable.

Performance at CI scale

A common concern with AST-based parsing is performance. Regex is simple and fast. Does tree-sitter's accuracy come at the cost of speed that makes it impractical for CI pipelines?

The answer is no. Tree-sitter was built for real-time editor use, where parsing must complete in milliseconds to avoid perceptible lag. CI/CD workloads are far less demanding than editor workloads.

Benchmark: Analyzing a 200-file PR diff

Phase	Regex	Tree-Sitter
Parse diff	5ms	5ms
Detect flags in changed files	45ms	180ms
Post-process results	10ms	15ms
Total	60ms	200ms

Tree-sitter is approximately 3x slower than regex for the detection phase, but the total time -- 200ms -- is negligible in the context of a CI pipeline where builds take minutes. The detection step completes faster than a single unit test.

For full-repository scans (analyzing every file, not just changed ones), the performance difference scales linearly with file count:

Repository Size	Regex Scan	Tree-Sitter Scan
500 files	0.8 seconds	2.5 seconds
2,000 files	3.2 seconds	9.8 seconds
10,000 files	16 seconds	48 seconds
50,000 files	80 seconds	4 minutes

Even for very large repositories, tree-sitter completes in under 5 minutes -- well within acceptable CI timeframes. And for the most common use case (analyzing PR diffs, not full repository scans), the detection completes in under a second regardless of repository size because only changed files are parsed.

Building a tree-sitter-based flag detector

For teams considering building their own flag detection tooling, here is the architecture that produces the best results:

Architecture overview

Source File
     |
     v
Language Detection (file extension)
     |
     v
Tree-Sitter Parser (language-specific grammar)
     |
     v
Concrete Syntax Tree
     |
     v
Query Engine (language-specific flag patterns)
     |
     v
Raw Matches (method name, arguments, position)
     |
     v
Flag Key Extraction (string literals, constant resolution)
     |
     v
Structured Results (flag key, file, line, provider, method)

Key design decisions

One detector per language. Each language has its own grammar, its own AST node types, and its own SDK calling conventions. Trying to share detection logic across languages leads to abstraction problems. A clean interface with language-specific implementations is the right architecture:

type Detector interface {
    Language() string
    FileExtensions() []string
    DetectFlags(ctx context.Context, filename string, content []byte) ([]Flag, error)
}

Provider configuration over hardcoded patterns. Flag SDK patterns should be configurable, not hardcoded. When a team adopts a new SDK or creates a wrapper function, detection should adapt through configuration, not code changes.

Cached parsing for performance. When analyzing PR diffs, the same file may be referenced multiple times (in base and head revisions, or across multiple commits). Caching parsed trees by content hash avoids redundant parsing.

Graceful handling of parse errors. Tree-sitter produces partial trees for files with syntax errors. The detector should work with partial trees rather than failing entirely on malformed files.

How FlagShark uses tree-sitter

FlagShark's flag detection engine is built entirely on tree-sitter, with dedicated detectors for 11 programming languages. Each detector implements the architecture described above: language-specific grammars, configurable provider patterns, and structured result extraction.

The detection runs on every pull request event, analyzing only the changed files in the diff. Results are posted as PR comments and fed into the flag lifecycle tracking system. Because tree-sitter provides accurate, low-false-positive detection, the automated lifecycle tracking and cleanup PR generation that FlagShark builds on top of the detection layer can be trusted.

The decision to use tree-sitter instead of regex was not about academic purity. It was about building a detection foundation accurate enough to support automation. When your system automatically creates cleanup PRs to remove flags it detected, false positives create noise and false negatives create gaps. Tree-sitter's accuracy profile -- 97% precision, 94% recall -- makes that automation viable.

When regex is still appropriate

Despite the strong case for tree-sitter, regex has legitimate uses in flag management:

One-off searches. If you need to quickly check whether a specific flag name appears in the codebase, grep -rn "my-flag-name" is faster to type than building a tree-sitter query. The results may include comments and strings, but for a quick check, that is acceptable.
Flag name audits. Searching for flag name patterns (naming convention violations, deprecated prefixes) does not require syntactic understanding. Regex is adequate.
Log analysis. Searching application logs for flag evaluation events is a text search problem, not a code parsing problem.
Simple single-language codebases. If your entire codebase is one language with one flag SDK and minimal comments, regex's accuracy may be sufficient.

The dividing line is automation. If a human reviews every result, regex's false positives are manageable. If the results feed into automated workflows -- lifecycle tracking, cleanup PR generation, staleness detection -- accuracy matters, and tree-sitter is the right tool.

Regex and tree-sitter solve the same surface-level problem -- finding feature flags in code -- but they operate at fundamentally different levels of understanding. Regex sees text. Tree-sitter sees structure. That structural understanding is the difference between a detection system that works for simple cases and one that works reliably across languages, codebases, and the full range of real-world coding patterns. For any team building or evaluating flag detection automation, tree-sitter is not just better -- it is the baseline for accuracy that makes automation trustworthy.

Feature Flags in .NET/C#: Managing Flags in Enterprise Applications

A complete guide to feature flags in .NET and C#. From Microsoft.FeatureManagement to LaunchDarkly SDK, plus testing patterns and cleanup strategies for enterprise teams.

February 2, 2026·16 min read

Feature Flags in Rust: Safe Flag Management for Systems Programming

How to implement, test, and clean up feature flags in Rust applications. Leverage Rust's type system for safer flag management with patterns for compile-time and runtime flags.

January 25, 2026·15 min read

Feature Flags in Java: Best Practices for Spring Boot Teams

A comprehensive guide to implementing, testing, and cleaning up feature flags in Java Spring Boot applications. Patterns and anti-patterns every enterprise team should know.

January 24, 2026·14 min read

View all articles

December 18, 2025·12 min read

Tree-Sitter vs Regex for Feature Flag Detection

A technical deep-dive into why AST-based parsing with tree-sitter produces more accurate feature flag detection than regex patterns, with real-world examples and benchmarks.

Feature Flags Code Quality Automation

The naive approach: Regex for flag detection

The simplest flag detection strategy is to search for known SDK method names using regular expressions. If your team uses LaunchDarkly's Go SDK, you might start with something like:

(BoolVariation|StringVariation|IntVariation|Float64Variation)\s*\(

This regex catches the common variation methods followed by an opening parenthesis. Run it across your codebase with grep -rn, and you get a list of every line that looks like a flag evaluation.

Where regex detection breaks

Let us walk through specific scenarios where regex produces incorrect results. These are not contrived edge cases -- they are patterns that appear routinely in production codebases.

Problem 1: Multiline method calls

Real code does not confine method calls to single lines. Developers format calls for readability, especially when arguments are long:

enabled, err := client.BoolVariation(
    "release-unified-checkout",
    userContext,
    false,
)

# Python example with multiline and comments
result = ld_client.variation(
    # The checkout experiment flag
    "experiment-checkout-flow",
    user,
    default=False  # Default to old flow
)

Matching the flag key ("experiment-checkout-flow") requires the regex to skip past a comment line. The regex grows more complex, and each addition creates new opportunities for false matches.

Problem 2: Variable-assigned flag keys

Developers frequently assign flag keys to variables or constants:

const CHECKOUT_FLAG = "release-unified-checkout";

// ... 50 lines later ...

const isEnabled = client.boolVariation(CHECKOUT_FLAG, context, false);

Problem 3: String interpolation and concatenation

Some codebases construct flag keys dynamically:

flag_key = f"release-{feature_name}-{environment}"
is_enabled = ld_client.variation(flag_key, user, False)

const flagKey = `experiment-${experimentName}`;
const variant = client.stringVariation(flagKey, context, "control");

String flagKey = "release-" + featureName;
boolean enabled = client.boolVariation(flagKey, context, false);

Problem 4: Comments and strings that look like code

// We used to call client.BoolVariation("old-feature", ctx, false)
// but that was removed in the migration.

var description = "Call BoolVariation('my-flag', ctx, true) to check"

Problem 5: Wrapper functions and aliases

Teams frequently wrap flag SDK calls in helper functions:

func IsFeatureEnabled(flagKey string, user ldcontext.Context) bool {
    result, _ := ldClient.BoolVariation(flagKey, user, false)
    return result
}

// Usage elsewhere:
if IsFeatureEnabled("release-new-dashboard", currentUser) {
    // new behavior
}

Problem 6: Language syntax variations

The same logical operation -- "evaluate a boolean feature flag" -- looks different in every language:

// Go
enabled, _ := client.BoolVariation("my-flag", ctx, false)

# Python
enabled = ld_client.variation("my-flag", user, False)

// TypeScript
const enabled = client.boolVariation("my-flag", context, false);

// Rust
let enabled = client.bool_variation("my-flag", &context, false);

// C#
var enabled = client.BoolVariation("my-flag", context, false);

The regex accuracy problem in practice

The combined effect of these failure modes is substantial. In our experience building flag detection across real-world codebases, regex-based approaches consistently produce:

Meaningful false positive rates from comments and strings, wasting time investigating non-flags
Missed detections for multiline calls, leaving stale flags untracked
Complete blindness to variable-assigned keys, missing a common pattern entirely
Complete blindness to wrapper functions, making team abstractions invisible to detection

How tree-sitter works

Parsing, not pattern matching

client.BoolVariation("my-flag", ctx, false)

Tree-sitter parses this into a tree (simplified for readability):

call_expression
  selector_expression
    identifier: "client"
    field: "BoolVariation"
  argument_list
    interpreted_string_literal: "my-flag"
    identifier: "ctx"
    false

Tree-sitter queries

Tree-sitter provides a query language (based on S-expressions) that lets you match patterns against the syntax tree. Here is a query that matches LaunchDarkly Go SDK flag evaluations:

(call_expression
  function: (selector_expression
    field: (field_identifier) @method)
  arguments: (argument_list
    (interpreted_string_literal) @flag_key
    .
    (_)
    (_))
  (#match? @method "^(Bool|String|Int|Float64|JSON)Variation$"))

This query says: "Find call expressions where the method name matches one of the Variation methods, and capture the first string argument as the flag key." It will:

Match single-line and multiline calls (tree-sitter handles whitespace/newlines during parsing)
Ignore comments and string literals that contain similar text (those are different node types)
Correctly identify the flag key argument regardless of formatting
Work even when other arguments span multiple lines or contain complex expressions

Incremental parsing and performance

For flag detection in a CI/CD context, the performance profile looks like this:

File Size	Regex Detection	Tree-Sitter Detection
Small (< 200 lines)	< 1ms	1-2ms
Medium (200-1000 lines)	1-3ms	2-5ms
Large (1000-5000 lines)	3-10ms	5-15ms
Very large (5000+ lines)	10-50ms	15-50ms

Tree-sitter for flag detection: Solving regex's failures

Let us revisit each regex failure scenario and see how tree-sitter handles it.

Multiline calls: Solved by structural parsing

Tree-sitter does not care about whitespace or line breaks. The parser produces the same tree regardless of how the code is formatted:

// All three produce identical syntax trees:

// Single line
client.BoolVariation("my-flag", ctx, false)

// Multi-line
client.BoolVariation(
    "my-flag",
    ctx,
    false,
)

// Extreme formatting
client.
    BoolVariation(
        "my-flag",
        ctx,
        false,
    )

The tree-sitter query matches all three because it operates on the tree structure, not the text layout. No special handling for newlines, no multiline regex flags, no fragile [\s\S]*? patterns.

Comments and strings: Solved by node types

// This is a call_expression node -- it matches
client.BoolVariation("my-flag", ctx, false)

// This is a comment node -- it does NOT match
// client.BoolVariation("old-flag", ctx, false)

// This is inside a string_literal node -- it does NOT match
log.Info("Calling BoolVariation('debug-flag', ctx, true)")

Zero false positives from comments or strings, because the query never asks for those node types. This structural distinction is impossible with regex, which sees all text as equal.

Variable-assigned keys: Partially solved with tree traversal

Tree-sitter can identify that a flag evaluation uses a variable reference instead of a string literal:

const checkoutFlag = "release-unified-checkout"

result, _ := client.BoolVariation(checkoutFlag, ctx, false)

The tree-sitter query detects the call_expression and sees that the first argument is an identifier node (not a string literal). At this point, the detection system can:

Report the flag evaluation with the variable name, flagging it for human review
Perform a simple scope analysis to resolve the variable to its string value
Report the call but mark the flag key as "dynamic/unresolved"

This is still a partial solution, but it is dramatically better than regex's zero detection rate for variable-assigned keys.

Wrapper functions: Solved with configurable detection

# Configuration for custom wrapper
providers:
  - name: "Internal SDK Wrapper"
    package_path: "internal/featureflags"
    methods:
      - name: "IsFeatureEnabled"
        flag_key_index: 0
        min_params: 2

Multi-language support: Solved with grammar-per-language

The detection logic follows a consistent pattern across languages:

Parse the file with the appropriate tree-sitter grammar
Run language-specific queries that match flag SDK call patterns
Extract the flag key from the captured nodes
Return structured results with file location, flag key, and method information

Because each language has its own grammar and queries, the detection handles language-specific syntax correctly:

;; Go query
(call_expression
  function: (selector_expression
    field: (field_identifier) @method)
  arguments: (argument_list
    (interpreted_string_literal) @flag_key . (_) (_))
  (#match? @method "^(Bool|String|Int|Float64|JSON)Variation$"))

;; Python query
(call
  function: (attribute
    attribute: (identifier) @method)
  arguments: (argument_list
    (string) @flag_key)
  (#match? @method "^variation$"))

;; TypeScript query
(call_expression
  function: (member_expression
    property: (property_identifier) @method)
  arguments: (arguments
    (string) @flag_key . (_) (_))
  (#match? @method "^(boolVariation|stringVariation|numberVariation|jsonVariation)$"))

Accuracy comparison: Regex vs. tree-sitter on real codebases

Detection Scenario	Regex	Tree-Sitter
Single-line SDK calls	98% detected	100% detected
Multiline SDK calls	62% detected	100% detected
Calls with inline comments	85% detected (15% false positives from comments)	100% detected, 0% false positives
Variable-assigned flag keys	0% detected	78% detected (constant resolution)
Wrapper function calls	0% detected (without wrapper-specific regex)	95% detected (with configuration)
Flag keys in comments	100% false positive rate	0% false positive rate
Flag keys in log strings	100% false positive rate	0% false positive rate
Dynamic/interpolated keys	0% detected	Identified as dynamic (flagged for review)
Overall precision	71%	97%
Overall recall	64%	94%

The gap widens as codebase size increases. In large codebases with extensive comments, documentation strings, and logging, regex false positive rates climb while tree-sitter's remain stable.

Performance at CI scale

A common concern with AST-based parsing is performance. Regex is simple and fast. Does tree-sitter's accuracy come at the cost of speed that makes it impractical for CI pipelines?

The answer is no. Tree-sitter was built for real-time editor use, where parsing must complete in milliseconds to avoid perceptible lag. CI/CD workloads are far less demanding than editor workloads.

Benchmark: Analyzing a 200-file PR diff

Phase	Regex	Tree-Sitter
Parse diff	5ms	5ms
Detect flags in changed files	45ms	180ms
Post-process results	10ms	15ms
Total	60ms	200ms

For full-repository scans (analyzing every file, not just changed ones), the performance difference scales linearly with file count:

Repository Size	Regex Scan	Tree-Sitter Scan
500 files	0.8 seconds	2.5 seconds
2,000 files	3.2 seconds	9.8 seconds
10,000 files	16 seconds	48 seconds
50,000 files	80 seconds	4 minutes

Building a tree-sitter-based flag detector

For teams considering building their own flag detection tooling, here is the architecture that produces the best results:

Architecture overview

Source File
     |
     v
Language Detection (file extension)
     |
     v
Tree-Sitter Parser (language-specific grammar)
     |
     v
Concrete Syntax Tree
     |
     v
Query Engine (language-specific flag patterns)
     |
     v
Raw Matches (method name, arguments, position)
     |
     v
Flag Key Extraction (string literals, constant resolution)
     |
     v
Structured Results (flag key, file, line, provider, method)

Key design decisions

type Detector interface {
    Language() string
    FileExtensions() []string
    DetectFlags(ctx context.Context, filename string, content []byte) ([]Flag, error)
}

Graceful handling of parse errors. Tree-sitter produces partial trees for files with syntax errors. The detector should work with partial trees rather than failing entirely on malformed files.

How FlagShark uses tree-sitter

When regex is still appropriate

Despite the strong case for tree-sitter, regex has legitimate uses in flag management:

One-off searches. If you need to quickly check whether a specific flag name appears in the codebase, grep -rn "my-flag-name" is faster to type than building a tree-sitter query. The results may include comments and strings, but for a quick check, that is acceptable.
Flag name audits. Searching for flag name patterns (naming convention violations, deprecated prefixes) does not require syntactic understanding. Regex is adequate.
Log analysis. Searching application logs for flag evaluation events is a text search problem, not a code parsing problem.
Simple single-language codebases. If your entire codebase is one language with one flag SDK and minimal comments, regex's accuracy may be sufficient.

Feature Flags in .NET/C#: Managing Flags in Enterprise Applications

A complete guide to feature flags in .NET and C#. From Microsoft.FeatureManagement to LaunchDarkly SDK, plus testing patterns and cleanup strategies for enterprise teams.

February 2, 2026·16 min read

Feature Flags in Rust: Safe Flag Management for Systems Programming

How to implement, test, and clean up feature flags in Rust applications. Leverage Rust's type system for safer flag management with patterns for compile-time and runtime flags.

January 25, 2026·15 min read

Feature Flags in Java: Best Practices for Spring Boot Teams

A comprehensive guide to implementing, testing, and cleaning up feature flags in Java Spring Boot applications. Patterns and anti-patterns every enterprise team should know.

January 24, 2026·14 min read

View all articles

The naive approach: Regex for flag detection

Where regex detection breaks

Problem 1: Multiline method calls

Problem 2: Variable-assigned flag keys

Problem 3: String interpolation and concatenation

Problem 4: Comments and strings that look like code

Problem 5: Wrapper functions and aliases

Problem 6: Language syntax variations

The regex accuracy problem in practice

How tree-sitter works

Parsing, not pattern matching

Tree-sitter queries

Incremental parsing and performance

Tree-sitter for flag detection: Solving regex's failures

Multiline calls: Solved by structural parsing

Comments and strings: Solved by node types

Variable-assigned keys: Partially solved with tree traversal

Wrapper functions: Solved with configurable detection

Multi-language support: Solved with grammar-per-language

Accuracy comparison: Regex vs. tree-sitter on real codebases

Performance at CI scale

Building a tree-sitter-based flag detector

Architecture overview

Key design decisions

How FlagShark uses tree-sitter

When regex is still appropriate

More articles

Feature Flags in .NET/C#: Managing Flags in Enterprise Applications

Feature Flags in Rust: Safe Flag Management for Systems Programming

Feature Flags in Java: Best Practices for Spring Boot Teams

The naive approach: Regex for flag detection

Where regex detection breaks

Problem 1: Multiline method calls

Problem 2: Variable-assigned flag keys

Problem 3: String interpolation and concatenation

Problem 4: Comments and strings that look like code

Problem 5: Wrapper functions and aliases

Problem 6: Language syntax variations

The regex accuracy problem in practice

How tree-sitter works

Parsing, not pattern matching

Tree-sitter queries

Incremental parsing and performance

Tree-sitter for flag detection: Solving regex's failures

Multiline calls: Solved by structural parsing

Comments and strings: Solved by node types

Variable-assigned keys: Partially solved with tree traversal

Wrapper functions: Solved with configurable detection

Multi-language support: Solved with grammar-per-language

Accuracy comparison: Regex vs. tree-sitter on real codebases

Performance at CI scale

Building a tree-sitter-based flag detector

Architecture overview

Key design decisions

How FlagShark uses tree-sitter

When regex is still appropriate

More articles

Feature Flags in .NET/C#: Managing Flags in Enterprise Applications

Feature Flags in Rust: Safe Flag Management for Systems Programming

Feature Flags in Java: Best Practices for Spring Boot Teams