2025-12-15 | PreviewProof Team
Feature Flags for Preview Environments: Scoping, Defaults, and Cleanup
Feature flags and preview environments interact in ways that don’t show up in any vendor’s docs and break a lot of demos. A flag in production state during development gives every preview a production-like view — meaning the feature the PR is supposed to demonstrate is hidden behind the flag that’s still off. A flag scoped to specific user IDs won’t behave correctly when the preview uses synthetic users. Flag toggles during review pollute the next reviewer’s experience. And flags created during a PR’s life often outlive the PR, accumulating into a graveyard nobody trusts.
Most teams treat flags as a global system. Previews are not global. They need their own scope.
What Goes Wrong
Three failure modes recur:
The invisible feature. A PR adds a new dashboard widget gated behind dashboard_v2_widgets, off in production. The preview inherits production’s flag state, so the widget is off there too. The reviewer sees the old dashboard, approves, the code ships. A week later when the flag is flipped, the widget is broken in a way that was visible in the preview the whole time — except nobody saw it.
The phantom user. Flags scoped to user IDs (user_id in [12, 47, 89]) work fine in production. The preview’s seeded data uses synthetic user IDs that don’t match. The evaluator falls back to the default, and the reviewer sees behavior that has no relationship to what production users see.
The pollution. A reviewer toggles a flag to test “what if this is on.” Three minutes later, a different reviewer opens the same preview and sees the toggled state. Their review is now invalid, and they don’t know it.
The Pattern: Per-Preview Flag Scopes
Treat each preview as its own flag scope. Flags can be set independently per preview, defaults are computed at provision time, and teardown removes the scope.
LaunchDarkly has environments as a first-class concept, but the cleaner pattern is one preview environment plus contexts. Every flag evaluation includes a preview_id context attribute; targeting rules use it.
const ldContext = { kind: 'multi', user: { key: currentUser.id }, preview: { key: process.env.PREVIEW_ID, branch: process.env.GIT_BRANCH },};const value = ldClient.variation('dashboard_v2_widgets', ldContext, false);A per-preview targeting rule, set at provision via the LD API, overrides the flag for that preview_id.
Unleash supports environments and constraints; OSS handles per-preview scoping via custom strategies. ConfigCat uses user attributes the same way as LaunchDarkly contexts. Flipt is self-hosted; use namespaces per preview rather than instances.
Default-On for the Feature Under Review
Once you have per-preview scoping, make the feature being previewed default-on automatically. The signal is in the PR: branch name, title, or — better — an explicit block in the description.
## Flags- dashboard_v2_widgets: on- legacy_billing: offA small bot reads it and configures the flag scope at provision time:
const flagOverrides = parseFromPR(pr.body);for (const [flag, value] of Object.entries(flagOverrides)) { await flagClient.setOverride({ flag, context: { preview: pr.previewId }, value, });}This single change fixes the most common reason previews “don’t work” — the reviewer sees the feature on first load, not after digging through a flag console nobody set up access to.
A weaker variant: convention. If the branch starts with feat/dashboard-v2-, the provisioner enables every flag matching dashboard_v2_*. Brittle, but better than nothing.
Default-Off for Unrelated Flags
Unrelated flags should default to off in previews, not “whatever production is.” Counterintuitive — most teams want previews to look like production — but consider what you actually want to verify.
The point of a preview is to see whether this PR’s change works. If unrelated flags are on and the new feature interacts buggily with them, you don’t know whether the bug is in the new feature or in the interaction. Defaulting unrelated flags off gives a clean baseline. Reviewers flip flags on intentionally to test interactions; the default is “everything off except what this PR touches.” Inverse of how most teams set up flag environments. Feels wrong for ten minutes, then feels obviously correct.
Build-Time vs Runtime Flags
Like environment variables in previews, flags split between build-time and runtime evaluation, and the build-time variety causes preview-specific problems.
Some flag systems bake values into the bundle at build time. Vercel’s flags SDK does this; LaunchDarkly’s edge SDK does too. Convenient — until you realize previews need preview-specific values baked in, meaning rebuilds per preview rather than reusing a single image.
Cleanest workaround: don’t evaluate at build time in previews. Use the runtime SDK, accept the small performance cost. Production can still use build-time eval for hot paths.
Cleanup
Flags created for a feature usually live longer than the feature. In preview-aware setups the problem multiplies: every PR adds flags to your scope, and unless you clean up, the system grows monotonically.
Mark preview-only flags at creation. A naming convention (preview_dashboard_v2_widgets) or a tag distinguishes preview flags from permanent ones. When PRs merge or close, run a cleanup that removes preview-only flags.
Tear down the scope when the preview tears down. Context attributes, environments, namespaces — whatever you used to scope flags should be deleted with the preview. One line in your teardown hook:
await flagClient.deleteContext({ preview: pr.previewId });await flagClient.deleteOverridesFor(pr.previewId);Reviewer-Facing Flag UI
A UI in the preview itself that shows flag state and lets reviewers toggle flags scoped to that preview only. Reviewers don’t need access to your production console; they get a small admin panel exposing only the preview’s overrides.
Implementation is straightforward — an admin route gated behind preview-only auth that wraps the flag SDK’s override API. The benefit: reviewer experimentation no longer pollutes other previews. They flip a flag, the override is scoped to this preview, and the change disappears with it. It also helps with audit if you record each toggle alongside the reviewer’s identity.
If you’d rather not wire up per-preview flag scoping, default management, and teardown cleanup yourself, PreviewProof integrates with the major flag platforms and handles per-preview scope provisioning and reviewer-facing toggle UI as part of the preview lifecycle. If you’re building it yourself, the patterns above are the ones we’d recommend — and the cleanup discipline is the part most teams skip until the flag graveyard makes itself visible.