Skip to content

feat(run-ops): webapp db topology, flags, and split-mode resolver wiring#4117

Draft
d-cs wants to merge 9 commits into
runops/pr04-store-enginefrom
runops/pr05-webapp-foundation
Draft

feat(run-ops): webapp db topology, flags, and split-mode resolver wiring#4117
d-cs wants to merge 9 commits into
runops/pr04-store-enginefrom
runops/pr05-webapp-foundation

Conversation

@d-cs

@d-cs d-cs commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

What

Wires the run-ops split into the webapp: database topology, environment flags, split-mode gating, and the control-plane resolver/cache layer that the run-store and run-engine seams from the previous PR plug into.

  • DB topology & env (apps/webapp/app/db.server.ts, env.server.ts, entry.server.tsx): adds the run-ops database clients/topology and the environment variables that configure and gate the split.
  • runOpsMigration module (new apps/webapp/app/v3/runOpsMigration/): the webapp-side machinery — splitMode.server.ts, controlPlaneResolver.server.ts + controlPlaneCache.server.ts, readThrough.server.ts, crossSeamGuard.server.ts, distinctDbSentinel.server.ts, id-minting helpers (mintBatchFriendlyId, runOpsMintKind, resolveInheritedMintKind), runOpsCascadeCleanup.server.ts, the split read gate, and route/unblock catalogs.
  • Store/engine wiring (app/v3/runStore.server.ts, runEngine.server.ts, runEngineHandlers.server.ts + new runEngineHandlersShared.server.ts): points the webapp's store/engine construction at the resolver, and factors shared handler logic out so both seams use one path.
  • Read-path touch-ups: runtimeEnvironment.server.ts, eventRepository/index.server.ts, taskRunHeartbeatFailed.server.ts, engineVersion.server.ts route their run/environment lookups read-through the resolver.
  • 413a94511 — interlocks split mode against the native realtime backend so the two aren't enabled in an incompatible combination (see .server-changes/run-ops-split-realtime-interlock.md).
  • dc74c57fd — drops the earlier "known-migrated" read layer; residency is determined by id-shape only.

Why

PR5 of the run-ops split stack. This is the webapp foundation layer: it stands up the DB topology, flags, and resolver/cache the rest of the stack depends on, and repoints webapp read paths through the resolver. Additive when the split is not enabled (existing single-DB behavior preserved behind flags); behavior-changing on the read-through paths and the realtime interlock.

Tests

New vitest coverage across apps/webapp/test/ and colocated *.server.test.ts files: db topology, split mode, split read gate, cross-seam guard, mint cutover / flip latency, control-plane cache, control-plane resolver, distinct-db sentinel, read-through loaders (route loaders, run-detail loaders, findEnvironmentFromRun), and the run-engine handlers. Testcontainers-backed; no mocks. pnpm-lock.yaml synced for the two new webapp deps.

Notes

Draft, stacked on #4116 (runops/pr04-store-engine). Review that first; this diff is against it.

Server-change / changeset note to be added at stack-assembly time.

🤖 Generated with Claude Code

@changeset-bot

changeset-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 8024e36

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 728cb38b-0e8a-4f00-bba5-f98ecad0c214

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch runops/pr05-webapp-foundation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

Open in Devin Review

Comment thread apps/webapp/app/v3/runEngineHandlersShared.server.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (10)
apps/webapp/test/v3/runOpsMigration/runEngineControlPlaneResolver.server.test.ts (1)

121-200: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Missing test coverage for resolveAuthenticatedEnv.

This suite covers resolveEnv, resolveWorkerVersion, and assertEnvExists, but not resolveAuthenticatedEnv — the one method in the adapter with custom (uncached) query logic. Adding a test asserting it resolves the authenticated shape (including git) and returns null for a missing env would close the gap and would likely have caught the cache-bypass behavior flagged in runEngineControlPlaneResolver.server.ts.

apps/webapp/app/v3/taskRunHeartbeatFailed.server.ts (1)

56-64: 🚀 Performance & Scalability | 🔵 Trivial | 💤 Low value

Optional: defer lockedWorker resolution to where it's used.

env is needed for the early-return guard on every path, but lockedWorker is only consumed in the completed-status branch (line 158). Resolving it here adds a resolver call (a DB read under split-OFF) for PENDING/DEQUEUED/EXECUTING/etc. that never use it. Moving the resolveRunLockedWorker call into that branch avoids the unnecessary lookup.

apps/webapp/app/v3/runOpsMigration/runOpsCascadeCleanup.server.ts (2)

118-275: 🩺 Stability & Availability | 🔵 Trivial

No logging/audit trail for a destructive cross-DB cleanup.

RunOpsCascadeCleanupService deletes rows across multiple writers with no logger calls anywhere (compare to other .server.ts files in this codebase that log around risky operations). Given deletes are non-transactional and rely on "recovered by re-running" on crash, structured logging of the per-table counts returned by cleanupEnvironment/cleanupProject (or at least on entry/exit) would materially help operators diagnose partial-cleanup states and stray rows after a crash.


1-275: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

No colocated test coverage for this destructive service.

This file isn't accompanied by a test in this cohort. Given it performs unconditional deleteMany cascades across control-plane and run-ops writers (and is deliberately not gated behind isSplitEnabled()), unit/integration coverage (e.g., verifying ordering doesn't throw, per-table counts, dedup-by-reference in single-DB mode) seems warranted before this ships. Want me to draft a Testcontainers-based test using the existing @internal/testcontainers helpers?

apps/webapp/app/v3/runOpsMigration/runOpsMintKind.flipLatency.test.ts (1)

13-25: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Test reimplements the cache wiring instead of exercising resolveRunIdMintKind directly.

makeCachedFlag duplicates resolveRunIdMintKind's internal mintCache.get/set logic rather than importing and calling the real function. If the production caching wiring (key derivation, TTL source, override-forwarding) changes, this suite would keep passing against its own copy without catching the drift. Given the comment explicitly frames this as a documented "current-behavior lock," this is acceptable as-is, but consider testing resolveRunIdMintKind itself (with env vars and $replica mocked) if higher-fidelity coverage is later desired.

apps/webapp/app/v3/runOpsMigration/types.ts (1)

11-25: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Use type instead of interface for these data shapes.

CrossSeamGuardInput and CrossSeamGuardDecision are plain data shapes, not behavioral contracts implemented by collaborators, so per project convention they should be type aliases.

♻️ Suggested fix
-export interface CrossSeamGuardInput {
+export type CrossSeamGuardInput = {
   waitpointId: string;
   routeKind: UnblockRouteKind;
   treeOwnerResidency?: RunOpsResidency;
   isCrossTreeIdempotency?: boolean;
   hasLegacyParent?: boolean;
-}
+};

-export interface CrossSeamGuardDecision {
+export type CrossSeamGuardDecision = {
   store: StoreTarget;
   /** Always the waitpoint's OWN classification, even when pinned to legacy. */
   residency: RunOpsResidency;
   routeKind: UnblockRouteKind;
   pinnedReason?: "non-tree-owned" | "cross-tree-idempotency" | "legacy-parent-descendant";
-}
+};

As per coding guidelines: "Use types over interfaces for TypeScript".

Source: Coding guidelines

apps/webapp/app/v3/runOpsMigration/unblockRouteCatalog.ts (1)

10-17: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Use type instead of interface for UnblockRoute.

UnblockRoute is a plain data shape (route metadata record), not a behavioral contract.

♻️ Suggested fix
-export interface UnblockRoute {
+export type UnblockRoute = {
   id: string;
   kind: UnblockRouteKind;
   /** The relative source path, e.g. "internal-packages/run-engine/src/engine/index.ts". */
   site: string;
   /** Enclosing method/symbol name — NEVER a line number. */
   symbol: string;
-}
+};

As per coding guidelines: "Use types over interfaces for TypeScript".

Source: Coding guidelines

apps/webapp/app/v3/runOpsMigration/crossSeamGuard.server.ts (1)

11-17: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

KNOWN_ROUTE_KINDS duplicates the UnblockRouteKind union.

This set must be manually kept in sync with UnblockRouteKind in types.ts; adding a new kind there without updating this set silently makes assertKnownRouteKind reject a valid kind (or vice versa if forgotten here). Consider deriving this from the type via a satisfies-checked array, or centralizing the literal list in types.ts and building both the union and the set from it.

apps/webapp/app/v3/runOpsMigration/readThrough.server.test.ts (1)

41-153: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add coverage for the unclassifiable-id → LEGACY fallback.

No test exercises the UnclassifiableRunId catch branch in readThrough.server.ts (defaults to "LEGACY" + logger.warn), which is a materially different behavior from crossSeamGuard.server.ts's loud rethrow for the same error type. A regression here would go undetected.

✅ Suggested additional test
heteroPostgresTest(
  "ambiguous-length id falls back to LEGACY residency (new-then-legacy probe)",
  async ({ prisma14, prisma17 }) => {
    const AMBIGUOUS_RUN_ID = "run_" + "a".repeat(10); // neither cuid nor ksuid length
    const warn = vi.fn();

    const result = await readThroughRun({
      runId: AMBIGUOUS_RUN_ID,
      environmentId: "env_1",
      readNew: (c) => realRead(c, false),
      readLegacy: (c) => realRead(c, true),
      deps: {
        splitEnabled: true,
        newClient: prisma17 as unknown as PrismaReplicaClient,
        legacyReplica: prisma14 as unknown as PrismaReplicaClient,
        logger: { warn },
      },
    });

    expect(result.source).toBe("legacy-replica");
    expect(warn).toHaveBeenCalled();
  }
);
apps/webapp/app/v3/runEngineHandlers.server.ts (1)

211-211: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use a named sentinel constant instead of scattered "" literals.

readRunForEventOrThrow/readRunForEvent are called with a bare "" placeholder for environmentId in three places (runAttemptFailed, and twice in cachedRunCompleted). As per coding guidelines: "Use named constants for sentinel/placeholder values (for example, const UNSET_VALUE = "__unset__") instead of scattering raw string literals across comparisons."

♻️ Suggested fix
// near eventReadDeps definition
const NO_ENVIRONMENT_ID = ""; // residency is keyed on runId; environmentId is informational only

Then replace the three "" call-site literals with NO_ENVIRONMENT_ID.

Also applies to: 299-299, 330-330

Source: Coding guidelines


ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 548c81c5-37b0-432b-b866-589a9d484ed6

📥 Commits

Reviewing files that changed from the base of the PR and between 88d1290 and 413a945.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (51)
  • .server-changes/run-ops-split-realtime-interlock.md
  • apps/webapp/CLAUDE.md
  • apps/webapp/app/db.server.ts
  • apps/webapp/app/entry.server.tsx
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/models/runtimeEnvironment.server.ts
  • apps/webapp/app/v3/engineVersion.server.ts
  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/v3/featureFlags.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/v3/runEngineHandlersShared.server.ts
  • apps/webapp/app/v3/runOpsMigration/controlPlaneCache.server.test.ts
  • apps/webapp/app/v3/runOpsMigration/controlPlaneCache.server.ts
  • apps/webapp/app/v3/runOpsMigration/controlPlaneResolver.server.ts
  • apps/webapp/app/v3/runOpsMigration/crossSeamGuard.server.ts
  • apps/webapp/app/v3/runOpsMigration/distinctDbSentinel.server.ts
  • apps/webapp/app/v3/runOpsMigration/mintBatchFriendlyId.server.test.ts
  • apps/webapp/app/v3/runOpsMigration/mintBatchFriendlyId.server.ts
  • apps/webapp/app/v3/runOpsMigration/readThrough.server.test.ts
  • apps/webapp/app/v3/runOpsMigration/readThrough.server.ts
  • apps/webapp/app/v3/runOpsMigration/resolveInheritedMintKind.server.test.ts
  • apps/webapp/app/v3/runOpsMigration/resolveInheritedMintKind.server.ts
  • apps/webapp/app/v3/runOpsMigration/runEngineControlPlaneResolver.server.ts
  • apps/webapp/app/v3/runOpsMigration/runOpsCascadeCleanup.server.ts
  • apps/webapp/app/v3/runOpsMigration/runOpsMintKind.flipLatency.test.ts
  • apps/webapp/app/v3/runOpsMigration/runOpsMintKind.server.test.ts
  • apps/webapp/app/v3/runOpsMigration/runOpsMintKind.server.ts
  • apps/webapp/app/v3/runOpsMigration/runOpsSplitReadGate.ts
  • apps/webapp/app/v3/runOpsMigration/splitMode.server.ts
  • apps/webapp/app/v3/runOpsMigration/types.ts
  • apps/webapp/app/v3/runOpsMigration/unblockRouteCatalog.ts
  • apps/webapp/app/v3/runStore.server.test.ts
  • apps/webapp/app/v3/runStore.server.ts
  • apps/webapp/app/v3/taskRunHeartbeatFailed.server.ts
  • apps/webapp/package.json
  • apps/webapp/test/findEnvironmentFromRun.readthrough.test.ts
  • apps/webapp/test/routeLoaders.controlPlane.readthrough.test.ts
  • apps/webapp/test/runDetailLoaders.controlPlane.readthrough.test.ts
  • apps/webapp/test/runEngineHandlers.test.ts
  • apps/webapp/test/runOpsCrossSeamGuard.test.ts
  • apps/webapp/test/runOpsDbTopology.test.ts
  • apps/webapp/test/runOpsMintCutover.test.ts
  • apps/webapp/test/runOpsSplitMode.test.ts
  • apps/webapp/test/runOpsSplitReadGate.test.ts
  • apps/webapp/test/services.controlPlane.readthrough.test.ts
  • apps/webapp/test/v3/runOpsMigration/controlPlaneRepoint.server.test.ts
  • apps/webapp/test/v3/runOpsMigration/controlPlaneResolver.server.test.ts
  • apps/webapp/test/v3/runOpsMigration/distinctDbSentinel.server.test.ts
  • apps/webapp/test/v3/runOpsMigration/runEngineControlPlaneResolver.server.test.ts
  • apps/webapp/vitest.config.ts

Comment thread apps/webapp/app/db.server.ts
Comment thread apps/webapp/app/entry.server.tsx
Comment thread apps/webapp/app/env.server.ts
Comment thread apps/webapp/app/v3/eventRepository/index.server.ts
Comment thread apps/webapp/app/v3/runOpsMigration/controlPlaneResolver.server.ts
Comment thread apps/webapp/app/v3/runOpsMigration/controlPlaneResolver.server.ts
Comment thread apps/webapp/app/v3/runOpsMigration/crossSeamGuard.server.ts Outdated
Comment thread apps/webapp/app/v3/runOpsMigration/distinctDbSentinel.server.ts
Comment thread apps/webapp/test/runDetailLoaders.controlPlane.readthrough.test.ts
@d-cs d-cs force-pushed the runops/pr04-store-engine branch from 88d1290 to d5610a9 Compare July 2, 2026 18:02
@d-cs d-cs force-pushed the runops/pr05-webapp-foundation branch from 413a945 to 99643f8 Compare July 2, 2026 18:02
@pkg-pr-new

pkg-pr-new Bot commented Jul 2, 2026

Copy link
Copy Markdown

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@e0b35d5

trigger.dev

npm i https://pkg.pr.new/trigger.dev@e0b35d5

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@e0b35d5

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@e0b35d5

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@e0b35d5

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@e0b35d5

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@e0b35d5

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@e0b35d5

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@e0b35d5

commit: e0b35d5

@d-cs d-cs force-pushed the runops/pr04-store-engine branch from 460477f to 1af2bab Compare July 2, 2026 19:25
@d-cs d-cs force-pushed the runops/pr05-webapp-foundation branch from 26871d5 to cdc4eb9 Compare July 2, 2026 19:25
d-cs and others added 7 commits July 2, 2026 21:21
…ngine wiring

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ape only

Migration/drain is deferred, so residency is decided purely by id-shape
(ownerEngine): 25-char cuid -> LEGACY, 27-char ksuid -> NEW, unclassifiable
-> LEGACY. This is behavior-preserving in production, which never injected a
custom isKnownMigrated and, with no migration, always saw the default false.

- delete knownMigratedFilter.server.ts + its test
- readThrough: drop the isKnownMigrated dep + migrated short-circuit; KEEP the
  unclassifiable->LEGACY new-then-legacy fallback
- resolveInheritedMintKind: collapse to pure ownerEngine id-shape (no deps)
- mintBatchFriendlyId: drop isKnownMigrated/isSplitEnabled from ResolveDeps
- runEngineHandlersShared: drop isKnownMigrated from EventReadDeps/readRunForEvent
  (batch-write residency probe via newReplica.batchTaskRun.findFirst is untouched)
- tests: delete injected-marker cases, keep pure id-shape assertions

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eration labels

Add a pure unit test for ControlPlaneCache covering per-slot round-trips,
null-vs-miss distinction, epoch-based invalidation, per-slot key isolation,
bounded eviction, and TTL expiry. Add a testcontainer test for
probeDistinctDatabases covering distinct clusters, same physical database
(with reason), same-cluster-different-database, and fail-closed probe failure.

Strip developer-enumeration labels from three existing test files (readThrough
step numbers, runEngineHandlers Test-X comments) and rename the run-detail
loader read-through test to drop the non-domain "shape 1" name.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… deps

apps/webapp/package.json declares @internal/run-ops-database (workspace) and
@testcontainers/postgresql but the lockfile importer entry was never regenerated,
so pnpm install --frozen-lockfile fails for the webapp. Regenerate the importer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Enabling RUN_OPS_SPLIT_ENABLED without REALTIME_BACKEND_NATIVE_ENABLED
silently breaks realtime: Electric replicates only from the control-plane
DB, so NEW-resident (ksuid) runs on the dedicated run-ops DB are invisible
and every realtime subscription hangs.

Add a boot-time interlock that refuses split mode in that misconfiguration,
mirroring the existing distinct-DB data-loss sentinel. The check is a pure
predicate (assertSplitRealtimeInterlock) run synchronously inside
assertRunOpsSplitSentinel on the same eager-boot path, failing fast before
the async DB probe and before any run-ops routing is wired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n diagnostics

- gate runOpsTopology splitEnabled on RUN_OPS_SPLIT_ENABLED so provisioning
  both DSNs before flipping the flag cannot open a second pool or route writes
  ahead of the distinct-DB sentinel
- rethrow the original UnclassifiableRunId in the cross-seam guard so its
  value/valueLength keep reflecting the real waitpoint id
- log run-found-but-environment-unresolved distinctly from missing-run
- correct the RUN_OPS_DATABASE_URL doc comment (Prisma datasource, not the
  webapp runtime pool)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@d-cs d-cs force-pushed the runops/pr05-webapp-foundation branch from cdc4eb9 to e0b35d5 Compare July 2, 2026 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant