feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle by d-cs · Pull Request #4118 · triggerdotdev/trigger.dev

d-cs · 2026-07-02T17:39:05Z

What

Routes the webapp write path through the run-ops split seam: trigger/batch minting, idempotency-key resolution, and the run-lifecycle services now determine residency and dispatch writes to the correct store.

Trigger & batch (runEngine/services/triggerTask.server.ts, batchTrigger.server.ts, createBatch.server.ts, streamBatchItems.server.ts, v3/services/batchTriggerV3.server.ts): mint ids with the run-ops-aware minting and route creation/streaming through the store; batch children inherit the parent's residency.
Idempotency (runEngine/concerns/idempotencyKeys.server.ts + new idempotencyResidency.server.ts): idempotency-key lookup/dedup is residency-aware so a keyed retrigger resolves against the store that owns the original run.
Run lifecycle services (createCheckpoint, createTaskRunAttempt, enqueueDelayedRun, expireEnqueuedRun, finalizeTaskRun, resumeBatchRun, cancelDevSessionRuns, executeTasksWaitingForDeploy, triggerFailedTask): resolve their target run through the store rather than a fixed client.
Reads that fan out from writes (runsRepository + clickhouseRunsRepository, BulkActionV2 + batch read-through, realtime sessions/runReader, alerts deliverAlert/performTaskRunAlerts): route through the read-through resolver.
9535ae63d — resolves the parent run through an injectable run store in TriggerFailedTaskService.
bf8f7c881 — drops the "known-migrated" concept from write-path and read repos; residency is id-shape only.
515b897ea — self-defaults resolveWaitpointThroughReadThrough to the safe run-ops clients.

Why

PR6 of the run-ops split stack. This is the write-path counterpart to the read foundation in the previous PRs: with it in place, both reads and writes route through the seam. Additive when the split is disabled (id-shape resolution collapses to the control-plane client); behavior-changing on the minting, idempotency, and lifecycle paths when enabled.

Tests

Large new/expanded vitest suite under apps/webapp/test/ and colocated service tests: trigger-task and batch-trigger store routing, residency inheritance, idempotency dedup residency + legacy-authority, bulk-action read routing, cancel-dev-session routing, alerts store routing, runs-repository read-through, realtime session/run-reader read-through and stream-registration routing, and the waitpoint read-through default. Testcontainers-backed; no mocks.

Notes

Draft, stacked on #4117 (runops/pr05-webapp-foundation). Review that first; this diff is against it.

Server-change / changeset note to be added at stack-assembly time.

🤖 Generated with Claude Code

changeset-bot · 2026-07-02T17:39:09Z

⚠️ No Changeset found

Latest commit: aa55b6b

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-07-02T17:39:15Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 20c6d1a3-4024-44ce-843c-9b22910baac1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

This PR migrates numerous webapp services from direct Prisma queries to a RunStore/controlPlaneResolver abstraction, enabling split reads/writes between legacy and new ("run-ops") Postgres databases. It introduces residency-aware idempotency dedup client resolution, waitpoint read-through resolution, and mint-kind-based friendly ID generation (ksuid vs cuid) for run, batch, and failed-task creation that preserves parent/child residency lineage. Trigger, batch trigger, alert, session serialization, run listing, and cancellation services are updated to resolve environments and locked workers via controlPlaneResolver instead of eager Prisma includes, and to read/write batch and run records through RunStore methods rather than direct Prisma calls. Extensive new integration tests validate store routing behavior across heterogeneous Postgres containers (legacy vs new), residency inheritance, and passthrough single-database scenarios.

Changes

Area	Change
Idempotency	New `resolveIdempotencyDedupClient` resolves dedup client by split mode/mint kind/parent residency; wired into `IdempotencyKeyConcern`
Waitpoints	New `resolveWaitpointThroughReadThrough` helper for read-through waitpoint lookups
Trigger services	`RunEngineTriggerTaskService` and `TriggerFailedTaskService` mint friendly IDs (ksuid/cuid) based on residency inheritance; `RunStore` injectable
Batch services	`RunEngineBatchTriggerService`, `CreateBatchService`, `StreamBatchItemsService`, `BatchTriggerV3Service` migrated to `runStore`/`controlPlaneResolver` for batch/item persistence and friendly-id minting
Lifecycle services	`createTaskRunAttempt`, `enqueueDelayedRun`, `expireEnqueuedRun`, `executeTasksWaitingForDeploy`, `finalizeTaskRun`, `deliverAlert`, `performTaskRunAlerts`, `resumeBatchRun`, `createCheckpoint`, `cancelDevSessionRuns` resolve environment/lock via `controlPlaneResolver`/`runStore`
Reads/repositories	`runsRepository`, `clickhouseRunsRepository`, `sessions.server`, `dashboardAgent`, `runReader`, `BulkActionV2` use runStore-based hydration/read-through seams
Project deletion	`deleteProject.server.ts` invokes `RunOpsCascadeCleanupService.cleanupProject`
Tests	Extensive new unit and heterogeneous-database integration tests validating store routing, residency inheritance, and passthrough behavior

Related PRs: None identified.

Suggested labels: run-engine, database, testing

Suggested reviewers: matt-aitken, ericallam

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is useful but does not follow the required template and is missing Closes, checklist, Changelog, and Screenshots sections.	Add the required template sections (Closes #..., checklist, Changelog, Screenshots) and keep the current What/Why/Tests summary.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is concise and matches the main write-path routing changes in the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch runops/pr06-write-path

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

devin-ai-integration

Devin Review found 1 potential issue.

pkg-pr-new · 2026-07-02T18:06:10Z

Open in StackBlitz

@trigger.dev/build

npm i https://pkg.pr.new/@trigger.dev/build@a8068e9

trigger.dev

npm i https://pkg.pr.new/trigger.dev@a8068e9

@trigger.dev/core

npm i https://pkg.pr.new/@trigger.dev/core@a8068e9

@trigger.dev/python

npm i https://pkg.pr.new/@trigger.dev/python@a8068e9

@trigger.dev/react-hooks

npm i https://pkg.pr.new/@trigger.dev/react-hooks@a8068e9

@trigger.dev/redis-worker

npm i https://pkg.pr.new/@trigger.dev/redis-worker@a8068e9

@trigger.dev/rsc

npm i https://pkg.pr.new/@trigger.dev/rsc@a8068e9

@trigger.dev/schema-to-json

npm i https://pkg.pr.new/@trigger.dev/schema-to-json@a8068e9

@trigger.dev/sdk

npm i https://pkg.pr.new/@trigger.dev/sdk@a8068e9

commit: a8068e9

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts (1)
245-294: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

blockRunWithWaitpoint still writes via this.prisma, not the resolved dedupClient.

dedupClient (computed above at line 155-174) is derived using parentRunFriendlyId: request.body.options?.parentRunId — the exact same parentRunId used here to block the parent run's waitpoint (line 246, 279). dedupClient is precisely the client that owns this parent run's residency, yet the transaction at line 290 still passes tx: this.prisma, the (possibly wrong) fallback client.

If split mode is enabled and the parent run resides on the "new" store while this.prisma targets the legacy store (or vice versa), this write would target the wrong database, failing to find the parent run row or silently writing state to a store that doesn't own it — contradicting the PR's core objective of routing writes to the store that owns the target run.
🐛 Proposed fix
             await this.engine.blockRunWithWaitpoint({
               runId: RunId.fromFriendlyId(parentRunId),
               waitpoints: associatedWaitpoint!.id,
               spanIdToComplete: spanId,
               batch: request.options?.batchId
                 ? {
                     id: request.options.batchId,
                     index: request.options.batchIndex ?? 0,
                   }
                 : undefined,
               projectId: request.environment.projectId,
               organizationId: request.environment.organizationId,
-              tx: this.prisma,
+              tx: dedupClient,
             });

🧹 Nitpick comments (8)

apps/webapp/app/runEngine/concerns/resolveWaitpointThroughReadThrough.server.ts (1)

44-49: 🚀 Performance & Scalability | 🔵 Trivial | 💤 Low value

Consider forwarding logger/onLegacyReplicaRead for parity with other read-through consumers.

ReadThroughDeps supports logger and onLegacyReplicaRead (saturation-signal hook), but ResolveWaitpointDeps/this wrapper drop both, so legacy-replica reads for waitpoints won't emit the saturation signal that other read-through call sites presumably rely on for monitoring split-read health.
apps/webapp/app/runEngine/services/batchTrigger.server.ts (2)
92-99: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Discards the minted id and re-derives it via BatchId.fromFriendlyId.

mintBatchFriendlyId returns { id, friendlyId }, but only friendlyId is kept here; id is recomputed later via BatchId.fromFriendlyId(batchId) (lines 175, 266). createBatch.server.ts uses the returned id directly instead of re-deriving it. Functionally likely equivalent if BatchId.fromFriendlyId is a lossless decode, but it's redundant work and an inconsistency between two services from the same PR doing the same job.
♻️ Proposed consistency fix
-          const { friendlyId } = await mintBatchFriendlyId({
+          const { id, friendlyId } = await mintBatchFriendlyId({
             environment: {
               organizationId: environment.organizationId,
               id: environment.id,
               orgFeatureFlags: environment.organization.featureFlags,
             },
             parentRunFriendlyId: body.parentRunId,
           });
Then thread id through to #createAndProcessBatchTaskRun and use it directly instead of BatchId.fromFriendlyId(batchId).
Please confirm BatchId.fromFriendlyId reliably reconstructs the same id for both ksuid- and cuid-shaped friendly ids before treating this purely as a style nit.

Also applies to: 169-184, 265-275

359-374: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Missing batch is silently ignored — no log emitted.

When findBatchTaskRunById returns nothing, the function returns silently, unlike the environment miss two lines below which logs an error. Given store-routing bugs could make a batch invisible from the wrong store, this failure mode deserves the same observability.
🔍 Proposed fix
     const batch = await this._engine.runStore.findBatchTaskRunById(options.batchId);

     if (!batch) {
+      logger.error("[RunEngineBatchTrigger][processBatchTaskRun] Batch not found", {
+        options,
+      });
       return;
     }
apps/webapp/test/engine/streamBatchItems.test.ts (1)

655-662: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Repeated PostgresRunStore wiring across 7 test cases.

The same 4-line block assigning engine.runStore = new PostgresRunStore({ prisma: racingPrisma, readOnlyPrisma: racingPrisma }) appears 7 times. A small helper (e.g. attachRacingRunStore(engine, racingPrisma)) would reduce duplication and centralize any future changes to how the racing store is wired.

Also applies to: 787-794, 919-926, 1052-1059, 1272-1279, 1411-1418, 1600-1604
apps/webapp/app/v3/services/createCheckpoint.server.ts (1)
149-154: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Batch lookup correctly routed through RunStore.

Both WAIT_FOR_BATCH lookups now call runStore.findBatchTaskRunByFriendlyId(friendlyId, environmentId), matching the upstream contract that probes NEW then LEGACY sub-stores scoped by (friendlyId, environmentId). This correctly fixes the pre-existing gap where a raw single-DB Prisma query would miss a NEW-resident (ksuid) batch.

The identical 5-line lookup + comment block is duplicated at both call sites (149-154 and 364-369). Extracting a small private helper would remove the duplication and centralize any future changes to the routing logic.
♻️ Proposed extraction
+  // Routed by friendlyId so a ksuid (NEW-resident) batch is found on the owning DB;
+  // env-scoped to the dependent attempt's run (a batch shares its dependent's env).
+  private async findWaitForBatchRun(batchFriendlyId: string, environmentId: string) {
+    return this.runStore.findBatchTaskRunByFriendlyId(batchFriendlyId, environmentId);
+  }
Then replace both call sites with:
-        // Routed by friendlyId so a ksuid (NEW-resident) batch is found on the owning DB;
-        // env-scoped to the dependent attempt's run (a batch shares its dependent's env).
-        const batchRun = await this.runStore.findBatchTaskRunByFriendlyId(
-          reason.batchFriendlyId,
-          attempt.taskRun.runtimeEnvironmentId
-        );
+        const batchRun = await this.findWaitForBatchRun(
+          reason.batchFriendlyId,
+          attempt.taskRun.runtimeEnvironmentId
+        );
Also applies to: 364-369
apps/webapp/app/v3/services/executeTasksWaitingForDeploy.ts (1)

74-111: 🩺 Stability & Availability | 🔵 Trivial

Solid defense-in-depth split; consider a quarantine path for stuck NEW-resident runs.

The NEW/legacy split correctly prevents a control-plane updateMany/enqueue from touching runs it can't actually own. One residual risk: if a NEW-resident run keeps getting selected by findRuns (e.g. from a real misconfiguration), it will never transition out of WAITING_FOR_DEPLOY, so this job will re-log the same error and potentially keep rescheduling itself (via the runsWaitingForDeploy.length > maxCount reschedule) on every poll, indefinitely.

Consider adding a metric/alert-worthy signal or a way to skip re-selecting known-stuck NEW-resident runs.
apps/webapp/test/engine/triggerTask.test.ts (1)
2393-2402: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Redundant dynamic import() — generateKsuidId is already statically imported.

generateKsuidId is imported statically at the top of the file (line 33) and used directly a few lines later (line 2432: generateKsuidId()). The dynamic await import("@trigger.dev/core/v3/isomorphic") here is unnecessary and inconsistent with the static-import usage elsewhere in the same file.
♻️ Proposed fix
       const parentFriendlyId = RunId.toFriendlyId(
-        // 27-char ksuid → classifies NEW
-        (await import("`@trigger.dev/core/v3/isomorphic`")).generateKsuidId()
+        // 27-char ksuid → classifies NEW
+        generateKsuidId()
       );
As per coding guidelines: "Prefer static imports over dynamic import(), and only use dynamic imports for unresolved circular dependencies, genuine code-splitting needs, or conditional runtime loading."

Source: Coding guidelines
apps/webapp/test/idempotencyDedupResidency.test.ts (1)

45-103: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Duplicate seeding helpers across split-seam test files.

seedOrgProjectEnv/seedRun here are re-implemented nearly verbatim in apps/webapp/test/idempotencyKeyConcernLegacyAuthority.test.ts and apps/webapp/test/resetIdempotencyKeyLegacyAuthority.test.ts. Consider extracting a shared heteroPostgresTest fixture-seeding helper module as this residency test suite keeps growing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d490b21c-676d-49ba-83a2-67771105b181

📥 Commits

Reviewing files that changed from the base of the PR and between 413a945 and 515b897.

📒 Files selected for processing (50)

apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts
apps/webapp/app/runEngine/concerns/idempotencyResidency.server.test.ts
apps/webapp/app/runEngine/concerns/idempotencyResidency.server.ts
apps/webapp/app/runEngine/concerns/resolveWaitpointThroughReadThrough.server.ts
apps/webapp/app/runEngine/services/batchTrigger.server.ts
apps/webapp/app/runEngine/services/createBatch.server.ts
apps/webapp/app/runEngine/services/streamBatchItems.server.ts
apps/webapp/app/runEngine/services/triggerFailedTask.server.ts
apps/webapp/app/runEngine/services/triggerTask.server.test.ts
apps/webapp/app/runEngine/services/triggerTask.server.ts
apps/webapp/app/services/archiveBranch.server.ts
apps/webapp/app/services/dashboardAgent.server.ts
apps/webapp/app/services/deleteProject.server.ts
apps/webapp/app/services/realtime/runReader.server.ts
apps/webapp/app/services/realtime/sessions.server.ts
apps/webapp/app/services/runsRepository/clickhouseRunsRepository.server.ts
apps/webapp/app/services/runsRepository/runsRepository.server.ts
apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
apps/webapp/app/v3/services/alerts/performTaskRunAlerts.server.ts
apps/webapp/app/v3/services/batchTriggerV3.server.ts
apps/webapp/app/v3/services/bulk/BulkActionV2.batchReadThrough.server.test.ts
apps/webapp/app/v3/services/bulk/BulkActionV2.batchReadThrough.server.ts
apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
apps/webapp/app/v3/services/cancelDevSessionRuns.server.ts
apps/webapp/app/v3/services/createCheckpoint.server.ts
apps/webapp/app/v3/services/createTaskRunAttempt.server.ts
apps/webapp/app/v3/services/enqueueDelayedRun.server.ts
apps/webapp/app/v3/services/executeTasksWaitingForDeploy.ts
apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
apps/webapp/app/v3/services/finalizeTaskRun.server.ts
apps/webapp/app/v3/services/resumeBatchRun.server.ts
apps/webapp/test/batchTriggerV3ResidencyInheritance.test.ts
apps/webapp/test/batchTriggerV3StoreRouting.test.ts
apps/webapp/test/bulkActionV2ReadRouting.test.ts
apps/webapp/test/cancelDevSessionRunsStoreRouting.test.ts
apps/webapp/test/engine/streamBatchItems.test.ts
apps/webapp/test/engine/triggerFailedTask.test.ts
apps/webapp/test/engine/triggerTask.test.ts
apps/webapp/test/idempotencyDedupResidency.test.ts
apps/webapp/test/idempotencyKeyConcernLegacyAuthority.test.ts
apps/webapp/test/performTaskRunAlertsStoreRouting.test.ts
apps/webapp/test/realtime/runReaderReadThrough.test.ts
apps/webapp/test/realtime/streamRegistrationRouting.test.ts
apps/webapp/test/resetIdempotencyKeyLegacyAuthority.test.ts
apps/webapp/test/resolveWaitpointThroughReadThrough.readthrough.test.ts
apps/webapp/test/runEngineBatchTriggerStoreRouting.test.ts
apps/webapp/test/runsRepository.readthrough.test.ts
apps/webapp/test/runsRepositoryCpres.test.ts
apps/webapp/test/sessions.readthrough.test.ts
apps/webapp/test/streamLoader.controlPlane.test.ts

d-cs · 2026-07-02T19:10:40Z

Addressed the outside-diff note on idempotencyKeys.server.ts (blockRunWithWaitpoint writing via this.prisma): the block now passes the residency-resolved dedupClient as tx, so the idempotent parent run's waitpoint write lands on the store that owns that parent run rather than the fallback client.

… routing, run lifecycle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e in TriggerFailedTaskService TriggerFailedTaskService read the parent run via the ambient module-singleton store while the engine wrote the run through its own store, so a ksuid parent's row was not found and parentTaskRunId came back null. Add an optional injected runStore (defaults to the shared singleton, preserving production behaviour) and resolve the parent through it at both call sites, mirroring triggerTask.server.ts. Align the three affected webapp tests to read through the same store the engine wrote to: triggerFailedTask.test.ts passes engine.runStore; performTaskRunAlerts routing passes a passthrough store over the seeded container; triggerTask.test.ts stubs the run-ops db handles and pins split mode off so the idempotency dedup uses the container client. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…id-shape only Migration is deferred, so child/batch residency is a pure id-shape check. Remove the isKnownMigrated (and mint-only isSplitEnabled) deps from the mint sites (triggerTask, triggerFailedTask, batchTriggerV3) and call the now- synchronous resolveInheritedMintKind(parentFriendlyId) with no deps arg. Read paths: drop the isKnownMigrated re-probe-avoidance from the ClickHouse runs hydrate (probe all missing on legacy), the runsRepository readThrough options type, resolveWaitpointThroughReadThrough deps, and the BulkActionV2 batch seam adapter — keeping the genuine cross-seam fallback that reads NEW first for unclassifiable/legacy-candidate ids. Delete the injected-marker test cases; the remaining residency tests assert pure id-shape inheritance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…s and test names Review hygiene only: remove the NEW-1 label, Test X: name prefixes, and [TEST-NEWSEED] comment label. No product logic or test behavior changed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…o safe run-ops clients The read-through concern defaulted both newClient and legacyReplica to $replica (control-plane), so a bare caller that omits `deps` — the waitpoints wait route — never queried the dedicated run-ops replica. A co-located, NEW-resident waitpoint minted by streams.input().wait() lives on the run-ops-new DB, so the read missed, returned null, and the route 404'd (re-serialized to 500). Match the deps the complete/callback routes pass: default newClient to runOpsNewReplica, legacyReplica to $replica, and splitEnabled to runOpsSplitReadEnabled — mirroring readThroughRun's own self-defaulting. This immunizes any bare caller (present or future) against the control-plane pin, without touching the wait route. The wait/complete/callback call sites live on a higher branch and are unchanged; complete/callback keep their explicit deps (now redundant but harmless). Adds a heteroRunOps regression case driving the concern with no `deps` via the `defaults` DI seam: proves the old $replica default misses a NEW-resident waitpoint (null) while the safe run-ops default finds it. No mocks; the fallback is exercised against real PG14/PG17 containers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…rvice prisma to the resolved store - Block the idempotent parent run's waitpoint via the residency-resolved dedup client instead of the fallback prisma, so the write lands on the store that owns the parent run. - Pass the caller-provided _prisma into WithRunEngine so a custom store isn't silently overridden by the module singleton. - Throw when a run-backed alert's environment can't be resolved instead of marking it SENT, so a transient replica miss doesn't permanently suppress the alert. - Pin splitEnabled:false in the waitpoint passthrough test so it exercises single-DB behaviour rather than relying on ksuid residency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The write-path split added static `runOpsLegacyPrisma`/`runOpsNewPrisma` imports to idempotencyKeys.server.ts, which this test loads. vitest validates every named import against the `~/db.server` mock, so the mock now errored on the missing run-ops singletons. Add the four run-ops exports (empty stubs, same boundary pattern as the batchTriggerV3 residency test) and pin isSplitEnabled() to false so the dedup routing deterministically returns the injected fake prisma regardless of the ambient RUN_OPS_SPLIT_ENABLED. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…setup Worker/engine/marqs/pubsub/socket singletons each construct an ioredis client at import time (singleton() + no lazyConnect), so any test importing the service graph opened real Redis connections on import. In CI there is no Redis, so these accumulate infinite-retry clients across a shard and take the suite down (locally they pass only because dev Redis is up). Globally mock the eager-Redis modules to no-op stubs in test/setup.ts: commonWorker, batchTriggerWorker, legacyRunEngineWorker, alertsWorker, the RunEngine and MarQS singletons, devPubSub and the socket.io server. Only these singletons are mocked — never the run store (~/v3/runStore.server, ~/db.server), which store-routing/residency tests need real against testcontainer Postgres. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…yConnect + stub runtime Redis singletons The setup-file mocks of the six eager worker/engine singletons were not enough: CI shards still flooded ECONNREFUSED/maxRetries. Two further classes of env-Redis usage survived them, reproduced locally by running the failing shards with REDIS_PORT pointed at a dead port: 1. Import-time construction: ~15 more singletons (platform cache, billing-limit reconcile queue, alerts rate limiter, DevPresence, auto-increment counter, s2 token cache, v1 streams cache, ...) build ioredis clients at module import, and ioredis dials on construction. A global ioredis mock now forces lazyConnect: true so clients only dial on first command — testcontainer-backed tests are unaffected (their first command connects as before). 2. Runtime commands inside code under test: tracePubSub.publish() (eventRepository writes), alertsRateLimiter.check() (deliverAlert) and the task metadata cache each issue commands against env-configured Redis mid-test; every command burns ~20 reconnect cycles before its error surfaces, which times the tests out. These three modules are now stubbed (metadata cache pinned to its Noop implementation, which is what CI's unset env resolves to anyway). Verified: webapp shards 2/5/6/8 (the ones failing on the pr06+ stack) run green with Redis pointed at a dead port, and shards 2/8 stay green against live Redis (store-routing suites still exercise the real run store). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…in CI CI runners have no .env, no REDIS_HOST/REDIS_PORT, and no Postgres at localhost:5432, which surfaced two failure layers that local runs mask (the dev stack answers on both): - suites transitively importing triggerTaskV1.server failed to collect because autoIncrementCounter.server.ts throws at import when REDIS_HOST/REDIS_PORT are unset (shards 2/5/6). Default the pair in test/setup.ts — the global ioredis lazyConnect mock means nothing dials. - TriggerFailedTaskService.call() resolved its event repository via getEventRepository → global prisma (feature-flag read + Prisma event repo), so in CI the swallowed connect error returned null friendlyIds (shard 8). Allow injecting the repository/store pair and bind the test to an EventRepository over the testcontainer DB. - once the cancelDevSessionRuns suite could collect, findLatestSession's hardwired global $replica was the next masked layer; give it an injectable client (defaulting to $replica) and pass the service's _replica through. Verified by replaying the exact CI env locally (.env hidden, workflow env vars, dead localhost DB, GITHUB_ACTIONS set): all four failing suites and full shards 2/5/6/8 reproduce the CI failures before and pass after. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

d-cs mentioned this pull request Jul 2, 2026

feat(run-ops): ClickHouse multi-source replication fan-in + admin ops #4119

Draft

d-cs self-assigned this Jul 2, 2026

devin-ai-integration Bot reviewed Jul 2, 2026

View reviewed changes

Comment thread apps/webapp/app/runEngine/concerns/idempotencyKeys.server.ts

d-cs force-pushed the runops/pr05-webapp-foundation branch from 413a945 to 99643f8 Compare July 2, 2026 18:02

d-cs force-pushed the runops/pr06-write-path branch from 515b897 to cb97148 Compare July 2, 2026 18:02

coderabbitai Bot reviewed Jul 2, 2026

View reviewed changes

d-cs force-pushed the runops/pr05-webapp-foundation branch from 26871d5 to cdc4eb9 Compare July 2, 2026 19:25

d-cs force-pushed the runops/pr06-write-path branch from c59d9c5 to d5d7fa1 Compare July 2, 2026 19:25

d-cs force-pushed the runops/pr05-webapp-foundation branch from cdc4eb9 to e0b35d5 Compare July 2, 2026 20:21

d-cs force-pushed the runops/pr06-write-path branch 2 times, most recently from a8068e9 to 0ef3a6b Compare July 2, 2026 20:38

d-cs and others added 9 commits July 2, 2026 22:43

feat(run-ops): webapp write path — trigger/batch minting, idempotency…

87f29a2

… routing, run lifecycle Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

style(run-ops): apply oxfmt

9510f47

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

d-cs force-pushed the runops/pr06-write-path branch from 0db90f0 to d5415e8 Compare July 2, 2026 21:44

d-cs and others added 2 commits July 2, 2026 23:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle#4118

feat(run-ops): webapp write path — trigger/batch minting, idempotency routing, run lifecycle#4118
d-cs wants to merge 11 commits into
runops/pr05-webapp-foundationfrom
runops/pr06-write-path

d-cs commented Jul 2, 2026

Uh oh!

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading

Review skipped

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

d-cs commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

d-cs commented Jul 2, 2026

What

Why

Tests

Notes

Uh oh!

changeset-bot Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

d-cs commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading