Almost Ready: Closing the Last Gaps and Templating for Scale

The last 10% of preparation is where you discover the preparation never ends — it just changes shape.

Where We Left Off

The previous post covered the grind of building agent-ready documentation: 30+ template files, the behavioral specs that required actual product decisions, the quality gate stack, and the shell environment problems that kept consuming time between sessions. We finished at 8 of 10 gaps done, with two remaining: acceptance criteria per component (Gap #9) and Docker/CI/deployment templates (Gap #10).

This post covers closing Gap #10, discovering Gap #11 we hadn’t anticipated, and reaching the point where generating code is the next thing on the agenda.

Gap #10: Deployment and Testing Infrastructure

Gap #10 was about making sure agents know how to package, deploy, and test a RepCheck component — not just how to write the Scala code inside it.

We added seven new template files:

Annotated guides:

deployment-architecture.md — Four deployment archetypes (library, config, pipeline, API service), Workload Identity Federation setup, dev → staging → prod promotion pipeline, Artifact Registry image naming
testing-infrastructure.md — The full testing strategy: Docker Compose for local emulators, ephemeral namespace isolation per test run, MockitoScala + WireMock + Testcontainers mocking stack, E2E test tagging, auto-bug filing on CI failure

Skeleton files:

dockerfile-pipeline.txt — Multi-stage Dockerfile (Temurin JDK build → Google Distroless Java 21 runtime)
docker-compose-local-dev.yml — Local dev stack with Postgres, Firestore emulator, Pub/Sub emulator
cloud-run-job.yaml — Cloud Run Job definition with resource limits and retry config
github-actions-deploy.yml — Deployment workflow with three conditional paths (library publish, pipeline deploy, config sync)
github-actions-bug-on-failure.yml — Auto-bug filing: creates a GitHub Issue on CI test failure, closes it when tests pass

We also extended the doc compressor to handle .txt, .yml, and .yaml files — previously it only processed .md and .scala. And we updated CLAUDE.md with the full set of universal rules around deployment and testing decisions that agents should follow without asking:

Google Distroless Java 21 as the container base image — always
Workload Identity Federation for GCP auth in CI — never JSON keys
Ephemeral namespace prefix for every test run — cleanup in afterAll()
MockitoScala + WireMock + Testcontainers as the mocking stack — no exceptions
Dev → Staging → Prod with a manual gate before prod

With this, Gap #10 is closed. Score: 9 of 10 done.

Gap #11: The Problem We Hadn’t Seen Coming

Here’s the thing about building agent-ready documentation for one repository: it works great for that repository. But RepCheck isn’t one repository. It’s eleven.

When we mapped out the remaining work, we realized every new repository — repcheck-data-ingestion, repcheck-llm-analysis, repcheck-scoring-engine, and the rest — would need the same scaffolding we’d just spent weeks building:

The same build.sbt with WartRemover, tpolecat, Scalafix, Scalafmt, and scoverage
The same CLAUDE.md routing table and universal coding rules
The same codecov.yml with 90% patch coverage enforcement
The same CI pipeline
The same documentation set
The same doc compressor

Setting that up manually for each repository would take hours per repo and immediately create drift — by the time we set up the fifth repository, the rules in the first one would have evolved and the copies wouldn’t match.

The solution: repcheck-g8 — a Giter8 template that generates a correctly-configured RepCheck repository from a single sbt new command.

What the Giter8 Template Does

Giter8 is the standard Scala templating tool. You run sbt new owner/template.g8 and it generates a project from the template, substituting parameters throughout.

The repcheck-g8 template is parameterized by:

archetype     = library | pipeline | api-service
has_firestore = true | false
has_cloudsql  = true | false
has_pubsub    = true | false
has_gcs       = true | false
has_llm       = true | false

A generated repository contains:

The full SBT build with all five plugins (WartRemover, tpolecat, Scalafix, Scalafmt, scoverage)
Dependencies.scala with all dependency groups defined — Doobie, Pub/Sub, GCS, Firebase Admin, Anthropic SDK — build.sbt conditionally pulls in only what the repo needs
CLAUDE.md with the complete routing table and all 20 universal rules
codecov.yml with 90% patch coverage on PRs
All 37 documentation files — architecture specs, annotated examples, skeleton templates
The doc compressor, copied as a self-contained SBT subproject
GitHub Actions CI with coverage reporting
A Dockerfile (multi-stage: Temurin JDK build → Distroless runtime)

To generate repcheck-data-ingestion:

sbt new Eligio-Taveras/repcheck-g8.g8 \
  --name=repcheck-data-ingestion \
  --archetype=pipeline \
  --has_firestore=true \
  --has_pubsub=true \
  --description="Congress.gov data ingestion pipelines"

That command produces a repository that an agent can immediately start implementing. The build works. The CI is configured. The quality gates are enforced. The documentation is there.

The Template Portability Question, Revisited

In the last post, we raised the question: can this documentation structure transfer to other repositories and other projects? We designed for portability but acknowledged we wouldn’t know until we tried.

The giter8 template is our first attempt to answer that question structurally rather than rhetorically. Each generated repository is independent — it has its own copy of the docs, its own doc compressor, its own CLAUDE.md. If the universal rules evolve in one repository, the template can be updated and new repositories get the updated version. Existing repositories drift, but at least the drift is visible (the template is the reference).

The honest assessment: we still don’t know if the patterns generalize across projects. We know they’re consistent within RepCheck. The real test is when we try to apply the same approach to a completely different domain — different tech stack, different business logic, different team. That test hasn’t happened yet.

The Shell Environment Problem Persists

The shell debugging from the last post hasn’t been fully resolved. Claude Code’s bash environment still needs explicit setup before it can run the Scala build. The current workaround is sourcing ~/.bashrc explicitly before running sbt, and using the full paths to binaries.

What we’ve learned: the API key for the doc compressor needs to be in ~/.bashrc as an export. If it’s only in PowerShell’s $PROFILE, the bash shell doesn’t see it. We’ve set it up correctly now, but it required discovering this the hard way multiple times.

We’re treating this as an ongoing experiment rather than a solved problem.

Skills: Still on the Roadmap

We haven’t converted the routing table to skills yet. The plan remains: move each task routing entry into its own .claude/skills/ file, reducing the base context from ~3,000 tokens per message to ~500 tokens plus whatever skills are loaded for the current task.

The reason we haven’t done it yet is prioritization — closing the documentation gaps and building the template came first. Skills are an optimization; the gaps were blockers.

Gap Tracker: 10 of 11 Done

#	Gap	Status
1	Scala code patterns/signatures	✅ Done
2	References to existing code as templates	✅ Done
3	Congress.gov API specs	✅ Done
4	build.sbt / project scaffolding	✅ Done
5	Error handling & retry strategy	✅ Done
6	Testing guidance	✅ Done
7	Behavioral ambiguity	✅ Done
8	GCP integration patterns	✅ Done
9	Acceptance criteria per component	Not started
10	Docker / CI / GitHub Actions templates	✅ Done
11	Giter8 repo template	✅ Done

10 done. 1 remaining.

Gap #9 — acceptance criteria per component — is the last piece. This means per-component success conditions: “Bill Ingestion is done when N bills are fetched, M are persisted, K failures are retried and either recovered or dead-lettered.” These criteria become the acceptance tests that verify an agent’s implementation actually works end-to-end.

What’s Next

Gap #9, then we generate code.

That’s it. Once we have acceptance criteria, every piece is in place: the architecture, the patterns, the behavioral rules, the templates, the API specs, the quality gates, the compressed context, and now the template for generating new repos. An agent will have everything it needs to implement a RepCheck repository from scratch — write the code, write the tests, pass the coverage gate, and have the behavior verified against defined criteria.

The entire purpose of this past week of work has been to reach this point. We’re almost there.

Project Repositories

All code for RepCheck is on GitHub:

votr: Main monorepo. Pipelines, migrations, infrastructure code, acceptance criteria, and this blog
repcheck-shared-models: Shared models library. DTOs, domain objects, Circe codecs, Doobie codecs
repcheck-pipeline-models: Pipeline models library. Events, workflow schemas, error handling, configuration
repcheck-ingestion-common: Ingestion common library. API client, XML parsing, change detection, event publishing, repository base, placeholders, execution helpers, structured logging
repcheck-g8: Giter8 template for scaffolding new RepCheck Scala repositories
tf-repcheck-infra: Terraform infrastructure-as-code for GCP (dev/staging/prod)

This is part of an ongoing series documenting RepCheck’s development. Previous posts: Introducing RepCheck | Building Agent-Ready Context | Token Costs and Template Architecture | Closing the Gaps