Can AI coding agents produce production-quality software if you give them enough context? We don’t know yet. But we’re building the structure to find out.
What Is RepCheck?
RepCheck is a platform designed to help U.S. citizens understand how their elected legislators vote relative to their personal political interests.
The idea is straightforward: most people have political opinions but don’t follow the hundreds of bills that move through Congress each session. Meanwhile, every representative and senator casts votes on those bills — creating a public record that almost nobody reads.
RepCheck bridges that gap. It takes a user’s stated political preferences, analyzes Congressional bills using large language models, tracks how every legislator votes, and produces an alignment score: “Based on what you care about, here’s how well this legislator actually represents you.”
How It Works
- Data Ingestion — Bills, votes, member records, and amendments are pulled from the Congress.gov API and stored in Google Cloud Firestore
- Bill Analysis — Each bill’s full text is analyzed by LLMs using configurable prompt chains. The output is structured: summaries, topic tags, stance classifications, pork/rider detection, projected impact, and fiscal estimates
- User Profiling — Users answer a dynamically generated questionnaire about political topics they care about. Their responses build a structured political preference profile stored in Cloud SQL
- Alignment Scoring — LLMs compare a user’s political profile against each legislator’s voting record to produce per-topic and overall alignment scores
- Pre-computed Delivery — Scores are batch-computed and cached so users get instant results
Why This Matters As a Software Project
RepCheck is a real product idea. But it’s also an experiment.
The experiment is: can we build a multi-repository, production-grade system by having AI coding agents implement the components independently — if we invest heavily enough in design, patterns, and enforcement up front?
We genuinely don’t know the answer. What we do know is that the naive approach — “build me a system that does X” — doesn’t work for anything beyond trivial projects. Agents hallucinate API surfaces, invent inconsistent patterns across files, and produce code that compiles but doesn’t cohere.
So we’re trying something different. We’re treating the design phase as a first-class engineering effort — the same rigor you’d put into the code, applied to the documentation and context that agents will consume.
The Approach
The Hypothesis
If we create sufficiently detailed, sufficiently specific context documents — system architecture, code patterns, API contracts, compile-time enforcement, acceptance criteria — then coding agents should be able to implement individual components that are consistent with each other, even when built in isolation across separate repositories.
The Risk
This might not work. Some likely failure modes:
- Context documents might still be ambiguous. Even with 21 explicit architectural decisions documented, an agent might interpret “flat error types” differently than we intended
- Cross-repo integration might break. Each repo’s agent might produce internally consistent code that fails when combined with other repos
- The design docs might be wrong. Decisions made at the design phase may not survive contact with implementation realities — API limitations, library incompatibilities, performance bottlenecks
- The iteration cost might be too high. If agents produce code that requires heavy human revision, the time saved by using agents may be consumed by review and correction
The Commitment
If the first attempt doesn’t produce high-quality code, we iterate. We’ll identify exactly where the context was insufficient, where the patterns were ambiguous, and where the enforcement was too loose. Then we’ll refine the documents and try again.
The goal is not to succeed on the first try. The goal is to find a repeatable structure that reliably produces good results. If that takes three iterations, that’s three iterations of better documentation — which has value regardless.
What We’ve Built So Far
No application code has been written for the full system yet. What exists is the context layer:
System Design Document
A comprehensive architecture spec with Mermaid diagrams covering:
- 9 independent repositories and their dependency graph
- Event-driven pipeline orchestration via Google Cloud Pub/Sub (only 4 events, each with documented downstream consumers)
- Prompt engine architecture — all prompt content lives in GCS, zero hardcoded prompts
- Vendor-neutral LLM abstraction with pluggable adapters and multi-provider fan-out
- Storage strategy split by concern: Firestore for legislative data, Cloud SQL for user data
- Separate pipeline-operational models from domain models
Scala Code Patterns Document
19 sections covering every technical convention:
- Effect system: tagless final
F[_]everywhere - DTO/DO layering: each repo publishes its own models sub-project
- Error handling: flat exception types, fail-fast per item, stream-and-forget with external result storage
- Serialization: Circe with semi-auto derivation, organized import ordering
- Streaming: FS2 with per-item emission, no in-memory accumulation
- Configuration: PureConfig with auto-derivation
- Database: Doobie for Cloud SQL, Firebase Admin SDK wrapped in
Sync[F]for Firestore - LLM integration: vendor-neutral
LlmRequest→ pluggable adapter → vendor SDK - ID strategy: natural keys for legislative data, generated UUIDs for RepCheck entities
- Project structure:
models/+app/split only where appropriate
Compile-Time Enforcement
Tooling that errors on violations:
- WartRemover — 11 error rules (no null, var, .get, .head, mutable collections, unsafe casts, return, Try.get)
- Scalafix — organized imports (java → scala → cats → circe → http4s → fs2 → google → project)
- tpolecat — strict compiler flags with
-Xfatal-warnings - GitHub Actions CI — compile + scalafix check + test on every PR
A Working Prototype
The original bill ingestion pipeline — a functional Scala 3 application that streams Congressional bills from Congress.gov into Firestore. This serves as the reference implementation that all patterns were derived from.
The Technology Stack
| Layer | Technology | Why |
|---|---|---|
| Language | Scala 3.4.1 | Strong type system, FP-native, tagless final support |
| Effect System | Cats Effect | Pure FP effect management, resource safety |
| HTTP | Http4s Ember | FP-native HTTP client, integrates with Cats Effect |
| JSON | Circe | Type-safe JSON with semi-auto derivation |
| Streaming | FS2 | Functional streaming, memory-safe batch processing |
| Relational DB | Doobie | FP-native JDBC, composable transactions |
| Document DB | Firebase Admin SDK | Firestore access, wrapped in effect types |
| Object Storage | GCS Java SDK | Prompt fragment storage, wrapped in Sync[F] |
| Event Bus | Google Cloud Pub/Sub | Async pipeline orchestration |
| Compute | Cloud Run Jobs | Serverless pipeline execution |
| CI | GitHub Actions | Compile + lint + test on every PR |
| Build | SBT | Multi-project Scala builds |
| Publishing | GitHub Packages (Maven) | Cross-repo dependency management |
The Repository Structure
The system is designed as independent repositories, each with a focused responsibility:
repcheck-shared-models Pure domain types (bills, members, votes, users, LLM schemas)
repcheck-pipeline-models Pipeline operational types (events, job metadata, collection constants)
repcheck-llm-client Vendor-neutral LLM types + pluggable adapters (Claude, GPT, Gemini)
repcheck-data-ingestion Congress.gov pipelines (4 sub-projects: bills, votes, members, amendments)
repcheck-prompt-engine-bills Bill analysis prompt composition (loads fragments from GCS)
repcheck-prompt-engine-users User scoring prompt composition (loads fragments from GCS)
repcheck-llm-analysis Bill analysis pipeline (orchestrates prompts + LLM calls)
repcheck-scoring-engine Alignment scoring pipeline (user profiles vs voting records)
repcheck-api-server Http4s REST API for frontend (future phase)
Each repo that has both publishable types and application code uses a models/ + app/ sub-project structure. The models/ project is published as a Maven artifact via GitHub Packages so other repos can depend on the types without pulling in application code.
What Happens Next
The immediate next step is filling the gaps in the design documents. The system design covers what each component does but not how — in terms of concrete Scala signatures, API endpoint specifications, build scaffolding, error retry strategies, testing patterns, and acceptance criteria.
Once those gaps are filled, we begin the actual experiment: handing individual repository specs to coding agents and evaluating what they produce.
We’ll document the results honestly. If agents produce clean, consistent, passing implementations on the first try, that’s a data point. If they produce code that needs significant revision, that’s also a data point — and we’ll document exactly what context was missing and what we changed.
The outcome we’re optimizing for isn’t “AI wrote our code.” It’s: “We found a repeatable process that produces reliable results.” Whether that process involves agents writing 90% of the code or 30%, the value is in knowing what works and being able to do it again.
Follow Along
This is a multi-stage project. The next post documents the detailed process of building the context layer — the structured Q&A sessions, the decisions, the corrections, and the effort involved.
Future posts will cover:
- Gap analysis and implementation spec creation
- First agent implementation attempts and results
- What worked, what didn’t, and what we changed
- The evolving process as we iterate
All design documents, enforcement configs, and CI pipelines are in the RepCheck repository.
Project Repositories
All code for RepCheck is on GitHub:
- votr: Main monorepo. Pipelines, migrations, infrastructure code, acceptance criteria, and this blog
- repcheck-shared-models: Shared models library. DTOs, domain objects, Circe codecs, Doobie codecs
- repcheck-pipeline-models: Pipeline models library. Events, workflow schemas, error handling, configuration
- repcheck-ingestion-common: Ingestion common library. API client, XML parsing, change detection, event publishing, repository base, placeholders, execution helpers, structured logging
- repcheck-g8: Giter8 template for scaffolding new RepCheck Scala repositories
- tf-repcheck-infra: Terraform infrastructure-as-code for GCP (dev/staging/prod)