Introducing RepCheck: An Experiment in Agent-Driven Software Development

Can AI coding agents produce production-quality software if you give them enough context? We don’t know yet. But we’re building the structure to find out.

What Is RepCheck?

RepCheck is a platform designed to help U.S. citizens understand how their elected legislators vote relative to their personal political interests.

The idea is straightforward: most people have political opinions but don’t follow the hundreds of bills that move through Congress each session. Meanwhile, every representative and senator casts votes on those bills — creating a public record that almost nobody reads.

RepCheck bridges that gap. It takes a user’s stated political preferences, analyzes Congressional bills using large language models, tracks how every legislator votes, and produces an alignment score: “Based on what you care about, here’s how well this legislator actually represents you.”

How It Works

Data Ingestion — Bills, votes, member records, and amendments are pulled from the Congress.gov API and stored in Google Cloud Firestore
Bill Analysis — Each bill’s full text is analyzed by LLMs using configurable prompt chains. The output is structured: summaries, topic tags, stance classifications, pork/rider detection, projected impact, and fiscal estimates
User Profiling — Users answer a dynamically generated questionnaire about political topics they care about. Their responses build a structured political preference profile stored in Cloud SQL
Alignment Scoring — LLMs compare a user’s political profile against each legislator’s voting record to produce per-topic and overall alignment scores
Pre-computed Delivery — Scores are batch-computed and cached so users get instant results

Why This Matters As a Software Project

RepCheck is a real product idea. But it’s also an experiment.

The experiment is: can we build a multi-repository, production-grade system by having AI coding agents implement the components independently — if we invest heavily enough in design, patterns, and enforcement up front?

We genuinely don’t know the answer. What we do know is that the naive approach — “build me a system that does X” — doesn’t work for anything beyond trivial projects. Agents hallucinate API surfaces, invent inconsistent patterns across files, and produce code that compiles but doesn’t cohere.

So we’re trying something different. We’re treating the design phase as a first-class engineering effort — the same rigor you’d put into the code, applied to the documentation and context that agents will consume.

The Approach

The Hypothesis

If we create sufficiently detailed, sufficiently specific context documents — system architecture, code patterns, API contracts, compile-time enforcement, acceptance criteria — then coding agents should be able to implement individual components that are consistent with each other, even when built in isolation across separate repositories.

The Risk

This might not work. Some likely failure modes:

Context documents might still be ambiguous. Even with 21 explicit architectural decisions documented, an agent might interpret “flat error types” differently than we intended
Cross-repo integration might break. Each repo’s agent might produce internally consistent code that fails when combined with other repos
The design docs might be wrong. Decisions made at the design phase may not survive contact with implementation realities — API limitations, library incompatibilities, performance bottlenecks
The iteration cost might be too high. If agents produce code that requires heavy human revision, the time saved by using agents may be consumed by review and correction

The Commitment

If the first attempt doesn’t produce high-quality code, we iterate. We’ll identify exactly where the context was insufficient, where the patterns were ambiguous, and where the enforcement was too loose. Then we’ll refine the documents and try again.

The goal is not to succeed on the first try. The goal is to find a repeatable structure that reliably produces good results. If that takes three iterations, that’s three iterations of better documentation — which has value regardless.

What We’ve Built So Far

No application code has been written for the full system yet. What exists is the context layer:

System Design Document

A comprehensive architecture spec with Mermaid diagrams covering:

9 independent repositories and their dependency graph
Event-driven pipeline orchestration via Google Cloud Pub/Sub (only 4 events, each with documented downstream consumers)
Prompt engine architecture — all prompt content lives in GCS, zero hardcoded prompts
Vendor-neutral LLM abstraction with pluggable adapters and multi-provider fan-out
Storage strategy split by concern: Firestore for legislative data, Cloud SQL for user data
Separate pipeline-operational models from domain models

Scala Code Patterns Document

19 sections covering every technical convention:

Effect system: tagless final F[_] everywhere
DTO/DO layering: each repo publishes its own models sub-project
Error handling: flat exception types, fail-fast per item, stream-and-forget with external result storage
Serialization: Circe with semi-auto derivation, organized import ordering
Streaming: FS2 with per-item emission, no in-memory accumulation
Configuration: PureConfig with auto-derivation
Database: Doobie for Cloud SQL, Firebase Admin SDK wrapped in Sync[F] for Firestore
LLM integration: vendor-neutral LlmRequest → pluggable adapter → vendor SDK
ID strategy: natural keys for legislative data, generated UUIDs for RepCheck entities
Project structure: models/ + app/ split only where appropriate

Compile-Time Enforcement

Tooling that errors on violations:

WartRemover — 11 error rules (no null, var, .get, .head, mutable collections, unsafe casts, return, Try.get)
Scalafix — organized imports (java → scala → cats → circe → http4s → fs2 → google → project)
tpolecat — strict compiler flags with -Xfatal-warnings
GitHub Actions CI — compile + scalafix check + test on every PR

A Working Prototype

The original bill ingestion pipeline — a functional Scala 3 application that streams Congressional bills from Congress.gov into Firestore. This serves as the reference implementation that all patterns were derived from.

The Technology Stack

Layer	Technology	Why
Language	Scala 3.4.1	Strong type system, FP-native, tagless final support
Effect System	Cats Effect	Pure FP effect management, resource safety
HTTP	Http4s Ember	FP-native HTTP client, integrates with Cats Effect
JSON	Circe	Type-safe JSON with semi-auto derivation
Streaming	FS2	Functional streaming, memory-safe batch processing
Relational DB	Doobie	FP-native JDBC, composable transactions
Document DB	Firebase Admin SDK	Firestore access, wrapped in effect types
Object Storage	GCS Java SDK	Prompt fragment storage, wrapped in `Sync[F]`
Event Bus	Google Cloud Pub/Sub	Async pipeline orchestration
Compute	Cloud Run Jobs	Serverless pipeline execution
CI	GitHub Actions	Compile + lint + test on every PR
Build	SBT	Multi-project Scala builds
Publishing	GitHub Packages (Maven)	Cross-repo dependency management

The Repository Structure

The system is designed as independent repositories, each with a focused responsibility:

repcheck-shared-models          Pure domain types (bills, members, votes, users, LLM schemas)
repcheck-pipeline-models        Pipeline operational types (events, job metadata, collection constants)
repcheck-llm-client             Vendor-neutral LLM types + pluggable adapters (Claude, GPT, Gemini)
repcheck-data-ingestion         Congress.gov pipelines (4 sub-projects: bills, votes, members, amendments)
repcheck-prompt-engine-bills    Bill analysis prompt composition (loads fragments from GCS)
repcheck-prompt-engine-users    User scoring prompt composition (loads fragments from GCS)
repcheck-llm-analysis           Bill analysis pipeline (orchestrates prompts + LLM calls)
repcheck-scoring-engine         Alignment scoring pipeline (user profiles vs voting records)
repcheck-api-server             Http4s REST API for frontend (future phase)

Each repo that has both publishable types and application code uses a models/ + app/ sub-project structure. The models/ project is published as a Maven artifact via GitHub Packages so other repos can depend on the types without pulling in application code.

What Happens Next

The immediate next step is filling the gaps in the design documents. The system design covers what each component does but not how — in terms of concrete Scala signatures, API endpoint specifications, build scaffolding, error retry strategies, testing patterns, and acceptance criteria.

Once those gaps are filled, we begin the actual experiment: handing individual repository specs to coding agents and evaluating what they produce.

We’ll document the results honestly. If agents produce clean, consistent, passing implementations on the first try, that’s a data point. If they produce code that needs significant revision, that’s also a data point — and we’ll document exactly what context was missing and what we changed.

The outcome we’re optimizing for isn’t “AI wrote our code.” It’s: “We found a repeatable process that produces reliable results.” Whether that process involves agents writing 90% of the code or 30%, the value is in knowing what works and being able to do it again.

Follow Along

This is a multi-stage project. The next post documents the detailed process of building the context layer — the structured Q&A sessions, the decisions, the corrections, and the effort involved.

Future posts will cover:

Gap analysis and implementation spec creation
First agent implementation attempts and results
What worked, what didn’t, and what we changed
The evolving process as we iterate

All design documents, enforcement configs, and CI pipelines are in the RepCheck repository.

Project Repositories

All code for RepCheck is on GitHub:

votr: Main monorepo. Pipelines, migrations, infrastructure code, acceptance criteria, and this blog
repcheck-shared-models: Shared models library. DTOs, domain objects, Circe codecs, Doobie codecs
repcheck-pipeline-models: Pipeline models library. Events, workflow schemas, error handling, configuration
repcheck-ingestion-common: Ingestion common library. API client, XML parsing, change detection, event publishing, repository base, placeholders, execution helpers, structured logging
repcheck-g8: Giter8 template for scaffolding new RepCheck Scala repositories
tf-repcheck-infra: Terraform infrastructure-as-code for GCP (dev/staging/prod)