Case Study 05 — Code Review

Code Review Pipeline

How a software team cut PR wait times from 4 hours to 20 minutes without sending their codebase to OpenAI.

The company

Atlas Freight Systems builds logistics and fleet management software. 35 developers, two engineering teams (platform and product), and a codebase that's been growing for eight years. They ship fast — two or three deployments per day. Or they did, until code review became the bottleneck.

The problem

Every pull request needs review before it merges. The team's code review guidelines require at least one senior engineer to review logic, architecture, and edge cases, security checks, test coverage verification, and style and consistency checks.

What actually happens: A developer finishes a feature at 2 PM. They open a PR. It sits in the queue. The senior engineer is in back-to-back meetings until 4:30. She picks up the PR at 5, reviews it, leaves three comments, and goes home. The developer sees the comments at 9 AM the next day, makes changes, re-requests review. The senior engineer reviews again at 2 PM. A one-day feature takes two days to ship because of review latency.

And then there's the quality problem. When the senior engineer finally gets to the PR at 5 PM, she's tired, context-switching from three meetings, and rushing because she knows the developer is waiting. She misses things — a SQL injection vector, a missing test case, an inconsistent error handling pattern.

PR wait time: 3-5 hours. Senior engineer spends 3-4 hours/day on review. 2-3 production bugs/month that a thorough review would have caught.

What they tried: GitHub Copilot (doesn't review PRs), ChatGPT for code review (sent their entire codebase to OpenAI — their biggest client's contract prohibits third-party AI processing), hiring a dedicated reviewer (he quit after four months).

What Foundry does

Foundry runs on a Mac Studio in the engineering team's office. It's connected to their GitHub via a webhook — when a PR is opened or updated, Foundry gets notified.

It does a first-pass code review. Not a rubber stamp. A real review.

When a PR opens, Foundry:

Reads the changes. It understands the diff — not just the lines changed, but the context around them.
Checks against the team's review guidelines: logic errors, security concerns, test coverage, consistency with existing codebase patterns, potential performance issues.
Posts a structured review as a comment on the PR with must-fix issues, should-consider suggestions, and looks-good confirmations — including specific line references and suggested fixes.
Flags the PR for human review with a priority level. No must-fix issues? Quick scan. Three must-fix issues? Needs careful human review.

The senior engineer still reviews every PR. But she's reviewing a PR that's already been through a thorough first pass. She's confirming, not discovering. And she's doing it in 5 minutes instead of 30.

What it looks like day to day

2:15 PM — Developer opens a PR. Sarah pushes a feature: a new endpoint that calculates delivery route optimisation based on traffic data. 340 lines across 4 files.

2:17 PM — Foundry posts review. One SQL injection risk flagged with suggested fix, one missing test case noted, auth and validation confirmed good.

2:20 PM — Developer fixes the SQL injection, adds the empty-data test, pushes update.

2:22 PM — Foundry re-reviews, confirms both issues resolved, flags ready for human review.

2:35 PM — Senior engineer reviews, confirms fix, approves. Total time from PR to merge: 20 minutes.

The transformation

Before

PR wait time: 3-5 hours

Senior review time: 3-4 hrs/day

Merge-to-deploy: 1-2 days

Bugs caught in review: 60%

API cost: £800-1,200/month

After

PR wait time: 15-25 minutes

Senior review time: 45-60 mins/day

Merge-to-deploy: same day

Bugs caught in review: 92%

API cost: £0 (local)

Annual impact: 600-700 hours of senior engineer time recovered + £10,000+ in API costs + fewer production incidents (each P1 incident costs £5,000-15,000 in response, fix, and client impact).

Foundry cost: £999 setup + £99/month = £2,187 first year. Existing Mac Studio.

What the team says

"The first week, Foundry caught a SQL injection in a PR that I would have missed at 5 PM on a Friday. I've been reviewing code for twelve years. That stung — but it proved the point." David, CTO

"I used to wait half a day for someone to look at my code. Now it's reviewed before I've finished my coffee. The feedback is specific — line numbers, suggested fixes, not just 'looks fine.'" Sarah, developer

"The national retailer contract clause about third-party AI was the blocker for us using ChatGPT for review. Foundry runs on our hardware. Our code never leaves the building. Procurement is happy, legal is happy, and we're shipping faster." David, CTO

Technical details

Hardware: Mac Studio M3 Ultra, 512GB unified memory

Model: Qwen3-Coder-30B (Q5_K_M) via llama.cpp — specifically tuned for code understanding

Pipeline: GitHub webhook to fetch diff, analyse changes, post structured review, developer addresses, re-review on update, flag for human approval

Review categories: Security (injection, auth, data exposure), logic (edge cases, error paths, null handling), tests (coverage, edge cases, assertions), consistency (codebase patterns, style, naming), performance (N+1 queries, allocations, blocking calls)

Throughput: 10-30 seconds per PR depending on diff size

False positive rate: ~5-8% on suggestions; near-zero on must-fix items

Is this right for your team?

Works for software teams of 10-100 developers doing regular PRs, companies with proprietary codebases that can't go through third-party AI APIs, teams where senior engineer review time is the deployment bottleneck, organisations with security requirements.

Book a Foundry Fit Review →

← Back to all case studies