I Built a Working Beta in 5 Days. Here's What I Actually Learned.

In my last post I talked in broad strokes about how AI changed things for me. This time I want to get into the weeds.

I built the working beta of FinEL Analytica in about 5 days of actual keyboard time. Auth service in Rust. API gateway. Data pipelines that parse SEC filings. Stripe billing. React frontend. The whole thing. Not 5 consecutive days — I have a day job — but about 5 days worth of evenings and weekends where I was heads down building.

I know how that sounds. I wouldn’t have believed it either. But it happened and I’ve been thinking a lot about why it worked, because it wasn’t just “type stuff into Claude and magic comes out.” There’s a real workflow here and most of the important parts are things I had to figure out the hard way.

Write MCP servers for yourself

Okay this is the big one. If you take one thing away from this post, make it this.

Claude Code has this thing called MCP servers — they’re basically plugins that let Claude directly interact with your stuff. Your databases, your APIs, your infrastructure. Instead of you copy-pasting a SQL result into the chat window like some kind of animal, Claude just… runs the query.

I built two of these for FinEL. Took about two hours total because they’re honestly not that complicated — just TypeScript wrappers around things that already exist.

The first one gives Claude read access to all my databases. Redis, MariaDB, DynamoDB. It can also call any of my API endpoints and trigger builds and deploys for all 12 of my services. The second one wraps the actual FinEL product API — all 22 endpoints. So Claude can search for companies, pull financial statements, read SEC filings, the same way an actual user would.

The difference this makes is stupid. Before the MCPs, debugging looked like this: I describe a bug to Claude. Claude asks me to check something. I go check it. I paste the result back. Claude asks me to check something else. I go check it. Back and forth, back and forth, burning 20 minutes on something that should take 2.

Now I just say “auth is returning 401 for beta users, check the Redis session and the DynamoDB record for this email” and Claude goes and does it. Looks at the actual data. Finds the actual problem. Done.

Two hours of work building those servers, and every single session after that got dramatically better. It’s the best trade I’ve ever made.

Make it think before it types

I had to learn this one by getting burned. More than once.

Claude’s default energy is “let me help! let me write code! right now!” Which is great when you’re scaffolding something new, but terrible when you’re working in an existing codebase. I watched it confidently rewrite a function that was working fine because it didn’t bother reading the file that called it first. The new version was clean, well-structured, and completely broke the auth flow.

So now I have a rule. Before anything non-trivial, Claude has to do its homework. “Read the auth middleware. Look at how the session tokens work. Check the Redis key format. Tell me what you’re going to do before you do it.”

It grumbles a little (not literally, but you can feel it wanting to just start coding), but the output quality when it actually understands the codebase first is so much better. It writes code that belongs instead of code that merely works. There’s a plan mode built into Claude Code for exactly this — use it. I use it on basically anything that touches more than a couple files.

Your context window is not infinite

This one cost me the most time before I figured it out.

I used to have these epic three-hour sessions. Claude’s crushing it, writing code across a dozen files, everything’s flowing. Then around the two hour mark, stuff starts getting weird. It forgets a function signature it wrote 40 minutes ago. It introduces a bug in a file it already edited. It starts suggesting changes that contradict things it said an hour earlier.

The context window is filling up. It’s not that Claude gets tired — it’s that there’s a finite amount of stuff it can pay attention to, and when you fill it up with three hours of back-and-forth, the early stuff starts getting fuzzy.

My rule now is to keep sessions tight. One feature. One bug. One service. When I finish something, I start fresh. Yeah, you spend 30 seconds at the top of a new session saying “here’s the project, here’s what we’re doing.” That’s a way better deal than the last hour of a bloated session where Claude is basically sundowning.

Same thing with context — don’t shove giant files in there for no reason. Point Claude at specific files. Let the MCP servers handle data queries instead of pasting results into chat. Every token you waste on context you don’t need is a token that isn’t paying attention to the code you actually care about.

Document as you go, not after

I never wrote documentation. Let’s just be honest about that. I always meant to, and then I didn’t, and the deploy commands lived in my bash history and the architecture decisions lived in my head.

Claude actually fixed this for me, not because it nags me about docs but because there’s a really practical reason to do it now. Claude doesn’t remember anything between sessions. So if you don’t write things down, the next session starts from zero.

What I do now: after every significant piece of work, Claude updates the project notes. Architecture decisions, deploy commands, gotchas, patterns. It all goes into CLAUDE.md and topic-specific files that get loaded at the start of every session.

Claude is weirdly good at this. I would never voluntarily write down something like “MariaDB 11.4 requires explicit ClientPasswordAuthType: MYSQL_NATIVE_PASSWORD in the RDS Proxy auth config.” But Claude does, without being asked twice. And two weeks later when I’m setting up a second proxy and that exact problem bites me again, the note is sitting right there.

It’s like having a coworker with perfect memory who actually bothers to update the wiki.

Use a fresh session to review

This feels weird at first but it works really well: have Claude review the code that Claude wrote.

The trick is you can’t do it in the same session. That’s like asking someone to proofread their own essay right after they wrote it. They read what they intended to write, not what’s actually on the page. The session that wrote the code has all the context about why it made certain choices, and that context creates blind spots.

What you want is a cold read. New session. Point it at the changed files. “Review this for bugs, security issues, and anything that doesn’t fit the existing patterns.” That’s it.

I did a full security audit on FinEL this way. Fresh session, told it to be adversarial, go through every service. It found 10 legitimate issues. Some were small (inconsistent input validation), some were not (a couple of places where the rate limiting could be bypassed). I deployed fixes for 5 of them. The other 5 were accepted risk or not applicable, but they were all real findings.

You basically get a free code reviewer who never gets tired and doesn’t have any ego about the code. It’s not perfect — it misses things a human security expert would catch — but it’s infinitely better than the nothing you’d have otherwise.

What still sucks

I should be honest about what doesn’t work well, because the hype around AI coding tools is a little out of control and I don’t want to add to it.

Cross-service debugging is still mostly on you. Claude is great within one service. But when the bug is “something between the auth service and the API gateway and the frontend is dropping the session,” you’re still the one who has to figure out which service to look at first. The MCPs help because Claude can at least poke around in all the databases, but connecting the dots across service boundaries is still a human job.

CSS is painful. Claude writes CSS fine but it can’t see what it looks like. So you get into this loop where you describe what you want, Claude writes it, you check the browser, it’s not right, you describe the fix, Claude writes it, check browser, still not right. My workaround is screenshots — I take a screenshot, drop it into the chat, and Claude can actually look at what’s wrong. It’s way better than trying to describe “the padding on the left side of the third card is off” in words. I have over 100 screenshots sitting on my desktop right now from this workflow. It’s not pretty but it works. You still end up doing more rounds than you’d like though, because Claude’s fixes sometimes nail one thing and break something else visually.

Domain-specific stuff needs constant hand-holding. Claude does not know that a company’s fiscal year Q4 might end in September. It doesn’t know that the SEC has like 47 different XBRL tags for what is essentially “revenue” — something my partner spent 5 years building a classification model to normalize. Every time I work on the financial data pipeline, I have to front-load a bunch of domain context or I get code that looks perfect but handles the data wrong.

Big refactors are sketchy. 3-4 files, no problem. 20 files? Things start falling apart. Stale imports, missed references, changes that conflict with other changes from earlier in the same session. For anything big, I break it into chunks across separate sessions.

Going fast makes you build dumb things. This is the sneaky one. When the cost of building something drops to an evening, you stop asking “should we build this?” and start asking “can we build this?” and the answer is always yes. I built an entire survey engine for FinEL. A full survey engine. With question types and response aggregation and a UI and everything. Did my financial analytics platform need a survey engine? No. Absolutely not. Not even a little. But I had an idea, Claude was right there, and two hours later it existed. That’s two hours I could’ve spent on the screener feature that users were actually asking for. The speed is genuinely dangerous because it removes the natural friction that used to force you to prioritize. When everything took a week, you had to think hard about whether it was worth a week. When everything takes an evening, you just… build it. And then you have 15 features and none of them are finished. I’m still learning to say no to myself on this.

The context switching between sessions will melt your brain. When you’re this productive, you end up with 4 Claude sessions open at once — one on the auth service, one on the frontend, one debugging a data pipeline, one doing a code review. And each one is moving fast. Really fast. So you’re bouncing between them and trying to remember which session has which context and what you told each one to do, and suddenly you approve a change in the wrong session and now the auth service has a React component in it or whatever. I’m being slightly hyperbolic but only slightly. The cognitive overhead of managing multiple fast-moving AI sessions is a real thing that nobody warned me about. I don’t have a great solution for this yet. I try to limit myself to two sessions max, but when things are flowing it’s hard not to open a third.

And honestly — it makes you lazy if you’re not careful. When Claude is writing most of the code, it’s really easy to just skim the output and approve it. I’ve caught myself doing this and then finding bugs days later that were obvious in retrospect. You have to actually read the code. Treat it like a PR review, not a rubber stamp. I’m still working on being disciplined about this.

This is not vibe coding

I want to be really clear about something because I think there’s a misconception about what AI-assisted development actually looks like in practice.

I’m not sitting on the couch half-watching Netflix while Claude writes my app. That’s vibe coding. That’s how you get a demo that falls apart the second a real user touches it.

What I’m actually doing is PM’ing, tech leading, and QA’ing a team of 4 highly productive junior engineers who happen to live inside my terminal. I’m writing requirements. I’m reviewing every pull request. I’m making architecture decisions. I’m catching the stuff they miss. I’m deciding what gets built and in what order. I’m saying “no, that’s not how we handle auth in this codebase, go look at how the existing middleware does it.” I’m running the feature in my browser and filing bugs against their work.

It’s management. It’s just really fast management of really fast engineers who never get tired and never have opinions about the sprint planning.

The 20 years of experience aren’t optional here. They’re the whole point. If I didn’t know what good architecture looks like, if I couldn’t spot a subtle bug in a code review, if I didn’t understand the domain well enough to catch when the AI gets the business logic wrong — the output would be garbage. Fast garbage, but garbage.

The skill isn’t writing prompts. The skill is everything you already know about building software, applied to a new workflow where you’re directing instead of typing.

The actual takeaway

None of these lessons are complicated. Write tools that give the AI access to your stuff. Make it plan before it codes. Keep sessions focused. Write things down. Use fresh eyes for review. Stay engaged. Remember that you’re the tech lead, not a passenger.

The hard part isn’t knowing this — it’s actually doing it consistently. Every time I skip one of these steps because I’m in a hurry, the quality drops. Every time I follow them, I’m shocked at how much I can get done in an evening.

It’s a different way of working and it took me a while to internalize it. But once it clicks, going back to the old way of doing things feels like trying to dig a swimming pool with a spoon.

I’m going to write up the MCP server setup in detail next — the actual code, what to expose, how to structure it. Especially if you’re building on AWS, it’s a massive unlock.