Vibe Coding in the Enterprise: What Actually Works

April 5, 2026

Andrej Karpathy coined "vibe coding" — describe what you want in English, let AI write the code, iterate by talking. It's real, it's here, and I use it every day at work.

But most of the conversation around it is about solo devs shipping weekend projects. I work at IBM. My systems serve tens of thousands of users across Salesforce, SAP, Azure, and a bunch of custom middleware. The stakes are different.

This is what I've actually learned using AI coding tools on enterprise projects — what works, what breaks, and how I decide when to vibe and when to think.

The Two Modes

There's a useful mental model here. I think about every coding task as falling into one of two buckets:

Mode 1: Structure

Vibe freely

The shape of the code matters more than any individual line. Test scaffolding, boilerplate, CRUD wrappers, migration scripts, CI/CD config. If a mistake here costs you an afternoon, AI can own it.

Mode 2: Judgment

Think first

The decision encoded in the code matters more than the code itself. Security, compliance, integration error handling, platform constraints. If a mistake here pages someone at 3am, you own it.

This distinction — blast radius, not complexity — is the thing I keep coming back to. Some incredibly simple code has enormous blast radius (a misconfigured auth middleware). Some complex code has almost none (a test data factory). Match your approach to the risk, not the difficulty.

Where I Vibe

Prototyping

This is where AI changed my job the most. I used to sketch integration patterns on a whiteboard and ask stakeholders to imagine how it would work. Now I build a working prototype and show them.

Last month, I needed to prove that Salesforce Platform Events could drive a near-real-time sync with an external warehouse system. With Claude Code, I had the Apex trigger, the middleware listener, retry logic, and a basic error dashboard in about three hours. It wasn't production code — but it answered the question, and it killed two weeks of debate.

The principle: prototyping speed changes which conversations you have. Instead of "should we build this?" you're discussing "here's how it works — what should we change?"

Tests

AI is shockingly good at writing test code. Describe your class, your method signatures, your edge cases — and it generates test classes that actually cover the right scenarios. I've started writing tests before implementation using AI, then writing the real code myself. It's a better workflow than the other way around.

Boilerplate

Enterprise software is ~40% boilerplate. Apex trigger handlers. REST API wrappers. Terraform modules. Data migration scripts. These follow patterns. The patterns are well-documented. AI eats this alive.

I describe the object model and API contract, and AI generates the scaffolding. I review it once, adjust naming to match our conventions, and move on to the interesting problems.

Learning

I work across maybe six platforms in any given month — Apex, Python, TypeScript, Terraform, Azure services, Kubernetes configs. Nobody is an expert in all of those simultaneously.

When I need to write something in a stack I haven't touched recently, I describe the architecture and constraints to the AI. It generates idiomatic code. I read it, understand the patterns, and learn while producing working output. It's the best learning accelerator I've ever used.

Where I Don't

Security

I've seen AI-generated code:

Log PII in debug statements
Skip WITH SECURITY_ENFORCED on SOQL queries
Store API keys in config files instead of named credentials
Create REST endpoints with no auth middleware

None of these are exotic mistakes. They're exactly the kind of thing a junior dev might do. That's the core problem with vibe coding security-sensitive paths: you get junior-dev risk at senior-dev speed. That combination is dangerous.

Every line that touches sensitive data gets human review. No exceptions.

Platform Constraints

This one is specific to Salesforce, but the principle applies everywhere.

Salesforce has hard governor limits — 100 SOQL queries per transaction, 150 DML statements, 6MB heap. AI-generated code routinely violates these because AI optimizes for readability and correctness, not for platform-specific resource budgets.

I've watched AI generate a perfectly logical trigger handler with a SOQL query inside a loop. Beautiful code. Completely broken at scale.

The pattern

Every platform has constraints that aren't visible in the language syntax: API rate limits, connection pool sizes, message queue throughput, database lock contention. AI doesn't know your production traffic patterns. It doesn't know your batch job at 2am contends with the European team's morning sync. These are the problems that make enterprise software hard — and they're exactly the problems AI can't vibe through.

Integration Error Handling

Enterprise architecture is mostly integration — making systems that were never designed to work together communicate reliably. AI can generate the happy path beautifully. The happy path is maybe 30% of the work.

The other 70% is: What happens when SAP is down for maintenance? How do we handle records that succeed in Salesforce but fail downstream? What's the reconciliation strategy? What do we monitor? Who gets paged?

This requires understanding specific systems, specific failure modes, and specific business impact. It's judgment, not code generation.

The Traffic Light

Here's the framework I actually use day-to-day. I keep it simple on purpose.

Green — Vibe freely

Prototypes, test classes, boilerplate, documentation, one-off scripts, learning new stacks. If you break it, you fix it in minutes.

Yellow — Vibe, then review hard

Business logic, UI components, integration middleware (happy path), DevOps config. Let AI draft it, but read every line before it ships.

Red — You drive, AI assists

Auth, encryption, compliance code, governor-limited paths, integration error handling, schema migrations. You write the logic. AI helps with syntax and suggestions.

The color isn't about how hard the code is. It's about what happens when it's wrong.

What This Changes About Architecture

Here's the thing nobody talks about: vibe coding doesn't replace architects. It changes what we spend our time on.

Before AI, I spent maybe 40% of my time on implementation details — writing code, reviewing PRs for syntax issues, creating boilerplate, writing docs. The architecture work — system design, tradeoff analysis, stakeholder alignment — competed with the mechanical work for calendar space.

With AI handling the mechanical work, I spend more time on the parts that actually require experience:

What should this system look like? Not what code to write, but which systems should own which data, how they should communicate, and what happens when things fail.
Is this the right tradeoff? Build or buy? Event-driven or batch? Real-time or eventual? These decisions have enormous downstream cost, and AI can't make them for you.
Does everyone understand what we're building? Half of architecture is communication — making sure the CTO, the product owner, and the engineering team are looking at the same picture.

Vibe coding makes architects more like actual architects — more time designing, less time laying bricks.

The Junior Dev Question

This is the thing that worries me.

If junior developers use AI to skip past the fundamentals — if they never debug a null pointer by hand, never trace a failing integration through three systems, never feel the pain of a governor limit violation in production — do they develop the judgment to become senior engineers?

I don't have a confident answer. But I know the trajectory I want for people I mentor: use AI to go faster, but make sure you understand what it's generating. Read the code. Break the code on purpose. Understand why the AI made the choices it did.

The people who will be great engineers in five years are the ones who use AI as a learning accelerator, not a thinking replacement.

Bottom Line

I vibe code every day. It makes me measurably faster at my job.

But speed without understanding is just technical debt with extra steps. The skill isn't in using the AI — everyone can do that. The skill is in knowing when to trust it and when to think.

That judgment is the actual job.