AI App Performance Optimization
Fix slow AI-built apps: bundle size, database queries, caching, Core Web Vitals. Before/after metrics included. Get optimized.
Your AI-built application loads in 6 seconds. Your users expect 2. That 4-second gap is costing you conversions, retention, and revenue — and the performance problems in AI-generated code are specific, predictable, and fixable.
AI code generation tools optimize for functionality, not performance. They produce code that works correctly on a fast development machine with a local database and a single user. In production, with concurrent users on mobile networks hitting a remote database, that same code performs terribly.
After auditing 50 vibe coded apps, the average application scored 34 on Lighthouse Performance (out of 100). After optimization, that average rose to 89. The fixes are not complex — they are systematic. Here is what they look like and what results to expect.
Bundle Size: The First Performance Killer
AI tools import entire libraries to use a single function. The result: JavaScript bundles that are 3-5x larger than necessary.
The Problem
A typical AI-generated Next.js application ships a 1.2-2.5 MB JavaScript bundle to the browser. A well-optimized equivalent ships 200-400 KB. The difference is not application logic — it is unnecessary code.
Pattern 1: Full library imports. AI tools generate import moment from 'moment' (330 KB) when the application uses exactly one function: date formatting. Replacing it with date-fns/format (2 KB) or native Intl.DateTimeFormat (0 KB) achieves the same result at 0.6% of the bundle cost.
Pattern 2: Duplicate functionality. AI-generated apps frequently include both axios and fetch wrappers, both lodash and native array methods, both a CSS framework and inline styles. Each duplicate adds bundle weight with no functional benefit.
Pattern 3: Missing code splitting. AI tools generate applications as a single bundle. Every page loads every component, every utility, and every third-party library — even the ones that page does not use. A user visiting your landing page downloads the code for your admin dashboard, your settings page, and your payment flow.
The Fix
- Analyze the bundle. Run
npx @next/bundle-analyzeron a Next.js app. The visual treemap shows exactly which packages consume the most space. In every AI-generated app we have analyzed, 60-80% of the bundle consists of packages that could be replaced, removed, or lazy-loaded. - Replace heavy packages.
momentbecomesdate-fnsor native APIs.lodashbecomes targeted imports (lodash/debounceinstead oflodash).chart.jswith all chart types becomes a focused charting library or dynamic import. - Implement code splitting. Next.js supports dynamic imports with
next/dynamic. Components below the fold, modals, and admin features should be dynamically imported so they load only when needed. This alone reduces initial bundle size by 40-60% in typical AI-generated apps. - Tree-shake effectively. Ensure your build configuration supports tree shaking. Some AI-generated code patterns (re-exporting from barrel files, CommonJS imports) prevent tree shaking from eliminating dead code.
Before/after: Average initial bundle size drops from 1.8 MB to 380 KB. First Contentful Paint improves from 3.2 seconds to 1.1 seconds on 4G connections.
Image Optimization: The Overlooked Bottleneck
Images are the largest single asset type on most web pages. AI tools handle them poorly.
The Problem
AI-generated code uses <img> tags with full-resolution images. No responsive sizing. No format optimization. No lazy loading. A 4000x3000 pixel hero image is served to a mobile phone that renders it at 375 pixels wide. The user downloads 2 MB of image data to display a 200 KB image.
The Fix
- Use Next.js Image component.
next/imageautomatically serves images in WebP/AVIF format, resizes for the device, and lazy-loads below-the-fold images. Replacing<img>with<Image>across an AI-generated app reduces total image transfer by 60-80%. - Specify sizes attribute. Tell the browser how wide the image will be at each breakpoint. Without
sizes, the browser downloads the largest image variant regardless of viewport. - Set priority on LCP images. The hero image or above-the-fold product image should have
priorityset to disable lazy loading. This is the image that determines your Largest Contentful Paint score. - Compress source images. Before uploading, compress images to web-appropriate resolution. No source image needs to be larger than 2x the maximum display size. Use tools like Sharp or Squoosh to batch-compress.
Before/after: Total image transfer drops from 4.2 MB to 680 KB. Largest Contentful Paint improves from 4.8 seconds to 1.6 seconds.
Database Query Optimization
AI-generated database code is correct but catastrophically slow under load. The ORM abstraction hides the actual queries being executed, and those queries are often 10-100x slower than necessary.
The Problem
N+1 queries. The most common performance killer. A page displaying 20 items with their related data executes 21 queries: one to fetch the list, then one per item to fetch related data. At 100 concurrent users viewing that page, the database processes 2,100 queries instead of 100. This pattern is the primary cause of database locks under load in AI-built apps.
Missing indexes. AI tools create database schemas without performance-tuned indexes. A query that filters by email on a table without an index on email performs a full table scan — acceptable at 1,000 rows, unusable at 100,000 rows.
Unoptimized joins. AI-generated Prisma code uses include to eagerly load related data. A single API endpoint might load a user with their orders, each order with its items, each item with its product — pulling hundreds of rows when the frontend displays 5 fields.
The Fix
- Enable query logging. In Prisma, set
log: ['query']in the client configuration. Observe the actual SQL being generated. In most AI-generated apps, the first developer to read the query log is shocked by the volume and complexity of queries. - Eliminate N+1 patterns. Replace individual lookups with batch queries. In Prisma, this means using
includeat the top query level rather than fetching related data in a loop. In raw SQL, this means JOINs or subqueries. - Add targeted indexes. Identify the 5-10 slowest queries from your slow query log. Add indexes on the columns used in WHERE, ORDER BY, and JOIN conditions. Do not index everything — each index slows write operations.
- Select only needed fields. Replace
findMany()withfindMany({ select: { id: true, name: true, email: true } }). Loading 20 fields when you display 3 wastes database I/O, network bandwidth, and memory. - Implement connection pooling. Use PgBouncer, Supabase's built-in pooler, or Neon's connection pooler. Without pooling, serverless functions exhaust database connections within minutes under load.
Before/after: p95 API response time drops from 2.4 seconds to 180 milliseconds. Database CPU utilization drops from 78% to 12% at equivalent traffic.
Caching Strategies
AI-generated code fetches everything from the database on every request. No caching layer. No stale-while-revalidate patterns. Every page load triggers the full query chain, regardless of whether the data has changed.
The Fix
Static content caching. Pages that do not change per-user (landing pages, blog posts, documentation) should be statically generated at build time or cached at the CDN edge. Next.js supports this with generateStaticParams and Incremental Static Regeneration. A blog page that takes 800ms to render from the database takes 20ms to serve from cache.
API response caching. Endpoints that return data which changes infrequently (product catalogs, configuration, user profiles) should include cache headers. Cache-Control: public, s-maxage=60, stale-while-revalidate=600 serves cached responses for 60 seconds and refreshes in the background.
Database query caching. For expensive queries that are called frequently, cache results in Redis or in-memory. A dashboard query that aggregates 100,000 rows takes 2 seconds. Caching the result for 5 minutes serves it in 1 millisecond.
CDN configuration. Static assets (JavaScript, CSS, images, fonts) should be served from a CDN with long cache durations. Vercel handles this automatically for Next.js deployments. Other platforms require explicit CDN configuration.
Before/after: Server-rendered page load time drops from 1.2 seconds to 80 milliseconds for cached content. Database load drops by 70% as repeated queries hit cache.
Core Web Vitals: What Google Measures
Google uses three Core Web Vitals to evaluate page experience. AI-generated apps typically fail all three.
Largest Contentful Paint (LCP). Target: under 2.5 seconds. Measures how long it takes for the largest visible element to render. AI-generated apps average 4-6 seconds because of unoptimized images, render-blocking JavaScript, and server-side rendering delays.
Cumulative Layout Shift (CLS). Target: under 0.1. Measures visual stability — how much elements jump around during load. AI-generated apps score 0.3-0.8 because images load without reserved dimensions, fonts swap without size matching, and dynamic content inserts above the viewport.
Interaction to Next Paint (INP). Target: under 200 milliseconds. Measures responsiveness to user interactions. AI-generated apps score 300-800ms because event handlers trigger expensive re-renders, state updates are not batched, and heavy computations block the main thread.
Fixing Each Metric
LCP fixes: Preload the LCP image. Reduce server response time with caching. Eliminate render-blocking resources. Use priority on above-the-fold images.
CLS fixes: Set explicit width and height on all images and iframes. Use font-display: swap with size-adjusted fallback fonts. Reserve space for dynamic content with CSS min-height.
INP fixes: Debounce expensive event handlers. Use startTransition for non-urgent state updates. Move heavy computations to Web Workers. Virtualize long lists instead of rendering all items.
Before/after: Lighthouse Performance score improves from 34 to 89. All three Core Web Vitals move from "poor" (red) to "good" (green) in Google Search Console.
Lazy Loading and Code Splitting
AI tools render everything immediately. In production, users see one screen at a time. Loading everything upfront wastes bandwidth and delays the content users actually want.
Implementation
- Route-based code splitting. Next.js does this automatically for pages. Verify with bundle analysis that page-specific code is not leaking into shared bundles through barrel file imports.
- Component-level lazy loading. Modals, dropdowns, tooltips, and below-the-fold sections should load on demand. Use
next/dynamicwith a loading placeholder. - Third-party script deferral. Analytics, chat widgets, and social embeds should load after the main content. Use
next/scriptwithstrategy="lazyOnload"for non-critical third-party code. - Data prefetching. While the user reads the current page, prefetch data for likely next actions. Next.js
<Link>components prefetch linked pages by default. For API data, implement prefetch on hover.
The Performance Optimization Process
Our optimization follows a measurement-driven process, not a checklist. Every change is validated with before/after metrics.
- Baseline measurement. Lighthouse audit, WebPageTest from multiple regions, Core Web Vitals from real user data (Chrome UX Report). This establishes the starting point and identifies the highest-impact opportunities.
- Bundle analysis. Identify oversized packages, duplicate code, and missing code splitting. Prioritize by size reduction potential.
- Database profiling. Enable slow query logging. Identify N+1 patterns, missing indexes, and unnecessary data loading. Prioritize by query frequency and latency impact.
- Incremental optimization. Fix the highest-impact issue first. Measure the result. Fix the next highest-impact issue. Each optimization is a separate commit with documented before/after metrics.
- Validation. Final Lighthouse audit, load testing at target traffic levels, and Core Web Vitals verification. The application must score 85+ on Lighthouse Performance and pass all three Core Web Vitals thresholds.
This is the performance component of our production engineering process. Performance optimization works alongside architecture improvements and security hardening to deliver a complete production-grade application. The underlying patterns that cause these performance problems are the same ones documented in our audit of 50 vibe coded apps.
Frequently Asked Questions
How much faster will my AI-built app get after optimization?
Based on our data across 50 applications: Lighthouse Performance scores improve from an average of 34 to 89. Page load times improve by 60-80%. API response times improve by 70-90% after database optimization. These are not theoretical improvements — they are measured results from production deployments.
Which optimization has the biggest impact?
Database query optimization. In most AI-generated applications, the server spends 70-90% of response time executing database queries. Fixing N+1 patterns and adding missing indexes typically reduces API response times from seconds to milliseconds. Bundle size optimization has the second-largest impact for client-side performance.
Can I optimize performance without changing functionality?
Yes. Performance optimization does not add or remove features. It makes existing features faster. Database indexes do not change query results. Code splitting does not change what loads — it changes when it loads. Image optimization does not change what users see — it changes how quickly they see it.
How do I measure performance before and after?
Use Lighthouse (built into Chrome DevTools) for synthetic measurement. Use Google Search Console's Core Web Vitals report for real-user data. Use WebPageTest for detailed load waterfall analysis from multiple locations. For API performance, use your application monitoring tool (Sentry, Datadog) to track p50 and p95 response times by endpoint.
Does performance affect SEO?
Yes. Google uses Core Web Vitals as a ranking signal. Pages that pass all three Core Web Vitals thresholds receive a ranking boost over pages that do not. More importantly, slow pages have higher bounce rates. A page that loads in 1 second has a 9% bounce rate. A page that loads in 5 seconds has a 38% bounce rate. Each second of improvement directly impacts user engagement.
Should I optimize before or after launch?
Before, if possible. If you are weighing whether to tackle this yourself, our comparison of DIY vs hiring experts explains why performance optimization at the database level is one area where expert help saves months of learning. Users form performance expectations in their first visit. If your application is slow on day one, users associate it with poor quality. That perception is difficult to reverse even after optimization. If you have already launched, optimize now — performance improvements have immediate impact on user engagement and conversion metrics.
How much does performance optimization cost?
As part of a production engineering engagement: the performance optimization component typically represents $3,000-$8,000 of the total engagement cost, depending on the number of performance issues and their complexity. The ROI is measurable: faster pages convert better, rank higher, and retain more users.
Speed Is Not a Feature — It Is a Requirement
Every 100 milliseconds of latency costs 1% in conversion. Your AI-built app is likely 2-4 seconds slower than it needs to be. That is 20-40% of conversions you are leaving on the table.
- Apply — Tell us about your application and its performance targets.
- Measure — We establish baselines and identify the highest-impact optimizations.
- Optimize — Your app scores 85+ on Lighthouse with all Core Web Vitals passing.
Apply for a performance optimization engagement and find out exactly how much faster your application can be.