Serverless & Edge Computing

Lambda Cold Start Optimization

Cold starts are the biggest complaint about AWS Lambda. A Java function can take 5+ seconds to initialize, and even Node.js functions hit 500ms+ with heavy dependencies. We optimize your Lambda cold starts using provisioned concurrency, SnapStart, dependency tree-shaking, runtime selection, and initialization code restructuring — cutting p99 latency by 60–90% without changing your application logic.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Diagnosing Cold Start Impact

Before optimizing, we measure your actual cold start footprint. Most teams overestimate cold start frequency — typically only 1–5% of invocations are cold starts. But for latency-sensitive APIs, even 1% matters.

# CloudWatch Insights query to measure cold start frequency and duration
fields @timestamp, @duration, @initDuration, @billedDuration
| filter ispresent(@initDuration)
| stats 
    count() as coldStarts,
    avg(@initDuration) as avgInit,
    pct(@initDuration, 50) as p50Init,
    pct(@initDuration, 95) as p95Init,
    pct(@initDuration, 99) as p99Init
  by bin(1h)

# Cold start percentage
fields @timestamp
| stats 
    sum(ispresent(@initDuration)) as coldStarts,
    count() as totalInvocations,
    (sum(ispresent(@initDuration)) / count()) * 100 as coldStartPct
  by bin(1h)

We analyze your function's initialization phase — what happens during INIT — using X-Ray subsegments. Common culprits include: loading large SDKs (AWS SDK v2 vs v3), establishing database connections, parsing configuration files, and importing heavyweight libraries.

Dependency Optimization & Bundling

The single most effective cold start optimization is reducing your deployment package size. We restructure your code to minimize what loads during initialization.

// esbuild.config.ts — Aggressive tree-shaking
import { build } from 'esbuild';

await build({
  entryPoints: ['src/handlers/create-order.ts'],
  bundle: true,
  minify: true,
  platform: 'node',
  target: 'node20',
  outfile: 'dist/create-order/index.mjs',
  format: 'esm',
  treeShaking: true,
  // Only include specific AWS SDK v3 clients
  external: [
    '@aws-sdk/client-dynamodb',
    '@aws-sdk/lib-dynamodb',
  ],
  // Mark other AWS SDK packages as external — Lambda runtime provides them
  // But only @aws-sdk/*, not third-party deps
});
  • AWS SDK v3 modular imports — Import only @aws-sdk/client-dynamodb instead of the entire SDK. This alone cuts 30–50MB from your bundle.
  • ESM format — Use .mjs extension with ESM output for faster module resolution in Node.js 20+.
  • Lazy imports — Move rarely-used dependencies inside the handler function instead of top-level imports.
  • Lambda Layers — Move stable dependencies to a Layer so they are cached across deployments and shared across functions.

After optimization, a typical Node.js function goes from 15MB to under 1MB, cutting cold starts from 800ms to 150ms.

Provisioned Concurrency & SnapStart

For latency-critical functions, we configure provisioned concurrency to keep warm instances ready. For Java functions, we enable SnapStart to snapshot the initialized JVM.

resource "aws_lambda_function" "api" {
  function_name = "api-handler"
  runtime       = "nodejs20.x"
  handler       = "index.handler"
  memory_size   = 1024  # More memory = more CPU = faster init
  timeout       = 30
  
  snap_start {
    apply_on = "PublishedVersions"  # For Java/SnapStart
  }
}

resource "aws_lambda_provisioned_concurrency_config" "api" {
  function_name                  = aws_lambda_function.api.function_name
  qualifier                      = aws_lambda_function.api.version
  provisioned_concurrent_executions = 5
}

# Auto-scaling provisioned concurrency based on utilization
resource "aws_appautoscaling_target" "lambda" {
  max_capacity       = 50
  min_capacity       = 5
  resource_id        = "function:${aws_lambda_function.api.function_name}:${aws_lambda_function.api.version}"
  scalable_dimension = "lambda:function:ProvisionedConcurrency"
  service_namespace  = "lambda"
}

resource "aws_appautoscaling_policy" "lambda" {
  name               = "lambda-pc-scaling"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.lambda.resource_id
  scalable_dimension = aws_appautoscaling_target.lambda.scalable_dimension
  service_namespace  = aws_appautoscaling_target.lambda.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value = 0.7  # Scale when 70% of PC is utilized
    predefined_metric_specification {
      predefined_metric_type = "LambdaProvisionedConcurrencyUtilization"
    }
  }
}

Provisioned concurrency costs money, so we right-size it based on your traffic patterns. For predictable traffic, fixed PC works. For variable traffic, we use Application Auto Scaling with target tracking at 70% utilization.

Memory Tuning & Runtime Selection

Lambda allocates CPU proportionally to memory. A 128MB function gets a fraction of a vCPU; a 1769MB function gets a full vCPU. More memory often means faster execution and lower total cost because you are billed per GB-second.

# AWS Lambda Power Tuning — find optimal memory
# Deploy the Step Function from:
# github.com/alexcasalboni/aws-lambda-power-tuning

# Input configuration
{
  "lambdaARN": "arn:aws:lambda:us-east-1:123456789:function:api-handler",
  "powerValues": [128, 256, 512, 768, 1024, 1536, 2048, 3008],
  "num": 50,
  "payload": { "path": "/api/orders", "method": "GET" },
  "strategy": "cost",
  "autoOptimize": true
}

We run AWS Lambda Power Tuning on every function to find the memory sweet spot. Typical results show that moving from 128MB to 512MB cuts execution time by 70% and reduces cost by 30% because the function runs 4x faster for 4x the memory price.

Runtime selection matters too: Node.js 20 and Python 3.12 have the fastest cold starts (100–200ms). Java with SnapStart achieves 200–400ms. Go compiles to a single binary with sub-100ms cold starts. We recommend the runtime that matches your team's expertise, with cold start optimization applied on top.

Why Anubiz Engineering

100% async — no calls, no meetings
Delivered in days, not weeks
Full documentation included
Production-grade from day one
Security-first approach
Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.