
Lambda’s pricing looks deceptively simple: pay per request, pay per millisecond of compute. But after running serverless workloads in production for several years, I can tell you the real cost is hiding in places most optimization guides don’t mention.
This isn’t another generic “right-size your memory” listicle. These are 10 tips ordered by impact, starting with the ones most teams miss entirely. Each includes the specific cost impact and concrete steps to implement it.
TL;DR: Reduce AWS Lambda costs by setting CloudWatch log retention, right-sizing memory (CPU scales with it), switching to Graviton/arm64, batching event sources, removing sync Lambda-to-Lambda calls, and eliminating unnecessary NAT Gateway traffic. For steady high-volume workloads, compare containers/EC2 and Lambda Managed Instances.
1. Tame Your CloudWatch Logs First
This is the tip most optimization guides skip, and it’s often the biggest win.
In one production workload we reviewed, CloudWatch Logs were 2.3x the Lambda compute cost. The function wasn’t expensive. The logging was. This isn’t unusual: CloudWatch Logs can account for a surprisingly large share of a serverless bill.
AWS charges for log ingestion and storage (pricing varies by region and log class). By default, Lambda log groups have no expiration, and logs never expire unless you set a retention policy. And since May 2025, Lambda logs support tiered pricing for Standard and Infrequent Access log classes, which can reduce costs at scale, but the default “keep everything forever” behavior still catches most teams.
What to do:
- Set log retention to 30 days (or less) on every Lambda log group. The default “Never expire” is a cost leak. Set it in your IaC to prevent drift:
# CloudFormation / SAM
RetentionInDays: 30
# Or via CLI for existing log groups
aws logs put-retention-policy \
--log-group-name /aws/lambda/your-function \
--retention-in-days 30
- Drop your log level to WARN or ERROR in production. DEBUG logging in prod is burning money
- Consider Lambda Extensions to route logs directly to S3 or a third-party service, bypassing CloudWatch ingestion entirely
- Use CloudWatch Infrequent Access log class for Lambda logs you rarely query, as it’s significantly cheaper
- Sample INFO logs in high-throughput functions (e.g., log 1% of requests) and keep full logging only on errors and timeouts
- Watch CloudWatch Logs Insights usage, since queries cost money too. Log less and query less is the combo
Cost impact: Can reduce your total serverless bill by 30-50% if logging is your dominant cost driver.
2. Right-Size Memory (But Understand the CPU Trade-off)
Here’s what most guides get wrong about Lambda memory: memory controls CPU. Lambda allocates CPU linearly with memory, so you get one full vCPU at 1,769 MB. Below that, you’re running on a fraction of a core.
This means more memory can actually be cheaper for CPU-bound functions. A function at 256 MB might take 800ms to execute. The same function at 1024 MB might finish in 200ms. Because Lambda bills on GB-seconds, 4x memory at 1/4 duration often results in similar or even lower total cost, with significantly better latency.
For I/O-bound functions (waiting on API calls, database queries), the opposite is true. Extra memory is wasted money because your function is just sitting idle waiting on the network.
What to do:
- Run AWS Lambda Power Tuning on your top 10 most-invoked functions. It tests multiple memory configurations and shows you the cost-performance curve
- Look for CPU-bound functions running below 1,769 MB, as they’re likely under-provisioned
- Look for I/O-bound functions running above 512 MB, as they’re likely over-provisioned
Cost impact: Right-sizing over-provisioned functions (e.g., 2048 MB → 512 MB for I/O-bound work) can save up to 75% on duration cost per function.
3. Switch to Graviton2 (ARM)
This is the closest thing to free money in Lambda optimization: 20% lower price per GB-second, plus up to 19% better performance. That’s up to 34% better price-performance from a config change.
For most runtimes (Python, Node.js, Java, .NET), switching is a one-line change in your function configuration. No code modifications needed.
What to do:
- Change your function’s architecture from
x86_64toarm64in the Lambda console or your IaC template - Test thoroughly if you use compiled native dependencies (C extensions, shared libraries), as these need ARM-compatible builds
- Python packages with C extensions (numpy, pandas, Pillow) work fine on ARM via Lambda layers or container images
- If you build container images, ensure your Docker base image supports
arm64. Most official AWS base images do, but third-party images may not, so check before deploying
Cost impact: 20% savings on duration cost, immediately. Combined with performance improvements, real-world savings of 20-34%.
4. Optimize Function Duration Aggressively
Lambda bills in 1ms increments, so every millisecond you shave off execution time directly reduces cost. Small optimizations compound fast at scale.
What to do:
- Reuse connections. Initialize SDK clients, database connections, and HTTP clients outside the handler function. They persist across warm invocations:
import boto3
import requests
# Initialized ONCE, reused across warm invocations
s3_client = boto3.client("s3")
session = requests.Session() # reuses TCP connections via keep-alive
def handler(event, context):
# s3_client and session are already warm
...
- Minimize cold starts. Keep deployment packages small, use layers for large dependencies, and consider SnapStart for Java 11+, Python 3.12+, and .NET 8+ functions
- Lazy-load dependencies. If your function has multiple code paths, only import heavy libraries when the specific path is triggered
- Profile before optimizing. Use AWS X-Ray to identify where time is actually spent. Don’t guess
Cost impact: Varies widely, but a 30% duration reduction at scale (say, 10M invocations/month at 512 MB) saves roughly $25/month on duration alone (assuming ~1s average duration). Multiplied across dozens of functions, it adds up.
5. Kill Synchronous Lambda-to-Lambda Calls
This is one of the most expensive anti-patterns in serverless, and it’s surprisingly common. When Lambda A synchronously invokes Lambda B, you’re paying for both functions simultaneously. Lambda A sits idle (and billed) while waiting for Lambda B to respond.
What to do:
- Replace synchronous
invoke()calls with asynchronous patterns: SQS queues, SNS topics, or Step Functions - Use
InvocationType: 'Event'for fire-and-forget invocations where you don’t need the response - If you need orchestration, Step Functions can be cheaper than keeping a Lambda waiting. Standard workflows are priced per state transition; Express workflows add duration and request components but are designed for high-volume, short-duration work
Cost impact: Eliminates double-billing. If Lambda A waits 500ms for Lambda B, you’re paying for 500ms of idle compute on every invocation. The dollar amount per function pair is modest, but the real cost is architectural. This pattern compounds across dozens of functions in a typical microservices setup and increases tail latency throughout your system.
6. Batch Your Event Sources
Fewer invocations means fewer request charges ($0.20/million) and fewer cold starts. Most event source mappings support batching, but many teams leave the defaults.
What to do:
- SQS: Increase
BatchSizefrom the default 10 to up to 10,000 for standard queues (FIFO queues max out at 10), subject to payload limits (large messages reduce practical batch size). SetMaximumBatchingWindowInSecondsto collect messages before invoking - Kinesis/DynamoDB Streams: Use
BatchSizeandBisectBatchOnFunctionErrorfor efficient processing - S3: Use S3 Event Notifications with SQS to batch object events instead of invoking Lambda per-object
- API Gateway: For non-real-time endpoints, consider buffering requests through SQS and processing in batches
Cost impact: Increasing SQS batch size from 10 to 100 reduces invocations by ~90%, saving proportionally on request charges and amortizing cold start overhead.
7. Optimize Data Transfer and Networking
Data transfer costs are silent budget killers, especially when Lambda functions live in a VPC.
NAT Gateway is the biggest offender. Pricing varies by region, but in US regions it’s typically $0.045/hour (~$32/month just to exist) plus $0.045/GB of data processed. In one system with high-volume event ingestion, we found NAT Gateway charges exceeded Lambda compute by 2x. The functions were cheap, but the networking wasn’t.
What to do:
- Use VPC endpoints for AWS services instead of routing through NAT Gateway. Important distinction: Gateway endpoints (S3, DynamoDB) are free. Interface endpoints (PrivateLink for SQS, Secrets Manager, etc.) have hourly and data processing charges per AZ. They’re still often cheaper than NAT Gateway for steady traffic, but not free
- Keep Lambda functions and their data sources in the same region and Availability Zone
- Only put Lambda in a VPC if it actually needs to access VPC resources. Functions that only call public APIs don’t need VPC access
- If you need a NAT Gateway, share one across functions rather than deploying per-subnet
Cost impact: Replacing a NAT Gateway with VPC endpoints can save $32+/month in fixed costs plus $0.045/GB on data processing.
8. Set Timeouts and Concurrency Guardrails
These are free guardrails that prevent runaway costs.
Lambda’s default timeout is 3 seconds, but many teams increase it to the maximum (15 minutes) “just in case.” This means a buggy function (say, an infinite loop or a hanging external call) burns money for the full 15 minutes before Lambda kills it.
What to do:
- Set timeouts to 2-3x your function’s actual average execution time. If your function normally runs in 2 seconds, a 10-second timeout is plenty
- Use reserved concurrency to cap maximum parallel executions per function. This puts a hard ceiling on per-function spend
- Set up AWS Budgets alerts for Lambda cost anomalies to catch runaway functions before the bill arrives
- If you use Provisioned Concurrency for latency-sensitive functions, track its cost separately. It’s an explicit capacity charge, not a “free” guardrail
Cost impact: Primarily defensive. A single buggy function running at max timeout with uncapped concurrency can generate thousands in unexpected charges in hours.
9. Use Compute Savings Plans
If you have steady Lambda usage, Compute Savings Plans offer up to 17% off duration charges in exchange for a 1-year or 3-year usage commitment.
What to do:
- Check your Lambda usage in AWS Cost Explorer and look for consistent baseline usage over the past 3+ months
- Start with a 1-year plan covering 50-70% of your baseline (leave headroom for variability)
- Compute Savings Plans also cover EC2 and Fargate, so they’re flexible if your architecture changes
When it’s NOT worth it: If your Lambda usage is spiky, seasonal, or still growing unpredictably, you’ll over-commit and pay for unused capacity. Get stable first, then commit.
Cost impact: Up to 17% savings on Lambda duration charges. On a $1,000/month Lambda duration bill, that’s $170/month.
10. Know When Lambda Is the Wrong Tool
This is the tip no vendor-sponsored guide will give you: sometimes the cheapest Lambda optimization is moving off Lambda entirely.
Lambda’s per-invocation pricing is brilliant for spiky, unpredictable workloads. But for sustained, high-volume, predictable traffic, the math stops working in your favor.
The evidence:
- A real-world case study showed migrating 40 Lambda functions to containers reduced the monthly bill from $9,400 to $2,530, a 73% savings
- For sustained image processing workloads, EC2 spot instances were 33-59% cheaper than Lambda
- Lambda Managed Instances (launched November 2025) offer a middle ground: essentially Lambda’s control plane and developer experience with EC2’s pricing model. You pay EC2 instance cost + 15% management fee instead of per-invocation pricing. Reserved Instances and Savings Plans apply to the EC2 portion (not the management fee). Currently available in select regions (us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1)
Rule of thumb: If your Lambda functions consistently run at high concurrency with predictable traffic, run the numbers on containers, EC2, or Lambda Managed Instances. The break-even point varies by workload, but if you’re spending $5,000+/month on Lambda compute, it’s worth the analysis.
For a deeper dive on when to choose what, see my Serverless vs Containers: A Decision Framework.
Where to Start
Don’t try to implement all 10 at once. Here’s the priority order:
- Tips 1-3 (CloudWatch Logs, memory right-sizing, Graviton2): these are the highest impact with the least effort. Start here.
- Tips 4-7 (duration, async patterns, batching, networking): require code or architecture changes, but offer significant savings at scale.
- Tips 8-9 (guardrails, Savings Plans): defensive and commitment-based. Implement once the above are stable.
- Tip 10 (moving off Lambda): only relevant for high-volume, predictable workloads. Evaluate quarterly.
Start by sorting Cost Explorer by service spend. Verify whether Logs or NAT Gateway is actually your top line item before touching code.
Quick-start checklist:
- Set CloudWatch log retention (30 days) on all Lambda log groups
- Switch top functions to arm64 and validate dependencies
- Run Power Tuning on your top 10 most-invoked functions
- Remove sync Lambda-to-Lambda calls in hot paths
- Replace NAT Gateway with VPC endpoints where possible
Every time we’ve done a serverless cost review, the biggest win came from fixing the adjacent services (logs and networking), not the Lambda code itself.
Have a CPU-bound Lambda function that’s slow and expensive? See how we solved that with parallel processing in AWS Lambda with Python, including the memory-CPU relationship that most guides get wrong.
Key Takeaways
- CloudWatch Logs and NAT Gateway often cost more than Lambda compute
- Memory controls CPU, so more memory can reduce cost for CPU-bound workloads
- arm64 (Graviton2) is the easiest immediate cost win
- Architectural patterns matter more than micro-optimizations
- For sustained workloads, run the math on containers or Lambda Managed Instances
FAQ: AWS Lambda Cost Optimization
What is the biggest hidden cost in AWS Lambda? CloudWatch log ingestion and retention are often the largest hidden cost. In many production workloads, logging costs exceed Lambda compute costs, sometimes by 2-3x. Set retention policies and reduce log verbosity in production.
Is increasing Lambda memory always more expensive? No. Lambda allocates CPU proportionally to memory. For CPU-bound functions, more memory means more CPU, faster execution, and sometimes lower total cost. Use AWS Lambda Power Tuning to find the optimal memory setting.
How much can Graviton (ARM) reduce Lambda cost? ARM-based Lambda functions are priced 20% lower per GB-second and often perform up to 19% better, resulting in up to 34% better price-performance. For most runtimes, switching is a configuration change.
When should I move off Lambda? When your workloads are sustained, predictable, and high-concurrency. Case studies show 33-73% savings by migrating to containers or EC2 for steady-state workloads. Lambda Managed Instances (Nov 2025) offer a hybrid option.
Discussion
Comments are powered by GitHub Discussions. Sign in with GitHub to join the conversation.