BIPI
BIPI

Cloud Cost vs Security: Where to Spend, Where to Cut

Cloud Security

Security and cost optimization pull opposite directions. KMS keys, VPC endpoints, CloudTrail replication, and multi-region logs all cost money. Knowing where to spend and where not separates mature posture from compliance theatre.

By Arjun Raghavan, Security & Systems Lead, BIPI · March 4, 2024 · 7 min read

#cloud-security#cost-optimization#architecture

Every cost optimization review we do ends up cutting security spend by 10-20% and reallocating most of it back into the same security budget but on higher-leverage controls. The patterns are consistent: teams over-spend on logging, encryption, and network controls in low-risk environments while under-investing in identity, runtime detection, and incident response in high-risk ones. Cost discipline forces priority discipline.

Where the security bill comes from

Typical security-attributable AWS spend in a 50-account organization, in rough order:

35%
CloudTrail + CloudWatch Logs + S3 log storage
25%
VPC endpoints, NAT gateways, Network Firewall
15%
GuardDuty, Security Hub, Config
10%
KMS keys and key usage
15%
Everything else (Macie, Inspector, third-party SaaS)

Logging and networking dominate. That is where most over-spend hides.

CloudWatch Logs is the silent killer

CloudWatch Logs charges $0.50 per GB ingested in ap-south-1. A noisy application logging 500 GB per month per service across 30 services hits $7500/month in log ingestion alone. Then storage at $0.03/GB/month adds up.

The fix is rarely cutting log volume (logs are useful), it is changing the storage path:

  • Subscription filter from CloudWatch Logs to Kinesis Firehose to S3. S3 standard storage is $0.025/GB/month, Glacier is $0.004/GB/month.
  • CloudWatch retention set to 7-30 days for hot debugging, S3 retention for long-term.
  • Query against S3 with Athena instead of CloudWatch Logs Insights for historical investigations.
  • Drop debug-level logs at the application layer in production. Most application logs at level DEBUG never get read.

We routinely cut CloudWatch Logs spend by 60-70% with this pattern with zero loss of investigation capability.

CloudTrail is mostly free, except when it is not

Management events in CloudTrail are free for the first trail. Data events (S3 object-level, Lambda invocations, DynamoDB item events) cost $0.10 per 100,000 events. A busy data lake with S3 data events on every bucket generates billions of events monthly and the bill becomes meaningful.

Targeted data event logging is the right pattern: enable data events only on buckets containing sensitive data, not on intermediate processing buckets. The compliance question 'do we log S3 access' becomes 'do we log access to S3 buckets that contain customer data' which is more accurate and much cheaper.

VPC endpoints vs NAT gateway math

Covered in detail in our VPC endpoints post, but the short version: interface endpoints make economic sense above roughly 200 GB/month of traffic per service per AZ. Below that, NAT is cheaper. Audit actual traffic patterns via Flow Logs before provisioning endpoints. We have seen organizations with 50 interface endpoints across 10 VPCs where only 8 of those endpoints carry meaningful traffic.

KMS keys: per-key vs shared

Customer-managed KMS keys cost $1/month each, plus $0.03 per 10,000 API calls. Per-resource keys (a separate key for every S3 bucket, every RDS, every Secrets Manager secret) sound like a strong control but the cost adds up. A typical estate has 200-500 customer-managed keys and an annual KMS bill of $5000-$15000.

Pragmatic pattern:

  • AWS-managed keys for low-risk encryption (most internal S3 buckets, most RDS).
  • Customer-managed keys per business unit or per high-sensitivity data type, not per resource.
  • Multi-region keys only where the workload is genuinely multi-region.
  • Key rotation enabled, but no scheduled key deletion (keys are cheap to keep, expensive if accidentally deleted).

Where to spend more

After cutting waste, the savings reallocate to:

  1. Identity Center upgrade if still on legacy IAM. Federation, permission set hygiene, session policies all matter more than the encryption posture for most threat models.
  2. Runtime detection (GuardDuty at minimum, Falco/Tetragon for K8s, a commercial EDR for high-value workloads). Detection in production catches what static IaC scanning misses.
  3. Incident response retainer with an IR firm. Average dwell time matters more than another network control.
  4. Security training for engineers. The single highest-leverage spend.

The right framing

Cost optimization and security are not opposed. They are both forms of organizational discipline. A cloud bill with a hundred unused KMS keys, twenty over-provisioned NAT gateways, and CloudWatch retention set to 'never' is a cloud bill that reflects a team without clear priorities. Cleaning it up usually improves the security posture as well, because the same forces that produce waste also produce risk.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.