BIPI
BIPI

CloudTrail data events will eat your budget if you let them

Cloud Security

Management events are cheap and mandatory. Data events on every S3 bucket and DynamoDB table will quietly add five figures a month. Here's the storage and retention model that works.

By Arjun Raghavan, Security & Systems Lead, BIPI · March 12, 2024 · 7 min read

#aws#cloudtrail#logging#cloud-security

A logistics company turned on CloudTrail data events for all S3 and Lambda 'because the auditor asked for it' and saw their CloudWatch Logs bill jump from 4k a month to 38k. The auditor had not asked for that. The auditor had asked for evidence of access to the four buckets containing customer PII. Default-on data events for everything is one of the most common avoidable cloud spend mistakes we encounter.

CloudTrail has three event categories with very different price and value profiles. Treating them as the same thing is what blows the budget. Treating them differently is what gives you forensics when you need them.

Management events: always on, organisation-wide

Management events log control-plane API calls: CreateUser, AssumeRole, RunInstances, ModifySecurityGroup. You get the first copy free in every account; the second copy costs money. Configure one organisation trail in your security account that captures management events from every member account, and turn off the per-account default trails to avoid duplication. Send the output to a single S3 bucket with bucket-level deny on Delete and a 7-year retention policy.

Data events: scope ruthlessly

Data events log object-level operations: GetObject, PutObject, Lambda Invoke, DynamoDB GetItem. They are charged per event and there is no free tier. A medium-sized S3 application generates millions of events per day. We enable data events on a hand-picked allow-list: buckets containing regulated data, Lambda functions with privileged execution roles, DynamoDB tables holding PII, and a small set of high-value KMS keys. Everything else stays off.

  • Buckets tagged data-classification:restricted or data-classification:confidential
  • Lambda functions whose execution role has iam:PassRole or sts:AssumeRole on production roles
  • DynamoDB tables in the customer-data and finance namespaces
  • KMS keys used for envelope-encrypting database backups

Retention tiers that match your forensic timeline

Most incidents are detected within 30 days. Almost all are detected within 12 months. The 7-year retention requirement comes from compliance, not from the security team. We split CloudTrail retention into three tiers: hot in CloudWatch Logs or a query engine for 30 days, warm in S3 Standard for 12 months, cold in S3 Glacier Deep Archive for the rest of the compliance window. Lifecycle rules move objects automatically.

30d
Hot tier in queryable storage. Sub-second forensic queries.
12mo
Warm tier in S3 Standard. Athena queries, minutes not seconds.
7yr
Cold tier in Glacier Deep Archive. Compliance only, retrieval in hours.

Athena over Logs Insights for real forensics

CloudWatch Logs Insights is fine for 'what happened in the last hour'. For real incident response across months of data, Athena over the S3-backed trail is faster and an order of magnitude cheaper. Set up a partitioned table by region and date, store the table definition in version control, and pre-write the queries you will need under pressure: which principal called AssumeRole into role X in the last 90 days, which IPs called GetObject on bucket Y, which sessions made API calls outside business hours.

Insights events are worth their cost on management trails

CloudTrail Insights costs about 0.35 USD per 100k management events analysed. On a management-only org trail, that is usually a few hundred dollars a month for an org of any size, and the value is high: it surfaces unusual API call rates and unusual error rates without you writing any detection logic. We enable it on the org management trail by default and have caught two real credential-theft incidents from it that GuardDuty missed.

The shape of a working CloudTrail setup is one organisation trail for management events, a hand-picked set of data events on sensitive resources, three retention tiers, and Athena for queries beyond 30 days. Anyone telling you to turn on data events for everything has not seen the bill.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.