BIPI

IaC Security Scanning at PR Time: Checkov, tfsec, KICS in 2024

Cloud Security

Static IaC scanning catches real misconfigurations but throws false positives that grind engineering teams down. Custom policies, suppression hygiene, and reporting-not-blocking get scanning adopted instead of resented.

By Arjun Raghavan, Security & Systems Lead, BIPI · February 28, 2024 · 7 min read

#iac#terraform#checkov#devsecops

Static analysis of Terraform, CloudFormation, and Kubernetes manifests catches a meaningful subset of cloud misconfigurations before they reach production. We deploy Checkov, tfsec (now part of Aqua's Trivy), KICS, or Snyk IaC on every Terraform pipeline we touch. We also spend significant time tuning them so engineers do not start hating the security team within two months.

What the tools catch reliably

All the major IaC scanners catch the same core set of issues well: public S3 buckets, security groups open to 0.0.0.0/0, unencrypted EBS volumes, IAM policies with wildcards, RDS instances without backup retention, EKS without control plane logging. These are well-defined patterns with low false positive rates and high real-world impact.

Where the tools diverge is in the long tail. Checkov has the largest rule set (over 1500 rules across providers in 2024) but the highest false positive rate. tfsec (Trivy IaC) has stricter Terraform parsing and fewer false positives but a smaller rule set. KICS has the strongest CloudFormation support. Snyk IaC has the best UI for managing findings but the rule set is smaller than Checkov.

False positives are the killer

The fastest way to lose adoption is to ship a security scanner that flags 50 issues on every PR and requires the engineer to look up each one. We see this constantly:

Rules that flag 'no MFA on IAM user' when the module does not create IAM users (false positive due to misparsing).
Rules that flag 'CloudTrail not encrypted' when it is encrypted at a higher level in the module that includes this one.
Rules that flag 'public S3 bucket' on a bucket that is intentionally public for a static website (no exception mechanism in the scanner).
Generic 'avoid wildcards in IAM' rules that flag legitimate read-only policies.

Without a suppression mechanism and tuning, the noise drowns the signal.

Suppression hygiene

Checkov supports skip comments in Terraform: # checkov:skip=CKV_AWS_18:Reason here. This is the right pattern, but we enforce three rules on suppressions:

Every suppression needs a reason in the comment. 'False positive' is not a reason. 'Bucket is intentionally public for static website hosting' is a reason.
Suppressions are reviewed at PR time by the security team if they touch high-severity rules.
Bulk suppression files (a top-level .checkov.yml with 200 skip entries) get flagged for review. If you need to skip 200 rules globally, the wrong tool is being applied to the wrong codebase.

Quarterly we run a report of all suppressions in the codebase and review whether they are still valid. Suppressions added two years ago for a specific reason often outlive the reason.

Custom policies are where the real value is

Out-of-the-box rules cover generic best practices. Organization-specific policies are where IaC scanning earns its keep:

All S3 buckets must have tags Environment, Owner, DataClassification.
All RDS instances must have backup retention >= 7 days in prod, >= 1 day in dev.
All Lambda functions must use a specific log group naming convention so log retention policies apply.
All security groups must reference other SGs, not CIDR blocks (with explicit exceptions for ingress from VPN).

Checkov supports custom policies in YAML or Python. tfsec supports custom checks in Rego. Both have a learning curve but the policies that emerge from organization-specific reviews are the highest-value rules in the scan.

Integration without blocking flow

We split scanner findings into three buckets:

Block PR: only critical issues with no plausible legitimate use case (public bucket without explicit website config, 0.0.0.0/0 ingress on production SGs, hardcoded credentials).
Required comment: high severity issues require an engineer to acknowledge in the PR. The bot posts a comment, the engineer responds with rationale, the security reviewer approves.
Report only: medium and low findings accumulate in a dashboard. We review trends, not individual findings.

The default failure mode is to put everything in 'block PR' and watch the team's velocity halve. Tiered enforcement keeps engineers engaged with the scanner.

Drift between scan and reality

IaC scanning checks the IaC code. It does not check what is actually deployed. Drift (manual changes to deployed resources, terraform apply with --target on a subset) means the deployed state diverges from the IaC. We pair IaC scanning with runtime configuration scanning (Prowler, ScoutSuite, AWS Config rules, or commercial CSPM) to close the loop. Both layers catch different problems.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.