8 min read

Semgrep + AI for Infrastructure as Code: Targeted IaC Security Without the Noise

General-purpose AI code scans (Claude Code, Cursor, etc.) are great for broad reviews, but they often skip or under-prioritise infrastructure. Ask for a "security review" and you get application logic, auth, and dependencies, while aws.iam.Role(...) and aws.s3.Bucket(...) slip through. The fix isn't to replace AI; it's to narrow its focus with Semgrep. Run Semgrep to find the exact IaC constructs you care about, then hand those snippets to AI for a deep, pattern-specific review. You'll get noticeably better results than from a single generalised scan.

This post shows how to combine Semgrep with AI (and optionally Semgrep's own AI engine) to review Pulumi, AWS CDK, and Terraform for risky patterns: IAM, S3, Lambda, RDS, security groups, KMS, and more. You get fewer false positives, fewer missed patterns, and no need to manually hunt for every Bucket, Role, or aws_s3_bucket in the repo.


Why Semgrep + AI

General scan: "Review this repo for security issues." The model sees thousands of lines, prioritises application code and obvious vulnerabilities, and often ignores IaC unless you spell it out. IAM policies, S3 ACLs, and encryption settings get less attention.

Semgrep-first: You run Semgrep with rules that match only IAM roles, S3 buckets, Lambda env vars, and so on. You get a small, curated list of locations. Then you ask AI: "Here are the 12 places we create IAM roles; for each snippet, check if permissions follow least privilege and security best practices." The model's context is scoped and labelled, so it doesn't miss those patterns.

Semgrep finds the needles; AI explains why they're sharp (or not). You can also author or refine Semgrep rules and ask AI whether the pattern is precise enough for your review process.


What we're targeting

Before we get into the rules, here's the kind of thing we're after. In a typical repo you might have an IAM role that's too permissive:

# Pulumi: role with broad actions
aws.iam.Role("app-role", assume_role_policy=json.dumps({
    "Effect": "Allow", "Principal": {"AWS": "*"}, "Action": "sts:AssumeRole"
}))

Or an S3 bucket without encryption or public access blocked:

# Terraform: bucket with no encryption or block_public_access
resource "aws_s3_bucket" "data" {
  bucket = "my-app-data"
  # Missing: server_side_encryption_configuration, versioning, block_public_acls
}

Or a security group that opens SSH to the world:

// CDK: 0.0.0.0/0 on port 22
new ec2.SecurityGroup(this, "bastion", {
  vpc,
  allowAllOutbound: true,
});
// ... later: addIngressRule(ec2.Peer.anyIpv4(), ec2.Port.tcp(22), "SSH");

A general AI scan might mention "check your IAM" or "review S3 settings" in passing. With Semgrep we find every place these constructs appear, then hand just those snippets to AI with a clear checklist. The sections below show the exact rules and prompts for each resource type.


1. IAM Permissions Review (Pulumi, AWS CDK & Terraform)

Find every place IAM users or roles are created, then have AI assess whether attached policies and trust relationships are least-privilege and aligned with best practices.

Semgrep: Find the Constructs

Pulumi (Python):

# semgrep rule: find IAM user/role creation in Pulumi
rules:
  - id: pulumi-iam-user-or-role
    pattern-either:
      - pattern: aws.iam.User(...)
      - pattern: aws.iam.Role(...)
    message: "IAM user or role creation - review permissions and trust policy with AI"
    languages: [python]
    severity: WARNING

Pulumi (TypeScript):

- id: pulumi-iam-user-or-role-ts
  pattern-either:
    - pattern: new aws.iam.User(...)
    - pattern: new aws.iam.Role(...)
  message: "IAM user or role creation - review with AI"
  languages: [typescript]
  severity: WARNING

AWS CDK (TypeScript):

- id: cdk-iam-role
  pattern-either:
    - pattern: new iam.Role(...)
    - pattern: new iam.User(...)
  message: "CDK IAM role/user - review policies and principal with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-iam-user-or-role
  pattern-either:
    - pattern: resource "aws_iam_role" ...
    - pattern: resource "aws_iam_user" ...
  message: "Terraform IAM role/user - review assume_role_policy and attached policies with AI"
  languages: [terraform]
  severity: WARNING

Run Semgrep, collect the file:line list and code snippets, then prompt AI with something like:

"For each of the following IAM role/user definitions [paste snippets], check: (1) least privilege, (2) wildcard actions (e.g. *), (3) trust policy scope (e.g. Principal: '*'), (4) inline policies vs managed policies, (5) conditions (IP, MFA) where appropriate. List issues and suggest minimal policies."

You don't have to open every file and search for iam.Role by hand. Semgrep does the discovery; AI does the judgment.


2. S3 Permissions and Configuration (Pulumi, AWS CDK & Terraform)

Find every S3 bucket definition and have AI verify encryption, versioning, public access, and ACLs.

Semgrep: Find Buckets

Pulumi (Python):

rules:
  - id: pulumi-s3-bucket
    pattern: aws.s3.Bucket(...)
    message: "S3 bucket - verify encryption, versioning, public access, ACLs with AI"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-s3-bucket
  pattern-either:
    - pattern: new s3.Bucket(...)
    - pattern: new s3.Bucket(this, ...)
  message: "CDK S3 bucket - check encryption, versioning, blockPublicAccess with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-s3-bucket
  pattern: resource "aws_s3_bucket" ...
  message: "Terraform S3 bucket - verify encryption (aws_s3_bucket_server_side_encryption_configuration), versioning, and public access block with AI"
  languages: [terraform]
  severity: WARNING

What to Ask AI

Hand the matched snippets to AI with a checklist:

"For each S3 bucket definition: (1) Is server-side encryption (SSE-S3 or SSE-KMS) enabled? (2) Is versioning enabled where data is critical? (3) Is block public access (or equivalent) enabled? (4) Are there public read/write ACLs or bucket policies that allow s3:GetObject / PutObject for *? (5) Is bucket logging configured? List misconfigurations and suggest fixes."

You can also write a stricter Semgrep rule (e.g. match Bucket only when versioning or server_side_encryption is absent) and ask AI: "Is this rule precise enough for our review process, or will it miss real issues or cause too many false positives?"


3. Lambda: Environment Variables and Permissions (Pulumi, AWS CDK & Terraform)

Flag Lambda function definitions so AI can check for secrets in env vars, overprivileged roles, and unsafe runtime or config.

Semgrep: Find Lambda Functions

Pulumi (Python):

rules:
  - id: pulumi-lambda-function
    pattern: aws.lambda_.Function(...)
    message: "Lambda function - check env vars for secrets, IAM role scope, VPC, and runtime with AI"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-lambda-function
  pattern: new lambda.Function(...)
  message: "CDK Lambda - check environment, role, vpc, and runtime with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-lambda-function
  pattern: resource "aws_lambda_function" ...
  message: "Terraform Lambda - check environment variables, role (iam role arn), vpc_config, and runtime with AI"
  languages: [terraform]
  severity: WARNING

What to Ask AI

"For each Lambda definition: (1) Do any environment variables look like secrets (keys, passwords, URLs with tokens)? (2) Does the attached role have more permissions than needed (e.g. broad S3 or DynamoDB)? (3) Is the function in a VPC when it needs private resource access? (4) Is the runtime up to date and is reserved concurrency set where appropriate?"

Semgrep finds the functions. AI checks the details.


4. RDS and Database Exposure (Pulumi, AWS CDK & Terraform)

Find RDS (or similar) instances and have AI check public accessibility, encryption, and backup or retention.

Semgrep: Find RDS / DB Instances

Pulumi (Python):

rules:
  - id: pulumi-rds-instance
    pattern: aws.rds.Instance(...)
    message: "RDS instance - verify not publicly accessible, encryption, and backup with AI"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-rds-database
  pattern: new rds.DatabaseInstance(...)
  message: "CDK RDS - check public access, encryption, and backup with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-rds-instance
  pattern: resource "aws_db_instance" ...
  message: "Terraform RDS - check publicly_accessible, storage_encrypted, backup_retention_period with AI"
  languages: [terraform]
  severity: WARNING

What to Ask AI

"For each RDS/database instance: (1) Is it publicly accessible? (2) Is storage encryption enabled? (3) Are automated backups and retention configured? (4) Is the engine version supported and patched?"


5. Security Groups and Network Exposure (Pulumi, AWS CDK & Terraform)

Find security group definitions and have AI check for overly permissive ingress (e.g. 0.0.0.0/0 on sensitive ports).

Semgrep: Find Security Groups

Pulumi (Python):

rules:
  - id: pulumi-security-group
    pattern: aws.ec2.SecurityGroup(...)
    message: "Security group - review ingress/egress rules with AI (avoid 0.0.0.0/0 on DB/SSH)"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-security-group
  pattern-either:
    - pattern: new ec2.SecurityGroup(...)
    - pattern: new ec2.Peer.anyIpv4()
  message: "CDK security group or anyIPv4 - review ingress/egress with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-security-group
  pattern-either:
    - pattern: resource "aws_security_group" ...
    - pattern: resource "aws_vpc_security_group_ingress_rule" ...
  message: "Terraform security group or ingress rule - check cidr_blocks for 0.0.0.0/0 on sensitive ports with AI"
  languages: [terraform]
  severity: WARNING

You can add a targeted rule for the riskiest pattern (Pulumi example):

- id: pulumi-security-group-cidr-open
  pattern: |
    aws.ec2.SecurityGroup(
        ...
        ingress = [ $...INGRESS ],
        ...
    )
  pattern-regex: (0\.0\.0\.0/0|::/0)
  message: "Security group allows 0.0.0.0/0 or ::/0 - verify with AI if this is acceptable"
  languages: [python]
  severity: WARNING

Terraform — open CIDR in ingress:

- id: terraform-sg-cidr-open
  pattern-regex: (0\.0\.0\.0/0|::/0)
  path: (ingress|cidr_blocks).*\.tf
  message: "Security group allows 0.0.0.0/0 or ::/0 - verify with AI if acceptable"
  languages: [terraform]
  severity: WARNING

What to Ask AI

"For each security group snippet: (1) Does any rule allow 0.0.0.0/0 or ::/0 on ports 22, 3389, 5432, 3306, or 27017? (2) Are ports restricted to the minimum required? (3) Are sources/scopes as narrow as possible?"


6. KMS Keys and Key Policies (Pulumi, AWS CDK & Terraform)

Find KMS key resources and have AI review key policies and usage (who can use or administrate the key).

Semgrep: Find KMS Keys

Pulumi (Python):

rules:
  - id: pulumi-kms-key
    pattern: aws.kms.Key(...)
    message: "KMS key - review key policy and usage with AI"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-kms-key
  pattern: new kms.Key(...)
  message: "CDK KMS key - review key policy and rotation with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-kms-key
  pattern: resource "aws_kms_key" ...
  message: "Terraform KMS key - review policy (aws_kms_key_policy) and key_rotation_enabled with AI"
  languages: [terraform]
  severity: WARNING

What to Ask AI

"For each KMS key: (1) Does the key policy allow kms:* or overly broad principals? (2) Is key rotation enabled where supported? (3) Are conditions (e.g. encryption context) used to restrict usage?"


7. CloudTrail and Logging (Pulumi, AWS CDK & Terraform)

Find CloudTrail (or equivalent) resources and have AI check for multi-region, encryption, and log integrity.

Semgrep: Find CloudTrail

Pulumi (Python):

rules:
  - id: pulumi-cloudtrail
    pattern: aws.cloudtrail.Trail(...)
    message: "CloudTrail - verify multi-region, encryption, and log file validation with AI"
    languages: [python]
    severity: WARNING

AWS CDK (TypeScript):

- id: cdk-cloudtrail
  pattern: new cloudtrail.Trail(...)
  message: "CDK CloudTrail - check is_multi_region_trail, encryption, and send_to_cloud_watch_logs with AI"
  languages: [typescript]
  severity: WARNING

Terraform (HCL):

- id: terraform-cloudtrail
  pattern: resource "aws_cloudtrail" ...
  message: "Terraform CloudTrail - check is_multi_region_trail, kms_key_id, enable_log_file_validation with AI"
  languages: [terraform]
  severity: WARNING

What to Ask AI

"For each trail: (1) Is it multi-region (or are all regions covered)? (2) Is the log bucket encrypted and access restricted? (3) Is log file validation enabled?"


Suggested Workflow

  1. Define Semgrep rules for the IaC constructs you care about (IAM, S3, Lambda, RDS, security groups, KMS, CloudTrail, etc.), in the languages you use (Python/TypeScript for Pulumi, TypeScript for AWS CDK, Terraform/HCL for Terraform).
  2. Run Semgrep on the repo and export or copy the list of findings with code snippets.
  3. Batch by resource type (e.g. "all IAM roles", "all S3 buckets") and run one AI prompt per batch with a clear checklist. This keeps context focused and consistent.
  4. Integrate Semgrep into CI so new IaC is always flagged; periodically re-run the AI review on new or changed resources.

Summary

General AI scans tend to under-prioritise IaC; Semgrep + AI fixes that by making infrastructure patterns the explicit target.

Use Semgrep to find IAM users/roles, S3 buckets, Lambda functions, RDS instances, security groups, KMS keys, CloudTrail, and any other resource you care about, in Pulumi, AWS CDK, or Terraform.

Use AI to analyse only those snippets for least privilege, encryption, public access, and best practices, with clear, repeatable prompts.

Suggest fixes, and tune rules so your IaC review stays both broad and precise.

Once this is in place, you get consistent, repeatable coverage of Pulumi, AWS CDK, and Terraform without manually hunting for every risky pattern, and without the noise of a single, generalised scan.

Please subscribe if you would like to receive more content like this :)