Implement centralized logging

If several thousand S3 buckets are accessed by a bad actor in the forest, does anybody hear it?

Setting up logging is a core requirement for any company: logs can help engineering teams debug, security teams troubleshoot incidents and breaches, and is a widely accepted best practice.

However, oftentimes logging isn't configured for your existing resources. How can you enable logging en masse? If your environment is managed in Terraform, hitting cloud APIs is out of the question - you would cause drift. Wading into your jungle of ELBs, buckets, databases, and VMs without a good remediation framework in place can take years off of your life...

In this guide, we'll walk through how you can enable logging on your existing resources without causing Terraform drift. This way you can always have the logging you need, for the resources you expect, without causing yourself or your developer teams a huge headache.

Implementing logging for existing resources

Assigning Jira tickets and asking developers to figure out how to turn on logging on their own is often ineffective, slow, and unpopular.

With Resourcely Campaigns, you can manage this process end-to-end with a single person:

1

Choose from pre-built logging policies or build your own

2

Identify the exact resource addresses in violation, and the relevant policies

3

Add relevant Terraform through a guided IDE, without causing drift

Logging basics

Let's choose a set of resources we want to apply logging on. We'll assume an AWS environment in this case.

  • S3

  • ELBs

  • Redshift

  • Opensearch

  • RDS

Define your policies

First, we need to define our Terraform policies to cover all five of our resource type and the logging settings we desire. Let's write one step-by-step - start by navigating to the Foundry in Resourcely. Hover over each part of this Guardrail below to see what it does.

// Start by naming the Guardrail


  
    
    
    
  

To enable logging on all of our resources, you can make one consolidated Guardrail:

GUARDRAIL "Require logging for S3"
  WHEN aws_s3_bucket
    REQUIRE logging EXISTS
    REQUIRE logging.target_bucket EXISTS
    REQUIRE logging.target_prefix = "aws-access-logs"
  WHEN aws_elb
    REQUIRE access_logs EXISTS 
    REQUIRE access_logs.bucket EXISTS 
    REQUIRE access_logs.enabled = true
    REQUIRE access_logs.interval = 60
  WHEN aws_redshift_cluster
    REQUIRE logging EXISTS 
    REQUIRE logging.bucket_name EXISTS 
    REQUIRE logging.enable = true
  WHEN aws_opensearch_domain
    REQUIRE log_publishing_options EXISTS 
  WHEN aws_db_instance
    REQUIRE monitoring_interval > 0
    REQUIRE monitoring_role_arn EXISTS 
    REQUIRE enabled_cloudwatch_logs_exports = ["error", "general", "slowquery"]
  OVERRIDE WITH APPROVAL @default

Publish this consolidated Guardrail after giving it metadata, and then you're ready to start scanning your existing state!

If you haven't set up the Campaign Agent yet, you need to in order to scan your existing Terraform state.

You could also do this for other clouds: here's the Guardrails for similar GCP resources:

Enforce GCP Logging
// All 4 resources covered by 1 Guardrail

GUARDRAIL "Require logging for Google storage buckets"
  WHEN google_storage_bucket
    REQUIRE logging EXISTS
    REQUIRE logging.log_bucket EXISTS
    REQUIRE logging.log_object_prefix = "gcs-access-logs"
  WHEN google_compute_subnetwork
    REQUIRE log_config EXISTS
    REQUIRE log_config.aggregation_interval = "INTERVAL_5_SEC"
    REQUIRE log_config.flow_sampling = 0.5
    REQUIRE log_config.metadata = "INCLUDE_ALL_METADATA"
  WHEN google_project_iam_audit_config
    REQUIRE audit_log_config.log_type EXISTS 
  WHEN google_container_cluster
    REQUIRE logging_service = "logging.googleapis.com/kubernetes"
  OVERRIDE WITH APPROVAL @default

Scan your existing resources

Now we can identify which of your resources don't have logging! Here's the existing Terraform that I'm working with:

main.tf
# main.tf

resource "aws_s3_bucket" "campaigns-resource_ckaMshbmm94dxMYO" {
  bucket = "campaigns-resource-test-bucket-demo1"
}

resource "aws_s3_bucket_public_access_block" "campaigns-resource_ckaMshbmm94dxMYO" {
  bucket                  = aws_s3_bucket.campaigns-resource_ckaMshbmm94dxMYO.id
  block_public_acls       = false
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = false
}

resource "aws_s3_bucket_ownership_controls" "campaigns-resource_ckaMshbmm94dxMYO" {
  bucket = aws_s3_bucket.campaigns-resource_ckaMshbmm94dxMYO.id

  rule {
    object_ownership = "BucketOwnerEnforced"
  }
}

resource "aws_s3_bucket_versioning" "campaigns-resource_ckaMshbmm94dxMYO" {
  bucket = aws_s3_bucket.campaigns-resource_ckaMshbmm94dxMYO.id

  versioning_configuration {
    status = "Disabled"
  }
}

# KMS Key for RDS Encryption
resource "aws_kms_key" "kmskey_6rqH9awtV732LCTF" {
  description             = "KMS key for encrypting an RDS instance"
  key_usage               = "ENCRYPT_DECRYPT"
  is_enabled              = true
  deletion_window_in_days = 30
}

# RDS Instance with Best Practices
resource "aws_db_instance" "example-rds_TJXZTFi3qSy724Fv" {
  identifier                   = "example-rds"
  engine                       = "postgres"
  engine_version               = "17.2"
  instance_class               = "db.t3.micro"
  allocated_storage            = 20
  max_allocated_storage        = 100
  db_name                      = "app_database"
  username                     = "db_user"
  password                     = "password"
  backup_retention_period      = 7
  storage_encrypted            = false
  multi_az                     = true
  deletion_protection          = true
  performance_insights_enabled = true
  skip_final_snapshot          = true
  tags = {
    backup = "false"
  }
}

resource "aws_db_instance" "key-application-dev" {
  identifier                   = "key-application-dev"
  engine                       = "postgres"
  engine_version               = "17.2"
  instance_class               = "db.t3.micro"
  allocated_storage            = 20
  max_allocated_storage        = 100
  db_name                      = "app_database"
  username                     = "db_user"
  password                     = "password"
  backup_retention_period      = 7
  storage_encrypted            = false
  multi_az                     = true
  deletion_protection          = true
  performance_insights_enabled = true
  skip_final_snapshot          = true
}

resource "aws_db_instance" "key-application-prod" {
  identifier                   = "key-application-prod"
  engine                       = "postgres"
  engine_version               = "17.2"
  instance_class               = "db.t3.micro"
  allocated_storage            = 20
  max_allocated_storage        = 100
  db_name                      = "app_database"
  username                     = "db_user"
  password                     = "password"
  backup_retention_period      = 7
  storage_encrypted            = false
  multi_az                     = true
  deletion_protection          = true
  performance_insights_enabled = true
  skip_final_snapshot          = true
}

# AWS Virtual Private Cloud (VPC)
resource "aws_vpc" "example-vpc_GqdekqYpVkAFP6QX" {
  cidr_block                       = "10.0.0.0/16"
  enable_dns_support               = true
  enable_dns_hostnames             = true
  instance_tenancy                 = "default"
  assign_generated_ipv6_cidr_block = true
  tags = {
    owner = "Chris"
  }
}

# AWS Subnet
resource "aws_subnet" "public-subnet-1_7nZ8v33Nhz5TQ69k" {
  vpc_id                  = aws_vpc.example-vpc_GqdekqYpVkAFP6QX.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-west-2a"
  map_public_ip_on_launch = true
  tags = {
    owner = "Chris"
  }
}

# AWS Security Group
resource "aws_security_group" "web-server-sg_dJZqk837GhmTNZGA" {
  name   = "web-server-sg"
  vpc_id = aws_vpc.example-vpc_GqdekqYpVkAFP6QX.id
  tags = {
    owner = "Chris"
  }

  ingress {
    protocol    = "tcp"
    from_port   = 5432
    to_port     = 5432
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    protocol    = "tcp"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Internet Gateway
resource "aws_internet_gateway" "public-gateway_vEqghHimEJ47pvV8_igw" {
  vpc_id = aws_vpc.example-vpc_GqdekqYpVkAFP6QX.id
  tags = {
    Name  = "public-gateway-igw"
    Owner = "Chris"
  }
}

# Route Table for Internet Access
resource "aws_route_table" "public-gateway_vEqghHimEJ47pvV8_rt" {
  vpc_id = aws_vpc.example-vpc_GqdekqYpVkAFP6QX.id
  tags = {
    Name  = "public-gateway-route-table"
    Owner = "Chris"
  }

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.public-gateway_vEqghHimEJ47pvV8_igw.id
  }
}

# Route Table Association
resource "aws_route_table_association" "public-gateway_vEqghHimEJ47pvV8_rta" {
  subnet_id      = aws_subnet.public-subnet-1_7nZ8v33Nhz5TQ69k.id
  route_table_id = aws_route_table.public-gateway_vEqghHimEJ47pvV8_rt.id
}

# AWS Elastic Load Balancer (ELB)
resource "aws_elb" "example-elb_zVYxBfXfh7gEiERm" {
  name            = "example-elb"
  internal        = false
  security_groups = [aws_security_group.web-server-sg_dJZqk837GhmTNZGA.id]
  subnets         = [aws_subnet.public-subnet-1_7nZ8v33Nhz5TQ69k.id]
  tags = {
    Owner = "Chris"
  }

  listener {
    instance_port     = 80
    instance_protocol = "HTTP"
    lb_port           = 80
    lb_protocol       = "HTTP"
  }

  health_check {
    target              = "HTTP:80/"
    interval            = 30
    timeout             = 5
    healthy_threshold   = 2
    unhealthy_threshold = 2
  }
}

resource "aws_iam_group" "my-iam-group_dzNUazB945E6fZSj" {
  name = "my-iam-group"
}

resource "aws_iam_group_membership" "my-iam-group_dzNUazB945E6fZSj_group_membership" {
  name  = "my-iam-group_membership"
  group = aws_iam_group.my-iam-group_dzNUazB945E6fZSj.name
  users = []
}

resource "aws_iam_group_policy_attachment" "my-iam-group_dzNUazB945E6fZSj_policy_attachment_0" {
  group      = aws_iam_group.my-iam-group_dzNUazB945E6fZSj.name
  policy_arn = resource.aws_iam_policy.overly-permissive-policy_0.arn
}

resource "aws_iam_policy" "overly-permissive-policy_0" {
  name        = "overly-permissive-policy"
  description = "Custom IAM policy for my-iam-group group."
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "rds:*"
        ]
        Effect = "Allow"
        Resource = ["*"]
      }
    ]
  })
}


resource "aws_s3_bucket" "app" {
  bucket = "my-app-bucket-no-logs-${random_id.suffix.hex}"
}

resource "random_id" "suffix" {
  byte_length = 4
}

resource "aws_elb" "example" {
  name               = "example-elb-no-logs"
  availability_zones = ["us-west-2a"]

  listener {
    instance_port     = 80
    instance_protocol = "HTTP"
    lb_port           = 80
    lb_protocol       = "HTTP"
  }

  health_check {
    target              = "HTTP:80/"
    interval            = 30
    timeout             = 5
    unhealthy_threshold = 2
    healthy_threshold   = 2
  }
}

resource "aws_redshift_cluster" "example" {
  cluster_identifier      = "redshift-cluster-1"
  node_type               = "dc2.large" # Smallest available for Redshift
  master_username         = "adminuser"
  master_password         = "SuperSecretPass123"
  cluster_type            = "single-node"
  publicly_accessible     = true
  skip_final_snapshot     = true
}

resource "aws_opensearch_domain" "example" {
  domain_name     = "example-domain"
  engine_version  = "OpenSearch_1.3"

  cluster_config {
    instance_type  = "t3.small.search"
    instance_count = 1
  }

  ebs_options {
    ebs_enabled = true
    volume_size = 10
    volume_type = "gp2"
  }
}

resource "aws_db_instance" "example" {
  identifier             = "exampledb"
  engine                 = "mysql"
  instance_class         = "db.t3.micro"
  allocated_storage      = 20
  username               = "adminuser"
  password               = "SuperSecretPass123"
  skip_final_snapshot    = true
  publicly_accessible    = true
}

# updating

Our goal is to scan our Terraform and find where our logging policies are being violated.

Note: Resourcely actually scans your Terraform state, not just your Terraform code.

To start scanning, create a Resourcely Campaign. Give it a name, and choose the AWS logging Guardrail that we created.

We found 9 violations of our logging Guardrail

After clicking "Create Campaign", we can inspect specific violations and see aggregated statistics. We found 9 policy violations across a variety of resources, from Redshift to S3 to load balancers to RDS instances.

Aggregated summary
Specific resource that were scanned, and their violation finding

You can scan multiple repos, and multiple Guardrails

Guided remediation

Now that we have identified resources that are missing logging, we can jump into remediation. Resourcely automatically identifies the IaC causing the violation. It presents this information in an IDE, giving developers guidance with support for exceptions or context updates.

Remediation screen

Resourcely's guided remediation screen is pictured below. It supports file navigation in your repository, feedback on code errors and warnings, and inline Guardrails that show the user exactly the policy they are violating.

Resourcely's remediation screen

Making remediation changes

Let's remediate some resources! We'll implement logging for some of these findings. Our RDS instance is violating this policy:

RDS violations

To make this change, we'll add a monitoring_interval, monitoring_role_arn, and enabled_cloudwatch_logs_exports.

Adding in missing configuration

Here, we have added missing configuration to our RDS instance. This is reflected with a pending orange hourglass icon on our Guardrail violation. We can repeat this for all of our policy violations, or we could request exceptions for some.

Request exceptions

If we don't believe that the policy applies in this case, we can request a policy exception for each resource by hitting "Request Exception".

Our exception request is registered as a green check mark. We'll review both types of changes (final and pending) when we submit our proposal as a change request.

Submitting a remediation proposal as a change request

We have two options when submitting our remediated Terraform: evaluating, or finalizing a change request.

Evaluate

Hitting Evaluate Changes will submit a draft change request, with accompanying Terraform and Resourcely checks. This does a couple things:

  • Verifies the user has written valid Terraform

  • Verifies the Terraform doesn't violate any Guardrails (either those that were previously violated, or new policies)

...without submitting this change request for review. Evaluation lets the user work independently to determine if the changes they make are accurate before involving others.

Finalize

Finalizing a change request will submit the PR for review. As with Evaluting, checks will run that verify the user is writing appropriate Terraform and not violating policies. Finalizing will tag in appropriate reviewers for approval using your existing version control, asking for review of both exceptions and any code changes.

Houston: we have logging

After finalizing our PR, we have added logging to our resources that desperately needed them. It is easy to make configuration changes at scale to cloud resources with Resourcely Campaigns - just define the behavior you want, and let us guide you through remediation.

Sign up for your free account at https://portal.resourcely.io today!

Last updated