If you take a look at this blog, you’ll see that I’ve begun to tinker with devops quite a bit. If you’ve ever taken the trouble to look me up on LinkedIn, you’ll also see that I’ve had a little history doing security stuff. Given my love of security, the next logical step of my devops journey was to start to look into securing the CI/CD pipeline.
My previous posts were some ways that I was able to make my own personal infrastructure easier to maintain while at the same time learning various devops tools. My two previous posts, How to Build a CI/CD Pipeline for Your Database and Automate Your Database Changes with a CI/CD Pipeline, are my attempt at creating a sample pipeline to be used as a basis of my examples. In addition to these articles, I’ve also created some other infrastructure as part of my devops environment. Now let’s secure it!
Selecting Tools to Secure Your CI/CD Pipeline
I think this is a challenging step because there’s a number of different tools out there and I have only just begun to research them. My initial research started with the term Cloud security posture management (CSPM). I think that Palo Alto has a reasonable definition of it here. I’m also cheap so I looked for Open Source tools and I found one called CloudQuery. This led me to an interesting article on implementing CloudQuery as a CSPM.
That article seemed really long and had too many steps to implement for my initial dive into security in the pipeline. I then stumbled upon a trusted name in the security world as well as the open source security community, Tenable. Tenable has a solution called Terrascan that looks like exactly what I want to start messing with. Here is why I like this tool:
- I grew up on running Nessus so I’m comfortable with the Tenable name
- Terrascan is open source!
- Terrascan has a docker container
- Terrascan has a Github Action
At this point, I have selected the first tool to drop into my pipeline, Terrascan.
Running My First Terrascan
Since I’ve been doing all of this work with Github Actions and more, it seemed like a no brainer to just spin up the Github Action and go. This felt like a really bad idea to me since I’ve never run anything against my pipeline. Who knows how bad it would? Who knows what all would need to be fixed? What all do I configure? The questions were endless!
I decided that it made more sense to just try running the docker container on my local machine to see what happens. I mounted the local copy of my Github devops repo and wanted to see what I needed to do. My run was a complete failure
% docker run -v /Users/scott/devops_repo:/opt/code -it tenable/terrascan scan
2023-11-08T19:51:43.527Z error utils/path.go:84 error encountered traversing directories{base path 15 0 / <nil>} {error 26 0 lstat /proc/1/fd/13: no such file or directory}
2023-11-08T19:51:43.527Z error utils/path.go:84 error encountered traversing directories{base path 15 0 / <nil>} {error 26 0 lstat /proc/1/fd/8: no such file or directory}
2023-11-08T19:51:43.527Z error utils/path.go:84 error encountered traversing directories{base path 15 0 / <nil>} {error 26 0 lstat /proc/1/fd/8: no such file or directory}
2023-11-08T19:51:43.527Z error utils/path.go:84 error encountered traversing directories{base path 15 0 / <nil>} {error 26 0 lstat /proc/1/fd/8: no such file or directory}
2023-11-08T19:51:43.527Z error v1/load-dir.go:40 error while searching for iac files%!(EXTRA zapcore.Field={root dir 15 0 / <nil>}, zapcore.Field={error 26 0 lstat /proc/1/fd/8: no such file or directory})
2023-11-08T19:51:43.529Z error utils/path.go:84 error encountered traversing directories{base path 15 0 / <nil>} {error 26 0 lstat /proc/1/fd/8: no such file or directory}
Scan Errors -
IaC Type : kustomize
Directory : /
Error Message : kustomization.y(a)ml file not found in the directory /
-----------------------------------------------------------------------
IaC Type : helm
Directory : /
Error Message : lstat /proc/1/fd/13: no such file or directory
-----------------------------------------------------------------------
IaC Type : arm
Directory : /
Error Message : lstat /proc/1/fd/8: no such file or directory
-----------------------------------------------------------------------
IaC Type : k8s
Directory : /
Error Message : lstat /proc/1/fd/8: no such file or directory
-----------------------------------------------------------------------
IaC Type : docker
Directory : /
Error Message : lstat /proc/1/fd/8: no such file or directory
-----------------------------------------------------------------------
IaC Type : cft
Directory : /
Error Message : lstat /proc/1/fd/8: no such file or directory
-----------------------------------------------------------------------
Scan Summary -
File/Folder : /
IaC Type :
Scanned At : 2023-11-08 19:51:44.737750405 +0000 UTC
Policies Validated : 785
Violated Policies : 0
Low : 0
Medium : 0
High : 0
Yup. Complete failure. If you look at the help, you’ll see that the default scan directory is .
. Don’t let the /
error messages fool you. I thought that it would just scan the entire docker container starting at /
but that was not the case. I needed to supply the -d
switch and point that at my /opt/code
mount like this
% docker run -v /Users/salgatt/devops_repo:/opt/code -it tenable/terrascan scan -d /opt/code
I ran this one and got a few more Scan Errors that I can ignore.
Scan Errors -
-----------------------------------------------------------------------
IaC Type : cft
Directory : /opt/code/terraform/.terraform/modules
Error Message : error while loading iac file '/opt/code/terraform/.terraform/modules/modules.json', err: failed to find valid Resources key in file: /opt/code/terraform/.terraform/modules/modules.json
-----------------------------------------------------------------------
IaC Type : cft
Directory : /opt/code/terraform
Error Message : error while loading iac file '/opt/code/terraform/postgres-credentials.yaml', err: failed to find valid Resources key in file: /opt/code/terraform/postgres-credentials.yaml
-----------------------------------------------------------------------
IaC Type : kustomize
Directory : /opt/code
Error Message : kustomization.y(a)ml file not found in the directory /opt/code
-----------------------------------------------------------------------
I do not have any kustomize
or cft
deployment scripts/configurations in this repo so I can ignore these IaC type scans. At the end of the run, I got the following Scan Summary
Scan Summary -
File/Folder : /opt/code
IaC Type : docker,k8s,terraform
Scanned At : 2023-11-08 19:56:36.344271485 +0000 UTC
Policies Validated : 76
Violated Policies : 179
Low : 37
Medium : 111
High : 31
Vulnerabilities : 0
Reviewing the Violation Details
I’ve run the scan and I’ve got a lot of errors so let’s look at some of the Violation Details
Violation Details -
Description : Ensure that S3 Buckets have server side encryption at rest enabled with KMS key to protect sensitive data.
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 22
Severity : HIGH
-----------------------------------------------------------------------
Description : Ensure DynamoDb is encrypted at rest
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 43
Severity : MEDIUM
-----------------------------------------------------------------------
Description : Ensure Point In Time Recovery is enabled for DynamoDB Tables
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 43
Severity : MEDIUM
-----------------------------------------------------------------------
Description : Containers Should Not Run with AllowPrivilegeEscalation
File : mssql/deployment.yml
Line : 1
Severity : HIGH
-----------------------------------------------------------------------
It looks like the tool is really good at providing details to the location of concern for Terraform resources. It isn’t “ok” at pointing to the YAML for the Kubernetes errors.
There are also quite a few redundant violations that can be summarized in the following way:
11 AppArmor profile not set to default or custom profile will make the container vulnerable to kernel level threats
14 Apply Security Context to Your Pods and Containers
12 CPU Limits Not Set in config file.
12 CPU Request Not Set in config file.
12 Container images with readOnlyRootFileSystem set as false mounts the container root file system with write permissions
12 Containers Should Not Run with AllowPrivilegeEscalation
2 Default Namespace Should Not be Used
12 Default seccomp profile not enabled will make the container to make non-essential system calls
1 Ensure DynamoDb is encrypted at rest
1 Ensure Point In Time Recovery is enabled for DynamoDB Tables
1 Ensure that S3 Buckets have server side encryption at rest enabled with KMS key to protect sensitive data.
12 Image without digest affects the integrity principle of image security
12 Memory Limits Not Set in config file.
12 Memory Request Not Set in config file.
12 Minimize Admission of Root Containers
11 No liveness probe will ensure there is no recovery in case of unexpected errors
11 No readiness probe will affect automatic recovery in case of unexpected errors
9 No tag or container image with
6 Nodeport service can expose the worker nodes as they have public interface
4 Prefer using secrets as files over secrets as environment variables
Investigating Some of The Violations
I’m going to start by addressing AWS related violations noted here
Description : Ensure that S3 Buckets have server side encryption at rest enabled with KMS key to protect sensitive data.
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 22
Severity : HIGH
-----------------------------------------------------------------------
Description : Ensure DynamoDb is encrypted at rest
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 43
Severity : MEDIUM
-----------------------------------------------------------------------
Description : Ensure Point In Time Recovery is enabled for DynamoDB Tables
File : terraform/init_resources/aws_resources.tf
Module Name : root
Plan Root : terraform/init_resources
Line : 43
Severity : MEDIUM
In order to investigate these further, we need to look at the referenced terraform/init_resources/aws_resources.tf
Terraform template:
# IAM Requirements : S3
# - CreateBucket
# - PutBucketPublicAccessBlock
# - PutBucketEncryption
# - PutBucketVersioning
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-k8-tf-state"
# Prevent accidental deletion of this S3 bucket
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "enabled" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "default" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
resource "aws_s3_bucket_public_access_block" "public_access" {
bucket = aws_s3_bucket.terraform_state.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
# IAM Requirements : Dynamo
# - DescribeTable
# - CreateTable
resource "aws_dynamodb_table" "terraform_locks" {
name = "my-k8-tf-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
Addressing the S3 Related Violations
The first violation points to Line 22 which is
resource "aws_s3_bucket_server_side_encryption_configuration" "default" {
While there’s obviously nothing wrong with this specific line, we can see that the problem is with this resource definition. The real problem is the rule containing apply_server_side_encryption_by_default
in the resource block. I currently have it set to
resource "aws_s3_bucket_server_side_encryption_configuration" "default" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
I looked over the aws_s3_bucket_server_side_encryption_configuration
resource documentation for some guidance. It looks like my options are AES256, aws:kms, and aws:kms:dsse. The scan would like this to be set to aws:kms
like this
resource "aws_s3_bucket_server_side_encryption_configuration" "default" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws-kms"
}
}
}
After running another scan, I see that I have addressed one of the Violations!
Scan Summary -
File/Folder : /opt/code
IaC Type : docker,k8s,terraform
Scanned At : 2023-11-08 20:30:35.897345638 +0000 UTC
Policies Validated : 76
Violated Policies : 178
Low : 37
Medium : 111
High : 30
Vulnerabilities : 0
Addressing the DynamoDB Related Violations
Both of the DyanmoDB related violations point to Line 43 which is this resource definition
resource "aws_dynamodb_table" "terraform_locks" {
Once again, I’m off to checkout the aws_dynamodb_table resource documentation. After reviewing the documentation, I need to add server_side_encryption
and point_in_time_recovery
to the resource like the following
resource "aws_dynamodb_table" "terraform_locks" {
name = "my-k8-tf-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
server_side_encryption {
enabled = true
}
point_in_time_recovery {
enabled = true
}
attribute {
name = "LockID"
type = "S"
}
}
Addressing the AllowPrivilegeEscalation Violations
The violations regarding the AllowPrivilegeEscalation
appears many times in my summary. I’m going to tackle this one next. The problem is that these are showing up in YAML files and those are handled a little differently. From the output above, the problem is located on Line 1
of the mssql/deployment.yml
file. If we look at the first few lines of this file, we can see that is not useful:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql
labels:
app: mssql
spec:
replicas: 1
selector:
matchLabels:
app: mssql
strategy:
rollingUpdate:
maxSurge: 1
I did some more digging into this and found the following Kubernetes Security documentation that proved helpful. This is stating that I need to add a securityContext
parameter to my container configurations. In looking at my original YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql
labels:
app: mssql
spec:
replicas: 1
selector:
matchLabels:
app: mssql
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
app: mssql
spec:
containers:
- name: mssql
env:
- name: ACCEPT_EULA
value: "y"
- name: MSSQL_PID
value: "Developer"
- name: MSSQL_AGENT_ENABLED
value: "true"
- name: SA_PASSWORD
value: "<some_password>"
- name: TEST_UNUSED
value: "y"
image: <IMAGE>
ports:
- containerPort: 1433
you can see that I do not have a securityContext
defined. I can also tackle two Violations using securityContext
. If I change my above YAML to include securityContext
like below
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql
labels:
app: mssql
spec:
replicas: 1
selector:
matchLabels:
app: mssql
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
app: mssql
spec:
containers:
- name: mssql
env:
- name: ACCEPT_EULA
value: "y"
- name: MSSQL_PID
value: "Developer"
- name: MSSQL_AGENT_ENABLED
value: "true"
- name: SA_PASSWORD
value: "<some_password>"
- name: TEST_UNUSED
value: "y"
image: <IMAGE>
ports:
- containerPort: 1433
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
I can address the AllowPrivilegeEscalation
and the readOnlyRootFileSystem
violations at once. The scan results prove my theory, I’ve reduced two more violations
Scan Summary -
File/Folder : /opt/code
IaC Type : docker,k8s,terraform
Scanned At : 2023-11-08 21:00:06.226143516 +0000 UTC
Policies Validated : 76
Violated Policies : 174
Low : 37
Medium : 108
High : 29
Vulnerabilities : 0
Moving to Github Actions
This is good and could allow me to continue working on these violations without any tracking or way to know what changes mapped to what violations. This is where the Github Action comes into play. The Github Action documentation talks about a parameter called sarif_upload that will report the violations in the GitHub repository’s Security tab under code scanning alerts. I think this is a better approach since we’ll be able to manage everything in Github.
I’m going to start by creating a new workflow file called terrascan.yaml
on: [push]
jobs:
terrascan_job:
runs-on: ubuntu-latest
strategy:
matrix:
iac_type:
- k8s
- helm
- terraform
include:
- iac_type: k8s
iac_version: v1
- iac_type: helm
iac_version: v3
- iac_type: terraform
iac_version: v14
name: terrascan-action
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Run Terrascan
id: terrascan
uses: tenable/terrascan-action@main
with:
iac_type: ${{ matrix.iac_type }}
iac_version: ${{ matrix.iac_version }}
find_vulnerabilities: true
# policy_type: 'aws'
only_warn: true
sarif_upload: true
verbose: true
# non_recursive:
# iac_dir:
# policy_path:
# skip_rules:
config_path: .github/workflows/terrascan.config
# webhook_url:
# webhook_token:
- name: Upload SARIF file
uses: github/codeql-action/upload-sarif@v1
with:
sarif_file: terrascan.sarif
This workflow makes use of the matrix capability so that I don’t need to create multiple actions or do anything crazy to loop over the different iac_type
definitions.
I have also created a terrascan.config
file that sets my violation results to high only for now
level = "high"
I committed this change and watched the output to only get an error
Error: Code scanning is not enabled for this repository. Please enable code scanning in the repository settings.
It looks like I need to enable code scanning but the problem is that I am using a personal account in a private repository. I’m not able to show the results of the code scanning but at least I can show the results of the terrascan executing and I got another error
cli/register.go:71 error while loading global config{error 26 0 file format ".config" not support for terrascan config file}
It looks like terrascan relies heavily on the file extension so I changed the filename to terrascan.toml
and updated the Action to point to this. Now I run and it generates what I’d expect
Conclusion
I only receive k8s
related violations now and they show up under the job output for it like you see below
The next step will be to address these high violations and then move onto the medium ones. Stay tuned as I look into adding more security tools!