October 2022 - My Battles With Technology

Using Github Actions To Test Before You Deploy

I’ve been using DigitalOcean for quite some time now and had recently setup their App Platform to run my website. Their platform is great in that I’m able to build a docker container running Openresty and it handles all of my needs. The platform does a great job of catching docker build failures and stops attempting a deployment when this happens. A few weeks ago, I had a concerning thought in that they don’t catch problems with my Openresty configuration until it’s too late. The moment their platform executes openresty inside the container, everything pukes and my site goes offline.

This isn’t terrible because I don’t make many changes to the server configuration. If I do, I could just run the docker locally and make sure nothing is wrong. As I have continued into the DevOps world more, I realized how outdated and manual this process was. I also realized that not every change I make to the server configuration is checked manually.

The fear is real

Fast forward to today and you’ll see that I broke my site a few times tinkering. I am wanting to do some additional tricks and configuration changes in Openresty so I started making them. Just as luck would have it, the site went down and I didn’t notice right away.

Github Actions to Save The Day

I figured the easiest thing would be to replicate what I was doing manually. The easiest thing would be to create a Github action that would check my configuration files using openresty -t just like I did that one time I actually checked my configuration before push a commit. I figured I’d take some simple steps:

Run the action on any commit
Load the files into a container
Run openresty -t
Wait for Digitalocean to build and load the changes

Problem #1 – Where do I even go next?

What I wanted to do was easy as I was doing it in Dockerfile in the repo already. My initial research and knowledge had me going down the route of building the container, deploying the container somewhere, and then attempting to run the container. This seemed like way too much overhead so I found the combination of run-on and container

jobs:

  build:

    runs-on: ubuntu-latest
    container: 
      image: openresty/openresty:1.19.9.1-4-alpine-fat

The runs-on just tells Github what underlying platform you’d like to run on for the host OS. The container allows you to also run on any docker container available. This does even support private repos and a host of other configuration parameters if needed.

Problem #2 – How do I mount the repos files into the container?

My first thought was to simply use the volumes parameter and tell it to mount GITHUB_WORKSPACE to the container and off we go! This was a terrible thought as you can tell by the error that I got running the action.

If you look closely at the docker command, you’ll see that there’s a workdir specified like so --workdir /__w/k8-rev-proxy/k8-rev-proxy. It looks like the container is starting itself with a working directory of my repo. This means that the repo is already being mounted into the container! I’m now able to extend my Action to do more!

jobs:

  build:

    runs-on: ubuntu-latest
    container: 
      image: openresty/openresty:1.19.9.1-4-alpine-fat

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it.
    - name: Checkout Branch
      uses: actions/checkout@v3

    - name: Copy Server Config
      id: copy-server-config
      run: cp conf/nginx.conf /usr/local/openresty/nginx/conf/nginx.conf

    - name: Copy Default Config
      id: copy-default-config
      run: cp conf/default.conf /etc/nginx/conf.d/

    - name: Copy Site Configs
      id: copy-site-configs
      run: cp conf/site-*.conf /etc/nginx/conf.d/

    - name: Create Lua Directory
      id: create-lua-dir
      run: mkdir /etc/nginx/lua

    - name: Copy Access Lua File
      id: copy-lua-access
      run: cp lua/access.lua /etc/nginx/lua/access.lua

    - name: Test OpenResty Configuration
      id: test-openresty
      run: openresty -t

Problem #3 – The Action Didn’t Stop The Digitalocean Build

Digitalocean only watches a branch for pushes and then starts a build on any push. I was hoping that my Action would fail the push and therefore stop the Digitalocean build. This was not the case. This was an easy fix, I just decided to make sure all of my commits were done to a dev branch and configured the Action to trigger on that. I updated my Action a little more

name: Openresty Configuration Check

on:
  push:
    branches: [ "dev" ]
  pull_request:
    branches: [ "dev" ]

jobs:

  build:

    runs-on: ubuntu-latest
    container: 
      image: openresty/openresty:1.19.9.1-4-alpine-fat

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it.
    - name: Checkout Branch
      uses: actions/checkout@v3

    - name: Copy Server Config
      id: copy-server-config
      run: cp conf/nginx.conf /usr/local/openresty/nginx/conf/nginx.conf

    - name: Copy Default Config
      id: copy-default-config
      run: cp conf/default.conf /etc/nginx/conf.d/

    - name: Copy Site Configs
      id: copy-site-configs
      run: cp conf/site-*.conf /etc/nginx/conf.d/

    - name: Create Lua Directory
      id: create-lua-dir
      run: mkdir /etc/nginx/lua

    - name: Copy Access Lua File
      id: copy-lua-access
      run: cp lua/access.lua /etc/nginx/lua/access.lua

    - name: Test OpenResty Configuration
      id: test-openresty
      run: openresty -t

Problem #4 – I’m Lazy and Don’t Want to Constantly Merge My Changes

I found an Action in the Github Marketplace that would automatically merge my changes from one branch to another. Now I have my final Action.

name: Openresty Configuration Check

on:
  push:
    branches: [ "dev" ]
  pull_request:
    branches: [ "dev" ]

jobs:

  build:

    runs-on: ubuntu-latest
    container: 
      image: openresty/openresty:1.19.9.1-4-alpine-fat

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it.
    - name: Checkout Branch
      uses: actions/checkout@v3

    - name: Copy Server Config
      id: copy-server-config
      run: cp conf/nginx.conf /usr/local/openresty/nginx/conf/nginx.conf

    - name: Copy Default Config
      id: copy-default-config
      run: cp conf/default.conf /etc/nginx/conf.d/

    - name: Copy Site Configs
      id: copy-site-configs
      run: cp conf/site-*.conf /etc/nginx/conf.d/

    - name: Create Lua Directory
      id: create-lua-dir
      run: mkdir /etc/nginx/lua

    - name: Copy Access Lua File
      id: copy-lua-access
      run: cp lua/access.lua /etc/nginx/lua/access.lua

    - name: Test OpenResty Configuration
      id: test-openresty
      run: openresty -t

    - name: Merge dev -> main
      uses: devmasx/merge-branch@master
      with:
          type: now
          target_branch: main
          github_token: ${{ github.token }}

Conclusion

Now, I can save me from myself! Anytime that I need to make changes to my configuration, I commit the changes to dev. From there, my Action runs and makes sure that the Openresty configuration is ok. If that runs fine, then I automatically merge my changes into main and that triggers the Digitalocean App build.

Tuning My Content Security Policy

In my Getting Started With a Content Security Policy post, I setup a report only CSP policy so that I could try and identify things that could test out a policy before implementing it. It is time to parse through the results and see what needs to be updated in my deployed policy. The original policy was very simple

default-src https

Inspecting The Violations

I started trying to look at the current violations and I think it was clear that I had a rather permissive Content Security Policy because nothing much was being blocked.

violated-directive	blocked-uri	count
style-src-attr	inline	2591
script-src-elem	inline	1141
style-src-elem	inline	367
img-src	data	137
img-src	https://live-blog.shellnetsecurity.com/wp-content/uploads/2020/12/cropped-Quinn-32×32.jpg	88
img-src	https://live-blog.shellnetsecurity.com/wp-content/uploads/2020/12/cropped-Quinn-192×192.jpg	87
script-src	eval	85
font-src	data	68
img-src	https://live-blog.shellnetsecurity.com/wp-content/uploads/2020/12/cropped-Quinn-180×180.jpg	57
default-src	inline	22
script-src-attr	inline	5
img-src	https://live-blog.shellnetsecurity.com/wp-content/uploads/2020/12/cropped-Quinn.jpg	4
default-src	data	1

csp data

I decided to change my policy to the following

default-src 'self'

and let the run for awhile before gathering more violation details like these

violated-directive	scheme	domain	count
font-src	https	fonts.gstatic.com	889
style-src-attr	None	inline	183
script-src-elem	None	inline	171
default-src	https	fonts.gstatic.com	97
style-src-elem	None	inline	87
script-src-elem	https	adservice.google.com	81
script-src-elem	https	www.googletagmanager.com	79
script-src-elem	https	pagead2.googlesyndication.com	73
connect-src	https	www.google-analytics.com	71
frame-src	https	googleads.g.doubleclick.net	65
img-src	https	secure.gravatar.com	43
script-src-elem	https	connect.facebook.net	42
connect-src	https	pagead2.googlesyndication.com	40
default-src	None	inline	37
style-src-elem	https	fonts.googleapis.com	23
img-src	https	www.facebook.com	21

This gives us a better list to work with and understand what we need to handle.

Building the New Header

In looking at some of the results, I’m not personally concerned with images so I’ll allow those from any https

default-src 'self'; img-src https:;

I know I have fonts loading from Google so we can also allow that next

default-src 'self'; img-src https:; font-src https://fonts.gstatic.com

The process continues until you finally have all of the known resources documented and the types of resource you plan to load until you have a final CSP. Once done, you should still continue with the reporting capability so that you can identify potentially new content on your site that should be allowed or worse yet malicious content that is trying to be introduced.

Resources

Here are some nice resources that can help you continue building your CSP as well

Exporting CloudWatch Logs to S3

I had to figure out how to get logs from CloudWatch into S3. This task is actually pretty easy because AWS provides a very nice tutorial, Exporting log data to Amazon S3, that explains how to do this either via Console or CLI. My problem is that I needed to do this daily so automating this task was my next struggle. The AWS tutorial provides details on setting up S3 and IAM for this solution so I won’t cover that here. I also found a great article by Omar Dulaimi that was the basis for my code (why completely reinvent the wheel?). With both of these laying the ground work, I got right to putting this together.

High Level Details

The code is setup to run daily, so I have it use the following flow:

Get the current date and time to be used as the end time for our export task
Get “n” number of days ago date and time as the start time for our export task
Create the Cloudwatch export to S3 task
Check the status of the export
1. If the task is still RUNNING, then sleep 1 second and check the status again
2. If the task is still RUNNING, then sleep 2x the previous delay and check the status again
3. Continue step 6 until we get a COMPLETED status from the task
4. If we ever get a status other than COMPLETED or RUNNING (See the other possible status codes here), then raise an exception
Once COMPLETED, return the export task id and the path in our S3 bucket.

The Code

I decided to put this together as a module in case I needed it for anything in the future.

import boto3
import os
import datetime
import logging
import time

"""
This portion will receive the n_days value (the date/day of the log you want
want to export) and calculate the start and end date of logs you want to
export to S3. Today = 0; yesterday = 1; so on and so forth...
Ex: If today is April 13th and NDAYS = 0, April 13th logs will be exported.
Ex: If today is April 13th and NDAYS = 1, April 12th logs will be exported.
Ex: If today is April 13th and NDAYS = 2, April 11th logs will be exported.
"""
def generate_date_dict(n_days = 1):
    currentTime = datetime.datetime.now()
    date_dict = {
        "start_date" : currentTime - datetime.timedelta(days=n_days),
        "end_date" : currentTime - datetime.timedelta(days=n_days - 1),
    }
    return date_dict

"""
Convert the from & to Dates to milliseconds
"""
def convert_from_date(date_time = datetime.datetime.now()):
    return int(date_time.timestamp() * 1000)


"""
The following will create the subfolders' structure based on year, month, day
Ex: BucketNAME/LogGroupName/Year/Month/Day
"""
def generate_prefix(s3_bucket_prefix, start_date):
    return os.path.join(s3_bucket_prefix, start_date.strftime('%Y{0}%m{0}%d').format(os.path.sep))


"""
Based on the AWS boto3 documentation
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.create_export_task
"""
def create_export_task(group_name, from_date, to_date, s3_bucket_name, s3_bucket_prefix):
    logging.info("Creating CloudWatch Log Export Task to S3")
    try:
        client = boto3.client('logs')
        response = client.create_export_task(
             logGroupName=group_name,
             fromTime=from_date,
             to=to_date,
             destination=s3_bucket_name,
             destinationPrefix=s3_bucket_prefix
            )
        if 'taskId' not in response:
            logging.error("Unexpected createExportTask response")
            raise Exception("Unexpected createExportTask response")
        return response['taskId']
    except Exception as e:
        logging.error(e)
        raise Exception("Failed to Create Export Task")

"""
Based on the AWS boto3 documentation
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.describe_export_tasks
"""
def export_status_call(task_id):
    try:
        client = boto3.client('logs')
        response = client.describe_export_tasks(
             taskId = task_id
            )

        if 'exportTasks' not in response:
            logging.error("Unexpected describeExportTasks response")
            raise Exception("Unexpected describeExportTasks response")

        if len(response['exportTasks']) != 1:
            logging.error("Wrong number of export tasks found")
            raise Exception("Wrong number of export tasks found")

        if "status" not in response['exportTasks'][0]:
            logging.error("No status found in describeExportTasks response")
            raise Exception("No status found in describeExportTasks response")

        if "code" not in response['exportTasks'][0]['status']:
            logging.error("No status code found in describeExportTasks response")
            raise Exception("No status code found in describeExportTasks response")

        if response['exportTasks'][0]['status']['code'] == 'RUNNING':
            return "RUNNING"

        if response['exportTasks'][0]['status']['code'] == 'COMPLETED':
            return "COMPLETED"

        # Otherwise we should be in a failed state
        logging.error("Task is in an unexpected state")
        raise Exception("Task is in an unexpected state")

    except Exception as e:
        logging.error(e)
        raise Exception("Status Call Failed")

def check_export_status(task_id, delay = 1):
    logging.info("Checking Status of Export Task")
    try:
        response = export_status_call(task_id = task_id)

        # We want to be sure we're not beating up the API and we'll increase our delay until the task completes
        while(response == 'RUNNING'):
            logging.warning("We did not get a COMPLETED response so waiting " + str(delay) + " seconds until checking status again")
            time.sleep(delay)
            response = export_status_call(task_id = task_id)
            delay = delay * 2

        logging.info("Log Export Task has completed")
        return True

    except Exception as e:
        logging.error(e)
        raise Exception("Describe Check Failed")

"""
The main function in here
"""
def run(group_name, s3_bucket_name, s3_bucket_prefix, n_days):
    try:
        """
        Based upon what we've been provided for n_days, we'll generate the dates needed to run
        """
        date_dict = generate_date_dict(n_days = n_days)
        date_dict['start_date_ms'] = convert_from_date(date_time = date_dict['start_date'])
        date_dict['end_date_ms'] = convert_from_date(date_time = date_dict['end_date'])
        s3_bucket_prefix_with_date = generate_prefix(s3_bucket_prefix = s3_bucket_prefix, start_date = date_dict['start_date'])
        export_task_id = create_export_task(group_name = group_name, from_date = date_dict['start_date_ms'], to_date = date_dict['end_date_ms'], s3_bucket_name = s3_bucket_name, s3_bucket_prefix = s3_bucket_prefix_with_date)
        logging.debug("Export Task ID : " + export_task_id)
        check_export_status(task_id = export_task_id)
        return {
          "task_id" : export_task_id,
          "s3_export_path" : s3_bucket_prefix_with_date
        }
    except Exception as e:
        logging.error(e)
        raise e

Running the Code

With this module in place in a modules directory on my machine, I created a quick main.py that I can use to demonstrate the execution of the code.

from modules import cloudwatch_to_s3
import logging
import os

"""
This portion will obtain the Environment variables
"""
GROUP_NAME = os.environ['GROUP_NAME']
DESTINATION_BUCKET = os.environ['DESTINATION_BUCKET']
PREFIX = os.environ['PREFIX']
NDAYS = os.environ['NDAYS']
n_days = int(NDAYS)

if __name__ == '__main__':
    try:
        response = cloudwatch_to_s3.run(group_name = GROUP_NAME, s3_bucket_name = DESTINATION_BUCKET, s3_bucket_prefix = PREFIX, n_days = n_days)
        print(response)
    except Exception as e:
        logging.error(e)
        print(e)

In the above main.py script, you’ll want to set the following params

GROUP_NAME	The name of the CloudWatch Log Group you are going to export from
DESTINATION_BUCKET	The name of the S3 bucket that you plan to export your logs into
PREFIX	A directory name within the S3 bucket that you’d like to export into (This could be useful if you intend to export multiple systems into the S3 bucket. Just remember that AWS limits you to only one export task in a `RUNNING` at a time)
NDAYS	Set this to how many days in the past you’d like to export for.
AWS_ACCESS_KEY_ID	AWS access key associated with the IAM user created in the `Exporting log data to Amazon S3` referenced above. (Optional if the User or Role was not assigned to the environment where your script is executing — Lambda/EC2/etc/….
AWS_SECRET_ACCESS_KEY	AWS secret key associated with the access key for the IAM User created in the `Exporting log data to Amazon S3` referenced above. (Optional if the User or Role was not assigned to the environment where your script is executing — Lambda/EC2/etc/….

Running the main.py results in the following output

# python main.py
WARNING:root:We did not get a COMPLETED response so waiting 1 seconds until checking status again
WARNING:root:We did not get a COMPLETED response so waiting 2 seconds until checking status again
WARNING:root:We did not get a COMPLETED response so waiting 4 seconds until checking status again
{'task_id': '75050e9d-99dc-487d-9233-93216dd993ae', 's3_export_path': 'server01/2022/09/25'}

You can now go to s3://<BUCKET_NAME>/<PREFIX>/<YYYY>/<MM>/<DD>/<TASK_ID>/ (In my case this would be s3://log_export_testing/server01/2022/09/25/75050e9d-99dc-487d-9233-93216dd993ae/) and see the exported contents of the log group. If the log group had multiple streams, each stream will have its own directory here with the logs gzipped under the directory.

Getting Started With a Content Security Policy

I recently needed to setup Content Security Policy (CSP) on a website and I couldn’t think of where to get started. The first question that came to mind was what all content do I allow and how do I test everything without having to look through all of the code on the site. This is where the Content-Security-Policy-Report-Only header can come into play. The short version is that this allows you to create a policy in report only mode and you can collect the results at the endpoint specified via the report-uri directive. That’s great! I have what I need but how do I collect what’s being reported by the clients to the report-uri and what do I use for the report-uri? This was a great place for me to begin testing out DigitalOcean Functions.

Initial Environment Setup

I like to use Splunk in my environment for some of my logs so I thought this was a great way to try and gather the data. I won’t cover the full setup here but I setup a Splunk HEC on my Splunk specifically for this. You can read more about Splunk HEC here in the Splunk documentation. With the Splunk HEC configured, I have my Splunk HEC URL and Splunk HEC Token and I’m ready to move on.

Creating the Function

With my Splunk credentials in hand, it’s now time to create a really simple function that can collect the csp reporting details. I took to my DigitalOcean account and followed some of their examples in the Functions documentation. The end result is the following Python script:

import os
import requests

def send_to_splunk(msg):
    if not msg:
        return "fail"
        
    myobj = {
        'event' : msg
    }
    headers = {'Authorization' : 'Splunk ' + os.environ.get("splunkHECToken")}

    try:
        x = requests.post(os.environ.get("splunkUrl"), json = myobj, verify=False, headers=headers)
        return "ok"
    except Exception as e:
        return "error"

def main(args):
    body = send_to_splunk(msg = args)
    return {"body": body}

The nice thing about these functions is that they already have a number of useful libraries installed such as requests. The main() function is automatically called whenever the function is called. The args contain headers and get/post parameters. I’ve also created two environmental variables in this function, splunkHECToken and splunkUrl

DigitalOcean Example of Environmental Variables

Updating Nginx to Add The Header

In order to make sure I add this to all requests to the site, I configured my Nginx configuration to add the response header via the add_header directive. I simply copied the URL for the function from the Source tab in my DigitalOcean Control Panel

With that URL in hand, I then updated my Nginx configuration to send the header

add_header Content-Security-Policy-Report-Only "default-src https:; report-uri https://faas-sfo3-7872a1dd.doser...";

I then restarted Nginx and tested out a few requests.

Reviewing the Results in Splunk

After some time, I did some checking in Splunk to see if I’m getting results. Here is an example log:

{
  "__ow_method": "post",
  "__ow_body": "eyJjc3AtcmVwb3J0Ijp7ImRvY3VtZW50LXVyaSI6ImFib3V0IiwicmVmZXJyZXIiOiIiLCJ2aW9sYXRlZC1kaXJlY3RpdmUiOiJmb250LXNyYyIsImVmZmVjdGl2ZS1kaXJlY3RpdmUiOiJmb250LXNyYyIsIm9yaWdpbmFsLXBvbGljeSI6ImRlZmF1bHQtc3JjIGh0dHBzOjsgcmVwb3J0LXVyaSBodHRwczovL2ZhYXMtc2ZvMy03ODcyYTFkZC5kb3NlcnZlcmxlc3MuY28vYXBpL3YxL3dlYi9mbi1lYjA3OTMwZC01MGQ1LTQ1YzktYWM1Yi1kZjA5OTg3YTIwMTcvZGVmYXVsdC90ZXN0aW5nIiwiZGlzcG9zaXRpb24iOiJyZXBvcnQiLCJibG9ja2VkLXVyaSI6ImRhdGEiLCJzdGF0dXMtY29kZSI6MCwic2NyaXB0LXNhbXBsZSI6IiJ9fQ==",
  "__ow_headers": {
    "accept": "*/*",
    "accept-encoding": "gzip",
    "accept-language": "en-US, en;q=0.9",
    "cdn-loop": "cloudflare",
    "cf-connecting-ip": "1.1.1.1",
    "cf-ipcountry": "US",
    "cf-ray": "74dcabd78bd4190e-EWR",
    "cf-visitor": "{\"scheme\":\"https\"}",
    "content-type": "application/csp-report",
    "dnt": "1",
    "host": "ccontroller",
    "origin": "https://live-blog.shellnetsecurity.com",
    "referer": "https://live-blog.shellnetsecurity.com/",
    "sec-ch-ua": "\"Google Chrome\";v=\"105\", \"Not)A;Brand\";v=\"8\", \"Chromium\";v=\"105\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"macOS\"",
    "sec-fetch-dest": "report",
    "sec-fetch-mode": "no-cors",
    "sec-fetch-site": "cross-site",
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36",
    "x-forwarded-for": "1.1.1.1",
    "x-forwarded-proto": "https",
    "x-request-id": "2b1262be9ec3ec22e5c6e521ea13002b"
  },
  "__ow_path": "",
  "__ow_isBase64Encoded": true
}

There’s a bunch of detail in here but we can see that the important information is __ow_isBase64Encoded: true tells us that the details of importance are base64 encoded. The details we care about are in __ow_body. If we base64 decode this, we get the response from our client given our test policy of default-src https:

{
  "csp-report": {
    "document-uri": "about",
    "referrer": "",
    "violated-directive": "font-src",
    "effective-directive": "font-src",
    "original-policy": "default-src https:; report-uri https://faas-sfo3-7872a1dd.doserverless.co/api/v1/web/fn-eb07930d-50d5-45c9-ac5b-df09987a2017/default/testing",
    "disposition": "report",
    "blocked-uri": "data",
    "status-code": 0,
    "script-sample": ""
  }
}

Now we collect this data some more and use it to eventually test out an updated policy. Until the next post!

The fear is real

Github Actions to Save The Day

Problem #1 – Where do I even go next?

Problem #2 – How do I mount the repos files into the container?

Problem #3 – The Action Didn’t Stop The Digitalocean Build

Problem #4 – I’m Lazy and Don’t Want to Constantly Merge My Changes

Conclusion

Related posts:

Inspecting The Violations

Building the New Header

Resources

Related posts:

High Level Details

The Code

Running the Code

Related posts:

Initial Environment Setup

Creating the Function

Updating Nginx to Add The Header

Reviewing the Results in Splunk

Related posts: