Terraform for Active Directory Testing: A Practical Example

In my current job, I’m one of the local resident Active Directory experts. Granted my knowledge is a little dated on the subject but I can still get around enough as needed. In order to perform testing, we need to spin up test environments for Active Directory and don’t want to maintain a long lived infrastructure for it. I was having to constantly spin these up by hand and thought there had to be a way to create a test active directory with Terraform. I was right! In addition to the active directory, I needed to be able to add member servers and I found this was all possible with Terraform.

It is very important to note that using this example for a production environment might not be the best option. I’m making use of the user_data parameter of aws_instance and this means that passwords can be found in the userdata configuration of the EC2 instances.

You’ve been warned 🙂

What are We Building?

It makes sense to first understand what I am deploying with this Terraform. I’m creating everything in AWS since that’s where we deploy most of our testing systems. I wanted to create a test Active Directory that has it’s own VPC and subnets. This makes sure it is isolated from all other environments.

In our testing, we’d also want to be able to deploy different Windows versions so we can test for any nuances in different Windows versions. I made this configurable in my deployment.

For a very specific testing scenario, we needed to have a Windows Certificate Services server in the domain. As a result, my deployment also includes a certificate server.

Finally, this environment can have varying requirements for other domain members that will be used for Remote Desktop access or as a client. Also, these members could also be used for installing SQLServer. For this reason, I’m also making it configurable on how many members would be joined to the domain.

The end result should be a Domain Controller with a Certificate Services server and a configurable number of Domain Members joined to the domain.

The Terraform Providers and Variables

As I mentioned before, this is being deployed in AWS so we really only need a single provider. I started with a providers.tf that looks like the following:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.54.0"
    }
  }
}

provider "aws" {
  access_key = var.aws_access_key
  secret_key = var.aws_secret_key
  region     = var.aws_region
}

With that in place, I wanted to be able to define a number of variables that I could use for my deployment. I created the following variables.tf

# Application Environment Tag
# This is used to tag the EC2 instances with an environment tag
variable "app_environment" {
  type        = string
  description = "This is used to add an environment tag to EC2 instances"
}

# AWS Availability Zone
variable "aws_az" {
  type        = string
  description = "Use this to specify the availability zone for the subnets and EC2 instances"
  default     = "us-east-2c"
}

# VPC Variables
variable "vpc_cidr" {
  type        = string
  description = "CIDR for the VPC"
  default     = "blog.shellnetsecurity.com/18"
}

# Subnet Variables
variable "public_subnet_cidr" {
  type        = string
  description = "CIDR for the public subnet"
  default     = "blog.shellnetsecurity.com/24"
}

# Subnet Variables
variable "private_subnet_cidr" {
  type        = string
  description = "CIDR for the public subnet"
  default     = "blog.shellnetsecurity.com/24"
}

# AWS Credentials
variable "aws_access_key" {
  type = string
  description = "AWS access key"
}

variable "aws_secret_key" {
  type = string
  description = "AWS secret key"
}

variable "aws_region" {
  type = string
  description = "AWS region"
}

# Windows Specific Variables
# 
# Supported Values:
# - 2012-R2_RTM-English-64Bit-Base
# - 2016-English-Full-Base
# - 2019-English-Full-Base
# - 2022-English-Full-Base
variable "windows_version" {
  type        = string
  description = "EC2 Windows Server Version"
  default     = "2019-English-Full-Base"
}

variable "windows_instance_type" {
  type        = string
  description = "EC2 instance type for Windows Server"
  default     = "t2.micro"
}

variable "windows_associate_public_ip_address" {
  type        = bool
  description = "Associate a public IP address to the EC2 instance"
  default     = true
}

variable "windows_root_volume_size" {
  type        = number
  description = "Volumen size of root volumen of Windows Server"
  default     = "30"
}

variable "windows_root_volume_type" {
  type        = string
  description = "Volumen type of root volumen of Windows Server."
  default     = "gp2"
}

variable "windows_instance_name" {
  type        = string
  description = "EC2 instance name for Windows Server"
  default     = "tfwinsrv01"
}

variable "windows_ad_domain_name" {
  type = string
  description = "Active Directory Domain Name"
  default = "2019.adfs.cyral.local"
}

variable "windows_ad_nebios_name" {
  type = string
  description = "Active Directory NetBIOS Name"
  default = "ADFS"
}

variable "windows_ad_safe_password" {
  type = string
  description = "Active Directory DSRM Password"
}

variable "windows_ad_user_name" {
  type = string
  description = "Username used for the local Administrator"
  default = "Administrator"
}

variable "windows_domain_member_count" {
  type = number
  description = "Number of domain members to add to this domain"
  default = "2"
}

I won’t go into too much detail on these variables as the descriptions “should” cover how they are used.

My First Battle With Technology

I think it is important to call out a change that I had to make to this deployment. By default, AWS will deploy Windows EC2 instances with an auto generated password. In the AWS Console, you can use your private key to decrypt the certificate. I originally created an outputs.tf to get this password that looked like the following:

output "password_decrypted" {
  value=rsadecrypt(aws_instance.windows-server-dc.password_data,tls_private_key.key_pair.private_key_pem)
  sensitive = true
}

This output does the same thing as the AWS console by using the private key to decrypt the generated password. The idea behind this was to get the generated password from the domain controller, windows-server-dc, resource and use it to join all of the member servers to the domain. I ran into a few problems several times using this method.

One problem is that the rsadescrypt would return the hex value for some special characters. This meant that using the decrypted password would fail authentication when I tried to join domain members.

Another problem is that some of the special characters would need to be escaped when I supplied them to my powershell scripts. This also resulted in failure joining member servers to the domain.

In order to get around this, I decided to ignore the generated password. Instead of getting the generated password and trying to reuse it, I just updated my powershell scripts to use my var.windows_ad_safe_password. I set my powershell scripts to change the local administrator password on all EC2 instances to the var.windows_ad_safe_password and my problem was solved. You will just need to make sure you avoid characters that should be escaped in powershell when setting your var.windows_ad_safe_password variable.

Foundational Terraform Resources

As I mentioned above, I wanted to deploy everything into it’s own network. The network.tf creates everything we need to accomplish this:

# Create the VPC
resource "aws_vpc" "vpc" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
}

# Define the private subnet
resource "aws_subnet" "private-subnet" {
    vpc_id = aws_vpc.vpc.id
    cidr_block = var.private_subnet_cidr
    map_public_ip_on_launch = "false"
    availability_zone = var.aws_az
}

## PRIVATE ROUTING
# PrivateRouteTable
resource "aws_route_table" "private-rt" {
    vpc_id = aws_vpc.vpc.id
}

# PrivateSubnet1RouteTableAssociation
resource "aws_route_table_association" "perf-crta-private-subnet-1"{
    subnet_id = aws_subnet.private-subnet.id
    route_table_id = aws_route_table.private-rt.id
}

# Define the public subnet
resource "aws_subnet" "public-subnet" {
  map_public_ip_on_launch = true
  vpc_id            = aws_vpc.vpc.id
  cidr_block        = var.public_subnet_cidr
  availability_zone = var.aws_az
}

# Define the internet gateway
resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.vpc.id
}

# Define the public route table
resource "aws_route_table" "public-rt" {
  vpc_id = aws_vpc.vpc.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gw.id
  }
}

# Assign the public route table to the public subnet
resource "aws_route_table_association" "public-rt-association" {
  subnet_id      = aws_subnet.public-subnet.id
  route_table_id = aws_route_table.public-rt.id
}

resource "aws_vpc_dhcp_options" "vpc-dhcp-options" {
  domain_name_servers  = [aws_instance.windows-server-dc.private_ip]
}

resource "aws_vpc_dhcp_options_association" "dns_resolver" {
   vpc_id          =  aws_vpc.vpc.id
   dhcp_options_id = aws_vpc_dhcp_options.vpc-dhcp-options.id
}

# Define the security group for the Windows server
resource "aws_security_group" "aws-windows-sg" {
  name        = "windows-sg"
  description = "Allow incoming connections"
  vpc_id      = aws_vpc.vpc.id
  ingress {
    from_port   = 0
    to_port     = 0
    protocol    = "all"
    cidr_blocks = ["1.1.1.1/32"]
    description = "Allow incoming connections from me"
  }
    ingress {
    from_port   = 0
    to_port     = 0
    protocol    = "all"
    self = true
    description = "Allow All Connections From VPC"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  tags = {
    Name = "windows-sg"
  }
}

# Define the security group for the Windows server
resource "aws_security_group" "aws-windows-private-sg" {
  name        = "windows-private-sg"
  description = "Allow incoming connections"
  vpc_id      = aws_vpc.vpc.id
    ingress {
    from_port   = 0
    to_port     = 0
    protocol    = "all"
    self = true
    description = "Allow All Connections From VPC"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
  tags = {
    Name = "windows-private-sg"
  }
}

I want to draw your attention to two of the statements in this file, lines 56 and 73. Line 56 is configuring the AWS VPC to use the Domain Controller as the DNS Server. This will prove useful whenever we try to join the member servers to the domain. Line 73 is important because it grants access to the EC2 instances. You will want to make sure you replace 1.1.1.1/32 with your address/subnet so you will be able to access the machines remotely.

I also have Terraform creating a key pair that can be used for the EC2 instances in my key-pair.tf file:

# Generates a secure private key and encodes it as PEM
resource "tls_private_key" "key_pair" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

# Create the Key Pair
resource "aws_key_pair" "key_pair" {
  key_name   = "windows-key-pair"  
  public_key = tls_private_key.key_pair.public_key_openssh
}

# Save file
resource "local_file" "ssh_key" {
  filename = "${aws_key_pair.key_pair.key_name}.pem"
  content  = tls_private_key.key_pair.private_key_pem
}

These resources were more important when I was trying to get the auto generated password for each machine. They don’t serve much of a purpose anymore but I kept them just in case. If the powershell command fails to set the password like I had planned, I can still use the generated private key to decrypt the generated password in the AWS Console.

Terraform Template for the Active Directory Domain Controller

Next, I have my resource definition for the Domain Controller in my windows-dc.tf file:

# Bootstrapping PowerShell Script
data "template_file" "windows-dc-userdata" {
  template = <<EOF
<powershell>
net user Administrator "${var.windows_ad_safe_password}"
$Password = ConvertTo-SecureString "${var.windows_ad_safe_password}" -AsPlainText -Force;
Add-WindowsFeature AD-Domain-Services -IncludeManagementTools
Install-ADDSForest -CreateDnsDelegation:$false -DatabasePath C:\Windows\NTDS -DomainMode WinThreshold -DomainName ${var.windows_ad_domain_name} -DomainNetbiosName ${var.windows_ad_nebios_name} -ForestMode WinThreshold -InstallDns:$true -LogPath C:\Windows\NTDS -NoRebootOnCompletion:$true -SafeModeAdministratorPassword $Password -SysvolPath C:\Windows\SYSVOL -Force:$true;

Restart-Computer;

</powershell>
EOF
}

# Create EC2 Instance
resource "aws_instance" "windows-server-dc" {
  ami = data.aws_ami.windows-server.id
  instance_type = var.windows_instance_type
  subnet_id = aws_subnet.public-subnet.id
  vpc_security_group_ids = [aws_security_group.aws-windows-sg.id]
  source_dest_check = false
  key_name = aws_key_pair.key_pair.key_name
  user_data = data.template_file.windows-dc-userdata.rendered 
  get_password_data = true
  
  # root disk
  root_block_device {
    volume_size           = var.windows_root_volume_size
    volume_type           = var.windows_root_volume_type
    delete_on_termination = true
    encrypted             = true
  }
  
  tags = {
    Name        = "windows-server-dc"
    Environment = var.app_environment
  }
}

This makes use of the aws_instance to create the EC2 and template_file resource to supply a user data script to the EC2 instance. The powershell listed in this template will change the Administrator password to what is defined in the var.windows_ad_safe_password and configure the machine as an Active Directory Domain Controller. Once that is done, the machine is rebooted.

Terraform Template for the Certificate Services Machine

As I mentioned previously, I needed to install a certificate server in my environment for certain testing scenarios. This configuration can be found in the windows-ca.tf file:

# Bootstrapping PowerShell Script
data "template_file" "windows-ca-userdata" {
  template = <<EOF
<powershell>

Set-ExecutionPolicy unrestricted -Force

net user Administrator "${var.windows_ad_safe_password}"
$domain = (Get-WmiObject win32_computersystem).Domain
$hostname = hostname
$domain_username = "${var.windows_ad_domain_name}\${var.windows_ad_user_name}"
$domain_password = ConvertTo-SecureString "${var.windows_ad_safe_password}" -AsPlainText -Force
$credential = New-Object System.Management.Automation.PSCredential($domain_username,$domain_password)

if ($domain -ne '${var.windows_ad_domain_name}'){
  Start-Sleep -Seconds 600
  Add-Computer -DomainName ${var.windows_ad_domain_name} -Credential $credential -Passthru -Verbose -Force -Restart
}

Install-WindowsFeature ADCS-Cert-Authority -IncludeManagementTools
Add-WindowsFeature Adcs-Web-Enrollment

Install-ADcsCertificationAuthority –Credential $credential -CAType EnterpriseRootCa -Force
Install-AdcsWebEnrollment –Credential $credential  -Force

#[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
#Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force
#Install-Module -Name PSPKI -Force
#Get-CertificateTemplate -Name WebServer | Get-CertificateTemplate | Add-CertificateTemplateAcl -Identity "Authenticated Users" -AccessType Allow -AccessMask Read, Enroll | Set-CertificateTemplateAcl

Start-Sleep -Seconds 5
</powershell>
<persist>true</persist>
EOF
}

# Create EC2 Instance
resource "aws_instance" "windows-server-ca" {
  ami = data.aws_ami.windows-server.id
  instance_type = var.windows_instance_type
  subnet_id = aws_subnet.public-subnet.id
  vpc_security_group_ids = [aws_security_group.aws-windows-sg.id]
  source_dest_check = false
  key_name = aws_key_pair.key_pair.key_name
  user_data = data.template_file.windows-ca-userdata.rendered 
  
  # root disk
  root_block_device {
    volume_size           = var.windows_root_volume_size
    volume_type           = var.windows_root_volume_type
    delete_on_termination = true
    encrypted             = true
  }
  
  tags = {
    Name        = "windows-server-ca"
    Environment = var.app_environment
  }
}

This uses powershell to set the administrator password again to var.windows_ad_safe_password. I have also added in a check to see if it is a member of the domain. The hope here is that if we are not yet in the domain, then I want the powershell to sleep for 10 minutes. This gives the Domain Controller to be configured and reboot so it is ready for member servers to join. After that, the remaining powershell commands will configure this EC2 instance as a certificate server in the domain.

Terraform Template for the Member Servers

The final step is to create our member servers and join them to the domain. I am doing this with my windows-domain-members.tf file:

# Bootstrapping PowerShell Script
data "template_file" "windows-member-userdata" {
  template = <<EOF
<powershell>

Set-ExecutionPolicy unrestricted -Force

net user Administrator "${var.windows_ad_safe_password}"
$domain = (Get-WmiObject win32_computersystem).Domain
$hostname = hostname
$domain_username = "${var.windows_ad_domain_name}\${var.windows_ad_user_name}"
$domain_password = ConvertTo-SecureString "${var.windows_ad_safe_password}" -AsPlainText -Force
$credential = New-Object System.Management.Automation.PSCredential($domain_username,$domain_password)

if ($domain -ne '${var.windows_ad_domain_name}'){
  Start-Sleep -Seconds 600
  Add-Computer -DomainName ${var.windows_ad_domain_name} -Credential $credential -Passthru -Verbose -Force -Restart
}

Start-Sleep -Seconds 5
</powershell>
<persist>true</persist>
EOF
}

# Create EC2 Instance
resource "aws_instance" "windows-server-member" {
  count = var.windows_domain_member_count
  ami = data.aws_ami.windows-server.id
  instance_type = var.windows_instance_type
  subnet_id = aws_subnet.public-subnet.id
  vpc_security_group_ids = [aws_security_group.aws-windows-sg.id]
  source_dest_check = false
  key_name = aws_key_pair.key_pair.key_name
  user_data = data.template_file.windows-member-userdata.rendered
  
  # root disk
  root_block_device {
    volume_size           = var.windows_root_volume_size
    volume_type           = var.windows_root_volume_type
    delete_on_termination = true
    encrypted             = true
  }
  
  tags = {
    Name        = "windows-server-member"
    Environment = var.app_environment
  }
}

#resource "aws_network_interface" "windows-server-member-private" {
#  subnet_id       = aws_subnet.private-subnet.id
#  attachment {
#    device_index = 1
#    instance   = "${element(aws_instance.windows-server-member.*.id,count.index)}"
#  }
#  private_ips_count = 2
#  security_groups = [ aws_security_group.aws-windows-private-sg.id ]
#  count = "${var.windows_domain_member_count}"
#}

This template will once again set the administrator password and join the machine to the domain after sleeping for 10 minutes. I want to call your attention to line 28 of this file. This makes use of the count meta argument that leverages the var.windows_domain_member_count variable to determine how many member servers to create. This line tells Terraform to create var.windows_domain_member_count number of instances of this aws_instance definition. My example variables.tf sets this to 2. This means that I’ll create a Domain Controller, a Certificate Server, and two member servers.

The commented line was for additional testing that I needed awhile ago but I left it in case I needed it. This is to create extra private network interfaces on the member servers. I don’t need this right now so it’s commented out.

Deploying and Wrapping Up

I won’t bore you with running the terraform apply to create the test Active Directory with Terraform because that should be pretty straight forward. After the apply is complete, you should wait roughly 10 minutes to check on your instances. After that wait, you should have your testing Active Directory ready to go and use.

I’ve also tried to make this a little more useful by uploading everything to my blog post file repo on GitHub here.

Enjoy!