7.1 Introduction
The transition from manual configuration to automated, programmatic network management is a cornerstone of NetDevOps. While Ansible and Python excel at imperative task execution and configuration management, Infrastructure as Code (IaC) tools like Terraform offer a powerful, declarative approach to provisioning and managing infrastructure, including network devices and cloud networking services.
This chapter delves into Terraform, focusing on its application within a NetDevOps framework for both traditional network hardware and modern cloud environments. We will explore Terraform’s core principles, its unique capabilities for state management, and how it integrates with diverse network ecosystems, from Cisco IOS XE routers to cloud-native Virtual Private Clouds (VPCs).
What this chapter covers:
- The fundamental concepts of Infrastructure as Code and Terraform.
- Terraform’s architecture, including providers, resources, data sources, and modules.
- How Terraform interacts with multi-vendor network devices using APIs like NETCONF and RESTCONF, leveraging YANG data models.
- Utilizing Terraform for provisioning and managing cloud network infrastructure across major providers (AWS, Azure, GCP).
- Practical configuration examples for Cisco, Juniper, and cloud environments.
- Security best practices, verification, and troubleshooting techniques specific to Terraform.
- Strategies for optimizing Terraform deployments in production.
Why it’s important: Terraform allows network engineers to define their network infrastructure in code, enabling version control, collaboration, idempotency, and the ability to rapidly deploy, modify, and destroy network components with confidence. This declarative approach minimizes configuration drift, enhances auditability, and supports agile development methodologies for network operations, aligning perfectly with NetDevOps principles.
What you’ll be able to do after: Upon completing this chapter, you will understand how to design and implement network infrastructure using Terraform, manage multi-vendor network devices declaratively, provision cloud network resources, and integrate these practices into your NetDevOps workflows. You will be equipped to leverage Terraform for building robust, scalable, and automated network environments.
7.2 Technical Concepts
7.2.1 Infrastructure as Code (IaC) and Declarative Configuration
Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. The entire network infrastructure, including devices, firewalls, load balancers, and cloud resources, is described in a high-level language.
Terraform embodies the declarative IaC paradigm. Instead of writing a script that describes how to achieve a desired state (imperative), you describe what the desired end-state should be. Terraform then figures out the necessary steps to reach that state. This is a significant shift from traditional network CLI management.
Key IaC Benefits:
- Version Control: Track changes, revert to previous states, collaborate using tools like Git.
- Idempotency: Applying the same configuration multiple times yields the same result without unintended side effects.
- Consistency: Eliminates “snowflake” configurations, ensuring uniformity across environments.
- Repeatability: Easily recreate environments (dev, test, production).
- Reduced Human Error: Automates complex provisioning tasks.
- Auditability: Changes are tracked in code, providing a clear history.
IaC Workflow with Terraform
digraph IaC_Terraform_Workflow {
rankdir=LR;
node [shape=box];
user [label="Network Engineer (HCL Code)"];
git [label="Version Control (Git)"];
terraform_cli [label="Terraform CLI"];
provider [label="Terraform Provider"];
network_devices [label="Network Devices/Cloud APIs", shape=cylinder];
terraform_state [label="Terraform State File", shape=Mrecord];
user -> git [label="Pushes HCL"];
git -> terraform_cli [label="Fetches HCL"];
terraform_cli -> terraform_state [label="Loads Current State"];
terraform_cli -> provider [label="Requests Resource State"];
provider -> network_devices [label="Interacts with APIs"];
network_devices -> provider [label="Returns Actual State"];
provider -> terraform_cli [label="Returns Actual State"];
terraform_cli -> terraform_state [label="Compares, Updates State"];
terraform_cli -> user [label="Shows Plan/Status"];
terraform_cli -> provider [label="Applies Changes (if approved)"];
}
Figure 7.1: Terraform Infrastructure as Code Workflow
7.2.2 Terraform Core Concepts
a) HashiCorp Configuration Language (HCL): Terraform uses HCL, a declarative language designed for configuration files. It’s human-readable and supports expressions, variables, and complex data structures.
b) Providers: Providers are plugins that Terraform uses to interact with an upstream API to manage resources. There are thousands of providers for cloud services (AWS, Azure, GCP), SaaS products, and network devices. For network devices, providers often translate HCL into API calls (NETCONF, RESTCONF, gNMI, vendor-specific APIs) that devices understand.
* Examples: aws, azurerm, google, ciscoiosxe, junos, arista, netconf.
c) Resources: The fundamental building blocks of infrastructure defined in Terraform. A resource block describes one or more infrastructure objects, such as a cloud VPC, a network interface, a router, or a VLAN. Terraform manages the lifecycle of these resources (create, read, update, delete).
d) Data Sources: Allow Terraform to fetch information about existing infrastructure objects, without managing their lifecycle. This is useful for querying current state or referencing resources managed outside of Terraform.
e) Modules: Self-contained packages of Terraform configurations that are reusable. Modules encapsulate related resources, promoting organization, reusability, and consistency. A module can define a complete network segment, a VPN tunnel, or a standardized device configuration.
f) State: Terraform maintains a terraform.tfstate file, which maps the real-world resources to your configuration. This state file is crucial:
* It tracks metadata about your infrastructure.
* It’s used to compare the desired state (HCL) with the actual state (remote infrastructure).
* It helps Terraform understand which resources to create, update, or destroy.
* Critical Security Note: State files can contain sensitive information. They should be stored securely (e.g., remote backend with encryption) and never committed directly to public source control.
g) Remote Backends: For collaborative environments and security, Terraform state files should be stored in a remote, shared, and versioned backend (e.g., AWS S3, Azure Blob Storage, HashiCorp Consul, Terraform Cloud/Enterprise). Remote backends also support state locking to prevent concurrent modifications.
h) Workspaces: Allow you to manage multiple distinct instances of the same configuration. This is useful for creating separate environments (e.g., dev, stage, prod) from a single set of Terraform files.
7.2.3 Terraform for Network Devices
Modern network devices expose programmatic interfaces, moving beyond CLI scraping. Terraform leverages these APIs, primarily NETCONF and RESTCONF, which utilize YANG data models to define device configurations and operational states.
a) YANG Data Models: YANG (RFC 6020, RFC 7950) is a data modeling language used to define the configuration and state data of network devices. It provides a standardized, structured, and vendor-agnostic way to represent network elements.
- Key Benefit: Enables programmatic interaction with devices, allowing tools like Terraform to understand and manipulate configurations consistently across different vendors.
- RFC References:
b) NETCONF: NETCONF (Network Configuration Protocol - RFC 6241) is a standardized, XML-based protocol for installing, manipulating, and deleting the configuration of network devices. It’s connection-oriented and uses RPCs (Remote Procedure Calls). Terraform providers for network devices often use NETCONF for robust, transaction-based configuration management.
- Key Benefit: Strong error handling, configuration validation, transaction support.
- RFC Reference: RFC 6241: Network Configuration Protocol (NETCONF)
c) RESTCONF: RESTCONF (RFC 8040) is an HTTP-based protocol that provides a REST-like interface for interacting with data defined by YANG models. It’s often preferred for simpler integrations and web-based applications. Many modern network devices support both NETCONF and RESTCONF.
- Key Benefit: Simpler to integrate with web-based tools, stateless nature.
- RFC Reference: RFC 8040: RESTCONF Protocol
d) gRPC Network Management Interface (gNMI): While NETCONF/RESTCONF are dominant for configuration, gNMI (a Google-led initiative) is gaining traction for high-performance telemetry and configuration. Some advanced Terraform providers may interface with gNMI for specific use cases, though it’s less common for general configuration provisioning than NETCONF/RESTCONF.
Terraform Interaction with Network Devices
@startuml
skinparam handwritten true
skinparam cloudBorderColor #ADD8E6
skinparam nodeBorderColor #8A2BE2
skinparam databaseBorderColor #FFD700
skinparam rectangleBorderColor #FFA500
cloud "Terraform Control Plane" as TF {
rectangle "Terraform CLI" as CLI
rectangle "Terraform State" as State
rectangle "Terraform Providers" as Providers
}
package "Network Automation Interfaces" {
component "NETCONF" as NETCONF_API
component "RESTCONF" as RESTCONF_API
component "Vendor API (e.g., ACI)" as Vendor_API
}
node "Cisco IOS XE Device" as IOSXE {
database "YANG Data Model" as IOSXE_YANG
}
node "Juniper JunOS Device" as JUNOS {
database "YANG Data Model" as JUNOS_YANG
}
node "Arista EOS Device" as ARISTA {
database "YANG Data Model" as ARISTA_YANG
}
CLI [label="> Providers : HCL configuration
Providers"] NETCONF_API : NETCONF RPCs (XML)
Providers [label="> RESTCONF_API : RESTCONF HTTP (JSON/XML)
Providers"] Vendor_API : Vendor-specific Calls
Providers <[label="State : Read/Write State
NETCONF_API <"] IOSXE_YANG : Configure Device
RESTCONF_API <[label="> IOSXE_YANG : Configure Device
NETCONF_API <"] JUNOS_YANG : Configure Device
RESTCONF_API <[label="> JUNOS_YANG : Configure Device
Vendor_API <"] ARISTA_YANG : Configure Device
IOSXE_YANG -- IOSXE
JUNOS_YANG -- JUNOS
ARISTA_YANG -- ARISTA
@enduml
Figure 7.2: Terraform Interaction with Multi-Vendor Network Devices
7.2.4 Terraform for Cloud Networking
Cloud platforms (AWS, Azure, GCP) fundamentally operate on an IaC model, exposing comprehensive APIs for all their services, including networking. Terraform excels at provisioning and managing these cloud network resources, integrating seamlessly with their native APIs.
Common Cloud Network Resources Managed by Terraform:
- Virtual Private Clouds (VPCs) / Virtual Networks (VNets): Isolated network segments in the cloud.
- Subnets: Divisions within a VPC/VNet.
- Route Tables: Control network traffic flow.
- Security Groups / Network Security Groups (NSGs): Stateful firewalls for instances/subnets.
- Load Balancers: Distribute traffic across instances.
- VPN Gateways / Direct Connect / ExpressRoute / Cloud Interconnect: Hybrid cloud connectivity.
- Transit Gateways / Hub-and-Spoke Topologies: Centralized routing and connectivity for multiple VPCs.
- DNS Services: Route 53, Azure DNS, Cloud DNS.
The declarative nature of Terraform maps perfectly to the API-driven nature of cloud networking.
Hybrid Cloud Network Architecture with Terraform
nwdiag {
// Define custom styles for clouds and on-prem
define style cloud {
color = "#ADD8E6"; // Light Blue
border_color = "#4169E1"; // Royal Blue
font_color = "#191970"; // Midnight Blue
border_width = 2;
}
define style on_prem {
color = "#FFDAB9"; // Peach Puff
border_color = "#FFA07A"; // Light Salmon
font_color = "#8B0000"; // Dark Red
border_width = 2;
}
define style device_router {
shape = router;
color = "#E0FFFF";
border_color = "#00BFFF";
}
define style device_server {
shape = box;
color = "#F0FFF0";
border_color = "#3CB371";
}
// Cloud network (AWS)
cloud "AWS Cloud" {
style = cloud;
network "AWS VPC Production" {
address = "10.10.0.0/16"
description = "Terraform-Managed Prod VPC"
network "Private Subnet A" {
address = "10.10.1.0/24"
instance_prod_app [address = "10.10.1.10", style = device_server];
}
network "Private Subnet B" {
address = "10.10.2.0/24"
instance_prod_db [address = "10.10.2.20", style = device_server];
}
network "Public Subnet" {
address = "10.10.0.0/24"
lb_public [address = "10.10.0.5", shape=cloud]; // Representing a Load Balancer
}
}
network "AWS VPC Development" {
address = "10.20.0.0/16"
description = "Terraform-Managed Dev VPC"
dev_router [address = "10.20.0.1", style = device_router];
}
// Transit Gateway for inter-VPC and hybrid connectivity
network "AWS Transit Gateway" {
address = "VPN/Direct Connect Tunnel"
description = "Centralized Routing Hub"
aws_tgw [shape=cloud]; // Abstract representation of TGW
}
aws_tgw -- "AWS VPC Production" : VPC Attachment
aws_tgw -- "AWS VPC Development" : VPC Attachment
}
// On-Premises Network
group "On-Premises Data Center" {
style = on_prem;
network "Internal Network" {
address = "192.168.1.0/24"
onprem_router [address = "192.168.1.1", style = device_router];
onprem_server [address = "192.168.1.100", style = device_server];
}
network "DMZ Network" {
address = "172.16.0.0/24"
firewall [address = "172.16.0.1", shape=firewall];
}
}
// Connectivity between On-Prem and Cloud
onprem_router -- firewall;
firewall -- aws_tgw : IPsec VPN Tunnel
// Implicitly, lb_public has internet access
}
Figure 7.3: Hybrid Cloud Network Topology Managed by Terraform
7.3 Configuration Examples
These examples demonstrate using Terraform to configure network devices and cloud resources. They assume a basic Terraform setup (installed CLI, configured cloud credentials).
7.3.1 Cisco IOS XE (NETCONF/RESTCONF Provider)
This example uses the generic netconf provider to configure a VLAN on a Cisco IOS XE device. The device needs to have NETCONF or RESTCONF enabled and configured for SSH.
Prerequisites:
- Cisco IOS XE device reachable via SSH.
- NETCONF/RESTCONF enabled on the device. Example config:
! username terraform privilege 15 secret 0 terraform_password ! netconf-yang ssh ! restconf transport https port 443 ! - A
~/.netconf.ymlor similar file with credentials, or direct inline credentials (less secure).
Terraform Files:
main.tf
# main.tf for Cisco IOS XE VLAN Configuration
# Configure the NETCONF provider
# Ensure host, username, and password are correct for your device
provider "netconf" {
device = "iosxe_router" # Name defined in ~/.netconf.yml or provide directly
# Alternatively, provide host, username, password directly:
# host = "192.168.1.10"
# username = "terraform"
# password = "terraform_password"
port = 830 # Default NETCONF over SSH port
# Use skip_verify for lab environments if self-signed certs
# skip_verify = true
}
# Define a NETCONF device for the provider
# This resource doesn't configure anything on the device,
# but it's used by other resources to target the specific device.
resource "netconf_device" "iosxe_router" {
name = "iosxe_router"
host = "192.168.1.10" # Replace with your IOS XE device IP
}
# Define a resource to manage a VLAN using NETCONF
# This uses a generic 'netconf_edit_config' resource, providing the XML payload.
# For production, consider vendor-specific providers (e.g., ciscoiosxe) if available,
# as they abstract away the XML/JSON and provide HCL-native resource definitions.
resource "netconf_edit_config" "vlan_data" {
device_name = netconf_device.iosxe_router.name
target = "running" # Apply to running configuration
config_xml = <<EOF
<config>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<vlan>
<vlan-list>
<id>100</id>
<name>Terraform_VLAN</name>
</vlan-list>
<vlan-list>
<id>101</id>
<name>Another_Terraform_VLAN</name>
</vlan-list>
</vlan>
</native>
</config>
EOF
# Lifecycle rule to prevent Terraform from destroying the VLAN if the resource is removed
# This is a common practice for core network configurations.
# If you want Terraform to manage deletion, remove this block.
lifecycle {
prevent_destroy = true
ignore_changes = ["config_xml"] # If you update config_xml outside, Terraform won't re-apply
}
}
# Output the device name after successful application
output "configured_iosxe_device" {
value = netconf_device.iosxe_router.name
description = "The name of the Cisco IOS XE device configured by Terraform."
}
Security Warning: Directly embedding credentials in main.tf is highly insecure for production. Use environment variables, terraform.tfvars, or a secrets management tool like HashiCorp Vault. For netconf provider, storing credentials in ~/.netconf.yml or using SSH agent forwarding is more secure.
Deployment Steps:
- Initialize Terraform:
terraform init - Review the plan:
terraform plan - Apply the configuration:
terraform apply
Verification Commands (Cisco IOS XE):
! Verify VLAN configuration
show vlan brief
show running-config | section vlan
Expected Output:
VLAN Name Status Ports
---- -------------------------------- --------- -------------------------------
1 default active
100 Terraform_VLAN active
101 Another_Terraform_VLAN active
...
7.3.2 Juniper JunOS (NETCONF/RESTCONF Provider)
Similar to Cisco, this example uses the netconf provider to configure a logical interface on a Juniper JunOS device.
Prerequisites:
- Juniper JunOS device reachable via SSH.
- NETCONF enabled on the device. Example config:
# Set up NETCONF over SSH set system services netconf ssh # Create user for Terraform set system login user terraform class super-user authentication plain-text-password # Provide password when prompted
Terraform Files:
main.tf
# main.tf for Juniper JunOS Interface Configuration
# Configure the NETCONF provider
provider "netconf" {
device = "juniper_srx" # Name defined in ~/.netconf.yml or provide directly
# host = "192.168.1.20" # Replace with your JunOS device IP
# username = "terraform"
# password = "your_juniper_password"
port = 830
}
resource "netconf_device" "juniper_srx" {
name = "juniper_srx"
host = "192.168.1.20" # Replace with your JunOS device IP
}
# Define a resource to manage a logical interface on JunOS using NETCONF
resource "netconf_edit_config" "loopback_interface" {
device_name = netconf_device.juniper_srx.name
target = "running" # Apply to running configuration
config_xml = <<EOF
<configuration>
<interfaces>
<interface>
<name>lo0</name>
<unit>
<name>0</name>
<family>
<inet>
<address>
<name>10.0.0.1/32</name>
</address>
</inet>
</family>
</unit>
</interface>
<interface>
<name>ge-0/0/0</name>
<unit>
<name>0</name>
<family>
<inet>
<address>
<name>192.168.20.1/24</name>
</address>
</inet>
</family>
</unit>
</interface>
</interfaces>
</configuration>
EOF
# Commit and synchronize changes
commit_confirmed = false # Set to true for a 'commit confirmed' operation
commit_synchronize = true # Ensures configuration is synchronized to the backup Routing Engine
commit_comment = "Terraform: Configured loopback and ge-0/0/0.0 interfaces"
# lifecycle { prevent_destroy = true } # Consider for critical resources
}
output "configured_juniper_device" {
value = netconf_device.juniper_srx.name
description = "The name of the Juniper JunOS device configured by Terraform."
}
Security Warning: Same as for Cisco, avoid hardcoding credentials.
Deployment Steps:
- Initialize Terraform:
terraform init - Review the plan:
terraform plan - Apply the configuration:
terraform apply
Verification Commands (Juniper JunOS):
# Verify interface configuration
show interfaces lo0.0
show interfaces ge-0/0/0.0
show configuration interfaces | display set
Expected Output:
user@juniper-srx> show interfaces lo0.0
Logical interface lo0.0 (Index 66) (SNMP ifIndex 506)
Flags: Up SNMP-Traps 0x4000000 Encapsulation: ENET2
inet addr 10.0.0.1/32
user@juniper-srx> show interfaces ge-0/0/0.0
Logical interface ge-0/0/0.0 (Index 67) (SNMP ifIndex 507)
Flags: Up SNMP-Traps 0x4000000 Encapsulation: ENET2
inet addr 192.168.20.1/24
7.3.3 Cloud Networking (AWS VPC Example)
This example provisions an AWS VPC, subnets, and a security group.
Prerequisites:
- AWS account with appropriate IAM permissions.
- AWS credentials configured for Terraform (e.g., via environment variables
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, or~/.aws/credentials).
Terraform Files:
main.tf
# main.tf for AWS VPC and Subnet Configuration
# Configure the AWS provider
provider "aws" {
region = "us-east-1" # Specify your desired AWS region
}
# Create a new VPC
resource "aws_vpc" "netdevops_vpc" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "NetDevOps-Terraform-VPC"
ManagedBy = "Terraform"
Environment = "Dev"
}
}
# Create a public subnet
resource "aws_subnet" "public_subnet" {
vpc_id = aws_vpc.netdevops_vpc.id
cidr_block = "10.0.1.0/24"
availability_zone = "${provider.aws.region}a"
map_public_ip_on_launch = true # Instances in this subnet get public IPs
tags = {
Name = "NetDevOps-Public-Subnet"
ManagedBy = "Terraform"
}
}
# Create a private subnet
resource "aws_subnet" "private_subnet" {
vpc_id = aws_vpc.netdevops_vpc.id
cidr_block = "10.0.2.0/24"
availability_zone = "${provider.aws.region}a"
tags = {
Name = "NetDevOps-Private-Subnet"
ManagedBy = "Terraform"
}
}
# Create a Security Group (Firewall Rules)
resource "aws_security_group" "web_sg" {
name = "web-security-group"
description = "Allow HTTP/HTTPS traffic"
vpc_id = aws_vpc.netdevops_vpc.id
ingress {
description = "Allow HTTP from anywhere"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
description = "Allow HTTPS from anywhere"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1" # Allow all outbound traffic
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "WebSecurityGroup"
ManagedBy = "Terraform"
}
}
# Output the VPC ID and Subnet IDs
output "vpc_id" {
value = aws_vpc.netdevops_vpc.id
description = "The ID of the created VPC."
}
output "public_subnet_id" {
value = aws_subnet.public_subnet.id
description = "The ID of the public subnet."
}
output "private_subnet_id" {
value = aws_subnet.private_subnet.id
description = "The ID of the private subnet."
}
Deployment Steps:
- Initialize Terraform:
terraform init - Review the plan:
terraform plan - Apply the configuration:
terraform apply
Verification (AWS Console/CLI):
- Navigate to the VPC service in the AWS console.
- Verify the existence of “NetDevOps-Terraform-VPC” with CIDR
10.0.0.0/16. - Check the subnets for
NetDevOps-Public-Subnet(10.0.1.0/24) andNetDevOps-Private-Subnet(10.0.2.0/24). - Confirm the
web-security-groupexists with appropriate ingress/egress rules.
Expected Output (CLI):
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Outputs:
private_subnet_id = "subnet-xxxxxxxxxxxxxxxxxx"
public_subnet_id = "subnet-xxxxxxxxxxxxxxxxxx"
vpc_id = "vpc-xxxxxxxxxxxxxxxxxx"
7.4 Network Diagrams
As demonstrated throughout the examples, clear network diagrams are crucial for understanding and documenting infrastructure managed by Terraform.
7.4.1 Network Topology (nwdiag)
A diagram illustrating a typical multi-region, hybrid cloud network managed by Terraform.
nwdiag {
// Define custom styles for different components
define style cloud_region {
color = "#E0FFFF"; // Azure light blue
border_color = "#4169E1"; // Royal Blue
font_color = "#191970"; // Midnight Blue
border_width = 2;
fontsize = 14;
}
define style on_prem_dc {
color = "#FFFACD"; // Lemon Chiffon
border_color = "#DAA520"; // Goldenrod
font_color = "#8B0000"; // Dark Red
border_width = 2;
fontsize = 14;
}
define style device_router {
shape = router;
color = "#F0FFF0"; // Honeydew
border_color = "#3CB371"; // Medium Sea Green
fontsize = 12;
}
define style device_firewall {
shape = firewall;
color = "#FFEFD5"; // PapayaWhip
border_color = "#FF8C00"; // Dark Orange
fontsize = 12;
}
define style device_server {
shape = box;
color = "#F5F5DC"; // Beige
border_color = "#A0522D"; // Sienna
fontsize = 12;
}
define style service_lb {
shape = cloud;
color = "#F0F8FF"; // AliceBlue
border_color = "#1E90FF"; // DodgerBlue
fontsize = 12;
}
define style internet_cloud {
shape = cloud;
color = "#F8F8FF"; // GhostWhite
border_color = "#778899"; // Light Slate Gray
fontsize = 12;
}
// AWS Region 1 (us-east-1)
cloud "AWS Region A (us-east-1)" {
style = cloud_region;
network "AWS VPC A (10.1.0.0/16)" {
description = "Terraform-Managed"
router_aws_a [style = device_router];
network "Public Subnet (10.1.1.0/24)" {
lb_web_a [style = service_lb];
}
network "Private Subnet (10.1.2.0/24)" {
app_server_a [style = device_server];
}
}
}
// AWS Region 2 (us-west-2)
cloud "AWS Region B (us-west-2)" {
style = cloud_region;
network "AWS VPC B (10.2.0.0/16)" {
description = "Terraform-Managed"
router_aws_b [style = device_router];
network "Public Subnet (10.2.1.0/24)" {
lb_web_b [style = service_lb];
}
network "Private Subnet (10.2.2.0/24)" {
app_server_b [style = device_server];
}
}
}
// On-Premises Data Center
group "On-Premises Data Center" {
style = on_prem_dc;
network "Core Network (192.168.10.0/24)" {
description = "Managed by Terraform/Ansible"
core_router [style = device_router, address="192.168.10.1"];
core_switch [shape = switch];
db_server [style = device_server, address="192.168.10.10"];
}
network "DMZ (172.16.0.0/24)" {
fw_onprem [style = device_firewall, address="172.16.0.1"];
web_proxy [style = device_server, address="172.16.0.10"];
}
}
// Interconnects
internet "Internet" {
style = internet_cloud;
}
// Cloud Interconnections
router_aws_a -- router_aws_b [label="VPC Peering / Transit Gateway"];
fw_onprem -- router_aws_a [label="Direct Connect / VPN Tunnel"];
// Internet Connectivity
lb_web_a -- internet;
lb_web_b -- internet;
web_proxy -- internet;
// Internal On-Prem Connectivity
core_router -- fw_onprem;
core_router -- core_switch;
core_switch -- db_server;
}
Figure 7.4: Multi-Region Hybrid Cloud Network Topology
7.4.2 Protocol Flow (graphviz)
Illustrating the Terraform plan and apply lifecycle.
digraph Terraform_Lifecycle {
rankdir=LR;
node [shape=box, style=filled, fillcolor=lightblue];
edge [color=gray, arrowhead=vee];
start [label="Start (terraform init)", shape=ellipse, fillcolor=lightgreen];
config [label="HCL Configuration (.tf files)"];
provider_plugins [label="Provider Plugins", shape=cylinder, fillcolor=lightgray];
state_file [label="Terraform State (.tfstate)", shape=Mrecord, fillcolor=lightyellow];
current_infra [label="Current Infrastructure (APIs)", shape=cylinder, fillcolor=lightgray];
plan [label="terraform plan", shape=box, fillcolor=orange];
diff [label="Compare (HCL vs State vs Actual)"];
execution_plan [label="Execution Plan (Proposed Changes)", fillcolor=lightpink];
review [label="Review Plan & Approve", shape=hexagon, fillcolor=cyan];
apply [label="terraform apply", shape=box, fillcolor=orange];
provider_action [label="Provider Actions (API Calls)", fillcolor=lightgray];
updated_infra [label="Updated Infrastructure", shape=cylinder, fillcolor=lightgreen];
update_state [label="Update State File"];
end [label="End (Infrastructure Deployed/Modified)", shape=ellipse, fillcolor=lightgreen];
destroy [label="terraform destroy", shape=box, fillcolor=red];
destroy_action [label="Provider Actions (Delete API Calls)", fillcolor=red];
destroyed_infra [label="Destroyed Infrastructure", shape=cylinder, fillcolor=darkgray];
clear_state [label="Clear State File"];
start -> config;
config -> provider_plugins;
provider_plugins -> plan;
plan -> state_file [label="Read State"];
plan -> current_infra [label="Read Current State via APIs"];
plan -> diff;
diff -> execution_plan;
execution_plan -> review;
review -> apply [label="Approve"];
review -> destroy [label="Decide to Destroy (Alternative)"];
apply -> provider_action [label="Execute Changes via APIs"];
provider_action -> updated_infra;
updated_infra -> update_state;
update_state -> state_file;
update_state -> end;
destroy -> destroy_action;
destroy_action -> destroyed_infra;
destroyed_infra -> clear_state;
clear_state -> state_file;
clear_state -> end;
}
Figure 7.5: Terraform Plan and Apply Lifecycle
7.4.3 Architecture (PlantUML)
High-level architecture depicting a NetDevOps pipeline utilizing Terraform for both network and cloud provisioning.
@startuml
skinparam handwritten true
skinparam style strictuml
rectangle "Network Engineering Team" as Team
cloud "Version Control System (GitLab/GitHub)" as VCS
rectangle "CI/CD Pipeline (Jenkins/GitLab CI/GitHub Actions)" as CI_CD
cloud "Terraform Cloud/Enterprise" as TF_Cloud_RemoteState
rectangle "Terraform Workspace" as TF_Workspace
package "Network Automation Layer" {
component "Terraform CLI" as TF_CLI
component "Terraform Providers" as TF_Providers
}
package "Cloud Infrastructure" {
cloud "AWS VPCs & Services" as AWS
cloud "Azure VNets & Services" as Azure
cloud "GCP VPCs & Services" as GCP
}
package "On-Prem Network Infrastructure" {
node "Cisco IOS XE Devices" as Cisco
node "Juniper JunOS Devices" as Juniper
node "Arista EOS Devices" as Arista
}
Team [label="> VCS : Pushes HCL Code
VCS"] CI_CD : Trigger Pipeline (Webhook)
CI_CD [label="> TF_Workspace : Checkout Code
CI_CD"] TF_CLI : Executes 'terraform plan/apply'
TF_CLI <[label="> TF_Cloud_RemoteState : Remote State Backend & Locking
TF_CLI"] TF_Providers : Delegates API Calls
TF_Providers [label="> AWS : Provision/Manage Cloud Resources
TF_Providers"] Azure : Provision/Manage Cloud Resources
TF_Providers [label="> GCP : Provision/Manage Cloud Resources
TF_Providers"] Cisco : Config via NETCONF/RESTCONF/API
TF_Providers [label="> Juniper : Config via NETCONF/RESTCONF/API
TF_Providers"] Arista : Config via EOS API/NETCONF
VCS ..> TF_Cloud_RemoteState : Store TF Modules
@enduml
Figure 7.6: NetDevOps Architecture with Terraform for Hybrid Infrastructure
7.5 Automation Examples (Terraform HCL)
These examples focus on the Terraform HCL configurations themselves, as Terraform is primarily an IaC automation tool.
7.5.1 Modular AWS VPC with Subnets and Route Tables
This demonstrates a more robust, modular approach for AWS networking, creating a reusable VPC module.
Module Definition (modules/vpc/main.tf):
# modules/vpc/main.tf
variable "region" {
description = "AWS region for the VPC"
type = string
}
variable "vpc_cidr" {
description = "CIDR block for the VPC"
type = string
}
variable "public_subnet_cidrs" {
description = "List of CIDR blocks for public subnets"
type = list(string)
}
variable "private_subnet_cidrs" {
description = "List of CIDR blocks for private subnets"
type = list(string)
}
variable "vpc_name" {
description = "Name tag for the VPC"
type = string
default = "terraform-managed-vpc"
}
# AWS VPC
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = var.vpc_name
ManagedBy = "Terraform-Module"
}
}
# Internet Gateway (for public subnets to access internet)
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.vpc_name}-igw"
}
}
# Public Subnets
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = "${var.region}${element(["a", "b", "c"], count.index)}"
map_public_ip_on_launch = true
tags = {
Name = "${var.vpc_name}-public-subnet-${count.index}"
ManagedBy = "Terraform-Module"
}
}
# Private Subnets
resource "aws_subnet" "private" {
count = length(var.private_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.private_subnet_cidrs[count.index]
availability_zone = "${var.region}${element(["a", "b", "c"], count.index)}"
tags = {
Name = "${var.vpc_name}-private-subnet-${count.index}"
ManagedBy = "Terraform-Module"
}
}
# Public Route Table
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.vpc_name}-public-rt"
}
}
resource "aws_route" "public_internet_gateway" {
route_table_id = aws_route_table.public.id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
# Associate public subnets with public route table
resource "aws_route_table_association" "public" {
count = length(var.public_subnet_cidrs)
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
# Output values
output "vpc_id" {
value = aws_vpc.main.id
}
output "public_subnet_ids" {
value = [for s in aws_subnet.public : s.id]
}
output "private_subnet_ids" {
value = [for s in aws_subnet.private : s.id]
}
Root Configuration (main.tf in parent directory):
# main.tf in root directory
provider "aws" {
region = "us-east-1"
}
module "prod_vpc" {
source = "./modules/vpc" # Path to your VPC module
region = "us-east-1"
vpc_cidr = "10.10.0.0/16"
public_subnet_cidrs = ["10.10.1.0/24", "10.10.2.0/24"]
private_subnet_cidrs = ["10.10.10.0/24", "10.10.11.0/24"]
vpc_name = "production-vpc"
}
module "dev_vpc" {
source = "./modules/vpc" # Reuse the same module for dev environment
region = "us-east-1"
vpc_cidr = "10.20.0.0/16"
public_subnet_cidrs = ["10.20.1.0/24"]
private_subnet_cidrs = ["10.20.10.0/24"]
vpc_name = "development-vpc"
}
output "prod_vpc_id" {
value = module.prod_vpc.vpc_id
}
output "dev_vpc_id" {
value = module.dev_vpc.vpc_id
}
This modular structure allows you to instantiate multiple VPCs with consistent patterns, drastically reducing code duplication and maintenance.
7.6 Security Considerations
Security is paramount in IaC, especially when managing network infrastructure.
7.6.1 State File Security
- Sensitive Data: Terraform state files often contain sensitive information (IP addresses, network topology details, sometimes even plaintext passwords if not handled carefully).
- Remote Backend: Always use a remote backend (e.g., AWS S3 with encryption, Azure Blob Storage, HashiCorp Consul, Terraform Cloud/Enterprise) for state storage. This centralizes state, enables locking, and provides access control.
- Encryption: Ensure the remote backend is configured for encryption at rest (e.g., S3 server-side encryption).
- Access Control: Implement strict Access Control Lists (ACLs) or IAM policies for who can read/write to the state file. Follow the principle of least privilege.
- Never Commit to Git: The
terraform.tfstatefile should never be committed to version control directly. Add it to.gitignore.
7.6.2 Credentials and Secrets Management
- Avoid Hardcoding: Never hardcode API keys, usernames, passwords, or tokens directly in your HCL code.
- Environment Variables: Use environment variables for sensitive provider authentication (e.g.,
TF_VAR_aws_access_key_id,NETCONF_USERNAME). - Secrets Management Tools: Integrate with dedicated secrets management solutions like HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager. These tools securely store and provide access to credentials at runtime.
- Terraform CLI
varfiles: For non-sensitive but environment-specific variables, useterraform.tfvars. For sensitive variables, use a*.auto.tfvarsfile that is not committed to Git, or better yet, inject them from a secrets manager in a CI/CD pipeline.
7.6.3 Access Control and RBAC
- Least Privilege: Configure IAM/RBAC roles for Terraform execution with the minimum necessary permissions to provision and manage resources. For example, a role might only have permissions to create/modify VPCs and subnets, but not delete critical production databases.
- Separate Environments: Use separate cloud accounts, IAM roles, or Terraform workspaces/projects for different environments (dev, test, prod) to prevent accidental cross-contamination or unauthorized access.
7.6.4 Drift Detection and Remediation
- Configuration Drift: This occurs when the actual state of infrastructure deviates from the desired state defined in HCL. This can happen due to manual changes, out-of-band configurations, or other automation tools.
- Regular
terraform plan: Periodically runterraform planin a read-only mode (e.g., in a CI/CD pipeline) to detect drift. - Automated Remediation (Caution): While
terraform applycan remediate drift, automating this can be risky if changes are not reviewed. Consider manual approval for remediation or use drift detection for alerting rather than immediate auto-remediation.
7.6.5 Supply Chain Security (Providers and Modules)
- Trusted Sources: Only use Terraform providers and modules from trusted sources (HashiCorp Registry, verified partners, or internal repositories).
- Versioning: Pin provider and module versions (
required_providersandmoduleblocks) to ensure consistent behavior and prevent unexpected changes from new releases. - Security Scanning: Incorporate static analysis tools (e.g., Checkov, Trivy, tfsec) into your CI/CD pipeline to scan HCL code for security misconfigurations and compliance violations before deployment.
7.6.6 Example: Protecting Sensitive Outputs
# main.tf (excerpt) - Sensitive Output Example
# Assume you have a resource that generates a sensitive value, e.g., a shared secret for a VPN
resource "random_string" "vpn_secret" {
length = 32
special = true
numeric = true
upper = true
lower = true
}
output "vpn_shared_secret" {
value = random_string.vpn_secret.result
description = "The shared secret for the VPN connection."
sensitive = true # Mark this output as sensitive
}
When an output is marked sensitive = true, Terraform will redact its value in the CLI output (e.g., (sensitive value)) and in the state file when queried via terraform output.
7.7 Verification & Troubleshooting
Terraform provides robust tools for verification and offers clear error messages, but understanding common issues helps.
7.7.1 Verification Commands
Terraform’s primary verification comes from its plan and apply outputs.
terraform init: Initializes your working directory, downloading necessary providers and modules.- Verification: Successful download of providers, creation of
.terraformdirectory. - Troubleshooting: Network connectivity issues, incorrect provider source in HCL, provider registry unreachable.
- Verification: Successful download of providers, creation of
terraform validate: Checks the HCL code for syntax errors and internal consistency. It does not access any remote services.- Verification:
Success! The configuration is valid. - Troubleshooting: Syntax errors in HCL (missing brackets, typos), incorrect variable declarations.
- Verification:
terraform plan: Generates an execution plan, showing what actions Terraform will take (create, update, destroy) to reach the desired state defined in your HCL. It queries the actual state of the infrastructure.- Verification: Review the proposed changes carefully. Ensure it matches your expectations. Look for
N to add, N to change, N to destroy. - Troubleshooting:
- Unexpected changes: Often due to drift (manual changes) or incorrect logic in your HCL.
- Provider errors: If Terraform can’t connect to the API or authenticate,
planwill fail. - Resource not found: If
datasources try to query non-existent resources.
- Verification: Review the proposed changes carefully. Ensure it matches your expectations. Look for
terraform apply: Executes the actions proposed in the plan. Requires explicit approval.- Verification: Successful completion of resource provisioning. The output shows
Apply complete! Resources: N added, N changed, N destroyed.. - Troubleshooting:
- API errors: Permissions issues, invalid parameters passed to the device/cloud API, rate limiting.
- State conflicts: If multiple users apply changes concurrently without state locking.
- Timeout errors: If resource creation takes longer than the provider’s configured timeout.
- Verification: Successful completion of resource provisioning. The output shows
terraform show: Reads the current state file and prints the resource attributes that Terraform knows about.- Verification: Check if the deployed resources match your configuration and if sensitive data is redacted.
terraform output <output_name>: Displays the value of a specific output variable.- Verification: Quickly check values of key resources (e.g., VPC ID, interface IP).
7.7.2 Common Issues and Resolution Steps
| Issue Category | Common Symptoms | Debug Commands / Resolution Steps |
|---|---|---|
| HCL Syntax Errors | Error: Missing argument, Error: Invalid block definition, Unterminated string literal | terraform validate is your first line of defense. Pay close attention to line numbers in error messages. Use a good IDE with HCL syntax highlighting. |
| Provider Issues | Error: provider.aws: NoCredentialProviders (AWS), Error: dial tcp: i/o timeout (NETCONF), Error: Failed to refresh state | Authentication: Double-check credentials (env vars, ~/.aws/credentials, ~/.netconf.yml). Connectivity: Ping target host, netstat to check port, firewall rules. Provider Version: Ensure required_providers in versions.tf is correct and run terraform init -upgrade. |
| State File Corruption | Error: Failed to load state, Resource already exists in state but not in config | NEVER MANUALLY EDIT terraform.tfstate DIRECTLY. Use terraform state subcommands: terraform state rm, terraform state mv, terraform import to reconcile. In extreme cases, restore from a remote backend backup. |
| Configuration Drift | terraform plan shows unexpected changes to existing resources. | Review the terraform plan output carefully. Determine if the drift was intentional (manual change) or unintentional. If unintentional, terraform apply will revert it. If intentional, consider updating HCL or terraform taint to force recreation. |
| Resource Creation Fails | Error: Error creating EC2 instance: UnauthorizedOperation, Error: VLAN ID 100 already exists | Permissions: Verify IAM/RBAC policies for the user/role running Terraform. API Errors: The provider error message usually contains the underlying API error (e.g., duplicate resource name, invalid value). Consult vendor API documentation. |
| Timeout Errors | Error: timeout while waiting for state to become 'running' | Increase timeout settings in the resource block (if supported by the provider) or provider configuration. This often happens with network device reboots or slow cloud resource provisioning. |
| Dependency Issues | Resources fail to provision because a dependency isn’t ready (e.g., routing table without a VPC). | Terraform usually manages dependencies implicitly. If explicit ordering is needed, use depends_on = [resource.type.name]. Ensure your network architecture is logically sound. |
7.7.3 Root Cause Analysis for Network-Specific Issues
- NETCONF/RESTCONF Specific:
- Incorrect XML/JSON Payload: Terraform might send valid XML/JSON to the device, but the content itself is semantically incorrect according to the YANG model. Use
terraform plan -jsonto inspect the actual payload being sent (if the provider supports it) and validate against the YANG schema using tools like Cisco YANG Suite. - Device Capability Mismatch: Ensure the device supports the specific YANG model or NETCONF operations being attempted. Use
show netconf-yang capabilities(Cisco) orshow netconf capabilities(Juniper) on the device. - Firewall between Terraform and Device: Ensure TCP port 830 (NETCONF over SSH) or 443/80 (RESTCONF over HTTPS/HTTP) is open.
- Incorrect XML/JSON Payload: Terraform might send valid XML/JSON to the device, but the content itself is semantically incorrect according to the YANG model. Use
- Cloud Network Specific:
- CIDR Overlaps: Terraform will often detect and fail on CIDR block overlaps when creating VPCs/VNets or subnets.
- Availability Zone (AZ) Capacity: Occasionally, an AZ may lack capacity for a specific resource type, leading to provisioning failures. Terraform retries can sometimes overcome this, or you may need to adjust your AZ strategy.
- Routing Conflicts: Incorrect route table configurations can cause provisioning failures or lead to unreachability post-deployment.
7.8 Performance Optimization
Optimizing Terraform deployments in complex network environments involves structuring your code and leveraging Terraform features effectively.
7.8.1 Modularization and Granularity
- Small, Focused Modules: Break down large configurations into smaller, reusable modules. For instance, an AWS module for VPCs, another for security groups, and a third for EC2 instances. For network devices, modules could represent a standardized device configuration, a VLAN stack, or a routing protocol setup.
- Reduced Scope: Smaller modules lead to faster
terraform planandterraform applytimes, as Terraform only needs to process a subset of resources. - Parallel Execution: Terraform can often parallelize resource creation/modification. Well-defined, independent modules can benefit from this.
7.8.2 Remote State Management
- State Locking: Essential for collaborative environments. Remote backends (e.g., S3, Consul, Terraform Cloud) provide state locking to prevent multiple
terraform applyoperations from conflicting. - Encryption: Store state files encrypted at rest.
- Versioning: Utilize state backend versioning (e.g., S3 versioning) for easy rollbacks and audit trails.
7.8.3 Provider Configuration
- Connection Pooling (where available): Some network device providers might support connection pooling for NETCONF/RESTCONF sessions, reducing overhead for multiple configuration changes.
- Batching API Calls: Advanced providers may batch multiple changes into a single API call, improving performance, especially over high-latency links.
- Rate Limiting: Be aware of API rate limits imposed by cloud providers or network devices. Terraform providers often have built-in retry mechanisms, but excessive resource creation can still hit limits.
7.8.4 CI/CD Integration
- Automated
planon PR: Runterraform planautomatically on every pull request to get quick feedback on proposed changes and detect drift. - Targeted Applies: For large configurations, use
terraform apply -target=resource.type.nameto apply changes only to specific resources. Use this with caution as it can break implicit dependencies and lead to partial deployments. It’s generally better to let Terraform manage the entire graph. - Terraform Cloud/Enterprise: Leverage these platforms for managed remote state, shared module registries, policy enforcement (Sentinel), and streamlined CI/CD integration, which significantly boosts team productivity and performance.
7.8.5 Resource count and for_each
- Efficiently provision multiple similar resources using
countorfor_eachmeta-arguments. This reduces HCL boilerplate and simplifies managing large numbers of identical network segments, interfaces, or security groups.
7.9 Hands-On Lab: Hybrid Network Provisioning
This lab will guide you through provisioning a simple hybrid network using Terraform. It will involve:
- Creating a cloud VPC with subnets (AWS).
- Configuring a loopback interface on a Cisco IOS XE device.
- Connecting the two logically (conceptual, as physical VPN setup is complex for a simple lab).
Lab Topology
nwdiag {
// Define custom styles for clouds and on-prem
define style cloud {
color = "#E0FFFF";
border_color = "#4169E1";
}
define style on_prem {
color = "#FFFACD";
border_color = "#DAA520";
}
define style device_router {
shape = router;
color = "#F0FFF0";
border_color = "#3CB371";
}
define style internet_cloud {
shape = cloud;
color = "#F8F8FF";
border_color = "#778899";
}
cloud "AWS Cloud (us-east-1)" {
style = cloud;
network "NetDevOps VPC (10.0.0.0/16)" {
description = "Terraform-Managed VPC"
aws_router [shape=router]; // Logical representation of AWS routing
network "Public Subnet (10.0.1.0/24)" {
aws_public_instance [address = "10.0.1.10", shape=box];
}
network "Private Subnet (10.0.2.0/24)" {
aws_private_instance [address = "10.0.2.10", shape=box];
}
}
}
group "On-Premises Lab" {
style = on_prem;
network "On-Prem Network (192.168.1.0/24)" {
cisco_iosxe [address = "192.168.1.100", style = device_router];
}
}
// Conceptual VPN/Direct Connect link
cisco_iosxe -- aws_router [label="Conceptual VPN/Direct Connect Link", style="dotted"];
aws_router -- "NetDevOps VPC (10.0.0.0/16)";
aws_public_instance -- internet_cloud; // Public instance has internet access
}
Figure 7.7: Hands-On Lab Hybrid Network Topology
7.9.1 Objectives
- Set up Terraform for AWS and a local Cisco IOS XE device.
- Provision an AWS VPC, subnets, and an Internet Gateway.
- Configure a loopback interface and a VLAN on the Cisco IOS XE device.
- Output key IDs for verification.
7.9.2 Prerequisites
- AWS Account: With credentials configured (e.g.,
~/.aws/credentials). - Cisco IOS XE Device:
- Running on a platform like VIRL/CML, EVE-NG, or a physical device.
- SSH accessible from your Terraform workstation.
- NETCONF/RESTCONF enabled. User
terraformwith passwordterraform_passwordand privilege 15. - IP address:
192.168.1.100(adjust if needed).
- Terraform CLI: Installed on your workstation.
7.9.3 Step-by-Step Configuration
Step 1: Create Lab Directory Structure
mkdir netdevops-terraform-lab
cd netdevops-terraform-lab
touch main.tf versions.tf terraform.tfvars
mkdir modules
mkdir modules/cisco_iosxe
touch modules/cisco_iosxe/main.tf modules/cisco_iosxe/variables.tf modules/cisco_iosxe/outputs.tf
Step 2: Define Terraform Providers and Backend (versions.tf)
# versions.tf
terraform {
required_version = ">= 1.0.0" # Ensure compatibility
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
netconf = {
source = "netconf-ng/netconf"
version = "~> 1.0"
}
}
# Example remote backend (uncomment and configure for production)
/*
backend "s3" {
bucket = "your-terraform-state-bucket"
key = "netdevops-lab/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock" # For state locking
}
*/
}
Step 3: Define Variables (terraform.tfvars)
# terraform.tfvars (Sensitive data should NOT be committed to VCS!)
# For lab purposes, you might keep these here, but for production, use environment variables or a secrets manager.
aws_region = "us-east-1"
aws_vpc_cidr = "10.0.0.0/16"
aws_public_subnet_cidr = "10.0.1.0/24"
aws_private_subnet_cidr = "10.0.2.0/24"
cisco_iosxe_host = "192.168.1.100"
cisco_iosxe_username = "terraform"
cisco_iosxe_password = "terraform_password" # Change in production!
Step 4: AWS Configuration (main.tf - root directory)
# main.tf (root directory)
variable "aws_region" {
description = "AWS region for the VPC"
type = string
}
variable "aws_vpc_cidr" {
description = "CIDR block for the AWS VPC"
type = string
}
variable "aws_public_subnet_cidr" {
description = "CIDR block for the AWS public subnet"
type = string
}
variable "aws_private_subnet_cidr" {
description = "CIDR block for the AWS private subnet"
type = string
}
provider "aws" {
region = var.aws_region
}
# AWS VPC
resource "aws_vpc" "netdevops_vpc" {
cidr_block = var.aws_vpc_cidr
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "NetDevOps-Lab-VPC"
}
}
# Internet Gateway
resource "aws_internet_gateway" "netdevops_igw" {
vpc_id = aws_vpc.netdevops_vpc.id
tags = {
Name = "NetDevOps-Lab-IGW"
}
}
# Public Subnet
resource "aws_subnet" "netdevops_public_subnet" {
vpc_id = aws_vpc.netdevops_vpc.id
cidr_block = var.aws_public_subnet_cidr
availability_zone = "${var.aws_region}a"
map_public_ip_on_launch = true
tags = {
Name = "NetDevOps-Lab-Public-Subnet"
}
}
# Private Subnet
resource "aws_subnet" "netdevops_private_subnet" {
vpc_id = aws_vpc.netdevops_vpc.id
cidr_block = var.aws_private_subnet_cidr
availability_zone = "${var.aws_region}a"
tags = {
Name = "NetDevOps-Lab-Private-Subnet"
}
}
# Public Route Table
resource "aws_route_table" "netdevops_public_rt" {
vpc_id = aws_vpc.netdevops_vpc.id
tags = {
Name = "NetDevOps-Lab-Public-RT"
}
}
resource "aws_route" "public_internet_route" {
route_table_id = aws_route_table.netdevops_public_rt.id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.netdevops_igw.id
}
resource "aws_route_table_association" "public_subnet_association" {
subnet_id = aws_subnet.netdevops_public_subnet.id
route_table_id = aws_route_table.netdevops_public_rt.id
}
# Output for AWS resources
output "aws_vpc_id" {
value = aws_vpc.netdevops_vpc.id
description = "ID of the provisioned AWS VPC."
}
output "aws_public_subnet_id" {
value = aws_subnet.netdevops_public_subnet.id
description = "ID of the public subnet."
}
output "aws_private_subnet_id" {
value = aws_subnet.netdevops_private_subnet.id
description = "ID of the private subnet."
}
# Call Cisco IOS XE module
module "cisco_network_config" {
source = "./modules/cisco_iosxe"
iosxe_host = var.cisco_iosxe_host
iosxe_username = var.cisco_iosxe_username
iosxe_password = var.cisco_iosxe_password
}
Step 5: Cisco IOS XE Module (modules/cisco_iosxe/main.tf)
# modules/cisco_iosxe/main.tf
variable "iosxe_host" {
description = "IP address of the Cisco IOS XE device"
type = string
}
variable "iosxe_username" {
description = "Username for NETCONF access"
type = string
sensitive = true
}
variable "iosxe_password" {
description = "Password for NETCONF access"
type = string
sensitive = true
}
provider "netconf" {
host = var.iosxe_host
username = var.iosxe_username
password = var.iosxe_password
port = 830
# skip_verify = true # Uncomment for lab if using self-signed certs or HTTP RESTCONF
}
resource "netconf_device" "iosxe_lab_device" {
name = "iosxe-lab-device"
host = var.iosxe_host
}
# Configure a Loopback Interface
resource "netconf_edit_config" "loopback_config" {
device_name = netconf_device.iosxe_lab_device.name
target = "running"
config_xml = <<EOF
<config>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<interface>
<Loopback>
<name>100</name>
<ip>
<address>
<primary>
<address>192.168.255.1</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<description>Configured by Terraform NetDevOps Lab</description>
</Loopback>
</interface>
</native>
</config>
EOF
lifecycle { prevent_destroy = true } # Keep loopback interface even if TF resource is removed
commit_comment = "Terraform: Configured Loopback100"
}
# Configure a VLAN
resource "netconf_edit_config" "vlan_config" {
device_name = netconf_device.iosxe_lab_device.name
target = "running"
config_xml = <<EOF
<config>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<vlan>
<vlan-list>
<id>200</id>
<name>Terraform-Lab-VLAN</name>
</vlan-list>
</vlan>
</native>
</config>
EOF
lifecycle { prevent_destroy = true }
commit_comment = "Terraform: Configured VLAN200"
}
output "iosxe_device_name" {
value = netconf_device.iosxe_lab_device.name
description = "Name of the Cisco IOS XE device managed by Terraform."
}
output "iosxe_loopback_ip" {
value = "192.168.255.1"
description = "IP address of the configured Loopback100 interface."
}
Step 6: Initialize, Plan, and Apply
From the netdevops-terraform-lab root directory:
terraform init
terraform plan
terraform apply
Review the plan output carefully, then type yes to apply.
7.9.4 Verification Steps
AWS Verification:
- Log in to the AWS Management Console.
- Navigate to VPC service in the
us-east-1region. - Confirm the
NetDevOps-Lab-VPCwith CIDR10.0.0.0/16exists. - Check the Subnets section to see
NetDevOps-Lab-Public-Subnet(10.0.1.0/24) andNetDevOps-Lab-Private-Subnet(10.0.2.0/24). - Verify the Internet Gateway and Public Route Table are associated correctly.
Cisco IOS XE Verification:
- SSH into your Cisco IOS XE device.
- Run the following commands:
show ip interface brief | include Loopback100 show running-config interface Loopback100 show vlan brief | include Terraform-Lab-VLAN - Confirm
Loopback100is up with IP192.168.255.1andVLAN200exists with the correct name.
7.9.5 Challenge Exercises
- Add a Security Group: Modify the AWS configuration to create a security group allowing SSH (port 22) from your IP and attach it to a hypothetical EC2 instance.
- Juniper Integration: Add a
junosprovider (or usenetconfwith Juniper XML) to configure a loopback interface on a separate Juniper device. - Variable Inputs: Convert more hardcoded values (e.g., VLAN ID, loopback IP) into variables in the Cisco module.
- Destroy: Once satisfied, understand and execute
terraform destroyto tear down the provisioned infrastructure.
7.10 Best Practices Checklist
Applying these best practices will lead to more maintainable, secure, and efficient Terraform deployments for network infrastructure.
- Version Control: Store all HCL code in a Git repository.
- Remote State: Use a remote backend for state management and enable state locking.
- State Encryption: Ensure your remote state backend encrypts data at rest.
-
.gitignore: Always includeterraform.tfstate*,.terraform/, and*.tfvars(for sensitive files) in your.gitignore. - Modularize: Organize configurations into small, reusable modules.
- Variable Inputs: Use variables for all configurable parameters, especially sensitive ones (and mark them
sensitive = truefor outputs). - Secrets Management: Integrate with a secrets manager (Vault, AWS Secrets Manager) for credentials, avoiding hardcoding.
- Least Privilege: Configure IAM/RBAC roles for Terraform execution with the minimum necessary permissions.
- Provider Pinning: Explicitly define required provider versions to ensure stability.
- Meaningful Naming: Use clear, consistent naming conventions for resources and tags.
- Documentation: Add comments to HCL code and provide READMEs for modules.
- Continuous Integration: Integrate
terraform planinto your CI pipeline for every pull request. - Dry Runs: Always perform
terraform planbeforeterraform apply. - Small, Incremental Changes: Avoid large, sweeping changes in a single
terraform apply. - Destroy Awareness: Understand the impact of
terraform destroyand useprevent_destroyfor critical resources. - Drift Detection: Regularly run
terraform planto identify and manage configuration drift. - Automated Testing: Implement validation tests (e.g., using Terratest or pytest for HCL) to verify deployed infrastructure.
- Tagging: Use consistent tagging for cloud resources for cost allocation, automation, and management.
7.11 Reference Links
- Terraform Official Documentation: https://developer.hashicorp.com/terraform/docs
- Terraform Providers (Registry): https://registry.terraform.io/
- AWS Provider: https://registry.terraform.io/providers/hashicorp/aws/latest/docs
- NETCONF Provider: https://registry.terraform.io/providers/netconf-ng/netconf/latest/docs
- Cisco IOS XE Provider: https://registry.terraform.io/providers/CiscoDevNet/iosxe/latest/docs (Note: Check specific vendor support, general
netconfis often a fallback) - Juniper JunOS Provider: https://registry.terraform.io/providers/Juniper/junos/latest/docs
- HashiCorp Configuration Language (HCL) Docs: https://developer.hashicorp.com/terraform/language/expressions/hcl
- RFC 6241: Network Configuration Protocol (NETCONF): https://datatracker.ietf.org/doc/html/rfc6241
- RFC 8040: RESTCONF Protocol: https://datatracker.ietf.org/doc/html/rfc8040
- RFC 7950: The YANG 1.1 Data Modeling Language: https://datatracker.ietf.org/doc/html/rfc7950
- Cisco DevNet - Infrastructure as Code: https://developer.cisco.com/iac/
- Cisco YANG Suite: https://developer.cisco.com/yangsuite/
7.12 What’s Next
This chapter provided a comprehensive deep dive into leveraging Terraform for Infrastructure as Code in network engineering. You’ve learned about its declarative nature, core components, and how to apply it to both traditional network devices and cloud networking. The ability to define, provision, and manage your network infrastructure in code is a fundamental skill in modern NetDevOps.
In the next chapter, we will shift our focus to “Continuous Integration and Continuous Delivery (CI/CD) for Network Automation.” We will explore how to integrate the Ansible playbooks, Python scripts, and Terraform configurations you’ve learned to build robust, automated pipelines that ensure rapid, reliable, and consistent deployments across your network infrastructure. This includes setting up Git workflows, automated testing, and deployment strategies that truly embody the NetDevOps ethos.