1. Introduction

The journey into NetDevOps with Python is fundamentally defined by the libraries and frameworks that abstract complex network interactions into manageable code. As network engineers transition from manual CLI operations to programmatic automation, a robust understanding of these foundational Python tools becomes paramount. This chapter dives deep into the core Python libraries that enable efficient, scalable, and secure automation across diverse network environments, including Cisco and other multi-vendor platforms.

This chapter will cover:

  • The evolution from basic SSH interaction to advanced API-driven network management.
  • Key Python libraries for CLI automation: Paramiko, Netmiko, Scrapli, and NAPALM.
  • Frameworks for managing inventory and executing tasks: Nornir.
  • Libraries for API interactions: Requests (RESTCONF/HTTP) and ncclient (NETCONF).
  • Handling structured data with JSON, YAML, and XML.

Upon completing this chapter, you will be able to:

  • Identify the appropriate Python library for various network automation tasks.
  • Develop Python scripts to connect to and configure network devices via CLI, RESTCONF, and NETCONF.
  • Leverage structured data for configuration and operational state retrieval.
  • Understand the architectural role of these libraries within a NetDevOps pipeline.
  • Implement secure and efficient automation practices using Python.

2. Technical Concepts

Python’s strength in network automation stems from its rich ecosystem of third-party libraries designed to interact with network devices using various protocols and interfaces. We will explore these core libraries, understanding their purpose, underlying protocols, and how they fit into a modern NetDevOps architecture.

2.1 Abstraction and Device Interaction (CLI & API)

Network devices expose various interfaces for management: the traditional Command Line Interface (CLI), and modern programmatic interfaces like RESTCONF, NETCONF, and gRPC. Python libraries offer different levels of abstraction to interact with these interfaces.

2.1.1 Paramiko: The SSH Foundation

Paramiko is a pure-Python (2.7, 3.4+) implementation of the SSHv2 protocol. While not network-specific, it forms the bedrock for many CLI automation libraries by providing the capability to establish secure SSH connections, execute commands, and transfer files. It operates at a low level, requiring explicit handling of prompt detection, command output parsing, and session management.

  • Protocol: SSHv2 (RFC 4251)
  • Architecture: Client-side SSH library.
  • Control Plane vs. Data Plane: Primarily interacts with the control plane (management plane) to send commands and receive structured/unstructured text output.

2.1.2 Netmiko: Simplified CLI Automation

Built on top of Paramiko, Netmiko (developed by Kirk Byers) significantly simplifies CLI interactions with a wide range of network devices. It handles many complexities like enable mode, pagination, prompt detection, and error handling, making it a go-to choice for sending commands and configuration sets to multi-vendor devices.

  • Protocol: SSHv2 (via Paramiko), Telnet.
  • Architecture: Provides a device-agnostic interface for common CLI operations, abstracting vendor-specific nuances. It uses a dictionary of device types to manage platform-specific behaviors.
  • Control Plane vs. Data Plane: Focuses on control plane interactions for configuration and operational state retrieval.
  • RFC/Standard References: Leverages SSHv2 (RFC 4251) for transport.

2.1.3 Scrapli: Modern & Asynchronous CLI Automation

Scrapli is a newer, high-performance, and asynchronous-first Python library for interacting with network devices. It aims to be faster and more robust than Netmiko, offering native support for asyncio, different transport plugins (SSH, Telnet, Console), and improved type-hinting. Scrapli can significantly speed up automation tasks involving a large number of devices or requiring rapid interaction.

  • Protocol: SSHv2, Telnet, Console.
  • Architecture: Designed for speed and flexibility, with clear separation of transport and platform layers. Supports synchronous and asynchronous operations.
  • Control Plane vs. Data Plane: Primarily for control plane automation.

2.1.4 NAPALM: Network Automation and Programmability Abstraction Layer with Multivendor Support

NAPALM provides a unified API to interact with different network device operating systems (IOS, IOS-XE, NX-OS, Junos, EOS, etc.) to perform common tasks like retrieving information (getters) or pushing configurations. It abstracts away vendor-specific CLI or API differences, allowing engineers to write truly multi-vendor automation scripts. NAPALM focuses on canonical data structures, standardizing the representation of operational state and configuration data.

  • Protocol: Utilizes various underlying protocols depending on the driver (SSH for CLI, NETCONF, eAPI, etc.).
  • Architecture: A driver-based abstraction layer. Each driver translates NAPALM’s standard calls into vendor-specific commands or API calls.
  • Control Plane vs. Data Plane: Primarily interacts with the control plane, offering structured data representations of both configuration and operational state.
  • RFC/Standard References: Aims to provide a consistent interface for network management operations, often aligning with concepts from IETF RFCs for data modeling where applicable.

2.1.5 Nornir: The Automation Framework

Nornir is an automation framework that enhances network automation scripts by providing an inventory management system and a powerful task runner. It’s not a direct device interaction library like Netmiko or NAPALM but acts as an orchestrator. Nornir allows for concurrent execution of tasks across many devices, manages device inventory (hosts, groups, defaults), and supports a plugin ecosystem for various functionalities (e.g., Netmiko, Scrapli, NAPALM plugins for device interaction; Jinja2 for templating).

  • Protocol: Agnostic; relies on plugins for actual device communication.
  • Architecture: Inventory-driven, concurrent, plugin-based task runner. Facilitates defining workflows and managing target devices.
  • Control Plane vs. Data Plane: Orchestrates control plane interactions.
  • State Machines and Workflows: Ideal for defining complex automation workflows, handling device state, and ensuring consistency across a fleet of devices.

2.2 API Interactions: Requests, ncclient, and gRPC

Modern network devices increasingly offer programmatic APIs based on standard data models like YANG, enabling a more structured and robust approach to automation.

2.2.1 Requests: HTTP/RESTCONF Client

The requests library is the de facto standard for making HTTP requests in Python. It’s fundamental for interacting with RESTful APIs, including RESTCONF interfaces found on many modern network devices (e.g., Cisco IOS XE, NX-OS, DNA Center, Meraki, Arista eAPI, Juniper Junos with J-Web/REST API). Requests handles details like sessions, authentication, JSON/XML payload encoding/decoding, and error handling for HTTP status codes.

  • Protocol: HTTP/HTTPS (RFC 7230-7235).
  • Architecture: Client-side HTTP library. Directly interacts with HTTP/RESTCONF endpoints.
  • Control Plane vs. Data Plane: Primarily control plane for configuration and operational data retrieval.
  • RFC/Standard References: RESTCONF (RFC 8040), HTTP (RFC 7231).

2.2.2 ncclient: NETCONF Client

ncclient is a Python library for connecting to NETCONF servers. NETCONF (Network Configuration Protocol) is a standard protocol for configuring network devices, and it uses YANG (Yet Another Next Generation) for data modeling. ncclient enables Python scripts to build and send NETCONF RPCs (Remote Procedure Calls) like get-config, edit-config, commit, lock, unlock, and get. This provides a highly structured and vendor-independent way to manage network configurations.

  • Protocol: NETCONF (RFC 6241) over SSH (RFC 6242).
  • Architecture: Client-side NETCONF library, designed to interact with devices exposing a NETCONF server. Handles XML parsing and generation for RPCs.
  • Control Plane vs. Data Plane: Exclusively control plane for robust configuration management.
  • YANG: The ncclient library works directly with YANG data models to structure configuration and operational data.
  • RFC/Standard References: NETCONF (RFC 6241, RFC 6242), YANG (RFC 6020, RFC 7950).

2.2.3 gRPC/gNMI (Brief Mention)

gRPC (Google Remote Procedure Call) is a modern, high-performance, open-source RPC framework. gNMI (gRPC Network Management Interface) is a Google-developed specification built on gRPC for network device configuration and telemetry. Python libraries like grpcio can be used to interact with gRPC services. While grpcio is the core library, more specialized clients like pygnmi exist for gNMI. These are often used for high-frequency telemetry streaming and transaction-based configuration updates, representing a more advanced automation pattern.

  • Protocol: gRPC, TCP.
  • Architecture: Client-side gRPC library for interacting with gRPC servers on network devices.
  • Control Plane vs. Data Plane: Primarily control plane, with strong emphasis on real-time telemetry (operational state data plane) and configuration management.
  • YANG: Uses YANG for data modeling with Protobuf for serialization.
  • RFC/Standard References: gRPC documentation, gNMI specification (OpenConfig).

2.3 Data Serialization/Deserialization

Interacting with network devices, especially via APIs, often involves exchanging structured data in formats like JSON, YAML, or XML. Python has excellent built-in and third-party libraries for handling these.

2.3.1 JSON (JavaScript Object Notation)

JSON is a lightweight data-interchange format, widely used with RESTful APIs. Python’s built-in json module provides functions to dump (serialize) Python objects to JSON strings or files, and load (deserialize) JSON strings or files into Python objects (dictionaries, lists).

2.3.2 YAML (YAML Ain’t Markup Language)

YAML is a human-friendly data serialization standard, often used for configuration files (e.g., Ansible playbooks, Nornir inventory). The PyYAML library allows parsing YAML files into Python data structures and vice-versa.

2.3.3 XML (Extensible Markup Language)

XML is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It’s extensively used in older APIs and is the native encoding for NETCONF. Python’s xml.etree.ElementTree module (built-in) provides a simple and efficient API for parsing and generating XML data.

2.4 Automation Architecture Diagram

The following diagram illustrates how various Python libraries fit into a comprehensive network automation architecture, facilitating interactions with multi-vendor devices via different protocols.

@startuml
skinparam handwritten false
skinparam monochrome false
skinparam packageStyle rectangle
skinparam defaultFontName "Cascadia Code"

title Python Network Automation Architecture

actor "Network Engineer" as NE

package "Automation Host" {
    cloud "Python Environment" as PythonEnv {
        folder "Core Python Libraries" as Libs {
            component Netmiko
            component Scrapli
            component NAPALM
            component Requests
            component ncclient
            component PyYAML #DDDDDD
            component json #DDDDDD
            component ElementTree #DDDDDD
        }

        component Nornir <<Framework>>
        component "Custom Python Scripts" as Scripts
        component Ansible <<Orchestrator>>
    }
    database "Inventory (YAML)" as Inventory
    file "Templates (Jinja2)" as Templates
}

package "Network Devices" {
    stack "Cisco IOS XE" as IOSXE {
        interface "CLI (SSH)" as IOSXE_CLI
        interface "RESTCONF (HTTP)" as IOSXE_RESTCONF
        interface "NETCONF (SSH)" as IOSXE_NETCONF
    }
    stack "Juniper Junos" as Junos {
        interface "CLI (SSH)" as Junos_CLI
        interface "NETCONF (SSH)" as Junos_NETCONF
        interface "eAPI (REST/HTTP)" as Junos_eAPI
    }
    stack "Arista EOS" as EOS {
        interface "CLI (SSH)" as EOS_CLI
        interface "eAPI (REST/HTTP)" as EOS_eAPI
    }
    stack "Other Vendors" as Other {
        interface "Varied Interfaces" as Other_IF
    }
}

NE [label="> Scripts : Develops / Executes
NE"] Ansible : Develops Playbooks

Scripts [label="> Nornir : Uses
Nornir"] Inventory : Reads
Nornir [label="> Templates : Renders config
Nornir"] Libs : Leverages for tasks

Ansible [label="> Libs : Leverages (via modules)
Ansible"] Inventory : Reads

Netmiko [label="SSH"] IOSXE_CLI
Netmiko [label="SSH"] Junos_CLI
Netmiko [label="SSH"] EOS_CLI

Scrapli [label="SSH"] IOSXE_CLI
Scrapli [label="SSH"] Junos_CLI
Scrapli [label="SSH"] EOS_CLI

NAPALM [label="CLI/API_Driver"] IOSXE_CLI
NAPALM [label="CLI/API_Driver"] Junos_NETCONF
NAPALM [label="CLI/API_Driver"] EOS_eAPI

Requests [label="HTTP"] IOSXE_RESTCONF
Requests [label="HTTP"] Junos_eAPI
Requests [label="HTTP"] EOS_eAPI

ncclient [label="SSH"] IOSXE_NETCONF
ncclient [label="SSH"] Junos_NETCONF

PyYAML [label="> Inventory : Parses / Generates
json"] IOSXE_RESTCONF : Parses / Generates
json [label="> Junos_eAPI : Parses / Generates
json"] EOS_eAPI : Parses / Generates
ElementTree [label="> IOSXE_NETCONF : Parses / Generates
ElementTree"] Junos_NETCONF : Parses / Generates

Libs --> Other_IF : Extensible

@endumml
@enduml

2.5 Protocol Flow Diagram

This GraphViz diagram illustrates a generic flow for a Python script automating a network device, demonstrating the interaction over a chosen protocol.

digraph protocol_flow {
    rankdir=LR;
    node[shape=box];

    PythonScript [label="Python Automation Script"];
    NetworkLibrary [label="Network Library\n(e.g., Netmiko, ncclient, Requests)"];
    NetworkDevice [label="Network Device"];

    PythonScript -> NetworkLibrary [label="Invokes functions\n(e.g., connect, send_command, get_config)"];
    NetworkLibrary -> NetworkDevice [label="Establishes connection\n(SSH, HTTPS, etc.)\nSends commands/RPCs/API calls"];
    NetworkDevice -> NetworkLibrary [label="Sends response\n(CLI output, JSON, XML)"];
    NetworkLibrary -> PythonScript [label="Returns parsed data\nor raw output"];
}

2.6 Simplified NETCONF RPC Packet Structure (Conceptual)

While a full NETCONF packet structure involves layers of SSH and XML, this packetdiag provides a conceptual look at the internal structure of a NETCONF RPC message payload, demonstrating how data models (YANG) define the content.

packetdiag {
  colwidth = 32
  node_height = 72

  0-31: SSH Header (Encrypted)
  32-63: NETCONF Session ID
  64-95: Message ID
  96-127: Message Length
  128-159: NETCONF RPC Header
  160-191: Operation Type [e.g., <edit-config>]
  192-223: Target Datastore [e.g., <running>]
  224-255: Configuration Data (YANG-modeled XML) [e.g., <interface><name>GigabitEthernet1</name>...]
  256-287: ...
  288-319: RPC End Tag [e.g., </rpc>]
  320-351: End of Message Delimiter
}

3. Configuration Examples (Target Devices)

Before automating, it’s crucial to understand the configurations on the target devices that our Python scripts will interact with. These examples demonstrate basic device configurations that enable CLI access (for Netmiko/NAPALM) and API access (for Requests/ncclient).

3.1 Cisco IOS XE Configuration

Enabling SSH for CLI access and NETCONF/RESTCONF for API-driven automation.

! Enable SSH for CLI automation (Netmiko, Scrapli, NAPALM)
hostname CiscoRouter
ip domain-name example.com
crypto key generate rsa modulus 2048
line vty 0 4
 transport input ssh
 login local

username netdevops privilege 15 secret 0 YourSecurePassword
! Or use local authentication for SSH/NETCONF
! username netconfuser privilege 15 secret 0 NetconfPass123
!
! Enable NETCONF-YANG over SSH (for ncclient)
! Requires IOS XE 16.x or newer
netconf-yang
!
! Enable RESTCONF-YANG over HTTPS (for Requests)
! Requires IOS XE 16.x or newer
restconf
ip http secure-server
ip http authentication local
!
! Security Warning: Replace 'YourSecurePassword' with a strong, unique password.
! For production environments, consider using RADIUS/TACACS+ or SSH key-based authentication.

3.2 Juniper Junos Configuration

Enabling SSH for CLI access and NETCONF over SSH.

# Set hostname
set system host-name JuniperSwitch

# Configure SSH for CLI automation (Netmiko, Scrapli, NAPALM)
set system services ssh protocol-version v2
set system login user netdevops uid 2000 class super-user authentication plain-text-password
set system login user netdevops authentication plain-text-password
New password: YourSecurePassword
Retype new password: YourSecurePassword
# Or use SSH key-based authentication for production

# Enable NETCONF over SSH (for ncclient)
set system services netconf ssh
#
# Security Warning: Replace 'YourSecurePassword' with a strong, unique password.
# For production environments, consider using RADIUS/TACACS+ or SSH key-based authentication.

3.3 Arista EOS Configuration

Enabling SSH for CLI access and eAPI (REST API) over HTTPS.

! Set hostname
hostname AristaSwitch

! Configure SSH for CLI automation (Netmiko, Scrapli, NAPALM)
aaa authentication login default local
aaa authorization exec default local

username netdevops secret 0 YourSecurePassword privilege 15
! Or use SSH key-based authentication

! Enable eAPI over HTTPS (for Requests)
management api http-command
  no shutdown
  protocol https
  ! vrf default
  ! ip http-server access-class ACL_MGMT
!
! Security Warning: Replace 'YourSecurePassword' with a strong, unique password.
! For production environments, consider using RADIUS/TACACS+ or SSH key-based authentication.

4. Network Diagrams

Network diagrams are essential for visualizing the automation environment and the devices involved.

4.1 Multi-Vendor Automation Topology

This nwdiag illustrates a simple multi-vendor network topology that our Python automation scripts might target.

nwdiag {
  network core_network {
    address = "10.0.0.0/24"

    CiscoCoreRouter [address = "10.0.0.1", shape = router];
    JuniperCoreSwitch [address = "10.0.0.2", shape = switch];
  }

  network access_network {
    address = "10.0.1.0/24"
    color = "#F0F8FF"; // AliceBlue

    CiscoAccessSwitch [address = "10.0.1.1", shape = switch];
    AristaAccessSwitch [address = "10.0.1.2", shape = switch];
  }

  network automation_network {
    address = "172.16.0.0/24"
    color = "#E6E6FA"; // Lavender

    AutomationServer [address = "172.16.0.10", shape = cloud];
  }

  CiscoCoreRouter -- CiscoAccessSwitch;
  JuniperCoreSwitch -- AristaAccessSwitch;
  AutomationServer -- CiscoCoreRouter;
  AutomationServer -- JuniperCoreSwitch;
  AutomationServer -- CiscoAccessSwitch;
  AutomationServer -- AristaAccessSwitch;
}

5. Automation Examples

This section provides practical Python code examples using the core libraries discussed.

5.1 CLI Automation with Netmiko (Cisco)

This script uses Netmiko to connect to a Cisco IOS XE device, retrieve its show ip interface brief output, and configure a loopback interface.

import os
from netmiko import ConnectHandler
from netmiko.exceptions import NetmikoAuthenticationException, NetmikoTimeoutException

# Device connection details (replace with your lab device info)
device = {
    "device_type": "cisco_ios",
    "host": os.getenv("CISCO_HOST", "192.168.1.10"),
    "username": os.getenv("CISCO_USERNAME", "netdevops"),
    "password": os.getenv("CISCO_PASSWORD", "YourSecurePassword"),
    # "secret": os.getenv("CISCO_ENABLE_PASSWORD", "enable_password") # Uncomment if enable password needed
}

try:
    print(f"Connecting to {device['host']}...")
    with ConnectHandler(**device) as net_connect:
        print("Connection successful!")

        # 1. Get and print operational data
        print("\n--- Current IP Interface Brief ---")
        output = net_connect.send_command("show ip interface brief")
        print(output)

        # 2. Configure a loopback interface
        print("\n--- Configuring Loopback Interface ---")
        config_commands = [
            "interface Loopback100",
            "description Automated Loopback",
            "ip address 10.255.255.100 255.255.255.255",
            "no shutdown"
        ]
        config_output = net_connect.send_config_set(config_commands)
        print(config_output)
        print("Loopback100 configured.")

        # 3. Verify configuration
        print("\n--- Verifying Loopback100 Configuration ---")
        verify_output = net_connect.send_command("show run interface Loopback100")
        print(verify_output)

        # 4. Save configuration (important for persistent changes)
        print("\n--- Saving Configuration ---")
        save_output = net_connect.send_command("write mem")
        print(save_output)
        print("Configuration saved.")

except NetmikoAuthenticationException:
    print("Authentication failed. Check username and password.")
except NetmikoTimeoutException:
    print("Connection timed out. Check host IP and network reachability.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

5.2 Multi-Vendor Abstraction with NAPALM (Cisco & Juniper)

This example uses NAPALM to get facts and then push a configuration change to both Cisco IOS XE and Juniper Junos devices using a common Python interface.

import json
import os
from napalm import get_network_driver
from napalm.base.exceptions import ConnectionException, DriverError

# Device connection details (replace with your lab device info)
devices = [
    {
        "hostname": os.getenv("CISCO_HOST", "192.168.1.10"),
        "vendor": "ios", # For Cisco IOS XE
        "username": os.getenv("CISCO_USERNAME", "netdevops"),
        "password": os.getenv("CISCO_PASSWORD", "YourSecurePassword"),
    },
    {
        "hostname": os.getenv("JUNIPER_HOST", "192.168.1.20"),
        "vendor": "junos", # For Juniper Junos
        "username": os.getenv("JUNIPER_USERNAME", "netdevops"),
        "password": os.getenv("JUNIPER_PASSWORD", "YourSecurePassword"),
    },
]

# Common configuration change to apply
cisco_config = """
interface Loopback101
  description NAPALM Configured Loopback
  ip address 10.255.255.101 255.255.255.255
  no shutdown
"""

juniper_config = """
interfaces {
    lo0 {
        unit 101 {
            description "NAPALM Configured Loopback";
            family inet {
                address 10.255.255.101/32;
            }
        }
    }
}
"""

for dev in devices:
    print(f"\n--- Processing device: {dev['hostname']} ({dev['vendor']}) ---")
    driver = get_network_driver(dev['vendor'])
    
    try:
        with driver(
            hostname=dev['hostname'],
            username=dev['username'],
            password=dev['password'],
            optional_args={} # Add optional_args like {'secret': 'enable_pass'} if needed
        ) as device:
            print("Connection successful!")

            # 1. Get device facts
            facts = device.get_facts()
            print(f"  Vendor: {facts['vendor']}, OS Version: {facts['os_version']}")

            # 2. Load and compare configuration
            if dev['vendor'] == 'ios':
                device.load_merge_candidate(config=cisco_config)
            elif dev['vendor'] == 'junos':
                device.load_merge_candidate(config=juniper_config)

            diff = device.compare_config()
            if diff:
                print("\n  --- Configuration Differences ---")
                print(diff)
                
                # 3. Commit configuration
                print("  Committing configuration...")
                device.commit_config()
                print("  Configuration committed.")

                # 4. Verify (NAPALM does not have a direct 'show run interface LoopbackX' getter,
                #    so we'd rely on get_config or direct CLI via Netmiko/Scrapli if combining)
                #    For simplicity, we'll just show the successful commit.
            else:
                print("  No configuration changes detected.")

            # 5. Discard changes if not committed
            device.discard_config()
            print("  Configuration changes discarded (if any uncommitted).")

    except ConnectionException:
        print(f"  Connection failed to {dev['hostname']}. Check credentials/reachability.")
    except DriverError as e:
        print(f"  NAPALM Driver error for {dev['hostname']}: {e}")
    except Exception as e:
        print(f"  An unexpected error occurred for {dev['hostname']}: {e}")

5.3 Inventory and Task Management with Nornir (Multi-vendor)

This Nornir example leverages netmiko for device interaction to gather show version from multiple devices concurrently.

First, create the inventory/hosts.yaml and inventory/groups.yaml files.

inventory/hosts.yaml:

---
cisco_router:
  hostname: 192.168.1.10 # Replace with your Cisco device IP
  platform: ios
  groups:
    - cisco_devices

juniper_switch:
  hostname: 192.168.1.20 # Replace with your Juniper device IP
  platform: junos
  groups:
    - juniper_devices

arista_switch:
  hostname: 192.168.1.30 # Replace with your Arista device IP
  platform: eos
  groups:
    - arista_devices

inventory/groups.yaml:

---
global:
  username: netdevops
  password: YourSecurePassword # Security Warning: Use environment variables or Vault for production

cisco_devices:
  platform: ios

juniper_devices:
  platform: junos

arista_devices:
  platform: eos

Now the Python script:

import os
from nornir import InitNornir
from nornir_netmiko.tasks import netmiko_send_command
from nornir_utils.plugins.functions import print_result
from nornir_utils.plugins.tasks.data import load_yaml

# Initialize Nornir (using environment variables for sensitive info)
# For production, consider using a Nornir Vault plugin or a custom credential store.
nr = InitNornir(
    config_file="nornir_config.yaml", # Create this file to point to inventory
    # Set global username/password via environment variables for safety
    # env_vars={"NORNIR_USERNAME": os.getenv("NORNIR_USERNAME", "netdevops"),
    #           "NORNIR_PASSWORD": os.getenv("NORNIR_PASSWORD", "YourSecurePassword")}
)

# Example nornir_config.yaml:
# ---
# inventory:
#   plugin: SimpleInventory
#   options:
#     host_file: inventory/hosts.yaml
#     group_file: inventory/groups.yaml
#     defaults_file: inventory/defaults.yaml # Optional
# runner:
#   plugin: simple
#   options:
#     num_workers: 20
# ---

# Task to send 'show version' command
def get_device_version(task):
    """
    Sends 'show version' to the device and prints the output.
    """
    print(f"Connecting to {task.host.name} ({task.host.hostname})...")
    command = "show version"
    if task.host.platform == "junos":
        command = "show version"
    elif task.host.platform == "eos":
        command = "show version" # Arista EOS also uses 'show version'

    result = task.run(task=netmiko_send_command, command_string=command)
    print(f"--- Output from {task.host.name} ---")
    print(result[0].result) # Access the output from the task result

# Run the task on all devices
results = nr.run(task=get_device_version)

# Print a summary of the results
print_result(results)

Security Warning: Storing passwords directly in groups.yaml is NOT recommended for production. Use environment variables, a secrets management tool (like HashiCorp Vault), or Nornir’s built-in nr.inventory.hosts['mydevice'].password = "secret" assignment in a secure manner. For this example, replace YourSecurePassword with a temporary password for your lab environment.

5.4 RESTCONF Automation with Requests (Cisco IOS XE)

This example uses the requests library to interact with a Cisco IOS XE device’s RESTCONF API to retrieve interface data and potentially configure an interface.

import os
import requests
import json
from requests.auth import HTTPBasicAuth

# Device connection details (replace with your lab device info)
HOST = os.getenv("CISCO_HOST", "192.168.1.10")
USERNAME = os.getenv("CISCO_USERNAME", "netdevops")
PASSWORD = os.getenv("CISCO_PASSWORD", "YourSecurePassword")
PORT = 443 # RESTCONF typically uses HTTPS

# Base URL for RESTCONF
BASE_URL = f"https://{HOST}:{PORT}/restconf/data"

# Disable SSL warnings for lab environments (NOT for production!)
requests.packages.urllib3.disable_warnings()

headers = {
    "Accept": "application/yang-data+json",
    "Content-Type": "application/yang-data+json"
}

try:
    # 1. Get all interfaces using IETF-interfaces YANG model
    print(f"Retrieving interfaces from {HOST} via RESTCONF...")
    url = f"{BASE_URL}/ietf-interfaces:interfaces"
    response = requests.get(url, headers=headers, auth=HTTPBasicAuth(USERNAME, PASSWORD), verify=False)
    response.raise_for_status() # Raise an exception for HTTP errors
    
    interfaces_data = response.json()
    print("\n--- Current Interfaces ---")
    print(json.dumps(interfaces_data, indent=2))

    # 2. Configure a Loopback interface via RESTCONF (PATCH operation)
    # This example uses a partial configuration (merge)
    print("\n--- Configuring Loopback Interface via RESTCONF ---")
    config_url = f"{BASE_URL}/ietf-interfaces:interfaces/interface=Loopback102"
    
    # YANG model for interface configuration (simplified)
    # The exact structure depends on the specific YANG model implemented by the device.
    # For Cisco-native, you might use 'Cisco-IOS-XE-native:native/interface/Loopback'
    # For IETF, it would be 'ietf-interfaces:interface'
    interface_config = {
        "ietf-interfaces:interface": {
            "name": "Loopback102",
            "description": "RESTCONF Automated Loopback",
            "type": "iana-if-type:softwareLoopback",
            "enabled": True,
            "ietf-ip:ipv4": {
                "address": [
                    {
                        "ip": "10.255.255.102",
                        "netmask": "255.255.255.255"
                    }
                ]
            }
        }
    }

    # Use PUT for full replacement, PATCH for merge/partial update
    config_response = requests.patch(
        config_url,
        headers=headers,
        auth=HTTPBasicAuth(USERNAME, PASSWORD),
        verify=False,
        data=json.dumps(interface_config)
    )
    config_response.raise_for_status()

    if config_response.status_code == 204: # No Content, success for PATCH
        print("Loopback102 configured successfully.")
    else:
        print(f"Configuration response: {config_response.status_code} - {config_response.text}")

    # 3. Verify the configured Loopback interface
    print("\n--- Verifying Loopback102 Configuration ---")
    verify_url = f"{BASE_URL}/ietf-interfaces:interfaces/interface=Loopback102"
    verify_response = requests.get(verify_url, headers=headers, auth=HTTPBasicAuth(USERNAME, PASSWORD), verify=False)
    verify_response.raise_for_status()
    print(json.dumps(verify_response.json(), indent=2))

except requests.exceptions.HTTPError as e:
    print(f"HTTP Error: {e.response.status_code} - {e.response.text}")
except requests.exceptions.ConnectionError:
    print(f"Connection Error: Could not connect to {HOST}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Security Warning: verify=False disables SSL certificate validation, which is extremely dangerous in production environments as it makes you vulnerable to man-in-the-middle attacks. Always use valid certificates and verify=True in production.

5.5 NETCONF Automation with ncclient (Juniper Junos)

This example uses ncclient to connect to a Juniper Junos device via NETCONF, retrieve interface data, and modify an interface description.

import os
from ncclient import manager
from ncclient.operations import rpc
from ncclient.xml_ import to_xml_string

# Device connection details (replace with your lab device info)
HOST = os.getenv("JUNIPER_HOST", "192.168.1.20")
USERNAME = os.getenv("JUNIPER_USERNAME", "netdevops")
PASSWORD = os.getenv("JUNIPER_PASSWORD", "YourSecurePassword")
PORT = 830 # Default NETCONF port

# Disable host key checking for lab environments (NOT for production!)
# In production, manage known_hosts file properly or use a host key policy.
KNOWN_HOSTS_FILE = "/dev/null"

try:
    print(f"Connecting to {HOST} via NETCONF...")
    with manager.connect(
        host=HOST,
        port=PORT,
        username=USERNAME,
        password=PASSWORD,
        hostkey_verify=False, # WARNING: Disable for lab, enable/configure for production
        allow_agent=False,
        look_for_keys=False,
        timeout=10 # seconds
    ) as m:
        print("Connection successful!")

        # 1. Get device capabilities
        print("\n--- Device Capabilities ---")
        for capability in m.server_capabilities:
            print(f"  {capability}")

        # 2. Get operational state for an interface (e.g., ge-0/0/0)
        print("\n--- Retrieving Interface Operational State (ge-0/0/0) ---")
        # Example filter for specific operational data (Junos specific YANG module might be needed for specific filters)
        # Using a broader filter for simplicity or direct CLI equivalent.
        # For Junos, standard OpenConfig or Junos-specific YANGs are available.
        # Here, we'll request a generic get, or an explicit filter if we know the YANG path.
        # Example using get_config and filtering for 'interfaces' from the 'running' datastore
        interface_filter_xml = """
            <configuration>
                <interfaces/>
            </configuration>
        """
        result = m.get_config(source='running', filter=('subtree', interface_filter_xml))
        print(to_xml_string(result.data_xml, pretty_print=True))

        # 3. Configure a description on an interface (e.g., Loopback0)
        print("\n--- Configuring Interface Description (lo0 unit 0) ---")
        # This XML represents a configuration snippet for Junos
        config_xml = """
            <configuration>
                <interfaces>
                    <interface>
                        <name>lo0</name>
                        <unit>
                            <name>0</name>
                            <description>Managed by ncclient script</description>
                        </unit>
                    </interface>
                </interfaces>
            </configuration>
        """
        
        # Load and commit the configuration
        m.lock('candidate')
        m.load_configuration(config=config_xml, format='xml')
        compare_result = m.compare_configuration()
        if compare_result.xpath('//rpc-error'):
            print(f"Error during configuration comparison: {to_xml_string(compare_result)}")
            m.unlock('candidate')
            raise ValueError("Configuration comparison failed.")

        print("\n--- Configuration Differences ---")
        print(to_xml_string(compare_result, pretty_print=True))

        if "no changes" not in str(compare_result):
            print("Committing configuration...")
            commit_result = m.commit()
            print("Configuration committed.")
        else:
            print("No changes to commit.")
        
        m.unlock('candidate')

        # 4. Verify configuration
        print("\n--- Verifying Configuration on lo0 unit 0 ---")
        verify_filter_xml = """
            <configuration>
                <interfaces>
                    <interface>
                        <name>lo0</name>
                        <unit>
                            <name>0</name>
                        </unit>
                    </interface>
                </interfaces>
            </configuration>
        """
        verify_result = m.get_config(source='running', filter=('subtree', verify_filter_xml))
        print(to_xml_string(verify_result.data_xml, pretty_print=True))

except rpc.RPCError as e:
    print(f"NETCONF RPC Error: {e.info}")
except manager.operations.rpc.RPCError as e:
    print(f"NETCONF RPC Error: {e.info}")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Security Warning: hostkey_verify=False disables host key verification, making your connection vulnerable to man-in-the-middle attacks. In production, ensure host keys are properly verified. Also, avoid hardcoding credentials; use environment variables, a vault, or SSH keys.

6. Security Considerations

Network automation introduces powerful capabilities, but also new attack vectors if not secured properly.

  • Credential Management:
    • Attack Vector: Hardcoding usernames/passwords in scripts. Storing credentials insecurely on automation hosts.
    • Mitigation: Use environment variables, HashiCorp Vault, Ansible Vault, CyberArk, or other secrets management solutions. Implement SSH key-based authentication for devices where possible, protecting private keys with strong passphrases.
    • Best Practice: Never commit credentials to version control.
  • Access Control (AAA):
    • Attack Vector: Using privileged accounts (e.g., admin) for all automation tasks.
    • Mitigation: Implement Role-Based Access Control (RBAC) on network devices. Create dedicated automation accounts with minimal necessary privileges. Utilize TACACS+/RADIUS for centralized authentication and authorization.
    • Security Configuration Example (Cisco):
      ! Create a specific user for automation with limited privileges
      username automation_user privilege 10 secret 0 AutomationPass
      ! Define an AAA method list that prioritizes TACACS+ or local
      aaa new-model
      aaa authentication login default group tacacs+ local
      aaa authorization exec default group tacacs+ local
      
  • Secure Transport:
    • Attack Vector: Using unencrypted protocols like Telnet or HTTP. Disabling SSL/TLS verification.
    • Mitigation: Always use SSH for CLI/NETCONF and HTTPS for RESTCONF/eAPI. Ensure proper SSL/TLS certificate validation (verify=True in requests).
    • Security Warning: Examples often use verify=False for lab simplicity. This is a critical security vulnerability and must be avoided in production.
  • Input Validation & Sanitization:
    • Attack Vector: Injecting malicious commands or data into automation scripts (e.g., through user input or insecure templating).
    • Mitigation: Rigorously validate all input data before generating configurations. Use templating engines (like Jinja2) with auto-escaping. Avoid directly concatenating user input into commands or API payloads.
  • Logging and Auditing:
    • Attack Vector: Lack of visibility into automation actions, making it hard to detect unauthorized changes or troubleshoot.
    • Mitigation: Implement comprehensive logging in automation scripts, capturing executed commands, API calls, responses, and user who initiated the automation. Integrate with SIEM systems. Network devices should also log programmatic access.
  • Code Security:
    • Attack Vector: Vulnerabilities in Python code, outdated libraries.
    • Mitigation: Regularly update Python libraries. Perform static code analysis. Follow secure coding guidelines.

7. Verification & Troubleshooting

Effective automation relies on robust verification and efficient troubleshooting.

7.1 Verification Commands and Expected Output

After any automation task, it’s crucial to verify the changes on the network device.

Scenario: Configured Loopback100 on Cisco IOS XE via Netmiko.

# Verification Command (Cisco IOS XE)
show run interface Loopback100
! Expected Output:
interface Loopback100
 description Automated Loopback
 ip address 10.255.255.100 255.255.255.255
end

Scenario: Configured Loopback101 on Juniper Junos via ncclient.

# Verification Command (Juniper Junos)
show configuration interfaces lo0 unit 101
# Expected Output:
interfaces {
    lo0 {
        unit 101 {
            description "Managed by ncclient script";
            family inet {
                address 10.255.255.101/32;
            }
        }
    }
}

7.2 Troubleshooting Common Issues

Issue CategoryCommon ProblemsDebug Commands/TechniquesResolution StepsRoot Cause Analysis
ConnectivityNetmikoTimeoutException, ConnectionErrorping <device_ip>, traceroute <device_ip>Verify IP address, firewall rules, network reachability.Device unreachable, incorrect IP, port blocked, network segmentation.
AuthenticationNetmikoAuthenticationException, RPCErrorCheck username/password, debug ssh authenticationCorrect credentials, ensure user exists and has correct permissions on device.Incorrect password, username, SSH key issue, AAA server unreachable.
AuthorizationPermission denied, RPCErrorCheck device logs (show log), user privilegesAssign correct privilege level or RBAC role.Automation account lacks necessary privileges for the operation.
ConfigurationParsingError, Invalid command, API error codesEnable netmiko_send_command(..., error_pattern=...), check API response status codes/bodies.Review syntax, consult vendor docs, use try commands.Malformed configuration, invalid YANG path, incorrect data format (JSON/XML).
Library ErrorsImportError, unexpected exceptionsCheck Python virtual environment, pip list, tracebackInstall/update missing libraries, review traceback for code errors.Missing dependencies, conflicting library versions, logic errors in automation script.
API SpecificIncorrect data structures, 4xx/5xx HTTP errorsUse Postman/Insomnia to test API, print API response.Consult API documentation, correct JSON/XML payload structure.Incorrect URL, header, payload, or method for the API endpoint.
YANG/NETCONFbad-element, data-not-unique, malformed-messageCheck NETCONF error responses, consult YANG model schema.Verify XML payload against YANG model, ensure data validity.Incorrect XML syntax, violation of YANG constraints, missing mandatory elements.

Python Debugging Tips:

  • Print Statements: Liberal use of print() to trace variable values and execution flow.
  • Logging: Use Python’s logging module for structured output.
    import logging
    logging.basicConfig(level=logging.DEBUG)
    # netmiko_logger = logging.getLogger("netmiko")
    # netmiko_logger.setLevel(logging.DEBUG) # Enable Netmiko internal debug logs
    
  • pdb (Python Debugger): Insert import pdb; pdb.set_trace() in your code to stop execution and inspect variables interactively.
  • Context Managers: Use with ConnectHandler(...) or with driver(...) to ensure connections are properly closed even if errors occur.

8. Performance Optimization

Efficient execution is critical for large-scale automation.

  • Concurrency:
    • Technique: Utilize Nornir’s built-in concurrency (num_workers) or Python’s concurrent.futures module for ThreadPoolExecutor/ProcessPoolExecutor to interact with multiple devices simultaneously.
    • Benefit: Reduces total execution time for tasks involving many devices.
  • Asynchronous I/O:
    • Technique: Use libraries designed for async operations, like Scrapli with asyncio, or httpx instead of requests for API calls.
    • Benefit: Allows a single thread to manage multiple I/O operations without blocking, improving responsiveness and throughput, especially for I/O-bound tasks.
  • Persistent Connections:
    • Technique: Reuse established SSH or API sessions (ConnectHandler context managers, requests.Session()) rather than re-establishing for every command or request.
    • Benefit: Reduces connection overhead (handshakes, authentication).
  • Targeted Data Retrieval:
    • Technique: Instead of show run all or retrieving entire API configurations, request only the specific data needed using filters or specific API endpoints.
    • Benefit: Reduces network traffic, device CPU load, and parsing time.
  • Efficient Parsing:
    • Technique: Use specialized parsing libraries (e.g., textfsm, genie) for CLI output or direct API responses. For JSON/XML, parse only relevant sections.
    • Benefit: Faster data extraction and processing.
  • Profiling:
    • Technique: Use Python’s cProfile module or external tools to identify performance bottlenecks in your automation scripts.
    • Benefit: Pinpoints code sections consuming the most time or resources for targeted optimization.

9. Hands-On Lab: Automating Multi-Vendor Loopback Configuration

This lab will guide you through configuring loopback interfaces on a Cisco IOS XE router and a Juniper Junos switch using Python with Netmiko and ncclient respectively.

9.1 Lab Topology

This nwdiag depicts the simple lab topology.

nwdiag {
  network mgmt_network {
    address = "192.168.1.0/24"
    color = "#ADD8E6"; // LightBlue

    AutomationHost [address = "192.168.1.5", shape = cloud];
  }

  network core_segment {
    address = "10.0.0.0/24"
    color = "#FFFAF0"; // FloralWhite

    CiscoRouter [address = "192.168.1.10", description = "Cisco IOS XE Router", shape = router];
    JuniperSwitch [address = "192.168.1.20", description = "Juniper Junos Switch", shape = switch];
  }

  AutomationHost -- CiscoRouter;
  AutomationHost -- JuniperSwitch;
}

9.2 Objectives

  1. Connect to the Cisco IOS XE router via Netmiko and configure Loopback100 with IP 10.255.255.100/32.
  2. Connect to the Juniper Junos switch via ncclient and configure lo0.100 with IP 10.255.255.100/32 and a description.
  3. Verify the configurations on both devices.

9.3 Step-by-Step Configuration

Step 1: Prepare Network Devices

Ensure your Cisco and Juniper devices are reachable from your AutomationHost and have SSH/NETCONF enabled as per Section 3. Create a user netdevops with password YourSecurePassword (or your chosen lab credentials) with sufficient privileges.

Step 2: Set up Python Environment

  1. Install Libraries:
    python3 -m venv netauto_env
    source netauto_env/bin/activate
    pip install netmiko ncclient napalm nornir nornir-netmiko nornir-utils requests
    
  2. Create Python Script (lab_automation.py):
    import os
    from netmiko import ConnectHandler
    from ncclient import manager
    from ncclient.xml_ import to_xml_string
    from netmiko.exceptions import NetmikoAuthenticationException, NetmikoTimeoutException
    
    # --- Device Configuration ---
    # Replace with your actual lab IP addresses and credentials
    cisco_ios_xe = {
        "device_type": "cisco_ios",
        "host": os.getenv("CISCO_LAB_HOST", "192.168.1.10"),
        "username": os.getenv("LAB_USERNAME", "netdevops"),
        "password": os.getenv("LAB_PASSWORD", "YourSecurePassword"),
    }
    
    juniper_junos = {
        "host": os.getenv("JUNIPER_LAB_HOST", "192.168.1.20"),
        "username": os.getenv("LAB_USERNAME", "netdevops"),
        "password": os.getenv("LAB_PASSWORD", "YourSecurePassword"),
        "port": 830, # Default NETCONF port
    }
    
    # --- Cisco Automation with Netmiko ---
    def configure_cisco_loopback():
        print(f"\n--- Configuring Cisco IOS XE ({cisco_ios_xe['host']}) with Netmiko ---")
        try:
            with ConnectHandler(**cisco_ios_xe) as net_connect:
                print("Connected to Cisco IOS XE.")
                config_commands = [
                    "interface Loopback100",
                    "description Netmiko Lab Loopback",
                    "ip address 10.255.255.100 255.255.255.255",
                    "no shutdown"
                ]
                output = net_connect.send_config_set(config_commands)
                print(output)
                net_connect.send_command("write mem")
                print("Cisco Loopback100 configured and saved.")
        except (NetmikoAuthenticationException, NetmikoTimeoutException) as e:
            print(f"Cisco Connection/Authentication Error: {e}")
        except Exception as e:
            print(f"An unexpected error occurred with Cisco: {e}")
    
    # --- Juniper Automation with ncclient ---
    def configure_juniper_loopback():
        print(f"\n--- Configuring Juniper Junos ({juniper_junos['host']}) with ncclient ---")
        try:
            # WARNING: hostkey_verify=False for lab only, use proper host key management in production
            with manager.connect(
                host=juniper_junos['host'],
                port=juniper_junos['port'],
                username=juniper_junos['username'],
                password=juniper_junos['password'],
                hostkey_verify=False,
                timeout=10
            ) as m:
                print("Connected to Juniper Junos.")
                config_xml = """
                    <configuration>
                        <interfaces>
                            <interface>
                                <name>lo0</name>
                                <unit>
                                    <name>100</name>
                                    <description>ncclient Lab Loopback</description>
                                    <family>
                                        <inet>
                                            <address>
                                                <name>10.255.255.100/32</name>
                                            </address>
                                        </inet>
                                    </family>
                                </unit>
                            </interface>
                        </interfaces>
                    </configuration>
                """
                m.lock('candidate')
                m.load_configuration(config=config_xml, format='xml')
    
                compare_result = m.compare_configuration()
                if "no changes" not in str(compare_result):
                    print("\n--- Junos Configuration Differences ---")
                    print(to_xml_string(compare_result, pretty_print=True))
                    m.commit()
                    print("Juniper lo0.100 configured and committed.")
                else:
                    print("No changes needed on Juniper.")
                m.unlock('candidate')
    
        except Exception as e:
            print(f"An error occurred with Juniper: {e}")
    
    # --- Main Execution ---
    if __name__ == "__main__":
        configure_cisco_loopback()
        configure_juniper_loopback()
    

Step 3: Run the Automation Script

Execute the script from your AutomationHost:

python lab_automation.py

9.4 Verification Steps

After the script runs, manually verify the configuration on each device:

Cisco IOS XE Router:

show run interface Loopback100
show ip interface brief

Juniper Junos Switch:

show configuration interfaces lo0 unit 100 | display set
show interfaces terse lo0.100

9.5 Challenge Exercises

  1. Error Handling: Modify the scripts to include more specific error handling for common issues (e.g., if a loopback interface already exists, or if a NETCONF operation fails).
  2. Idempotency: Make the Juniper configuration idempotent. How would you ensure the lo0.100 description is only changed if it’s different from the desired state?
  3. Data-Driven: Instead of hardcoding IP addresses, create a simple YAML file to define the loopback configurations for both devices and load this data into your Python script.
  4. NAPALM: Refactor the lab to use NAPALM’s load_merge_candidate for both devices, demonstrating its multi-vendor abstraction capabilities for configuration.

10. Best Practices Checklist

Adhering to these best practices will ensure your Python network automation efforts are robust, scalable, and secure.

  • Credential Security: Use environment variables, vaults, or SSH keys (with passphrases) for all sensitive credentials. Never hardcode or commit them to version control.
  • Idempotency: Design scripts to be idempotent, meaning running them multiple times yields the same result without unintended side effects.
  • Error Handling: Implement robust try-except blocks to gracefully handle connection failures, authentication issues, and API errors.
  • Logging: Log all automation activities, including connection attempts, commands sent, responses received, and any errors. Use Python’s logging module.
  • Virtual Environments: Always use Python virtual environments (venv or conda) to manage project dependencies.
  • Code Readability: Write clean, well-commented code, following PEP 8 style guidelines.
  • Version Control: Store all automation scripts, inventory, and templates in a version control system (e.g., Git).
  • Modularity: Break down complex automation tasks into smaller, reusable functions or modules.
  • Testing: Implement unit tests and integration tests for your automation code to ensure reliability.
  • Documentation: Document your scripts, their purpose, dependencies, and how to run them.
  • Targeted Changes: Apply only the necessary configuration changes. Avoid broad replace-config operations unless explicitly intended.
  • API First (where possible): Prioritize using programmatic APIs (NETCONF, RESTCONF, gRPC) over CLI parsing for greater reliability and structured data.
  • Multi-Vendor Strategy: When dealing with multiple vendors, leverage abstraction layers like NAPALM or Nornir with appropriate plugins.
  • Secure Transport: Always use SSH/HTTPS for device communication; never disable SSL/TLS verification in production.
  • Rollback Strategy: Plan for how to revert changes if an automation task fails or causes unintended issues.

12. What’s Next

This chapter has provided a robust foundation in Python’s core libraries for network automation, empowering you to interact with network devices using CLI, RESTCONF, and NETCONF. You’ve gained practical experience with tools like Netmiko, NAPALM, Nornir, Requests, and ncclient, understanding their strengths and best use cases in a multi-vendor environment.

In the next chapter, we will expand on this foundation by exploring Infrastructure as Code (IaC) principles with Ansible for Network Automation. We will see how Ansible, a powerful automation engine, can leverage many of these Python libraries through its modules and collections to define network state declaratively, manage inventory, and orchestrate complex workflows efficiently. We’ll also delve into templating with Jinja2 and integrating these tools into a complete NetDevOps pipeline.