Every network engineer has a spreadsheet. Usually several. One for the IP address plan, another for the patch panel layout, maybe a Visio diagram that was last updated two quarters ago. These artifacts live in SharePoint folders, get emailed around, fall out of sync, and silently become liabilities.
We had all of that. And then we moved the entire topology definition into YAML files, committed them to Git, and let automation consume them. The result was a level of consistency and speed in site deployments that spreadsheets could never deliver.
The Problem with Traditional Documentation
When you’re deploying network infrastructure across dozens of sites, every site needs an IP address plan, a VLAN assignment sheet, a cable patch plan, a hardware bill of materials, and usually a Day 1 configuration. Traditionally, an engineer opens the master spreadsheet, manually carves out the next available subnets, figures out port assignments, and types it all up.
This works when you’re doing one site a quarter. It falls apart when you’re doing five a month. People make mistakes. Two sites get overlapping subnets. The patch plan says port 49 but the physical cable lands on port 50. The Visio diagram shows a connection that existed three hardware refreshes ago.
The root issue is that the topology knowledge lives in unstructured, human-only formats. A spreadsheet cell that says “VLAN 100 = Data” is meaningful to an engineer reading it, but it’s useless to a script that needs to generate a configuration.
Defining Topology as Data
The shift was conceptually simple: describe the network topology in YAML, a format that’s easy for humans to read and trivial for machines to parse. Every aspect of the site design got expressed as structured data.
VLANs and Address Allocation
Instead of a spreadsheet tab with VLAN numbers and descriptions, we defined them once:
vlans:
100:
name: "Data"
description: "Corporate data - wired endpoints"
purpose: "user"
150:
name: "Voice"
description: "Voice/telephony traffic"
purpose: "user"
200:
name: "Guest"
description: "Guest wireless and wired access"
purpose: "user"
300:
name: "Services"
description: "Network services, AP registration"
purpose: "infrastructure"
Subnet sizing varies by site size, so we defined allocation rules per site type:
subnet_allocation:
small:
infra_block: "/24"
user_block: "/23"
crosslinks: "/26"
loopbacks: "/27"
vlan_100: "/24"
vlan_150: "/26"
vlan_200: "/25"
medium:
infra_block: "/23"
user_block: "/21"
crosslinks: "/25"
loopbacks: "/26"
vlan_100: "/22"
vlan_150: "/24"
vlan_200: "/23"
Give the automation a site type and a parent supernet, and it carves out every subnet deterministically. No guesswork, no overlap, no “let me check the spreadsheet.”
Site Types and Hardware Roles
Each site type defines what hardware roles are present, their redundancy model, and the specific device models:
site_types:
small:
description: "Small site - single access stack, no core"
roles:
mandatory:
- wan_gw
- floor_sw
- console_server
- wireless
optional:
- lab_gw
redundancy:
wan: single
core_sw: none
console: single
devices:
wan_gw:
model: "C8300-2N2S-4T2X"
description: "WAN Edge Gateway"
quantity: 1
floor_sw:
model: "C9300X-48HX-A"
description: "48-port mGig Access Switch"
max_stack: 6
ports_per_switch: 48
ap_ports_per_switch: 8
uplink_ports: 2
A medium site has dual WAN gateways, dual core switches, optional voice gateways, and optional lab gateway pairs. The YAML captures all of that without ambiguity. An engineer reading it understands the design; a Python script consuming it knows exactly what to generate.
Physical Connectivity
This was the real unlock. We defined every physical cable connection between devices in YAML:
medium:
connections:
- link_id: "wan1-to-core1"
from_role: "wan_gw"
from_id: 1
from_port: "ten0/0/4"
from_media: "SFP-10Base-SR"
to_role: "core_sw"
to_id: 1
to_port: "hun1/0/49"
to_media: "CVR QSFP28 SFP25G"
cable_type: "fiber"
cable: "OM4 MMF"
- link_id: "core1-to-core2-a"
from_role: "core_sw"
from_id: 1
from_port: "twe1/0/3"
from_media: "SFP-10/25GBase-CSR"
to_role: "core_sw"
to_id: 2
to_port: "twe1/0/3"
to_media: "SFP-10/25GBase-CSR"
cable_type: "fiber"
cable: "OM4 MMF"
Every connection specifies both ends: the role, the device instance, the exact port, the optic/media type, the cable type, and the cable specification. Optional connections use a condition field:
- link_id: "core1-to-lab1"
from_role: "core_sw"
from_id: 1
from_port: "twe1/0/11"
from_media: "SFP-10Base-SR"
to_role: "lab_gw"
to_id: 1
to_port: "ten0/0/0"
to_media: "SFP-10Base-SR"
cable_type: "fiber"
cable: "OM4 MMF"
condition: "lab_enabled"
If the site doesn’t have a lab, those connections are simply skipped during generation. No manual editing required.
What This Enables
Once the topology is structured data, everything downstream becomes automatable.
Address plan generation. Feed in a site name, type, and parent supernet. The script reads the VLAN definitions and subnet allocation rules, carves out every subnet, assigns gateway addresses, and produces a complete address plan. IPv4 and IPv6, dual stack, every time, with zero manual calculation.
Patch plan generation. The connectivity YAML drives the patch plan output. For every connection, the tool generates a line item: source device, source port, cable type, destination device, destination port. Dynamic floor switch uplinks are calculated based on the stack count and a base port offset defined in the data. The output is a ready to print patch schedule that the cabling team can follow.
Bill of materials. The site type definition gives you the exact hardware, optic, and cable counts. Need to order for three medium sites? Multiply. Need to know how many OM4 fiber patches to stock? Count the connections where cable_type: "fiber".
Configuration generation. With the address plan and connectivity data resolved, generating Day 1 switch and router configurations via Jinja2 templates becomes mechanical. The VLAN names, subnet assignments, interface descriptions, and port channel configurations all flow from the same source data.
Git as the Source of Truth
Here’s where it gets powerful. When your topology definition is YAML in a Git repository:
Every change is tracked. Someone added a new VLAN? It’s in the commit history. Subnet allocation changed? There’s a diff showing exactly what moved. The “who changed what and when” question that’s impossible to answer with a shared spreadsheet becomes trivial.
Pull requests enforce review. A junior engineer proposes a change to the site type definition. A senior engineer reviews the YAML diff, catches that the proposed subnet sizing would overlap with an existing allocation, and requests changes. This happens before anything touches the network.
Branching enables parallel work. One team is working on a new site type for data centers while another is adjusting the medium site connectivity for a hardware refresh. Both work on separate branches without stepping on each other.
CI/CD validates automatically. A pre-commit hook or CI pipeline can validate that all YAML files parse correctly, that subnet allocations don’t overlap, that every port referenced in the connectivity file actually exists on the specified device model. These checks catch errors that a human reviewer might miss.
Readable by Humans and Machines
One of the underappreciated qualities of YAML for topology definition is that it bridges the gap between human documentation and machine input. A network architect can open site_types.yaml, read it top to bottom, and understand the design without running any code. A Python script can yaml.safe_load() the same file and start generating outputs.
This is different from trying to parse a Visio diagram programmatically. It’s different from scraping values out of an Excel spreadsheet with openpyxl and hoping nobody moved a column. The YAML is the specification. The automation consumes it directly. There is no translation layer where errors creep in.
With AI and LLM based agents becoming part of the operational toolkit, this matters even more. A structured YAML topology is something an agent can reason about, query, and generate modifications for. Ask an agent to “add a new VLAN 250 for IoT devices with a /24 in small sites and a /22 in medium sites” and it can produce the exact YAML diff. Try that with a Visio diagram.
Lessons Learned
Start with the topology, not the tools. We spent time getting the YAML schema right before writing any automation. The data model is the hardest part. Once it’s solid, the scripts are straightforward.
Keep it DRY. The VLAN definitions are defined once and referenced everywhere. The subnet allocation rules are defined per site type, not per site. Site-specific overrides exist but are the exception, not the pattern.
Version the schema. When we changed from a flat connectivity list to the current from_role/to_role structure, we bumped a schema version. Old data can be migrated; new data validates against the current schema.
Make the output diffable. The generated address plans and patch plans are deterministic. Same inputs always produce the same outputs. This means you can diff generated outputs between runs and immediately see the impact of a YAML change.
If your team is still managing network topology in spreadsheets and diagrams, consider what it would take to express that same information as structured data. The upfront investment in defining a clean YAML schema pays for itself the moment you need to deploy the second site. By the tenth site, you’ll wonder how you ever did it any other way.