Networklearner: ACI VPC

Showing posts with label ACI VPC. Show all posts

Tuesday, 28 April 2026

Leaf Node ID Swap in Cisco ACI: Risks, Precautions, and Steps

Cisco ACI Leaf Node ID Swap Steps When Leaves Are Part of vPC

Step 0 – Preconditions
Confirm maintenance window is approved. Ensure alternate connectivity or downtime is acceptable. Make sure you have console or OOB access to both leaf switches.

Step 1 – Drain Traffic and Clear Endpoints
Shut down or migrate all server-facing interfaces connected to the vPC pair.
From APIC, navigate to Fabric → Inventory → Pod → Node → Leaf → Endpoints.
Verify endpoint count is zero on both leaves.

Step 2 – Remove vPC and Port-Channel Configuration
Delete vPC protection group policies.
Delete all vPC port-channels.
Remove interface policy associations.
Remove all static EPG bindings that reference the vPC or either leaf.
At this stage, the leaves must have no access policy dependencies.

Step 3 – Remove L3Out (If Leaves Are Border Leaves)
If the vPC pair is used for L3Out, remove both leaves from the L3Out logical node profile and logical interface profile.
Confirm external routing is stable via remaining border leaves.

Step 4 – Decommission First Leaf (Leaf A)
In APIC, go to Fabric → Inventory → Fabric Membership.
Select Leaf A and perform Decommission.
Wait until the status shows Decommissioned.
Do not power off yet.

Step 5 – Clean Leaf A
Connect to Leaf A using console or OOB.
Run acidiag touch clean and then reload the switch.
This removes old node ID, certificates, and fabric identity.

Step 6 – Decommission Second Leaf (Leaf B)
In APIC, again go to Fabric → Inventory → Fabric Membership.
Select Leaf B and perform Decommission.
Wait until the status shows Decommissioned.

Step 7 – Clean Leaf B
Connect to Leaf B using console or OOB.
Run acidiag touch clean and reload the switch.
Both leaves are now clean and discovery-ready.

Step 8 – Re-add Leaf A with New Node ID
Power on Leaf A only.
Ensure fabric uplinks to spines are connected.
From APIC Fabric Membership, approve the switch and assign the new desired node ID.
Wait until Leaf A is fully discovered and stable.

Step 9 – Re-add Leaf B with New Node ID
Power on Leaf B.
From APIC Fabric Membership, approve it and assign the other node ID.
Wait until Leaf B is fully discovered and stable.

Step 10 – Rebuild vPC Configuration
After both leaves are healthy, recreate the vPC protection group.
Recreate vPC port-channels and interface policies.
Reapply static EPG bindings to the vPC.
Do not rebuild vPC if only one leaf is active.

Step 11 – Validation
Verify fabric health is green.
Ensure no vPC, access, or infra faults exist.
Confirm port-channels are up on both leaves.

Step 12 – Restore Traffic
Enable server-facing interfaces.
Bring servers or upstream devices back online.
Verify endpoint learning and confirm no MAC flapping or faults.

Final Rule

Never attempt a live node ID swap. Always decommission, clean, and re-add both vPC peer leaves in a controlled sequence.

Precautions

Swapping node IDs between Cisco ACI leaf switches is a sensitive operation, especially when the leaves are configured as a vPC pair. Unlike traditional networks, Cisco ACI tightly binds policies, forwarding state, and infrastructure objects to node IDs, making a node ID swap a planned maintenance activity, not a live change. When vPC is involved, the risk multiplies because both leaves act as a single logical endpoint for servers and network devices.

This article explains the critical precautions you must follow when performing a Cisco ACI leaf node ID swap in a vPC environment, based on real production experience and Cisco‑accepted operational practices.

Why Node ID Swap Is Risky in vPC‑Based ACI Fabrics

In Cisco ACI, a leaf’s node ID is not just an identifier; it is embedded into multiple internal constructs such as vPC identifiers, static EPG bindings, endpoint tables, and forwarding databases. In a vPC pair, both leaves jointly provide forwarding for a single logical port‑channel. Swapping node IDs without proper preparation can cause MAC flapping, endpoint blackholing, broken port‑channels, and fabric faults.

There is no supported in‑place node ID change in Cisco ACI. The only supported method to swap node IDs is to decommission, clean, and re‑add the leaf switches with the desired node IDs.

Precaution 1: Treat the vPC Pair as a Single Failure Domain

The most important rule is to treat both vPC peers as a single unit, even though they are two physical switches. Never attempt a node ID swap on only one vPC peer while the other peer is actively forwarding traffic. ACI vPC forwarding relies on consistent node information across both leaves. Any mismatch can result in unpredictable traffic loss.

Before starting, ensure:

All connected servers or upstream devices are drained or shut down.
No single‑homed devices depend on the vPC pair.
Maintenance is scheduled during a proper change window.

Precaution 2: Ensure Zero Active Endpoints on Both Leaves

A node ID swap must never be performed while endpoints are active. In ACI, endpoints can be learned dynamically through traffic, and their state is tied to the leaf node ID. If endpoints remain on either vPC peer, swapping node IDs will cause immediate disruption.

From APIC, verify that both leaves show zero endpoints before proceeding. If endpoints are present, migrate workloads, shut down interfaces, or disconnect cables until endpoint learning is cleared.

Precaution 3: Remove vPC and Port‑Channel Policies Before Decommissioning

ACI does not automatically clean up vPC policies during decommissioning. All vPC‑related constructs must be removed manually. This includes:

vPC protection group
Port‑channel policies
Interface policy associations
Static EPG bindings referencing the vPC

Leaving these objects in place can block decommissioning or result in orphaned configuration that causes faults after the swap. A clean policy removal ensures that the fabric does not retain references to the old node IDs.

Precaution 4: If the vPC Pair Is Also a Border Leaf, Remove L3Out First

When a vPC pair is serving as a border leaf for L3Out, the risk is even higher. External routing protocols such as BGP or OSPF depend on stable leaf identities. Before any node ID swap:

Remove the leaves from all L3Out logical node profiles.
Ensure routing is fully operational on alternate border leaves.
Validate external reachability before continuing.

Failure to do this can result in complete north‑south traffic outages.

Precaution 5: Always Clean Both Leaves Using acidiag

After decommissioning each leaf, it is mandatory to run:

acidiag touch clean
reload

on both vPC peers. Cleaning only one switch is a common and dangerous mistake. If one leaf still retains fabric identity or certificates, the fabric may encounter node ID conflicts, discovery failures, or inconsistent vPC behavior when the switches are re‑added.

Cleaning ensures that the switch boots in a discovery‑ready state with no residual ACI identity.

Precaution 6: Re‑Add Leaves Sequentially, Not in Parallel

When re‑adding switches with swapped node IDs, never power up or approve both leaves at the same time. Always follow a controlled order:

Bring up the first leaf and assign its new node ID.
Wait for full fabric stability and health.
Bring up the second leaf and assign its new node ID.

This approach avoids node ID collisions, partial vPC instantiation, and confusing APIC fault scenarios.

Precaution 7: Rebuild vPC Only After Both Leaves Are Fully Healthy

Do not recreate vPC configurations until both leaves are fully discovered, healthy, and visible in the fabric. Building vPC with only one peer active leads to port‑channel inconsistencies and deployment failures.

Once both leaves are stable:

Recreate vPC protection groups.
Recreate port‑channels.
Reapply static EPG bindings.
Validate that both leaves appear in all bindings.

Only after this should server ports or network devices be reconnected.

Precaution 8: Validate vPC Health Before Allowing Traffic

Before reintroducing traffic, perform strict validation:

No vPC‑related faults in APIC.
Port‑channels show operational status.
No access, fabric, or infra faults.
Leaf interfaces are up and error‑free.

Once validation is complete, gradually restore server or upstream connectivity and observe endpoint learning behavior.

Common Mistakes to Avoid

The most common mistakes during node ID swap in vPC environments include attempting a live swap, forgetting to remove vPC policies, cleaning only one leaf, or restoring traffic before full validation. Each of these can result in extended outages and complex recovery procedures.

Final Takeaway

A Cisco ACI leaf node ID swap in a vPC environment is a full teardown and rebuild operation, not a minor change. Success depends on treating both leaves as a single unit, removing all dependencies, cleaning both switches, and performing a controlled re‑addition process. When executed correctly, the swap is safe and fully supported, but shortcuts almost always lead to problems.

One‑Line Summary

In Cisco ACI, swapping node IDs on vPC‑connected leaf switches requires full vPC teardown, clean decommissioning of both leaves, and a controlled rebuild to avoid traffic loss and fabric instability.

Tuesday, 5 August 2025

Concept of vPC in ACI

Concept of vPC in ACI

In Cisco ACI, a Virtual Port Channel (vPC) enables two separate leaf switches to present a unified port channel to a connected endpoint—such as a server, firewall, or another switch that supports link aggregation protocols like LACP.

In this setup, two ACI leaf nodes (e.g., Leaf201 and Leaf202) act as vPC peers, forming a logical construct known as a vPC domain. One of these peers is elected as the primary, while the other assumes the secondary role.

ACI’s MCT-Based Architecture

Unlike traditional vPC implementations that rely on a dedicated peer-link, ACI leverages the fabric itself to manage synchronization and control-plane communication. This architecture is referred to as Multichassis EtherChannel Trunk (MCT).

🔧 Key Characteristics:

No physical peer-link is required between Leaf201 and Leaf202.
Instead, the ACI fabric handles all peer communication and synchronization.
ZMQ (Zero Message Queue) replaces traditional CFS (Cisco Fabric Services) for messaging between vPC peers.

How Peer Communication Works in ACI

ZMQ, a high-performance messaging library using TCP, is embedded as libzmq on each switch.
Applications that require peer communication (like the vPC manager) use this library to exchange messages.

🔄 Peer Reachability Mechanism:

The vPC manager subscribes to routing updates via URIB.
When IS-IS discovers a route to the peer (e.g., Leaf202 sees Leaf201), URIB notifies the vPC manager.
The manager then attempts to establish a ZMQ socket with the peer.
If the route is withdrawn (e.g., due to link failure), the vPC manager is notified and the MCT link is brought down accordingly.

Upgrade Best Practices with vPC

To ensure high availability during fabric upgrades, it's recommended to divide switches into at least two upgrade groups. For example:

Group A: Leaf201, Leaf203, Spine101
Group B: Leaf202, Leaf204, Spine102

This strategy ensures that at least one vPC peer remains active during the upgrade, preventing service disruption for connected endpoints.

Glossary

Term	Description
ACI	Application Centric Infrastructure
vPC	Virtual Port Channel
MCT	Multichassis EtherChannel Trunk
ZMQ	Zero Message Queue
URIB	Unicast Routing Information Base
IS-IS	Intermediate System to Intermediate System
LACP	Link Aggregation Control Protocol

VPC Design Options:-

Option 1 -VPC with SAME Leaf interfaces across two leafs with Combined Profiles

Option 2 - VPC with SAME Leaf interfaces across two leafs with Individual Profiles.

Option 3 - VPC with DIFFERENT Leaf interfaces across two leafs with Individual Profiles

Monday, 4 August 2025

Complete Steps to Create vPC in Cisco ACI (via APIC GUI)

Understanding vPC in Cisco ACI: A Modern Approach to High Availability

In the evolving landscape of data center networking, Virtual Port Channel (vPC) stands out as a cornerstone of high availability and link redundancy. While traditional NX-OS environments rely on CLI-driven configurations, Cisco ACI reimagines vPC through a policy-driven, intent-based model that aligns with the fabric’s overarching design philosophy.

Unlike legacy setups, ACI abstracts physical connectivity into logical constructs, allowing administrators to define vPC behavior through interface policy groups, switch profiles, and attachable access entity profiles (AAEPs). This not only simplifies deployment but also ensures consistency across the fabric.

At its core, a vPC in ACI enables two leaf switches to present a unified uplink to a downstream device—be it a server, firewall, or load balancer—without relying on spanning tree protocols. The result is active-active forwarding, improved bandwidth utilization, and seamless failover.

In this guide, we’ll walk through the step-by-step configuration of vPC in Cisco ACI, demystifying each component and highlighting best practices to ensure a robust and scalable deployment.

Note:- In Cisco ACI, a Fabric Extender (FEX) can be integrated using a port channel in a straight-through topology, where each FEX connects directly to a leaf switch. While vPCs can be established between hosts and the FEX for redundancy and load balancing, the FEX itself does not support vPC connectivity to multiple leaf switches.

Complete Steps to Create vPC in Cisco ACI (via APIC GUI)

Step 1: Leaf Onboarding (One-by-One)

🔍 Monitor Discovery in APIC

Log in to the APIC GUI
Navigate to:
Fabric → Inventory → Fabric Membership → Nodes Pending Registration
Wait for Leaf101 to appear

You’ll see its Serial Number
Node Role: Leaf
Status: Blank / Not Registered

📝 Register Leaf101

Right-click on Leaf101’s serial number
Click Register
In the registration window, enter:

Node ID: 101
Node Name: Leaf101
Click Register
Wait for it to appear in Registered Nodes

📝 Register Leaf102

Repeat the same steps for Leaf102:

Wait for it to appear in Nodes Pending Registration
Right-click → Register
Enter:

Node ID: 102
Node Name: Leaf102

Click Register
Wait for it to appear in Registered Nodes

🔢 Step-by-Step ACI Configuration Flow

2. VLAN Pool (VLAN 113)

Navigate to:
Fabric → Access Policies → Pools → Right Click on VLAN and click Create Vlan Pool
Create VLAN Pool:

Name: VLAN_113_Pool
Mode: Static
Click + under Encap Blocks

Ø Range: 113 – 113

Ø Allocation mode: Static

Click Ok - >Submit

3. Domain (Physical Domain)

Go to:
Fabric → Access Policies → Physical and External Domains ->Right Click on Physical domain ->
Create Physical Domain:

Name: PhysDom_VLAN113
VLAN Pool: VLAN_113_Pool
Click Submit

4. AEP (Attachable Access Entity Profile)

Navigate to:
Fabric → Access Policies → Policies-> Global → Right Click on Attachable Access Entity Profiles -> Click Create Attachable Access Entity Profiles
Create AEP:

Name: AEP_VLAN113
Click + under Domains and Associated Domain: PhysDom_VLAN113
Click Update ->Next -> Finish

5. Interface Policy Group (vPC)

Go to:
Fabric → Access Policies → Interface → Leaf Interfaces - >Policy Groups->Right click on VPC Interfaces - >Create VPC Interfaces
Create VPC Interface Policy Group:

Name: vPC_LF101_LF102_1_1
AEP: AEP_VLAN113
Port Channel Policy: system-lacp-Active
Link Level Policy: system-link-level-XG-Auto

Click Next > Finish

6. Create vPC Policy (Your Mentioned Step)

Go to:
Fabric → Access Policies → Policies → Switch
Right-click on Virtual Port Channel Default

Name:VPC_101_102

ID:10

VPC Domain Policy: Default

Switch1: Leaf101

Switch2: Leaf102

This step ensures the vPC behavior is defined at the switch policy level.

7. Interface Profile

Navigate to:
Fabric → Access Policies → Interface → Leaf Interface -> Profiles
Right click on the interface profile and click Create Interface Profile:

Name: IntProf_Leaf101_102

Click + under Interface Selector:

Name: Eth1_4
Interface ID: 1/4
Policy Group: vPC_LF101_LF102_1_1

Click Ok - > Submit

8. Switch Profile

Go to:
Fabric → Access Policies → Switches → Profiles
Right Click on Profile and click Create Leaf Profile:

Name: LeafProf_101_102

Click + under Leaf Selector:

Name: Leaf101_102
Node Block: From 101 to 102

Click Update -> Next
Attach Interface Profile:

IntProf_Leaf101_102

9. Create Tenant

Navigate to:
Tenants
Click Add Tenant

Name: Tenant_WebApp

Click Submit

10. Create VRF

Navigate to:
Tenants
Click Networking -> VRF -> Right click on VRF -> click Create VRF

Name: WebApp_VRF
Uncheck Create A Bridge Domain

Click Finish

11 Create BD

Navigate to:
Tenants
Click Networking -> Bridge Domain -> Right click on Bridge Domain-> click Create Bridge Domain

Name: WebApp_BD
VRF: WebApp_VRF
Click Next
Click + under Subnet

Ø Gateway IP: 10.1.1.1/24

Ø Check “Make this IP address Primary”

Ø Scope: check “Advertised Externally”

Click OK -> Next-> Finish

12. Create Application Profile (AP)

Inside Tenant_WebApp, go to:
Application Profiles
Right Click on Application Profile and Create Application Profile:

Name: WebApp_AP

Click Submit

13. Create Endpoint Group (EPG)

Inside WebApp_AP, go to:
EPGs
Right Click on Application EPG and click Create Application EPG:

Name: WebApp_EPG
Bridge Domain: WebApp_BD
Click Finish

Right Click on WebApp_EPG and click ADD Physical Domain Association:

Domain Association: PhysDom_VLAN113
Click Submit

14. Create Contract (Allow TCP Port 80)

Go to:
Tenant_WebApp → Contracts -> Standard
Right Click on Standard -> Click Create Contract:

Name: Allow_HTTP

Click + under Subject:

Name: HTTP_Subject
Filter: Click + under Filter

Click + Under Name
Name: HTTP_Filter
Click + under Entries
Name: HTTP_Entry
EtherType: IP
IP Protocol: TCP
Destination Port: From http – To http
Click Update-> Submit
Click Update -> Ok -> Submit

Provide contract to/from EPG as needed

15. Static Binding of EPG to Port

Go to Tenant WebApp_EPG-> Application Profiles -> WebApp_AP -> Application EPGs -> WebApp_EPGs
Right Click on WebApp_EPGs -> Click Deploy Static EPG on PC,VPC, or Interface

Path Type: Virtual Port Channel
Path: Leaf101/eth1/4 and Leaf102/eth1/4
Mode: Trunk
Encapsulation: vlan-113

Sunday, 3 August 2025

Forward Error Correction (FEC) in Cisco ACI

In Cisco ACI, Forward Error Correction (FEC) is a mechanism used to improve the reliability of high-speed data transmission across physical links, especially in environments using 25G, 40G, 100G, or 400G interfaces.

🔍 What Is Forward Error Correction?

FEC is a technique where the sender adds redundant data (parity bits) to each transmission. If some bits are corrupted during transit, the receiver can detect and correct those errors without needing a retransmission. Think of it like sending a puzzle with extra pieces so the receiver can still complete it even if a few pieces go missing.

🧠 How FEC Works in Cisco ACI

In ACI, FEC is negotiated between switches and endpoints during auto-negotiation. The devices advertise their supported FEC modes and agree on the best one. Common FEC modes include:

FC-FEC (Firecode FEC): Used for 25G links.
RS-FEC (Reed-Solomon FEC): Used for 25G, 100G, and 400G links.
CL91-RS-FEC and IEEE-RS-FEC: Advanced versions for higher speeds.
AUTO-FEC: Automatically selects the best FEC mode based on link capabilities.

⚙️ Why It Matters

FEC is especially important in Cisco ACI because:

High-speed links (like 25G or 100G) are more prone to bit errors.
Breakout ports (e.g., 4x25G from a 100G port) often require FEC to maintain link stability.
Copper DAC cables used in short-distance connections rely on FEC to compensate for signal degradation.

✅ Use Cases

Ensuring error-free transmission over high-speed links.
Supporting auto-negotiation on breakout ports.
Enhancing link reliability without increasing latency or requiring retransmissions.

Symmetric hashing in Cisco ACI

🔄 Symmetric Hashing in Cisco ACI: A Traffic Balancing Philosophy

Imagine a highway with multiple lanes, and cars (data packets) trying to reach their destination. Normally, each car chooses a lane based on its starting point and destination. But what if the return journey picks a different lane? That’s what happens with asymmetric hashing — the forward and reverse paths of a data flow may travel through different physical links.

In Cisco ACI, symmetric hashing is like a rule that says: “If you go out through lane 3, you must come back through lane 3.” It ensures that both directions of a traffic flow — from source to destination and back — follow the same physical path within a port channel.

This matters a lot when you're dealing with devices like firewalls, load balancers, or any system that tracks sessions. If traffic enters through one link and exits through another, it can confuse these devices, leading to dropped packets or broken connections.

Symmetric hashing is not supported on the following switches:

Cisco Nexus 93128TX
Cisco Nexus 9372PX
Cisco Nexus 9372PX-E
Cisco Nexus 9372TX
Cisco Nexus 9372TX-E
Cisco Nexus 9396PX
Cisco Nexus 9396TX

🧠 Why Cisco ACI Made It Optional

Cisco ACI’s default behavior is asymmetric — it spreads traffic across links based on a hash of various packet fields (IP, MAC, ports). This works well for general load balancing. But when precision and consistency are needed, ACI gives you the option to enable symmetric hashing in the port-channel policy.

Once enabled, you can choose the hashing algorithm — like using only IP addresses or including Layer 4 ports — to fine-tune how traffic is distributed.

✅ Use Cases That Benefit

Firewall clusters that expect consistent ingress/egress paths.
Load balancers that rely on session stickiness.
Troubleshooting scenarios where symmetric paths simplify packet tracing.