Cisco ACI Leaf Node ID Swap Steps When Leaves Are Part of vPC
Step 0 – Preconditions
Confirm maintenance window is approved. Ensure alternate connectivity or downtime is acceptable. Make sure you have console or OOB access to both leaf switches.
Step 1 – Drain Traffic and Clear Endpoints
Shut down or migrate all server-facing interfaces connected to the vPC pair.
From APIC, navigate to Fabric → Inventory → Pod → Node → Leaf → Endpoints.
Verify endpoint count is zero on both leaves.
Step 2 – Remove vPC and Port-Channel Configuration
Delete vPC protection group policies.
Delete all vPC port-channels.
Remove interface policy associations.
Remove all static EPG bindings that reference the vPC or either leaf.
At this stage, the leaves must have no access policy dependencies.
Step 3 – Remove L3Out (If Leaves Are Border Leaves)
If the vPC pair is used for L3Out, remove both leaves from the L3Out logical node profile and logical interface profile.
Confirm external routing is stable via remaining border leaves.
Step 4 – Decommission First Leaf (Leaf A)
In APIC, go to Fabric → Inventory → Fabric Membership.
Select Leaf A and perform Decommission.
Wait until the status shows Decommissioned.
Do not power off yet.
Step 5 – Clean Leaf A
Connect to Leaf A using console or OOB.
Run acidiag touch clean and then reload the switch.
This removes old node ID, certificates, and fabric identity.
Step 6 – Decommission Second Leaf (Leaf B)
In APIC, again go to Fabric → Inventory → Fabric Membership.
Select Leaf B and perform Decommission.
Wait until the status shows Decommissioned.
Step 7 – Clean Leaf B
Connect to Leaf B using console or OOB.
Run acidiag touch clean and reload the switch.
Both leaves are now clean and discovery-ready.
Step 8 – Re-add Leaf A with New Node ID
Power on Leaf A only.
Ensure fabric uplinks to spines are connected.
From APIC Fabric Membership, approve the switch and assign the new desired node ID.
Wait until Leaf A is fully discovered and stable.
Step 9 – Re-add Leaf B with New Node ID
Power on Leaf B.
From APIC Fabric Membership, approve it and assign the other node ID.
Wait until Leaf B is fully discovered and stable.
Step 10 – Rebuild vPC Configuration
After both leaves are healthy, recreate the vPC protection group.
Recreate vPC port-channels and interface policies.
Reapply static EPG bindings to the vPC.
Do not rebuild vPC if only one leaf is active.
Step 11 – Validation
Verify fabric health is green.
Ensure no vPC, access, or infra faults exist.
Confirm port-channels are up on both leaves.
Step 12 – Restore Traffic
Enable server-facing interfaces.
Bring servers or upstream devices back online.
Verify endpoint learning and confirm no MAC flapping or faults.
Final Rule
Never attempt a live node ID swap. Always decommission, clean, and re-add both vPC peer leaves in a controlled sequence.
Precautions
Swapping node IDs between Cisco ACI leaf switches is a sensitive operation, especially when the leaves are configured as a vPC pair. Unlike traditional networks, Cisco ACI tightly binds policies, forwarding state, and infrastructure objects to node IDs, making a node ID swap a planned maintenance activity, not a live change. When vPC is involved, the risk multiplies because both leaves act as a single logical endpoint for servers and network devices.
This article explains the critical precautions you must follow when performing a Cisco ACI leaf node ID swap in a vPC environment, based on real production experience and Cisco‑accepted operational practices.
Why Node ID Swap Is Risky in vPC‑Based ACI Fabrics
In Cisco ACI, a leaf’s node ID is not just an identifier; it is embedded into multiple internal constructs such as vPC identifiers, static EPG bindings, endpoint tables, and forwarding databases. In a vPC pair, both leaves jointly provide forwarding for a single logical port‑channel. Swapping node IDs without proper preparation can cause MAC flapping, endpoint blackholing, broken port‑channels, and fabric faults.
There is no supported in‑place node ID change in Cisco ACI. The only supported method to swap node IDs is to decommission, clean, and re‑add the leaf switches with the desired node IDs.
Precaution 1: Treat the vPC Pair as a Single Failure Domain
The most important rule is to treat both vPC peers as a single unit, even though they are two physical switches. Never attempt a node ID swap on only one vPC peer while the other peer is actively forwarding traffic. ACI vPC forwarding relies on consistent node information across both leaves. Any mismatch can result in unpredictable traffic loss.
Before starting, ensure:
- All connected servers or upstream devices are drained or shut down.
- No single‑homed devices depend on the vPC pair.
- Maintenance is scheduled during a proper change window.
Precaution 2: Ensure Zero Active Endpoints on Both Leaves
A node ID swap must never be performed while endpoints are active. In ACI, endpoints can be learned dynamically through traffic, and their state is tied to the leaf node ID. If endpoints remain on either vPC peer, swapping node IDs will cause immediate disruption.
From APIC, verify that both leaves show zero endpoints before proceeding. If endpoints are present, migrate workloads, shut down interfaces, or disconnect cables until endpoint learning is cleared.
Precaution 3: Remove vPC and Port‑Channel Policies Before Decommissioning
ACI does not automatically clean up vPC policies during decommissioning. All vPC‑related constructs must be removed manually. This includes:
- vPC protection group
- Port‑channel policies
- Interface policy associations
- Static EPG bindings referencing the vPC
Leaving these objects in place can block decommissioning or result in orphaned configuration that causes faults after the swap. A clean policy removal ensures that the fabric does not retain references to the old node IDs.
Precaution 4: If the vPC Pair Is Also a Border Leaf, Remove L3Out First
When a vPC pair is serving as a border leaf for L3Out, the risk is even higher. External routing protocols such as BGP or OSPF depend on stable leaf identities. Before any node ID swap:
- Remove the leaves from all L3Out logical node profiles.
- Ensure routing is fully operational on alternate border leaves.
- Validate external reachability before continuing.
Failure to do this can result in complete north‑south traffic outages.
Precaution 5: Always Clean Both Leaves Using acidiag
After decommissioning each leaf, it is mandatory to run:
acidiag touch clean
reload
on both vPC peers. Cleaning only one switch is a common and dangerous mistake. If one leaf still retains fabric identity or certificates, the fabric may encounter node ID conflicts, discovery failures, or inconsistent vPC behavior when the switches are re‑added.
Cleaning ensures that the switch boots in a discovery‑ready state with no residual ACI identity.
Precaution 6: Re‑Add Leaves Sequentially, Not in Parallel
When re‑adding switches with swapped node IDs, never power up or approve both leaves at the same time. Always follow a controlled order:
- Bring up the first leaf and assign its new node ID.
- Wait for full fabric stability and health.
- Bring up the second leaf and assign its new node ID.
This approach avoids node ID collisions, partial vPC instantiation, and confusing APIC fault scenarios.
Precaution 7: Rebuild vPC Only After Both Leaves Are Fully Healthy
Do not recreate vPC configurations until both leaves are fully discovered, healthy, and visible in the fabric. Building vPC with only one peer active leads to port‑channel inconsistencies and deployment failures.
Once both leaves are stable:
- Recreate vPC protection groups.
- Recreate port‑channels.
- Reapply static EPG bindings.
- Validate that both leaves appear in all bindings.
Only after this should server ports or network devices be reconnected.
Precaution 8: Validate vPC Health Before Allowing Traffic
Before reintroducing traffic, perform strict validation:
- No vPC‑related faults in APIC.
- Port‑channels show operational status.
- No access, fabric, or infra faults.
- Leaf interfaces are up and error‑free.
Once validation is complete, gradually restore server or upstream connectivity and observe endpoint learning behavior.
Common Mistakes to Avoid
The most common mistakes during node ID swap in vPC environments include attempting a live swap, forgetting to remove vPC policies, cleaning only one leaf, or restoring traffic before full validation. Each of these can result in extended outages and complex recovery procedures.
Final Takeaway
A Cisco ACI leaf node ID swap in a vPC environment is a full teardown and rebuild operation, not a minor change. Success depends on treating both leaves as a single unit, removing all dependencies, cleaning both switches, and performing a controlled re‑addition process. When executed correctly, the swap is safe and fully supported, but shortcuts almost always lead to problems.
One‑Line Summary
In Cisco ACI, swapping node IDs on vPC‑connected leaf switches requires full vPC teardown, clean decommissioning of both leaves, and a controlled rebuild to avoid traffic loss and fabric instability.
No comments:
Post a Comment