Networklearner: Cisco ACI

Showing posts with label Cisco ACI. Show all posts

Tuesday, 28 April 2026

Leaf Node ID Swap in Cisco ACI: Risks, Precautions, and Steps

Cisco ACI Leaf Node ID Swap Steps When Leaves Are Part of vPC

Step 0 – Preconditions
Confirm maintenance window is approved. Ensure alternate connectivity or downtime is acceptable. Make sure you have console or OOB access to both leaf switches.

Step 1 – Drain Traffic and Clear Endpoints
Shut down or migrate all server-facing interfaces connected to the vPC pair.
From APIC, navigate to Fabric → Inventory → Pod → Node → Leaf → Endpoints.
Verify endpoint count is zero on both leaves.

Step 2 – Remove vPC and Port-Channel Configuration
Delete vPC protection group policies.
Delete all vPC port-channels.
Remove interface policy associations.
Remove all static EPG bindings that reference the vPC or either leaf.
At this stage, the leaves must have no access policy dependencies.

Step 3 – Remove L3Out (If Leaves Are Border Leaves)
If the vPC pair is used for L3Out, remove both leaves from the L3Out logical node profile and logical interface profile.
Confirm external routing is stable via remaining border leaves.

Step 4 – Decommission First Leaf (Leaf A)
In APIC, go to Fabric → Inventory → Fabric Membership.
Select Leaf A and perform Decommission.
Wait until the status shows Decommissioned.
Do not power off yet.

Step 5 – Clean Leaf A
Connect to Leaf A using console or OOB.
Run acidiag touch clean and then reload the switch.
This removes old node ID, certificates, and fabric identity.

Step 6 – Decommission Second Leaf (Leaf B)
In APIC, again go to Fabric → Inventory → Fabric Membership.
Select Leaf B and perform Decommission.
Wait until the status shows Decommissioned.

Step 7 – Clean Leaf B
Connect to Leaf B using console or OOB.
Run acidiag touch clean and reload the switch.
Both leaves are now clean and discovery-ready.

Step 8 – Re-add Leaf A with New Node ID
Power on Leaf A only.
Ensure fabric uplinks to spines are connected.
From APIC Fabric Membership, approve the switch and assign the new desired node ID.
Wait until Leaf A is fully discovered and stable.

Step 9 – Re-add Leaf B with New Node ID
Power on Leaf B.
From APIC Fabric Membership, approve it and assign the other node ID.
Wait until Leaf B is fully discovered and stable.

Step 10 – Rebuild vPC Configuration
After both leaves are healthy, recreate the vPC protection group.
Recreate vPC port-channels and interface policies.
Reapply static EPG bindings to the vPC.
Do not rebuild vPC if only one leaf is active.

Step 11 – Validation
Verify fabric health is green.
Ensure no vPC, access, or infra faults exist.
Confirm port-channels are up on both leaves.

Step 12 – Restore Traffic
Enable server-facing interfaces.
Bring servers or upstream devices back online.
Verify endpoint learning and confirm no MAC flapping or faults.

Final Rule

Never attempt a live node ID swap. Always decommission, clean, and re-add both vPC peer leaves in a controlled sequence.

Precautions

Swapping node IDs between Cisco ACI leaf switches is a sensitive operation, especially when the leaves are configured as a vPC pair. Unlike traditional networks, Cisco ACI tightly binds policies, forwarding state, and infrastructure objects to node IDs, making a node ID swap a planned maintenance activity, not a live change. When vPC is involved, the risk multiplies because both leaves act as a single logical endpoint for servers and network devices.

This article explains the critical precautions you must follow when performing a Cisco ACI leaf node ID swap in a vPC environment, based on real production experience and Cisco‑accepted operational practices.

Why Node ID Swap Is Risky in vPC‑Based ACI Fabrics

In Cisco ACI, a leaf’s node ID is not just an identifier; it is embedded into multiple internal constructs such as vPC identifiers, static EPG bindings, endpoint tables, and forwarding databases. In a vPC pair, both leaves jointly provide forwarding for a single logical port‑channel. Swapping node IDs without proper preparation can cause MAC flapping, endpoint blackholing, broken port‑channels, and fabric faults.

There is no supported in‑place node ID change in Cisco ACI. The only supported method to swap node IDs is to decommission, clean, and re‑add the leaf switches with the desired node IDs.

Precaution 1: Treat the vPC Pair as a Single Failure Domain

The most important rule is to treat both vPC peers as a single unit, even though they are two physical switches. Never attempt a node ID swap on only one vPC peer while the other peer is actively forwarding traffic. ACI vPC forwarding relies on consistent node information across both leaves. Any mismatch can result in unpredictable traffic loss.

Before starting, ensure:

All connected servers or upstream devices are drained or shut down.
No single‑homed devices depend on the vPC pair.
Maintenance is scheduled during a proper change window.

Precaution 2: Ensure Zero Active Endpoints on Both Leaves

A node ID swap must never be performed while endpoints are active. In ACI, endpoints can be learned dynamically through traffic, and their state is tied to the leaf node ID. If endpoints remain on either vPC peer, swapping node IDs will cause immediate disruption.

From APIC, verify that both leaves show zero endpoints before proceeding. If endpoints are present, migrate workloads, shut down interfaces, or disconnect cables until endpoint learning is cleared.

Precaution 3: Remove vPC and Port‑Channel Policies Before Decommissioning

ACI does not automatically clean up vPC policies during decommissioning. All vPC‑related constructs must be removed manually. This includes:

vPC protection group
Port‑channel policies
Interface policy associations
Static EPG bindings referencing the vPC

Leaving these objects in place can block decommissioning or result in orphaned configuration that causes faults after the swap. A clean policy removal ensures that the fabric does not retain references to the old node IDs.

Precaution 4: If the vPC Pair Is Also a Border Leaf, Remove L3Out First

When a vPC pair is serving as a border leaf for L3Out, the risk is even higher. External routing protocols such as BGP or OSPF depend on stable leaf identities. Before any node ID swap:

Remove the leaves from all L3Out logical node profiles.
Ensure routing is fully operational on alternate border leaves.
Validate external reachability before continuing.

Failure to do this can result in complete north‑south traffic outages.

Precaution 5: Always Clean Both Leaves Using acidiag

After decommissioning each leaf, it is mandatory to run:

acidiag touch clean
reload

on both vPC peers. Cleaning only one switch is a common and dangerous mistake. If one leaf still retains fabric identity or certificates, the fabric may encounter node ID conflicts, discovery failures, or inconsistent vPC behavior when the switches are re‑added.

Cleaning ensures that the switch boots in a discovery‑ready state with no residual ACI identity.

Precaution 6: Re‑Add Leaves Sequentially, Not in Parallel

When re‑adding switches with swapped node IDs, never power up or approve both leaves at the same time. Always follow a controlled order:

Bring up the first leaf and assign its new node ID.
Wait for full fabric stability and health.
Bring up the second leaf and assign its new node ID.

This approach avoids node ID collisions, partial vPC instantiation, and confusing APIC fault scenarios.

Precaution 7: Rebuild vPC Only After Both Leaves Are Fully Healthy

Do not recreate vPC configurations until both leaves are fully discovered, healthy, and visible in the fabric. Building vPC with only one peer active leads to port‑channel inconsistencies and deployment failures.

Once both leaves are stable:

Recreate vPC protection groups.
Recreate port‑channels.
Reapply static EPG bindings.
Validate that both leaves appear in all bindings.

Only after this should server ports or network devices be reconnected.

Precaution 8: Validate vPC Health Before Allowing Traffic

Before reintroducing traffic, perform strict validation:

No vPC‑related faults in APIC.
Port‑channels show operational status.
No access, fabric, or infra faults.
Leaf interfaces are up and error‑free.

Once validation is complete, gradually restore server or upstream connectivity and observe endpoint learning behavior.

Common Mistakes to Avoid

The most common mistakes during node ID swap in vPC environments include attempting a live swap, forgetting to remove vPC policies, cleaning only one leaf, or restoring traffic before full validation. Each of these can result in extended outages and complex recovery procedures.

Final Takeaway

A Cisco ACI leaf node ID swap in a vPC environment is a full teardown and rebuild operation, not a minor change. Success depends on treating both leaves as a single unit, removing all dependencies, cleaning both switches, and performing a controlled re‑addition process. When executed correctly, the swap is safe and fully supported, but shortcuts almost always lead to problems.

One‑Line Summary

In Cisco ACI, swapping node IDs on vPC‑connected leaf switches requires full vPC teardown, clean decommissioning of both leaves, and a controlled rebuild to avoid traffic loss and fabric instability.

Sunday, 26 April 2026

Cisco ACI “Unknown” Leaf State Explained: Certificates, LLDP, Software, and Hardware Issues

In a Cisco ACI fabric, one of the most frustrating issues during initial fabric bring‑up, expansion, or node replacement is seeing a leaf switch stuck in an “Unknown” state. When a leaf is in an unknown state, it means the APIC cannot fully discover, authenticate, or manage the node, preventing it from joining the fabric and participating in traffic forwarding.

This issue can occur during initial fabric deployment, adding a new leaf to an existing fabric, replacing failed hardware, performing software upgrades, or moving switches between fabrics.

Understanding why a leaf enters the “Unknown” state is critical for fast recovery. In most cases, the root cause is not a single configuration mistake but a failure in communication, authentication, compatibility, or initialization.

This article explains the most common causes of the “Unknown” leaf state in Cisco ACI, why they happen, and how to systematically troubleshoot them in real‑world environments.

1. What Does “Unknown” Leaf State Mean in Cisco ACI?

When a leaf is shown as “Unknown” in the APIC GUI, it indicates that the APIC can see the node attempting discovery, but the node cannot complete secure authentication or critical control‑plane messaging has failed.

At this stage, the leaf is not operational, not programmable, and cannot forward production traffic.

2. Certificate Issues Between Leaf and APIC

Cisco ACI uses mutual certificate‑based authentication between the APIC controllers and fabric nodes. Every leaf switch must present a valid certificate chain that is signed and trusted by the APIC.

If the certificate exchange fails, the leaf cannot authenticate correctly, and APIC marks it as Unknown.

Common certificate‑related problems include an invalid or corrupted certificate on the leaf, the leaf previously belonging to another ACI fabric, expired or mismatched certificates due to time drift, or incomplete cleanup after node replacement.

These issues are often seen when hardware is reused without full re‑initialization.

The most reliable resolution is to completely wipe and reinitialize the leaf switch, ensure it boots in ACI mode, and allow APIC to generate and install a fresh certificate.

3. LLDP Mismatch or LLDP Failure

Cisco ACI relies heavily on LLDP for fabric discovery and adjacency validation. LLDP is mandatory in ACI for identifying correct topological relationships between leaf and spine switches.

If LLDP is not exchanged correctly, discovery fails and the leaf remains in an Unknown state.

Typical LLDP problems include LLDP being disabled on connected devices, LLDP filtered due to security policies, incorrect cabling such as connecting a leaf to something other than a spine, or the switch running in NX‑OS mode instead of ACI mode.

Symptoms include missing neighbor information, partial discovery, or interfaces appearing operationally down.

To resolve LLDP issues, ensure LLDP is enabled end‑to‑end, verify correct cabling from leaf to spine only, confirm the switch is running in ACI mode, and check optics and interfaces on both ends.

4. Firmware or Software Incompatibility

ACI fabric components are designed to work within a compatible software matrix. Significant software mismatches between the APIC, leaf, and spine can prevent successful node onboarding.

This often occurs when a leaf is running an unsupported ACI version, the APIC has been upgraded but the leaf image was not updated, or an incorrect software image is installed on the switch.

Typical symptoms include the leaf being detected but never transitioning from Unknown to Active, along with compatibility or image‑related faults.

Resolution requires verifying Cisco’s supported version matrix and ensuring that the leaf software version is compatible with both the APIC and spine versions.

5. Hardware Problems

Physical layer issues are a common but frequently overlooked cause of Unknown leaf state. Even a simple faulty optic can completely prevent discovery.

Common hardware causes include defective or unsupported transceivers, damaged fiber or copper cables, faulty ports on the leaf or spine, or mismatched speed or media types.

Indicators include interfaces staying down, intermittent connectivity, missing LLDP information, or hardware‑related faults in APIC.

Troubleshooting involves replacing suspect cables and optics, using only Cisco‑supported transceivers, testing alternate ports, and validating interface status on both leaf and spine.

6. Time Synchronization Issues

Certificate validation in ACI is time‑sensitive. If the system time on the leaf is significantly out of sync with the APIC, certificate authentication can fail even if the configuration and connectivity are correct.

This is common in environments where NTP is misconfigured, unavailable, or the device has been powered off for an extended period.

Symptoms include authentication failures and persistent Unknown leaf state with no obvious physical or configuration issues.

Resolution involves verifying NTP configuration on APIC, ensuring the leaf can synchronize time, and reinitiating discovery after time correction.

7. Incorrect Node ID or Serial Number Issues

ACI uniquely identifies nodes using a combination of node ID, serial number, and certificates. If these identifiers do not match what APIC expects, the leaf will fail authentication.

This commonly occurs when a switch was previously part of another fabric, reused after RMA without proper cleanup, or when a node ID conflict exists.

Symptoms include the leaf appearing with unexpected identity information or being rejected during registration.

The safest resolution is to fully wipe the leaf configuration, reboot the device, and allow APIC to assign a fresh node identity.

8. Recommended Troubleshooting Sequence

When a leaf is stuck in Unknown state, follow this sequence:

First, verify physical connectivity and optics.
Second, confirm LLDP adjacency and cabling.
Third, check software compatibility.
Fourth, validate certificates and authentication.
Fifth, ensure correct time synchronization.
Finally, reinitialize the leaf if needed.

Following this order avoids unnecessary configuration changes and reduces downtime.

9. Best Practices to Prevent Unknown Leaf State

Always wipe reused hardware before deployment.
Keep APIC, spine, and leaf software versions compatible.
Use supported Cisco optics and cables.
Ensure stable NTP configuration.
Verify LLDP connectivity during installation.
Document node IDs and serial numbers carefully.

Most Unknown leaf issues are preventable with proper procedures.

10. Conclusion

An Unknown leaf state in Cisco ACI is always a symptom of a failed discovery, authentication, compatibility, or initialization process. Certificate issues, LLDP failures, firmware incompatibility, hardware problems, time synchronization issues, and incorrect node identity are the most common causes.

By understanding these root causes and following a structured troubleshooting approach, engineers can resolve Unknown leaf issues quickly and avoid prolonged deployment delays.

A clean initialization and methodical verification remain the most effective solution in Cisco ACI environments.

Cisco ACI L3Out Interview Questions Explained – Design, Implement, and Troubleshooting

Section 1: Basic Cisco ACI L3Out Interview Questions

1. What is L3Out in Cisco ACI?

L3Out (Layer‑3 Outside) is the ACI construct that provides external Layer‑3 connectivity between the ACI fabric and networks outside the fabric.

2. Why do we need L3Out?

L3Out is used to:

Connect ACI to external routers
Integrate firewalls
Provide north‑south traffic
Advertise routes between ACI and external networks

3. Is L3Out mandatory in ACI?

No. L3Out is required only if the ACI fabric needs external Layer‑3 communication.

4. Where is L3Out configured?

L3Out is configured under a Tenant, associated with a VRF, and deployed on leaf switches.

5. Is L3Out Layer‑2 or Layer‑3?

L3Out is strictly a Layer‑3 construct.

Section 2: L3Out Components Interview Questions

6. What are the main components of L3Out?

L3Out object
Logical Node Profile
Logical Interface Profile
External EPG
Contracts

7. What is a Logical Node Profile?

It defines which leaf nodes participate in the L3Out.

8. What is a Logical Interface Profile?

It defines:

Interface type (routed, SVI)
IP addressing
Encapsulation (VLAN)
Connectivity to external device

9. Can L3Out be deployed on multiple leafs?

Yes. L3Out is commonly deployed on multiple leaf switches for redundancy.

10. What happens if an L3Out leaf fails?

Traffic fails over to other L3Out‑enabled leafs, assuming proper design (ECMP / routing).

Section 3: L3Out and Routing Protocol Interview Questions

11. Which routing protocols are supported with L3Out?

Static routing
OSPF
BGP

12. Which routing protocol is most commonly used?

BGP, due to scalability and flexibility.

13. Is OSPF supported in L3Out?

Yes, but less commonly used in large deployments.

14. Can static routes be used in L3Out?

Yes, for simple or small environments.

15. Can L3Out support ECMP?

Yes. ACI supports ECMP for L3Out when routing protocols allow it.

Section 4: L3Out and VRF Association Questions

16. Is L3Out associated with a VRF?

Yes. Every L3Out must be associated with exactly one VRF.

17. Can one L3Out be shared across multiple VRFs?

No. One L3Out belongs to only one VRF.

18. Can multiple L3Outs exist in the same VRF?

Yes. A VRF can have multiple L3Outs.

19. Why would you create multiple L3Outs in one VRF?

Multiple external devices
Separate routing domains
Different security or routing policies

20. What happens if VRF association is wrong?

External routing will fail and traffic will be dropped.

Section 5: External EPG Interview Questions

21. What is an External EPG?

An External EPG represents external networks outside the ACI fabric.

22. Why is an External EPG required?

Because ACI is deny‑by‑default, and external networks must also follow ACI security policy.

23. How is traffic allowed between internal EPGs and External EPGs?

Using contracts.

24. Is External EPG similar to internal EPG?

Conceptually yes, but it represents external endpoints.

25. Can there be multiple External EPGs under one L3Out?

Yes.

Section 6: L3Out and Contracts (Very Important)

26. Is traffic allowed by default between ACI and external networks?

No. Traffic is denied by default.

27. How do you allow internal traffic to external networks?

Apply contracts between internal EPG and External EPG.

28. Can External EPG be provider or consumer?

It can be either or both, depending on traffic flow.

29. What happens if no contract is applied?

Traffic will be dropped, even though routing is correct.

30. Why do many L3Out issues occur?

Because routing works, but contracts are missing or incorrect.

Section 7: L3Out Design Interview Questions

31. Routed Interface vs SVI – what is preferred?

Routed interfaces are preferred for simplicity and scale.

32. When would you use SVI‑based L3Out?

When connecting to:

Traditional VLAN‑based networks
Legacy firewalls

33. Can L3Out connect to firewalls?

Yes, very commonly.

34. Can one firewall connect to multiple L3Outs?

Yes, depending on design.

35. Should L3Out be deployed on border leafs?

Yes. Border leafs are best practice.

Section 8: Advanced L3Out Interview Questions

36. How is route leaking handled in ACI?

Using Shared Services VRF and contracts.

37. Can L3Out be used with Shared Services VRF?

Yes, very commonly.

38. Can L3Out be stretched across sites?

Multi‑Pod: Yes
Multi‑Site: Via individual site L3Outs

39. How does L3Out behave in Multi‑Pod?

L3Out is shared across pods.

40. How does L3Out behave in Multi‑Site?

Each site has its own L3Out, orchestrated by NDO.

Section 9: L3Out and External Connectivity Troubleshooting Questions

41. Routing is correct but traffic fails – why?

Most likely contract or filter issue.

42. Endpoint can ping gateway but not internet – why?

External EPG contract missing or incorrect.

43. How to verify routes learned from L3Out?

APIC routes view
Leaf show commands
moquery

44. How do you verify contract programming?

Use:

show zoning-rule

45. How do you verify L3Out operational status?

APIC Health score
Faults
Leaf CLI

Section 10: MoQuery Commands for L3Out Verification

46. Verify L3Out configuration

moquery -c l3extOut

47. Verify External EPGs

moquery -c l3extInstP

48. Verify L3Out subnets

moquery -c l3extSubnet

49. Verify VRF association

moquery -c fvCtx

50. Check faults related to L3Out

moquery -c faultInst

Section 11: Common L3Out Mistakes (Interview Favorite)

51. Forgetting contracts

Most common mistake.

52. Wrong VRF association

Causes route blackholing.

53. Deploying L3Out on wrong leaf

Traffic won’t exit properly.

54. Using SVI instead of routed interface unnecessarily

Adds complexity.

55. Not planning for redundancy

Leads to single‑point failures.

Section 12: Scenario‑Based L3Out Interview Questions

56. When should you create multiple External EPGs?

When different external networks need different security policies.

57. Can multiple L3Outs advertise the same prefix?

Yes, but routing behavior must be carefully designed.

58. Can L3Out connect to non‑Cisco devices?

Yes. ACI is vendor‑agnostic at Layer‑3.

59. Can L3Out be used for Internet access?

Yes, with proper NAT/firewall integration.

60. What is the biggest design challenge in L3Out?

Balancing security, simplicity, and scalability.

Conclusion

Cisco ACI L3Out is the gateway between the ACI fabric and the external world. Interviews around L3Out focus on design understanding, security enforcement, VRF association, and troubleshooting approach, not just configuration steps.

If you understand:

How routing works
Why contracts are mandatory
Where L3Out should be placed
How to verify and troubleshoot

you will handle most Cisco ACI L3Out interview questions confidently.

✅ Interview Tip

When answering L3Out questions, always explain:

Routing
Security (contracts)
Placement (leafs)
Verification

Cisco ACI Interview Questions and Answers (ESG, Multi‑Site, NDO, MoQuery Explained)

Cisco Application Centric Infrastructure (ACI) is a cornerstone technology in modern enterprise data centers. As a result, Cisco ACI interview questions appear frequently in interviews for Network Engineers, Data Center Specialists, ACI Architects, and CCIE Data Center candidates.

This comprehensive guide brings together basic, intermediate, and advanced ACI interview questions, including Endpoint Security Groups (ESG), Multi‑Pod, Multi‑Site, and Nexus Dashboard / NDO, with concise, practical answers. It also includes comparison tables frequently used by interviewers to test real‑world understanding.

Section 1: Cisco ACI Fundamentals – Core Interview Questions

1. What is Cisco ACI?
Cisco ACI is a policy‑based data center networking solution that centralizes management and enforces application‑centric policies across a fabric.

2. What problem does Cisco ACI solve?
It reduces operational complexity, configuration drift, and scalability issues found in traditional networking.

3. Which switches are used in ACI?
Cisco Nexus 9000 series switches running in ACI mode.

4. What is APIC?
APIC (Application Policy Infrastructure Controller) is the centralized control and management platform for the ACI fabric.

5. Is APIC part of the data path?
No. APIC is out of the data path; traffic continues even if APIC is unavailable.

Section 2: ACI Architecture Interview Questions

6. What topology does ACI use?
Leaf–spine architecture.

7. What connects to leaf switches?
Endpoints such as servers, firewalls, load balancers, and L3Outs.

8. What is the role of spine switches?
High‑speed packet forwarding between leaf switches.

9. Can endpoints connect to spine switches?
No.

10. What happens if a spine fails?
Traffic reroutes through remaining spines without impact.

Section 3: ACI Logical Model Questions

11. What is a Tenant?
An administrative boundary representing an organization or business unit.

12. What is a VRF in ACI?
A Layer‑3 routing domain providing IP isolation.

13. What is a Bridge Domain (BD)?
A Layer‑2 forwarding domain that defines flooding and gateway behavior.

14. What is an Endpoint Group (EPG)?
A logical group of endpoints that share the same policy.

15. Is an EPG the same as a VLAN?
No. EPGs are policy objects, not VLANs.

Section 4: Traffic Flow and Contracts Interview Questions

16. What is the default traffic behavior in ACI?
Traffic between EPGs is denied by default.

17. How is traffic allowed?
Using contracts.

18. What is a contract?
A policy object that defines who talks, what traffic is allowed, and direction.

19. What are subjects?
Logical groupings of filters within a contract.

20. What is a filter?
Defines protocol, port, and direction.

Section 5: Advanced Policy – vzAny and Taboo

21. What is vzAny?
A special object that represents all EPGs within a VRF.

22. Why use vzAny?
To simplify policy and reduce TCAM usage.

23. What is a Taboo Contract?
A deny contract used to explicitly block traffic.

24. Does Taboo override permit contracts?
Yes. Deny always takes precedence.

25. When should Taboo be used?
Only for specific, unavoidable deny cases.

Section 6: Endpoint Security Group (ESG) Interview Questions

26. What is an ESG?
Endpoint Security Group is a policy‑based security construct independent of topology.

27. How is ESG different from EPG?
EPG is topology‑based; ESG is security‑policy‑based.

28. Can ESG span multiple EPGs?
Yes.

29. Does ESG use contracts?
Yes, contracts are applied directly between ESGs.

30. Is ESG mandatory?
No, it is optional and mainly used for zero‑trust designs.

🔍 Comparison Table: EPG vs ESG

Feature	EPG	ESG
Based on	Topology	Security policy
Dependency	BD / VLAN	Independent
Scope	Limited	Cross‑EPG
Zero‑Trust	Basic	Strong
Use Case	General policy	Advanced security

Section 7: ACI Multi‑Pod Interview Questions

31. What is ACI Multi‑Pod?
A single ACI fabric stretched across multiple locations (pods).

32. Is Multi‑Pod one fabric?
Yes.

33. How many APICs manage Multi‑Pod?
One APIC cluster.

34. Are L2 and L3 stretched?
Yes.

35. What is IPN?
Inter‑Pod Network connecting pods.

36. What is the main risk of Multi‑Pod?
Increased fault domain.

Section 8: ACI Multi‑Site Interview Questions

37. What is ACI Multi‑Site?
Multiple independent ACI fabrics managed under common policy.

38. Are fabrics independent?
Yes.

39. Is Layer‑2 stretched in Multi‑Site?
No, Multi‑Site is primarily Layer‑3.

40. What is Multi‑Site mainly used for?
Disaster recovery and geo‑redundancy.

🔍 Comparison Table: Multi‑Pod vs Multi‑Site

Feature	Multi‑Pod	Multi‑Site
Fabric	Single	Multiple
APIC	Shared	Separate
L2 Stretch	Yes	No
Latency Requirement	Strict	Relaxed
Fault Isolation	Low	High
Best Use Case	Metro DC	Geo‑redundancy

Section 9: Nexus Dashboard & NDO Interview Questions

41. What is Nexus Dashboard (ND)?
A unified platform hosting ACI‑related services like NDO, NDI, and Insights.

42. What is Nexus Dashboard Orchestrator (NDO)?
A tool used to orchestrate policies across multiple ACI sites.

43. What was NDO previously called?
MSO (Multi‑Site Orchestrator).

44. Does NDO replace APIC?
No.

45. What is a schema in NDO?
A logical template defining tenants and policies.

Section 10: Nexus Dashboard Insights (NDI)

46. What is NDI?
Nexus Dashboard Insights provides health analytics, anomaly detection, and assurance.

47. Does NDI configure the fabric?
No. It is analytics only.

48. Is NDI mandatory?
No, but highly recommended for large environments.

🔍 Comparison Table: ND vs NDO vs NDI

Component	Purpose
Nexus Dashboard	Platform
NDO	Policy orchestration
NDI	Analytics & assurance
APIC	Fabric control

Section 11: Troubleshooting Interview Questions

49. What is a health score?
A numeric representation of object health.

50. What is a fault?
An abnormal condition detected in the fabric.

51. What is moquery?
A read‑only CLI tool to query ACI managed objects.

52. Is moquery safe in production?
Yes.

53. Why prefer moquery over GUI?
Speed and accuracy.

Section 12: Automation and Operations

54. Does ACI support automation?
Yes, via native REST APIs.

55. Can Ansible be used with ACI?
Yes.

56. What is Day‑0?
Fabric deployment.

57. What is Day‑1?
Policy configuration.

58. What is Day‑2?
Operations and troubleshooting.

Section 13: ACI Disadvantages (Interview Favorite)

59. What is the biggest challenge in ACI?
Learning curve.

60. Is ACI expensive?
Yes, compared to traditional designs.

61. Is ACI vendor locked?
Yes.

62. Can ACI be over‑engineered?
Yes, with poor design.

Conclusion

Cisco ACI interviews test more than definitions—they assess design thinking, security understanding, architecture choice, and operational awareness. A clear grasp of EPG vs ESG, Multi‑Pod vs Multi‑Site, and NDO vs NDI is critical for senior‑level roles.

If you understand why ACI behaves the way it does, not just how to configure it, you will stand out in interviews.

Cisco ACI Taboo Contract vs vzAny Contract: Complete Guide with Configuration Examples

If you've been working with Cisco ACI contracts and wondering when to use a Taboo Contract versus a vzAny Contract, you're not alone. These two policy constructs are among the most misunderstood concepts in ACI — and using the wrong one in a production environment can lead to unexpected traffic flows, TCAM exhaustion, or even serious security gaps.

In this guide, I'll break down both contract types from the ground up, explain exactly when to use each one, walk through configuration steps in APIC, and share real-world lessons from 10 years of deploying and troubleshooting Cisco ACI and Nexus environments (CCIE DC #XXXXX).

Table of Contents

1. What are ACI Contracts?

Before diving into Taboo and vzAny, it's worth quickly grounding ourselves in ACI's policy model. In Cisco ACI, contracts are the mechanism that controls traffic between Endpoint Groups (EPGs). By default, ACI operates on an implicit-deny model — no traffic flows between EPGs unless a contract explicitly permits it.

A standard ACI contract consists of:

Subjects — a grouping of filters applied to a traffic flow
Filters — define the traffic (protocol, port, direction)
Provider EPG — the EPG offering the service
Consumer EPG — the EPG consuming the service

Taboo Contracts and vzAny Contracts are special variations of this model that serve very different purposes — one is designed to deny traffic, and the other to simplify large-scale policy across an entire VRF.

2. What is a Taboo Contract?

A Taboo Contract is a special ACI contract type used to explicitly deny specific traffic for an EPG. Think of it as a blacklist — you use it when you want to block a particular type of traffic that would otherwise be permitted by a broader "permit-all" contract or contract preferred group.

The name "Taboo" reflects its intent: traffic matched by a Taboo Contract is forbidden. It operates at the EPG level and is typically used alongside other contracts that permit broader access.

Key characteristics of Taboo Contracts:

Applied directly to an EPG (not a VRF)
Works as a deny override — takes precedence over permit contracts
Requires manually creating filters for the traffic you want to block (e.g., TCP port 23 for Telnet, TCP port 80 for HTTP)
Generates TCAM entries per filter — the more granular your filters, the more TCAM it consumes
Cisco generally discourages Taboo Contracts unless absolutely necessary, because misconfiguration can cause unintended outages

Important: Taboo Contracts were designed for specific use cases — primarily to block cleartext or insecure protocols (Telnet, FTP, HTTP) when a broader permit contract is already in place. They are not a general-purpose security tool and should be used sparingly.

Typical Taboo Contract use case:

Imagine your security policy mandates that no EPG should ever communicate over Telnet (TCP 23) or unencrypted HTTP (TCP 80), even if a broader "permit-all" contract exists. Rather than modifying every single contract across your fabric, you apply a Taboo Contract directly to the EPGs in question, creating a deny entry that overrides any existing permits for those specific ports.

3. What is a vzAny Contract?

vzAny is a special managed object in Cisco ACI that represents all EPGs within a VRF. When you associate a contract with vzAny, that contract automatically applies to every EPG in that VRF — both as a provider and as a consumer simultaneously.

In practical terms, vzAny is ACI's way of saying "apply this policy to everything in this VRF at once." It is the most efficient way to enable intra-VRF communication or to apply a common policy across a large number of EPGs without creating individual contract relationships for every pair.

Key characteristics of vzAny:

Operates at the VRF level — affects all EPGs inside that VRF
Acts as both provider and consumer simultaneously, creating a "wildcard" relationship
Dramatically reduces the number of contract relationships needed in large environments
Optimises TCAM — instead of n² contract entries for n EPGs, vzAny creates a single group-level entry
Must be used carefully in multi-tenant environments to avoid unintended traffic leaks between tenants
Recommended for large-scale deployments where many EPGs need common policy

Pro tip: vzAny is especially powerful when combined with the "prefer" option for intra-EPG isolation. You can allow broad intra-VRF communication via vzAny while still enforcing stricter policies between specific EPGs using targeted contracts.

Typical vzAny use case:

In a large enterprise environment with 50+ EPGs across a VRF, you want all EPGs to communicate freely with a shared set of services (DNS, NTP, monitoring). Without vzAny, you'd need to create individual provider/consumer relationships for every EPG — that's potentially hundreds of contract associations. With vzAny, you create one contract, associate it with vzAny as the consumer, and the shared-service EPG as the provider. Done.

4. Taboo vs vzAny: Full Comparison Table

Aspect	Taboo Contract	vzAny Contract
Primary Purpose	Explicitly deny specific traffic types for an EPG	Apply contracts to all EPGs within a VRF simultaneously
How It Works	Acts as a deny filter — overrides permit contracts for matched traffic	Acts as a wildcard — represents every EPG in the VRF as both provider and consumer
Scope of Application	Individual EPG level	Entire VRF level — all EPGs inside it
Primary Use Case	Block insecure protocols (Telnet, FTP, HTTP) or specific ports	Enable free intra-VRF communication or simplify many-to-one service consumption
Configuration Complexity	Higher — requires manual filter creation per protocol/port to deny	Lower — one contract association covers all EPGs automatically
TCAM Usage	Can be high — each filter creates additional TCAM entries per leaf	Optimised — single group-level entry significantly reduces TCAM consumption
Traffic Direction Impact	Affects only the specific EPG it is applied to	Affects all EPGs in the VRF — bidirectional impact
Multi-tenant Safety	Safe — scoped to specific EPG	Risky if misconfigured — can cause unintended cross-tenant traffic leaks
Best Practice	Use sparingly — only when a specific deny is required and cannot be achieved by restructuring contracts	Recommended for large-scale environments with shared services or open intra-VRF communication needs
Cisco Recommendation	Generally discouraged unless absolutely necessary	Recommended for efficient policy management at scale
Interaction with Other Contracts	Overrides permit contracts — deny takes precedence	Works alongside EPG-specific contracts — EPG contracts take precedence where defined
Limitations	Not suitable for inter-EPG communication control at scale; increases TCAM pressure	Must follow strict guidelines to avoid unintended traffic leaks across EPGs or tenants

5. Configuring a Taboo Contract in APIC

Here's a step-by-step walkthrough of configuring a Taboo Contract to block Telnet (TCP 23) traffic from an EPG:

Step 1 — Create the Filter

In APIC, navigate to Tenants → [Your Tenant] → Contracts → Filters
Right-click Filters and select Create Filter
Name it something descriptive: deny-telnet-filter
Add a Filter Entry:

Name: block-tcp-23
EtherType: IP
IP Protocol: TCP
Destination Port From: 23
Destination Port To: 23

Click Submit

Step 2 — Create the Taboo Contract

Navigate to Tenants → [Your Tenant] → Contracts → Taboo Contracts
Right-click and select Create Taboo Contract
Name it: block-insecure-protocols
Under Subjects, add a Subject:

Name: deny-telnet
Associate the filter: deny-telnet-filter

Click Submit

Step 3 — Apply the Taboo Contract to an EPG

Navigate to Tenants → [Your Tenant] → Application Profiles → [Your AP] → [Your EPG]
Under the EPG, find Taboo Contracts
Right-click and select Add Taboo Contract
Select block-insecure-protocols
Click Submit

Caution: Always verify the Taboo Contract is working as expected in a lab or lower environment before applying it to production EPGs. Use the APIC Troubleshoot → Traffic Map tool or show zoning-rule on the leaf switches to confirm the deny rules are programmed correctly.

6. Configuring vzAny in APIC

Here's how to configure vzAny to allow all EPGs within a VRF to consume a shared DNS service:

Step 1 — Create the Contract (standard contract, not Taboo)

Navigate to Tenants → [Your Tenant] → Contracts
Right-click and select Create Contract
Name it: shared-dns-contract
Add a Subject with a filter permitting UDP port 53
Click Submit

Step 2 — Set the DNS EPG as Provider

Navigate to your DNS EPG
Under Provided Contracts, add shared-dns-contract

Step 3 — Associate vzAny as Consumer

Navigate to Tenants → [Your Tenant] → Networking → VRFs → [Your VRF]
Click on the VRF and find the EPG Collection for VRF (vzAny) section
Under Consumed Contracts, add shared-dns-contract
Click Submit

With this configuration, every EPG inside the VRF automatically becomes a consumer of the DNS service — no individual contract relationships needed.

# REST API call to associate contract with vzAny { "vzAny": { "attributes": { "dn": "uni/tn-MyTenant/ctx-MyVRF/any", "prefGrMemb": "disabled" }, "children": [{ "vzRsAnyToCons": { "attributes": { "tnVzBrCPName": "shared-dns-contract" } } }] } }

7. TCAM Impact: What You Need to Know

TCAM (Ternary Content Addressable Memory) is one of the most critical hardware resources in ACI leaf switches. Running out of TCAM space causes contract rules to fail to program, which means traffic silently drops. Understanding how Taboo and vzAny affect TCAM is essential for any production deployment.

Taboo Contract — TCAM behaviour:

Each filter entry in a Taboo Contract creates a separate TCAM entry per EPG per leaf where that EPG has endpoints
If you have granular filters (e.g., blocking 10 different ports), you get 10 TCAM entries per EPG per leaf
In large fabrics with many EPGs and many leaves, Taboo Contracts can consume TCAM rapidly
Always check leaf TCAM utilisation with: show system internal eltmc info policy brief

vzAny — TCAM behaviour:

vzAny uses a group-level TCAM entry instead of per-EPG entries
Instead of creating N entries for N EPGs in the VRF, vzAny creates a single entry representing the entire VRF class
This makes vzAny dramatically more TCAM-efficient in environments with many EPGs
The trade-off is less granular control — you cannot easily apply vzAny selectively to only some EPGs

TCAM rule of thumb: If you have more than 20 EPGs in a VRF and need a common policy, vzAny will almost always be more TCAM-efficient than individual contracts or Taboo Contracts applied per EPG.

8. Real-World Use Cases

When Taboo Contract makes sense:

Compliance-driven blocking: Your security team mandates no Telnet or FTP anywhere, even if ops teams have existing "permit-all" contracts. A Taboo Contract enforces this without restructuring all existing policies.
Temporary traffic block during maintenance: You need to block a specific port to an EPG temporarily during an upgrade without removing the existing contract.
Protocol enforcement: Forcing all communication to be encrypted — block HTTP (80) and enforce HTTPS (443) only for specific EPGs.

When vzAny makes sense:

Shared services access: DNS, NTP, syslog, monitoring — services that every EPG in the VRF should be able to reach.
Development/test environments: Where you want all EPGs to communicate freely with each other without managing individual contracts.
Large-scale deployments: 30+ EPGs that all need the same baseline policy — vzAny avoids an explosion of contract relationships.
Migration from traditional networking: When converting a flat network to ACI and initially needing open intra-VRF communication while policy is refined.

9. Which One Should You Choose?

Your Situation	Recommended Choice
Need to block a specific port or protocol for one or a few EPGs	Taboo Contract
Need all EPGs in a VRF to access a shared service (DNS, NTP, monitoring)	vzAny (consumer) + standard contract
Want open communication between all EPGs in a VRF (dev/test)	vzAny with permit-all contract
Concerned about TCAM exhaustion with many EPGs	vzAny — always more TCAM-efficient at scale
Need to enforce a security deny that overrides existing permits	Taboo Contract
Multi-tenant environment needing strict isolation	Avoid vzAny — use targeted contracts per EPG
Migrating from flat network, need temporary open access	vzAny with a plan to tighten policy over time

My recommendation from the field: In 10 years of ACI deployments, the most common mistake I've seen is over-relying on Taboo Contracts as a security tool. Taboo Contracts should be your last resort — a scalpel, not a hammer. If you find yourself adding many Taboo Contracts across multiple EPGs, that's usually a sign your contract design needs to be re-architected. On the other hand, vzAny is underutilised — most environments with 20+ EPGs sharing common services would benefit enormously from vzAny, yet many engineers avoid it because they don't fully understand it. Used correctly, vzAny is one of the most powerful simplification tools in the entire ACI policy model.

10. Common Mistakes to Avoid

Taboo Contract mistakes:

Applying Taboo to the wrong EPG direction: Remember Taboo Contracts only affect the EPG they're applied to — not the EPG on the other side of the communication.
Using Taboo instead of fixing contract design: If you need many Taboo Contracts, your underlying contract architecture probably needs to be redesigned.
Not testing in a lab first: A misconfigured Taboo Contract can block legitimate traffic and cause outages. Always validate with traffic tests before production rollout.
Ignoring TCAM utilisation: Adding many granular Taboo filters across many EPGs can silently exhaust TCAM and cause rule programming failures.

vzAny mistakes:

Using vzAny in multi-tenant environments without careful scoping: If your VRF spans multiple tenants, vzAny can unintentionally allow cross-tenant traffic. Always scope VRFs carefully before applying vzAny.
Applying vzAny with permit-all in production: Fine for dev/test, but dangerous in production. Always use vzAny with a specific contract that permits only required traffic.
Forgetting that vzAny affects endpoint discovery too: vzAny contracts can affect how COOP (Council of Oracle Protocol) handles endpoint learning. Test thoroughly after applying vzAny changes.

11. Frequently Asked Questions

Can I use Taboo Contract and vzAny together?

Yes. A common pattern is to use vzAny to allow broad intra-VRF communication, and then apply a Taboo Contract to specific EPGs to block particular protocols within that open policy. The Taboo deny takes precedence over the vzAny permit for the matched traffic.

Does vzAny work with the Contract Preferred Group?

Yes, but with important caveats. When Contract Preferred Group is enabled on a VRF, EPGs within the preferred group communicate freely without contracts. vzAny can still be used alongside preferred groups to apply contracts to EPGs outside the preferred group. See our post on Contract Preferred Groups in ACI for more detail.

What happens to TCAM if I apply both Taboo and vzAny?

The vzAny entry remains efficient (group-level). The Taboo Contract adds per-EPG entries only for the specific EPGs it is applied to. As long as you keep Taboo Contracts minimal and targeted, the TCAM impact remains manageable.

Can vzAny be used as a provider instead of a consumer?

Yes. vzAny can be configured as a provider, consumer, or both. When used as a provider, all EPGs in the VRF provide the contract — useful for making all EPGs accessible to a specific external consumer (like a monitoring system) without configuring each EPG individually.

How do I verify Taboo Contract rules are programmed on the leaf?

SSH to the leaf switch and run:

leaf# show zoning-rule scope [vrf-vnid] # Look for entries with action "deny" — these are your Taboo Contract rules # Also check: show system internal policy-mgr stats

Summary

Key takeaways

Taboo Contract = EPG-level explicit deny. Use it sparingly to block specific protocols. Increases TCAM consumption per EPG.
vzAny = VRF-level wildcard. Use it to apply contracts to all EPGs at once. Much more TCAM-efficient at scale.
They can be used together — vzAny for broad policy, Taboo for specific overrides.
In multi-tenant environments, use vzAny with caution to avoid cross-tenant traffic leaks.
If you find yourself creating many Taboo Contracts, it's time to redesign your contract architecture.

Friday, 13 March 2026

VRF Enforced Mode - Cisco ACI

Understanding Cisco ACI VRF Enforced Mode (Simple & Secure Explanation)

In Cisco ACI, VRF Enforced Mode is the default and most secure policy model. It follows a strict white‑list approach, meaning EPGs cannot communicate with each other unless a contract explicitly allows that traffic. Even if two EPGs exist in the same VRF, they still need a contract for inter‑EPG communication.

Why VRF Enforced Mode Matters

Secure by default: All inter‑EPG traffic is denied until permitted by a contract.
Controlled micro-segmentation: Only required communication paths are opened.
Intra‑EPG traffic allowed: Endpoints inside the same EPG can talk freely unless isolation is configured.

Other Policy Modes

Unenforced Mode:
All traffic inside the VRF is allowed without contracts. Used rarely—mainly for testing or troubleshooting.
Preferred Group:
A flexible option where selected EPGs can communicate freely, while others still require contracts. Useful during migrations or gradual policy tightening.

How to Configure VRF Enforced Mode

Navigate to:
Tenants → [Tenant Name] → Networking → VRFs

Under Policy Control Enforcement Preference, choose:
🔒 Enforced

Once enabled, contracts become mandatory for any EPG-to-EPG communication.

Switching from Unenforced to Enforced

If you’re tightening security from an unenforced setup, you can enable Preferred Groups to avoid breaking existing traffic flows during the transition. This allows a smooth shift while still moving toward full policy enforcement.