Today we will talk about the CPU protection from broadcast / multicast traffic on the Extreme X670 platform. These considerations are also correct for the X670V.
X670 family switches are built on the old Trident + and in comparison with classmates – Huawei (S6700-48-EI) or Cisco (N3K-3064) practically dont have control plane protection at all.
Extreme have the built-in to XOS feature – dos-protect. However, this feature will not save you in case of broadcast or multicast storm.
Best practice to protect against BUM traffic, is to use the storm-control or rate-limit flood option in XOS. Here we will also get an unpleasant surprise. The X670 G1 platform uses XOS 16.x, in which the rate-limit flood functionality is implemented using tokens with an update interval of 15.625 microseconds. Without going into details, I’ll say right away – the rate-limit flood option will drop legitimate traffic due to the imperfection of the mechanism.
You can read more on the extremenetworks.com forum.
The described methods have been successfully used as the last border of CPU protection on L2 aggregate switches. However, best practice is protection on access layer.
Theory:
So, it’s time to figure out what traffic is lifted to CPU:
Queue 0 : Broadcast and IPv6 packets
Queue 1 : sFlow packets
Queue 2 : vMAC destined packets (VRRP MAC and ESRP MAC)
Queue 3 : L3 Miss packets (ARP request not resolved) or L2 Miss packets (Software MAC learning)
Queue 4 : Multicast traffic not hitting hardware ipmc table (224.0.0.0/4 normal IP multicast packets neither IGMP nor PIM)
Queue 5 : ARP reply packets or packets destined for switch itself
Queue 6 : IGMP or PIM packets
Queue 7 : Packets whose TOS field is "0xc0" and Ethertype is "0x0800", or STP, EAPS, EDP, OSPF packets
When L2 loop occurs, we will get Broadcast, IPV6 and Multicast lifted to CPU.
To dump fraffic hitted CPU and analyze the current situation, you can use this command:
debug packet capture on interface Broadcom count 1000
Dump will be saved in /usr/local/tmp.
You can use tftp or scp to copy dump to local machine for further analysis.
tftp put 11.1.1.1 vr "VR-Default" /usr/local/tmp/2020-02-10_16-54-22_rx_tx.pcap 2020-02-10_16-54-22_rx_tx.pcap
Now we can analyze regular traffic hitting CPU. It is time to build the protection.
Practice:
First, let’s see what will happen with CPU without any protection, if we will generate and flood broadcast and multicast traffic .
Lab scheme:
Im using pktgetn linux utility to generate BcMc traffic, and nmap script for to generate IPV6 RA flood.
Multicast traffic and IPV6 RA are generated on the gen1 server, the broadcast is generated on the gen2 server.
The average generation rate is 3Mpps.
Started traffic generators, here is CPU usage:
sw2# sh cpu-monitoring | exclude 0.0
CPU Utilization Statistics - Monitored every 5 seconds
-----------------------------------------------------------------------
Process 5 10 30 1 5 30 1 Max Total
secs secs secs min mins mins hour User/System
util util util util util util util util CPU Usage
(%) (%) (%) (%) (%) (%) (%) (%) (secs)
-----------------------------------------------------------------------
System 52.2 50.3 50.4 47.3 48.9 26.0 13.0 69.7 0.32 964.28
mcmgr 39.9 40.4 41.6 43.0 38.6 10.5 5.2 47.1 46.47 356.98
sw2#top
Load average: 6.57 5.38 4.74 4/208 2905
PID PPID USER STAT RSS %MEM CPU %CPU COMMAND
1327 2 root RW 0 0.0 1 41.2 [bcmRX]
1543 1 root R 3852 0.3 0 40.4 ./mcmgr
sw2# debug hal show device port-info system unit 0 | include cpu
MC_PERQ_PKT(0).cpu0 : 1,131,571 +62,271 5,776/s
MC_PERQ_PKT(4).cpu0 : 187,781 +42,768 5,353/s
MC_PERQ_PKT(7).cpu0 : 503 +17
MCQ_DROP_PKT(0).cpu0 : 174,095,428 +29,103,098 3,298,296/s
MCQ_DROP_PKT(4).cpu0 : 78,082,718 +17,834,191 1,021,107/s
sw2# debug hal show congestion
Congestion information for Summit type X670-48x since last query
CPU congestion present: 12414120
Now, with transit broadcast / multicast traffic with about 6Mpps rate, we have 100% CPU load and 10Kpps in the queues on the CPU, everything else is dropped. Real traffic on CPU queues is many times higher.
Now let’s move on to the filters and see what we can do.
To protect Queue 0: Broadcast and IPv6 packets, i will create policy limit-bc-ipv6.
This config can be used if you are NOT using L3 IPv6 features.
configure meter limit-broadcast committed-rate 1000 Pps out-actions drop
sw#show policy limit-bc-ipv6
entry match-v6 {
if match all {
ethernet-type 0x86DD ;
}
then {
deny-cpu ;
}
}
entry match-bc {
if match all {
ethernet-destination-address ff:ff:ff:ff:ff:ff ;
}
then {
meter limit-broadcast ;
}
}
This policy can be applied to all ports, or only to those where a storm is possible.
configure access-list limit-bc-ipv6 ports 1-48 ingress
For a complete simulation, i will create a loop on ports 11 and 12.
sw2# sh ports 11-12,46-47 utilization
Link Utilization Averages Fri Apr 3 14:55:26 2020
Port Link Rx Peak Rx Tx Peak Tx
State pkts/sec pkts/sec pkts/sec pkts/sec
===========================================================================
11 A 7484877 8725834 7483368 8724075
12 A 7471259 8710066 7472785 8711816
46 A 2962834 3716425 7435635 8690736
47 A 2807846 3931673 7304028 8541282
3,000 Broadcast packets are lifted to the CPU; the rest is multicast.
sw2# debug hal show device port-info system unit 0 | include cpu
MC_PERQ_PKT(0).cpu0 : 2,456,584 +290,921 3,043/s
MC_PERQ_PKT(3).cpu0 : 3,104,008 +1,498,923 4,102/s
MC_PERQ_PKT(4).cpu0 : 3,125,047 +394,781 4,102/s
MC_PERQ_PKT(6).cpu0 : 1,222,571 +135,286 4,090/s
CPU utilization is still 100%:
sw2# sh cpu-monitoring
CPU Utilization Statistics - Monitored every 5 seconds
-----------------------------------------------------------------------
Process 5 10 30 1 5 30 1 Max Total
secs secs secs min mins mins hour User/System
util util util util util util util util CPU Usage
(%) (%) (%) (%) (%) (%) (%) (%) (secs)
-----------------------------------------------------------------------
System 50.3 48.9 46.9 51.4 58.1 31.7 15.8 68.8 0.32 1180.54
mcmgr 43.9 43.1 44.8 31.0 6.3 6.3 3.1 48.5 33.33 198.40
Let’s try to protect CPU from multicast.
To do so, you need to turn off igmp snooping on all vlanes that are facing port, where the storm occurs.
If igmp snooping is enabled on the vlan, the traffic will be forwarded to the CPU.
sw2# configure igmp snooping filters per-vlan
sw2#disable igmp snooping vlan test
sw2#disable igmp snooping vlan test2
After disabling igmp snooping on looped vlan, only 3000pps of broadcast hits CPU:
sw2# debug hal show device port-info system unit 0 | include cpu
MC_PERQ_PKT(0).cpu0 : 4,908,231 +6,352 2,994/s
MC_PERQ_PKT(7).cpu0 : 11,214 +3 1/s
CPU utilization ~ 15%.
sw2# sh cpu-monitoring | exclude 0.0
CPU Utilization Statistics - Monitored every 5 seconds
-----------------------------------------------------------------------
Process 5 10 30 1 5 30 1 Max Total
secs secs secs min mins mins hour User/System
util util util util util util util util CPU Usage
(%) (%) (%) (%) (%) (%) (%) (%) (secs)
-----------------------------------------------------------------------
System 9.3 10.5 16.5 13.7 2.7 0.4 0.2 34.1 0.14 18.96
hal 3.5 3.7 3.2 10.2 2.0 0.3 0.1 51.5 1.42 10.67
sw2#top
Mem: 577816K used, 443708K free, 0K shrd, 127032K buff, 157064K cached
CPU: 0.8% usr 16.5% sys 0.0% nic 69.9% idle 0.0% io 2.6% irq 9.9% sirq
Load average: 5.12 2.77 1.10 3/208 2015
PID PPID USER STAT RSS %MEM CPU %CPU COMMAND
1329 2 root SW 0 0.0 1 17.7 [bcmRX]
Summing up, i can say that these switches will be good as L2 aggregators or P routers in MPLS networks. If you are planing to use the X670 as combined L3/L2 switch, protecting CPU will a nightmare.
UPD(02/2021):
In ExtremeXOS 16.2.5-Patch1-22 (Apr 2020) storm-control fix was announced:
xos0063205 Even though the traffic rate is below the configured flood rate limit, traffic is dropped.
But, as usual, it was not completely fixed.
I will configure action = disable-port log for broadcast and multicast with threshold of 500pps and only log for unknown-destmac with same threshold.
configure port 1 rate-limit flood broadcast 500 out-actions log disable-port
configure port 1 rate-limit flood multicast 500 out-actions log disable-port
configure port 1 rate-limit flood unknown-destmac 500 out-actions log
Fired up traffic generator with random DST MAC pattern and 1000pps rate.
The problem is that action = disable-port is triggered when any type of traffic reaches the threshold. In our case, when the unknown-destmac threshold reaches 500pps, the port should not be turned off. However, the port turns off:
sw2#sh ports 1 rate-limit flood out-of-profile refresh
Port Monitor Fri Feb 12 11:48:33 2021
Port Flood Type Status Counter
#-------- ---------------- -------------- ----------
1 Unknown Dest MAC Out of profile 109336
1 Multicast Out of profile 109336
1 Broadcast Out of profile 109336
sw2# sh log
02/12/2021 11:52:14.71 Info:vlan.msgs.portLinkStateDown Port 1 link down
02/12/2021 11:52:14.70 Info:vlan.msgs.FldRateOutActDsblPort Port 1 disabled by Flood Control Rate Limit because the traffic exceeded the configured rate.
Tested on 16.2.5.4 16.2.5.4-patch1-29.
Hopefully this will be fixed in the next releases.