802.1d – STP
Last updated
Last updated
BPDUs are sent as 802.3 frames with the source address set to the MAC address of the sending switch port and with a destination address of 01:80:C2:00:00:00. The LLC header has the DSAP and SSAP fields set to 0x42.
Protocol ID is set to 0x0000 (IEEE 802.1d)
The version can be one of 0x00 (802.1d STP), 0x02 (802.1w – RSTP) and 0x03 (802.1s – MSTP).
The Type field indicates wether this BPDU is a Config BPDU(0x00) or a TCN – Topology Change Notification BPDU (0x80).
Bit 1 in the Flags field signals a TCN BPDU, while Bit 8 in the same field signals a TC ACK.
Timer fields like Message Age, Max Age, Hello Time and Forward Delay are stored in units of 1/256 sec.
To find out the Base Mac Address used by STP, you can use the following command:
Also, to see Spanning Tree parameters used for each vlan, you can use:
This method is used by default if the switch can support 1024 unique MAC addresses for its own use. STP uses one MAC address for each instance and since the MAC address is unique, the Bridge ID becomes unique.
Bridge Priority: 0 – 65535. Default: 32768
This method is used if the switch can’t use 1024 unique MAC addresses for itself. STP uses one MAC address for all instances, but uses the VLAN-ID to make the Bridge ID unique for each VLAN.
The Bridge ID is split into Priority Multiplier (4 bits) and VLAN-ID (12 bits). This means that the Bridge Priority is equal to 4096k + VLAN_ID. (0<=k<=15). Default BridgeID = 32768 + VLAN_ID, K=8. The use of the Extended System ID can be disabled, using:
The Bridge ID can be set manually or by using a macro:
The macro command will use the following algorithm to determine what BRIDGE-PRIORITY to assign the switch:
For primary:
If the current Root Bridge Priority is more than 24576 (k=6), then set the BRIDGE-PRIORITY to 24756
If the current Root Bridge Priority is less than or equal to 24576 then set the BRIDGE-PRIORITY to a value 4096 less than the current Root Bridge Priority
If the BRIDGE-PRIORITY should be set to less than 4096, then the macro command will fail and the priority should be manually set
For secondary:
Sets the BRIDGE-PRIORITY to 28762 (k=7)
Using the root macro will overwrite any custom STP timers that were configured
In early implementations, a linear scale is used for the cost of a link:
The disadvantage of using such a formula is that links with bandwidth higher than 1 Gbps (1000 Mbps) were considered of equal cost (1). Nowadays, a non-linear scale is used, according to the following table:
Bandwidth | STP Cost |
---|---|
4 Mbps | 250 |
10 Mbps | 100 |
16 Mbps | 62 |
45 Mbps | 39 |
100 Mbps | 19 |
155 Mbps | 14 |
622 Mbps | 6 |
1 Gbps | 4 |
10 Gbps | 2 |
To modify the default port cost, use:
To verify, use:
Setting the port cost is a method of influencing which ports are in FWD and which are in BLK state on a switch. Setting the cost on the downstream switch will influence the port election on that switch.
The Port ID is a combination of Port Priority and Port Number. The Port Priority can be set as multiples of 16 in the range 0-255. Default value is 128. The Port Number is also in the range 0-255. You cannot change the port Number, but you can change its priority, using:
To verify, use:
To see the Upstream port priority, look for ‘designated port id X.X’ when using:
A lower port priority is preferred. This method is used to influence the election of a forwarding port on a downstream switch, by setting a lower priority for a port on the upstream switch.
The best BPDU is chosen using the following algorithm:
Lowest Root Bridge ID
Lowest Path Cost to the Root Bridge
Lowest Sender Bridge ID
Lowest Sender Port ID
Lowest Receiver Port ID – not part of the BPDU, but evaluated locally
The BPDU mechanism works as follows:
When starting up, all switches think they are the Root Bridge. In 802.1d, as long as a switch thinks it is the Root Bridge, it will generate Config BPDUs every HelloTime and will send them out all (designated) ports, towards the other switches
If a switch receives a BPDU with a lower ID in the Root Bridge field, it will not consider itself anymore as the Root Bridge and will stop sending config BPDUs with itself as Root. It will only send out the Designated ports a modified version of the Config BPDUs received from the Root Bridge (on the Root Port).
A switch will not send BPDUs out on a port as long as it receives better BPDUs than the ones it would out that port. If better BPDUs are not received for a period of time (MaxAge = 20 sec default), the local port can once again send BPDUs.
As an exception, if a switch receives an inferior BPDU on a Designated Port, it will send a BPDU out that port to update the other switch
The STP Convergence is achieved in 3 steps:
Elect a Root Bridge
Elect Root Ports
Elect Designated Ports
When they first boot, all switches start sending BPDUs with themselves as the Root, until they receive a BPDU with a lower Bridge ID. In the end, all switches will receive BPDUs with the lowest Bridge ID as the Root Bridge. This Bridge will be considered the Root Bridge.
To see the role of a port in Spanning Tree, use:
The Root port is the port on the switch that has the best path to the Root Bridge. Unless the switch considers itself to be the Root Bridge, each switch will elect one port as the Root Port for each STP instance. The port where the best BPDU is received will be the Root Port. The Root Ports will transit to the forwarding state – FWD
On each segment, there are usually 2 or more switches. At first, each switch will think it is the designated switch on the segment, and will forward BPDUs on the segment. Eventually, all switches on the segment will find which is the switch that sends the best BPDUs on the segment. This switch will become the Designated Bridge and the port that sends the best BPDU on the segment will become the Designated Port. All ports on the Root Bridge should become Designated Ports, unless there is a loop connecting 2 ports on the Root Bridge. The Designated ports will transit to the forwarding state – FWD
All non-Root and non-Designated ports become Backup Ports. All Backup ports will transit to the Blocked State – BLK.
In 802.1d, there are 4 port states in addition to the Disabled state where the port is shut down or not participating in the STP process.
Blocking State – BLK
Listening State – LIS
Learning State – LRN
Forwarding State – FWD
To transit from BLK to FWD, one port must go through the LIS and LRN phase. Ports can directly transition from all other states to the BLK state. To see the state of a port in spanning tree, use:
All ports start in the BLK State. Ports can transition to the LIS state if they are elected to become Designated or Root ports. This is done only in the following situations:
if the port does not receive better BPDUs than the ones it would send for a MaxAge period (Default: 20 sec)
immediately in case of direct failures of the local root/designated port, based on port priority rules
Ports in the BLK state can only receive BPDUs. Receiving no BPDUs in a MaxAge period will transition the port to the LIS state. All Backup ports are immediately put in the BLK state.
Once in the LIS State, a port can both send and receive BPDUs, but it cannot forward data. Ports in the LIS state will transition back to the BLK state if they lose their Designated or Root status. This can only happen if they receive better BPDUs. Ports in the LIS state will transition to the LRN state if they don’t receive better BPDUs in a Forward Delay period (Default: 15 sec).
Once in the LRN state, a port can both send and receive BPDUs, it still cannot forward data, but it will build its MAC address table based on the traffic it receives. Ports in the LRN state will transition back to the Blocking state if they lose their Designated or Root status. This can only happen if they receive better BPDUs. Ports in the LRN state will transition to the FWD state if they don’t receive better BPDUs in a Forward Delay period (Default: 15 sec).
Root and Designated ports end up in the forwarding state. This state permits the port to send and receive BPDUs, but also to forward traffic (finally). A port in the Forwarding State can only move to the blocking state if it becomes a Backup port (loses its Designated or Root status). When a port first initializes it will take up to 50 seconds for it to get to the FWD state:
There are 2 types of BPDUs.
Config BPDUs are generated by the Root Bridge and go downstream to the Spanning Tree’s leafs.
TCN BPDUs go upstream towards the Root Bridge and are the only BPDUs sent out the Root Port.
How TCN BPDUs work:
A bridge originates a TCN BPDU if:
It transitions a port into FWD state and it has at least one Designated Port
It transitions a port form FWD or LRN to BLK state
A bridge will send TCN BPDUs every Hello Time (locally configured) until the TCN is acknowledged.
The upstream bridge sets the TCN ACK flag in it’s next BPDU that it sends downstream to acknowledge the TCN BPDU received.
The upstream bridge propagates the TCN BPDU out its Root Port.
When the TCN BPDU arrives at the Root Bridge, the Root Bridge sets the TCN Acknowledgement flag and the Topology Change flag in the next Configuration BPDU.
These flags instructs all bridges to shorten their bridge table aging process from the default 300 sec to the current FwdDelay, thus reducing the convergence time from 300 sec to 50 sec.
The Root Bridge continues to set the TCN flag in all its Config BPDUs for the total of FwdDealy + MaxAge (default:35 sec)
Downstream bridges will forward the TC BPDUs to other bridges
Timers can only be modified on the Root Bridge. Timer values are carried in Config BPDUs. Modifications on other bridges will be ignored until they become Root Bridge.
HelloTime is the time between Config BPDUs sent by the Root Bridge. It is also the interval between TCN BPDUs until the bridge receives a TCN ACK. Default: 2 sec. Lowering Hello Time to 1 sec does not improve convergence Hello Time can be set manually:
or by using a macro:
FwdDelay is the duration of Listening and Learning States. Also, when a bridge receives a BPDUs with TC flag set, it will set the bridge table aging period to FwdDelay. Default: 15 sec (This value was calculated using a max radius of 7 bridges, a Hello Time of 2 sec and a max of 3 lost BPDUs). RFC dictates that time spent in the LIS and LRN states must be equal. A simplified formula for FwdDelay calculation is:
For details, see Cisco’s article on Understanding and Tuning Spanning Tree Protocol Timers
Default FwdDelay is 15 sec. You can change it with:
How much time is the Best BPDU stored. If the Best BPDU is not received anymore, it is aged out, and a new Root/Designated Port Election will take place. Convergence is achieved in maximum 50 sec(20+15+15). Direct failures can make the port skip the Max Age timer and put another port directly into a Listening State. In the latter case, convergence is achieved in 30 sec(15+15) A simplified formula for Max Age is:
For details, see Cisco’s article on Understanding and Tuning Spanning Tree Protocol Timers
The default MaxAge value is 20 sec. To change it, use:
Message Age is set to 0 on BPDUs originated by the Root Bridge. Bridges that process BPDUs will increment this value by 1 at every bridge hop before forwarding the BPDU. If BPDUs from the Root are missed, the switch will count the missed BPDUs and when a BPDU from the root is finally received, it will increment this value once for each missed BPDU.
When a BPDU with TCN is received, the Bridge Table Aging Time will change from 300 sec to FwdDelay to reduce convergence time.
This feature enables the transition of a port from the BLK state directly to the FWD state, bypassing LIS and LRN states. This feature should only be used on access ports where only one host is connected. Transitions from FWD to BLK or BLK to FWD on ports configured as portfast do not generate TCN BPDUs out the Root Port. Ports configured as Portfast still run spanning tree in order to block a potential loop, so they still send BPDUs. BPDU Guard and BPDU Filter can be enabled on portfast ports. To enable the portfast status of a port, you can use the command:
You can also use a macro that enables portfast, sets the port to access mode and disables PAgP
The last option is to use a global command that will enable portfast for all non-trunking ports:
To verify, use:
When portfast is off, convergence is achieved in 50 sec, and when it is on, it is instant.
If a switch has multiple uplinks that lead to the Root Bridge, only one of them will be elected as the Root Port, and all others will be Backup Ports. When the primary uplink fails, a new one will be elected as the Root Port, but it still has to go from BLK through LIS and LRN states, before reaching the FWD state. The Uplink Fast feature enables leaf-mode switches at the end of the spanning tree branches to have a functioning Root Port while keeping one or more redundant or potential Root Ports in Blocking state. When the primary root port uplink fails, the next lowest Root Path Cost port is unblocked and used without delay. Uplink Fast is enabled per switch, on all VLANs. To enable it, use:
The command will not be allowed on the Root Bridge. When entering this command the switch will automatically:
Raise Bridge Priority to 49152 (k=12) to make sure that this switch does not become the Root Bridge and that the switch is not used as a transit switch to get to the Root Bridge
Increment Port Cost by 3000 making the ports undesirable as paths to the root for any downstream switches.
If the root port fails and uplink fast is enabled, as soon as a new root port is elected, the switch will sends multicast frames to 0100.0CCD.CDCD on behalf of all entries in the CAM table so that upstream switches can learn about the new uplink. These frames are sent at a configurable rate (max-update-rate). If the rate is set to 0, these frames are not sent. To verify:
Uplink Fast reduces convergence in case of direct failures from 30-50 sec down to zero.
It helps STP to converge faster, in case of indirect link failure. A switch detects an indirect link failure when it receives inferior BPDUs from the Designated Bridge on a Root Port or a Blocked Port. When this happens, the switch will use the RLQ(Root Link Query) protocol to see if upstream switches have a stable path to the Root Bridge. RLQ is enabled only when Backbone Fast is enabled, so you must use Backbone Fast on all switches in the network:
To verify:
If backbone fast is enabled, when the switch receives an inferior BPDU:
If the inferior BPDU came on a Blocked Port – The switch considers the Root Port and all other blocked ports as alternate paths to the Root Bridge. It starts using RLQ
If the inferior BPDU came on the Root Port
If there are no Blocked Ports – The switch assumes it is the Root Bridge – This is done quicker than by using the default MaxAge mechanism
If there are Blocked Ports – The switch considers all Blocked Ports as alternate paths to the Root Brige. It starts using RLQ
RLQ Requests are sent out all alternate paths. When a switch receives a RLQ request it will send a RLQ reply if it thinks it is the Root Bridge (it actually is or it lost connectivity to the Root Bridge). In any other case, it will relay the RLQ request to all other connected switches. The switch that send the RLQ Request will eventually receive a RLQ Reply from the Root Bridge.
If the reply came on the current Root Port, than the path to the Root is stable.
If the reply came on a non-Root Port, then an alternate path must be used. MaxAge will immediately expire so the port will go directly to the LIS state and from there to LRN and FWD. This way, convergence in case of indirect failure is reduced from 50 to 30 sec.
Root Guard should be used on ports where you do not expect root bridges to appear. It will protect against better BPDUs on ports where the Root Bridge should not be found. If better BPDUs are received on a Root Guard enabled port, the switch will move the port in a “Root-inconsistent” state. In this state, no data is sent or received, but the switch can listen to BPDUs. If BPDUs are not received anymore, the port enters normal STP cycle.
Another indication can be seen while looking at the spanning tree data for the VLAN:
BPDU Guard is a feature that protects against any BPDUs received. If any BPDUs are received on such ports, they are immediately put into errdisable state. In errdisable state, the ports are shutdown and have to be manually re-enabled (shut/no shut) or they have to wait for errdisable to timeout. By default, BPDU Guard will shutdown all VLANs on the port, but it can be configured to disable only the offending VLAN if you configure:
BPDU Guard can be enabled per port, regardless of the portfast status:
or it can be enabled globally on every portfast port, using:
To enable auto-recovery for bpduguard errordisabled ports, use:
BPDU Filter disables sending and receiving of BPDUs on the ports it is configured on. It is as if STP is not runnning on those ports. BPDU Filtering can be be enabled globally or per port, but there is a difference in how it works: To enable BPDU Filtering for all portfast ports:
This way, all portfast ports will stop sending BPDUs. If a BPDU is received on a PortFast port with BPDU filtering, the port will lose its portfast status and will disable BPDU filtering. It will now work like a normal STP port.
You can also enable BPDU filtering on each port, regardless of the portfast status.
This way, the port acts as if STP is disabled and may lead to loops in the network.
When enabled, it keeps track of the BPDU activity on non-designated ports. As long as BPDUs are received, the port is allowed to behave normally. When BPDUs go missing, Loop Guard moves the port into Loop-inconsistent state, so that it cannot go into FWD after MaxAge expires. To enable Loop Guard, use:
UDLD is a Cisco proprietary feature that interactively monitors a port to see if the link is truly bidirectional. UDLD expects the far-end switch to echo these frames back across the same link, with the far-end switch port’s identification added. If echoes are not seen back, the link is considered unidirectional. There are 2 modes of operation:
Normal: when an uni-directional link is detected, the port is allowed to continue operations. The port is marked in an undetermined state and a syslog message is generated.
Aggressive: when an uni-directional link is detected, the port sends 8 UDLD messages, once a second. If none is echoed back, the port is put in errdisable state
UDLD doesn’t do anything until the other switch echoes back messages. So, unless UDLD worked before, the link won’t be disabled. UDLD can be enabled globally for all optic links, using:
or it can be enabled on each port, regardless of type, using:
The time between probe messages can be adjusted globally with:
To verify, use:
To reset all ports blocked by UDLD, use:
Originally, IEEE adopted the 802.1d Spanning Tree standard. BPDUs were carried over 802.1q links as untagged frames. This means that only one STP instance was supported. This was known as the MST (Mono Spanning Tree) or CST (Common Spanning Tree)
Cisco decided to run one 802.1d instance on each VLAN, resulting in PVST (Per VLAN Spanning Tree). PVST runs only on ISL trunks so it was not inter-operable with IEEE 802.1d
In the end, Cisco came up with PVST+ a version of PVST that runs also on 802.1q links and is interoperable with IEEE 802.1d.
PVST+ sends different BPDUs for each VLAN. The IEEE 802.1d version sends one BPDU regardless of the number of VLANs. When a Cisco switch connects to a IEEE region, it will exchange untagged BPDUs over the native VLAN with the CST instance and will encapsulate BPDUs for the other VLANs inside a multicast frame that will be forwarded by the IEEE switches out all ports, eventually reaching other Cisco Switches that will understand their special meaning. This mode of tunneling is possible because the VLAN ID information is added to the frame as a TLV field.
PVST+ is the default spanning tree mode running on Cisco switches. If MST or rapid-pvst was enabled, you can re-enable PVST+ using: