PIM 101

PIM basics

PIM is a protocol used to create loop-free multicast delivery trees from the source to the receivers. There are 2 different modes for PIM – Dense Mode and Sparse Mode, and a few other variations: PIM Sparse-Dense Mode, Bidirectional PIM and Source Specific PIM. PIM uses the unicast routing table to do the RPF check of incoming multicast packets. PIM will not forward multicast traffic out the interfaces where PIM is not enabled. By default, no interfaces are running PIM until a PIM mode is defined on them:

R(config-if)# ip pim {dense-mode|sparse-mode|sparse-dense-mode}

To see a list of interfaces running PIM,use:

R# show ip pim interfaces [INTERFACE][detail|stats|count]

PIM Neighbors

PIM uses a neighbor discovery mechanism by sending HELLOs every 30 sec (default) to 224.0.0.13. The interval can be changed with

R(config-if)# ip pim query-interval HELLO-TIME [msec]
!Default: 30 sec

PIM uses neighbor adjacencies to build the distribution tree. By default, PIM will accept any PIM-enabled router as a neighbor, but you can prevent that with:

R(config-if)# ip pim neighbor-filter ACL

To see a list of PIM neighbors, use:

R# show ip pim neighbors

PIM DR

A PIM DR is elected for both PIM Dense Mode and PIM Sparse Mode. While on PIM Dense Mode it has no explicit role (other than being the IGMPv1 Querier), in PIM Sparse Mode, the DR will Register a source to the Randezvous Point. HELLOs are also used to elect the PIM DR. The DR is elected on a multiaccess network as the router with the highest priority and then the highest IP Address. To set the priority, use:

R(config-if)# ip pim dr-priority PRI
!Default Priority = 1. Highest wins

PIM Dense Mode

Building the Shortest Path Tree

PIM Dense Mode was the initial method of delivery of multicast traffic. It uses a “Flood and Prune” mechanism. This means that by default, PIM DM will send traffic out all PIM-enabled interfaces as long as there is a PIM Neighbor or an IGMP client that joined the destination multicast group. By using this approach, the network is “flooded” with multicast traffic until it reaches the intended clients. A multicast “Source Distribution Tree” a.k.a “Shortest Path Tree” is built from the source as the root. More exactly, whenever multicast traffic arrives at the router and it passes the RPF check, it will populate it’s mroute table with an (S,G) entry. The interface where multicast traffic arrived will be the incoming interface, and all other interfaces where PIM neighbors exist or where directly connected clients exist (known via IGMP) will be added to the OIL (Outgoing Interface List).

To prevent unwanted multicast traffic, a Router can send a PRUNE message upstream, telling its upstream PIM neighbor to stop sending the traffic this way. A router will send PRUNE messages upstream when:

  • traffic is arriving on a non-RPF poin-to-point interface

  • it is a leaf router (no PIM neighbors downstream) and no directly connected receivers

  • it is a non-leaf router on a point-to-point link that has received a PRUNE message from the neighbor

  • it is a non-leaf router on a LAN segment that has received a PRUNE from a neighbor an no one overrides that PRUNE.

When a PRUNE is received, a 3 seconds (default) prune delay is started. When a PRUNE is sent on a LAN, if somebody still wants the traffic it will send a PRUNE OVERRIDE message. If such a message is received before the prune delay times out, the traffic is not pruned.

A PRUNE times out every 3 minutes by default. When it times out, the PIM router will start to send multicast traffic again on the interface. A new PRUNE message is needed for the router to stop the flooding. To prevent the periodic flooding, PIM implemented a PIM STATE REFRESH message that is sent downstream from the routers closest to the source. If traffic is needed on an interface, the downstream routers can send a GRAFT message upstream to request traffic. This message is acknowledged with a GRAFT-ACK before traffic is forwarded on that interface.

To enable PIM Dense Mode, use the following command:

R(config-if)# ip pim dense-mode
! It will also enable IGMP
! If multicast-routing is not enabled, it gives a warning

It is very important to understand the data in the following show command:

R# show ip mroute MULTICAST-GROUP
! (*,G) = Container entry, not used for forwarding in PIM DM
(*, MULTICAST-GROUP), 00:22:41/stopped, RP 0.0.0.0, flags: D
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    INTERFACE1, Forward/Dense, 00:22:41/stopped
    INTERFACE2, Forward/Dense, 00:22:41/stopped
! (S,G) entry is used for forwarding:
(MULTICAST-SOURCE, MULTICAST-GROUP), 00:00:47/00:02:12, flags: PT
  Incoming interface: INTERFACE1, RPF nbr UPSTREAM-NEIGHBOR
  Outgoing interface list:
    INTERFACE2, Prune/Dense, 00:00:47/00:02:12

The first entry is always a (*,G) group, but this is not actually used for forwarding. Instead, it is just a Group that lists all interfaces that could be part of the OIL. Forwarding is don based on the (S,G). If there are multiple sources that send to the same multicast group, then we’ll have multiple (S,G) entries. The information in the (S,G) entry tells us that traffic was received on INTERFACE1 and was forwarded out INTERFACE2, but we recently received a PRUNE message on that interface. No clients on this branch of the tree. If there where no PIM neighbors either, then the OIL would have said “Null”. If we had a client on this branch than it would’ve said “Forward/Dense”. If the traffic on that interface is pruned then, just as in our example, it says “Prune/Dense. If the incoming interface in the (S,G) entry is “Null”, it means that the traffic from this source failed the RPF check. This is usually due to the fact that our routing table points to another interface to reach this source. You can make the traffic pass the RPF check if you modify the routing table accordingly, or if you force the router to accept the traffic on this interface, using a static mroute:

R(config)# ip mroute SOURCE MASK {PREVIOUS-HOP|INCOMING-INTERFACE}
! PREVIOUS-HOP in multicast equals to NEXT-HOP in unicast routing table

If there are multiple sources that transmit, you can see a summary of the mroute table with:

R# show ip mroute summary
(*, MULTICAST-GROUP), 01:14:12/stopped, RP 0.0.0.0, OIF count: 3, flags: D
  (SOURCE1, MULTICAST-GROUP), 00:00:46/00:02:13, OIF count: 2, flags: T
  (SOURCE2, MULTICAST-GROUP), 00:06:47/00:01:02, OIF count: 2, flags: T

PIM Forwarder

When sending traffic downstream, PIM ASSERT messages are used to determine the PIM forwarder. If a router receives a multicast packet on an interface that is in the OIL (Outgoing Interface List), it means there is another router forwarding traffic on the same segment. This will result in duplicate packets being sent to the receivers. To solve this, the routers will send a PIM ASSERT message on the LAN. This message contains the metric information [AD/metric] to the source of the traffic. The router with the best metric will win. In case of a tie, the router with the highest IP address will win. The only way you can change the PIM Forwarder is by changing the routing table so it uses a better [AD/metric] in the ASSERT messages.

PIM Sparse Mode

PIM Sparse mode is defined in RFC 4601. It uses an explicit JOIN mechanism where no traffic is sent out an interface, unless an explicit JOIN was received on that interface. To enable PIM Sparse Mode on an interface, use:

R(config-if)# ip pim sparse-mode

Sparse Mode considers that no interface needs the multicast traffic at the beginning, and each router that has directly connected receivers must explicitly send a JOIN upstream. Interfaces that receive JOINs are added to the Forwarding list in the OIL, but they timeout after 3 minutes. JOINs are usually sent every minute to keep the interface forwarding.

Since the receivers must explicitly JOIN the distribution tree, there is a need for a Rendezvous Point that will keep track of all receivers and sources in the network.

PIM-SM is basically split into 2 forwarding trees. One from the source to the RP and another from the RP to the receivers.

Another difference between PIM Dense Mode and PIM Sparse Mode is that the entries in the mroute table are generated by multicast traffic in PIM DM, and by PIM JOINs in PIM SM.

Building the Shared Tree

The router that receives IGMP JOINs (PIM DR if more than one) sends a PIM JOIN upstream to the RP. All routers in the path insert (*,G) entries and send the PIM JOIN upstream until it reaches the RP. The (*,G) entries will have an incoming interface set to the interface towards the RP and an OIL containing the interface where the JOIN was received.

If traffic is no longer needed, a (*,G) PRUNE must be sent upstream. When a PRUNE is received, the interface is excluded from the OIL. If the OIL is empty (NULL) then a PRUNE is sent upstream. A (*,G) PRUNE will also prune all (S,G) entries

Source Registration and Building the Source Distribution Tree

Since the RP is the root of the distribution tree, there must be a way for the sources to send the traffic to the RP for distribution. This is done via a registration process. When a source starts to transmit, the DR on its segment encapsulates the packets in a PIM REGISTER message, which is unicasted to the RP. When the RP receives the message from the PIM DR, it will acknowledge it with a REGISTER STOP message and will insert an (S,G) entry into the mroute table. Now, only the DR and the RP know about the (S,G) entry. The RP can be configured with a list of allowed sources that can register to it:

R(config)# ip pim accept-register ACL
! The ACL shoud be extended. It matches the SOURCE and the GROUP

Once the RP has a (*,G) entry in its mroute table (meaning there is a receiver a.k.a the shared tree was built), it sends its own PIM JOIN message upstream to the source, making all routers on the upstream path to add (*,G) entries with OIL pointing to the RP.

By default, the PIM DR sources the register message from the interface closest to the RP. You can change this with:

R(config)# ip pim register source INTERFACE

SPT Switchover

A router that has directly connected receivers can switch from the Shared Tree to the SPT to the source when the traffic on the shared tree exceeds a certain threshold:

R(config)# ip pim spt-threshold {KBPS |infinty}
! Default: 0 = always

By default, this threshold is 0 for Cisco Routers, meaning that it will always switchover. It could be set to never switchover by using the infinity keyword. When this router decides to switchover, it sends a (S,G) JOIN upstream. It is possible, at some point in the shared tree, that the SPT tree and the Shared Tree will diverge. Routers will send a (S,G) PRUNE up the shared tree when the upstream neighbor for (S,G) is different than the RP. This will prune the shared tree. An (S,G) PRUNE will only prune the (S,G) entry and will not generate a PRUNE upstream.

PIM-SM mRoute Table

Unlike PIM-DM, in PIM-SM (*,G) entries are used to forward multicast traffic down the shared tree. (*,G) entries are created when JOINs are received. The incoming interface of a (*,G) entry is set to the RPF interface that points to the RP, and the OIL contains interfaces where PIM JOINs for this group were received. (S,G) entries are created as a result of explicit JOINs for (S,G). (S,G) JOIN messages are sent by a router that is doing SPT Switchover or by the RP when a REGISTER message is received.

In PIM-SM pruned interfaces are removed from OIL so they will only show up in the OIL if traffic needs to be sent on them. Therefor they always appear as Forward/Sparse

Setting the Rendezvous Point (RP)

PIM Sparse Mode uses a shared tree to deliver traffic to the receivers. Without a RP, PIM SM will not work. The RP can be set in 3 ways: manual, with Auto-RP, or with BSR.

To prevent unwanted RPs in the network, a router can be configured to ignore JOIN messages addressed to other RP then the ones specified in:

R(config)# ip pim accept-rp {RP-ADDR [ACL] | auto-rp}

This means that the router will ignore these JOIN messages so they will not generate entries in the mroute table.

Manual

R(config)# ip pim rp-address RP-ADDR [ACL] [override]

When setting the RP manual, it can be done for all groups or for a subset (matching the ACL). Normally, if the RP is set in any other way, it will be considered a better choice than the manual setting. Use the override keyword to use the statically defined RP even if another one is set using Auto-RP or BSR.

Auto-RP

Auto RP is a feature implemented first by Cisco. It helps routers find the RP in a network. There are 2 components in the design: the RP candidates and the mapping agent. First, the RP candidates must announce themselves at the 224.0.1.39 address.

R(config)# ip pim send-rp-announce INTERFACE scope TTL [group-list ACL][interval SEC]
!The INTERFACE must run PIM (including Loopback)

Then the mapping agents must be configured to join the 224.0.1.39 address and listen to the announcement. They will announce the mappings to the 224.0.1.40 group, an address that all Cisco PIM-SM routers join by default. To configure a mapping agent, use:

R(config)# ip pim send-rp-discovery INTERFACE scope TTL [group-list ACL][interval SEC]

If there are multiple RP candidates, the Mapping Agent will choose the router with the highest IP Address. The mapping agents can filter incoming RP candidates announcements, using:

R(config)# ip pim rp-announce-filter [rp-list RP-ACL] [group-list GROUP-ACL]

The GROUP-ACL must match the ACL used by the RP Candidate in its announcement, otherwise it will be completely filtered. The mapping agent uses Dense Mode to send RP information, and we need RP to join any groups. This ends up in a circular logic. The available solutions are:

  • Run PIM in sparse-dense mode

  • Statically define the RP for the Auto-RP Groups. (224.0.1.39 and 224.0.1.40) – this defeats the purpose of auto-configuration

  • Set the router to be an auto-rp listener, which is a workaround that makes the router work in dense mode just for those 2 addresses, and in sparse mode for all other multicast groups.

    R(config)# ip pim rp-autolistener

To see the RP mappings, use:

R# show ip pim rp [mapping]

You can filter auto-rp messages going out an interface with:

R(config-if)# ip multicast boundary filter-autorp [ACL]

BSR – BootStrap Router

BSR is an open standard defined in PIMv2 RFC, used to automate the process of electing a RP. Similar to Auto-RP, there are several candidate RPs and one or more BSR routers that will advertise RP to group mappings. The BSR is elected first based on the highest priority and then by the highest IP address. To define a router as a BSR Candidate, use:

R(config)# ip pim bsr-candidate INTERFACE [HASH-LEN] [PRI]
! Default PRI=0 - Highest wins. If tied, the higher IP wins.

The BSR algorithm tries to distribute groups more or less equal between all Candidate RPs, but at the same time to have all routers select the same RP for a group. For each Group Address, the BSR algorithm will select the Candidate RP with the lowest Priority (as advertised by the Candidate RP to the BSR). Since this is 0 by default, it often happens that there are multiple RP Candidates with the same lowest Priority. The Hash function, which can be found in RFC 2362, will generate a Hash value for each Candidate RP that is available for a Group and that has the same lowest Priority. The Candidate RP with the highest hash value will be selected as the RP. To see the result of the Hash value, use:

R# show ip rp-hash GROUP-ADDR

HASH-LEN represents the length of a mask that is ANDed with the Group Address in order to obtain a Hash value. This makes the hash function to actually use only the first HASH-LEN bits of the Group Address. Therefor, all groups that have the same HASH-LEN bits will select the same RP. By default, HASH-LEN is zero, so all groups will select the same RP (You can see that the hash value is the same for all RPs, regardless of the Group Address). If you define a HASH-LEN value, then you can have smaller continuos chunks of the multicast address space assigned to the same RP. For example, when using a HASH-LEN of 31, every 2 Group Address will be assigned to the same RP. For 30, every 4 Group Addresses, and so on. At this point you would think that using a HASH-LEN of 1 would assign half of the Groups to an RP and the other will be assigned to an RP, and the other half to another RP. But that’s not the case. Since the Multicast Address all share the same 4 bits (224.0.0.0/4) a HASH-LEN of 4 or less will result in the same result of the hash function. Therefor, the minimum HASH-LEN should be 5 if you attempt to assign half of the Group Addresses to an RP and the other half to another RP. However, the Hash results are pseudo-random and you have better chances of equal distribution as the number of different hash results grows. That is a 32 HASH-LEN will compute different values for each address and assign the group to one of the Candidate RPs.

BSR information is passed on a hop-by-hop basis to 224.0.0.13 (All PIM Routers). This is how routers will find information about the RP mappings.

RP canidates send unicast information to the BSR, because they will know the BSR address:

R(config)# ip pim rp-candidate INTERFACE [group-list ACL] [interval SEC] [priority PRI]
! Default PRI=0. Lowest wins!
! Default SEC=60

The ACL used to filter the groups it advertises for, can only have permit entries.

You can filter BSR messages going out an interface with:

R(config-if)# ip pim bsr-border

PIM Sparse-Dense Mode

PIM Sparse-Dense Mode uses Sparse Mode when there is a RP for the group, and Dense Mode when there is no RP. It can be used to make Auto-RP work. To enable sparse-dense mode on an interface, use:

R(config-if)# ip pim sparse-dense-mode

There is a problem when the RP becomes unavailable. The groups will fallback to Dense Mode and the information will be flooded throughout the multicast network. To prevent this, you can:

  • disable dense-mode fallback:

    R(config)# no ip pim dm-fallback
  • use Auto RP listener:

    R(config)# ip pim autorp listener
  • or set a “sink RP” as a last resort – a static RP that will prevent Dense Mode fallback.

    R(config)# ip pim rp-address RP-ADDR [ACL]

Bidirectional PIM

Bidirectional PIM is based on PIM-SM but differs in the way the source sends its data to the RP. In normal PIM-SM, traffic would only flow downstream from the source to the RP on the SPT and from the RP to the receivers on the shared tree. In Bidir PIM, traffic flows only on a shared tree. Upstream from the source to the RP and downstream from the RP to the receivers.

On every network segment, a Designated Forwarder (DF) is elected to forward the traffic upstream towards the RP. There will be one DF for each RP in the network. There is no source registration in Bidir Mode. Instead, all traffic will reach the RP where it will be dropped if there are no clients.

Routers will only use (*,G) entries in the mroute table and when performing RPF check, the interface towards the RP is used. When sending traffic towards the RP, only the DF will forward packets, all other routers will discard them. Bidirectional PIM will skip the RPF check, but all routers in a domain must be configured to support this feature, otherwise loops can occur.

To configure a router to support PIM, use:

R(config)# ip pim bidir-enable

Of course, the routers will need to work in Sparse or Sparse-Dense Mode, and they will need to know the RP for this traffic. To enable a RP for Bidirectional PIM you can use any method, just specify that it works in bidir mode:

! Manual:
R(config)# ip pim rp-address RP-ADDR [ACL] bidir
! Auto-RP
R(config)# ip pim send-rp-announce INTERFACE scope TTL [group-list ACL] bidir
! BSR
R(config)# ip pim rp-candidate [group-list ACL] bidir

PIM SSM – Source Specific Multicast

SSM is a delivery model in which traffic is forwarded to the receivers only from those sources that they explicitly joined. Only IGMPv3 supports this kind of Membership Reports where the receiver specifies the source it wants to receive from. In PIM SSM, the routers only create (S,G) entries, but this applies only to the SSM address range. All other groups are treated as normal Sparse Mode.

R(config)# ip pim ssm {default | range ACL}

The default range is specified by IANA as 232.0.0.0/8 but any other multicast range can be specified using the ACL. Remember to set the IGMP version to 3 on the interfaces where receivers exist:

R(config-if)# ip igmp version 3

In SSM, there is no need for RP because the clients know the source of the traffic. The Clients generate IGMPv3 (S,G) JOINs, and the routers build the tree up to the source using (S,G) PIM JOINs.

Last updated