Implementing Inter-AS MPLS L3VPN services with Option C

We finish the inter-AS L3VPN series with Option C: As I foreshadowed in the end of the of the Option B post, Option C is even more scalable solution than Option B, and we'll have a single, globally significant VPN label between the service providers, and the ASBRs will only swap the transport label, the VPN label stays the same end-to-end.

Inter-AS MPLS L3VPN topology Option C
Inter-AS MPLS L3VPN Option C topology (open picture in new window for a better view)

So what do we start with? I removed the VPNv4 BGP neighborship between R1 and R7 which we used with Option B, and there aren't any VRFs defined on R1 and R7.

About Option C

So the VPN label and the LSP is end-to-end this time, and this can actually be advantageous from QoS perspective. It is even more scalable, but usually requires more configuration, and a lot of coordination and trust is needed between the service providers, so it is probably the least secure solution. This time we form a VPNv4 BGP neighborship between the Route Reflectors of the service providers (SP). So the SPs need to learn the address of neighbor provider's RR. Moreover the two RRs will be configured with next-hop-unchanged to each other, so the next hop address of the VPNv4 routes will point to the PE router in the neighboring ISP's infrastructure: so the SPs will also need to learn the loopback addresses of the other SP's PE routers. So these addresses have to be globally unique, they can't be the same. The ASBRs (R1 and R7) form a Labeled Unicast (LU) BGP neighborship. We don't run LDP of course between R1 and R7, we generate the transport label through BGP between the two ASBRs. So let's take a look how we can implement all of these step-by-step.

Option C Implementation

First we build the VPNv4 BGP neighborship between the two RRs with next-hop-unchanged: so the Next Hop will point to the remote PE which is located in the other ISP's infrastructure. This will be an eBGP session, so normally the next hop would point to the loopback of the RR itself.

R4(config)#router bgp 16
R4(config-router)#neighbor 10.10.10.10 remote-as 712
R4(config-router)#neighbor 10.10.10.10 update-source lo0
R4(config-router)#neighbor 10.10.10.10 ebgp-multihop 255
R4(config-router)#address-family vpnv4 unicast 
R4(config-router-af)#neighbor 10.10.10.10 activate 
R4(config-router-af)#neighbor 10.10.10.10 next-hop-unchanged 

R10(config)#router bgp 712
R10(config-router)#neighbor 10.4.4.4 remote-as 16
R10(config-router)#neighbor 10.4.4.4 update-source lo0
R10(config-router)#neighbor 10.4.4.4 ebgp-multihop 255
R10(config-router)#address-family vpnv4 unicast 
R10(config-router-af)#neighbor 10.4.4.4 activate 
R10(config-router-af)#neighbor 10.4.4.4 next-hop-unchanged 

We also need the ebgp-multihop because it is an eBGP session which uses a TTL of 1 by default and the RRs are multiple hops away from each other. So that's all we need to configure on the RRs. The neighborship won't come up of course at this point because the RRs don't have routes to each other, it remains in the Idle state at this point:

R10(config-router-af)#do show bgp vpnv4 unicast all summ | begin Neighbor
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.4.4.4        4           16       0       0        1    0    0 never    Idle
10.7.7.7        4          712      23      25       13    0    0 00:14:28        0
10.11.11.11     4          712      20      25       13    0    0 00:14:06        4
10.12.12.12     4          712      20      25       13    0    0 00:13:59        4

Next we build the IPv4 LU BGP neighborship between R1 and R7 with send-label:

R1(config)#router bgp 16
R1(config-router)#neighbor 100.1.17.7 remote-as 712
R1(config-router)#address-family ipv4 unicast 
R1(config-router-af)#neighbor 100.1.17.7 activate 
R1(config-router-af)#neighbor 100.1.17.7 send-label 
R1(config-router-af)#network 10.4.4.4 mask 255.255.255.255

R7(config)#router bgp 712
R7(config-router)#neighbor 100.1.17.1 remote-as 16
R7(config-router)#address-family ipv4 unicast 
R7(config-router-af)#neighbor 100.1.17.1 activate 
R7(config-router-af)#neighbor 100.1.17.1 send-label 
R7(config-router-af)#network 10.10.10.10 mask 255.255.255.255

Notice that IPv4 Labeled Unicast is actually a separate Address Family, still we configure it under IPv4 unicast on IOS-XE. On IOS-XR it is different: the neighbor is actually configured with ipv4 labeled-unicast AF explicitly.

BGP OPEN message between R1 and R7 with AFI/SAFI capabilities
BGP OPEN message between R1 and R7 with AFI/SAFI capabilities: both ASBRs must support the IPv4 LU address family, otherwise the neighborship won't come up

I also advertise the loopback of each RR with the network command above, so now the ASBRs (R1 and R7) should learn the loopback of the neighbor ISP's RR.

BGP IPv4 LU NLRI: the prefix is advertised with an MPLS label
BGP IPv4 LU NLRI: the prefix is advertised with an MPLS label

But the RRs still don't have routes to each other at this point. Why? That's because we don't have IPv4 unicast BGP sessions between the ASBRs and the RRs. So we redistribute the learned prefixes of the remote RRs into IGP (which is OSPF in this case):

R1(config)#router ospf 1
R1(config-router)#redistribute bgp 16 subnets 

R7(config)#router ospf 1
R7(config-router)#redistribute bgp 712 subnets 

Now the RRs should have routes to each other and the neighborhip should come up.

R10(config-router-af)#
%BGP-5-ADJCHANGE: neighbor 10.4.4.4 Up 

Now on the RRs we can see the prefixes from the remote ISP's infrastructure, but the next-hop is unreachable:

R4#show bgp vpnv4 unicast all | begin Network
     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 16:1
 * i  14.14.14.14/32   10.6.6.6                 0    100      0 65111 i
 *>i                   10.5.5.5                 0    100      0 65111 i
Route Distinguisher: 16:2
 * i  13.13.13.13/32   10.6.6.6                 2    100      0 ?
 *>i                   10.5.5.5                 2    100      0 ?
 * i  172.2.5.0/24     10.6.6.6                 2    100      0 ?
 *>i                   10.5.5.5                 0    100      0 ?
 *>i  172.2.6.0/24     10.6.6.6                 0    100      0 ?
 * i                   10.5.5.5                 2    100      0 ?
Route Distinguisher: 16:11
 *    15.15.15.15/32   10.11.11.11                            0 712 65222 i
Route Distinguisher: 16:22
 *    16.16.16.16/32   10.11.11.11                            0 712 ?
 *    172.2.11.0/24    10.11.11.11                            0 712 ?
 *    172.2.12.0/24    10.12.12.12                            0 712 ?

Remember we configured the RRs with next-hop-unchanged: so as we can see the next hop points to R11 (10.11.11.11) and R12 (10.12.12.12) which R4 don't have routes to. So what we do next, we redistribute the remote PEs loopbacks into BGP on the ASBRs. Actually we're going to redistribute every /32 loopback address this time not just the PEs:

R1(config)#ip prefix-list ALL_32S permit 0.0.0.0/0 ge 32
R1(config)#route-map REDIST permit 10
R1(config-route-map)#match ip address prefix-list ALL_32S
R1(config)#router bgp 16
R1(config-router)#address-family ipv4 unicast 
R1(config-router-af)#redistribute ospf 1 route-map REDIST

R7(config)#ip prefix-list ALL_32S permit 0.0.0.0/0 ge 32
R7(config)#route-map REDIST permit 10
R7(config-route-map)#match ip address prefix-list ALL_32S
R7(config)#router bgp 712
R7(config-router)#address-family ipv4 unicast 
R7(config-router-af)#redistribute ospf 1 route-map REDIST

You can already see what I outlined above: the service providers need to share the loopback addresses with each other, and these must be globally unique. But now next-hops are reachable, so the VPNv4 routes are now both valid and best and the RRs could advertise them:

R4#show bgp vpnv4 unicast all | begin Network
     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 16:1
 * i  14.14.14.14/32   10.6.6.6                 0    100      0 65111 i
 *>i                   10.5.5.5                 0    100      0 65111 i
Route Distinguisher: 16:2
 * i  13.13.13.13/32   10.6.6.6                 2    100      0 ?
 *>i                   10.5.5.5                 2    100      0 ?
 * i  172.2.5.0/24     10.6.6.6                 2    100      0 ?
 *>i                   10.5.5.5                 0    100      0 ?
 *>i  172.2.6.0/24     10.6.6.6                 0    100      0 ?
 * i                   10.5.5.5                 2    100      0 ?
Route Distinguisher: 16:11
 *>   15.15.15.15/32   10.11.11.11                            0 712 65222 i
Route Distinguisher: 16:22
 *>   16.16.16.16/32   10.11.11.11                            0 712 ?
 *>   172.2.11.0/24    10.11.11.11                            0 712 ?
 *>   172.2.12.0/24    10.12.12.12                            0 712 ?

The RTs are now also globally significant of course, just like with Option B, if the import/export RTs match, the CEs should be able to reach each other:

R14#traceroute 15.15.15.15 source lo0 numeric 
Type escape sequence to abort.
Tracing the route to 15.15.15.15
VRF info: (vrf in name/id, vrf out name/id)
  1 172.1.5.5 2 msec 2 msec 2 msec
  2 10.0.25.2 [MPLS: Labels 2014/11021 Exp 0] 7 msec 6 msec 5 msec
  3 10.0.12.1 [MPLS: Labels 1013/11021 Exp 0] 6 msec 6 msec 6 msec
  4 100.1.17.7 [MPLS: Labels 7014/11021 Exp 0] 6 msec 5 msec 5 msec
  5 10.7.8.8 [MPLS: Labels 8008/11021 Exp 0] 5 msec 6 msec 6 msec
  6 172.1.11.11 [MPLS: Label 11021 Exp 0] 6 msec 5 msec 4 msec
  7 172.1.11.15 7 msec *  4 msec

Which they do. Notice that the LSP is end-to-end, the VPN (inner) label doesn't change, only transport label changes just like with intra-AS L3VPN.

MPLS label stack between R1 and R7
MPLS label stack between R1 and R7: only the transport label changes

The VPN label is globally significant, for example R5 (PE of ISP1) learns the label of R11 (PE of ISP2):

R5#show bgp vpnv4 uni all 15.15.15.15/32
BGP routing table entry for 16:1:15.15.15.15/32, version 76
Paths: (1 available, best #1, table RED)
Flag: 0x100
  Advertised to update-groups:
     2         
  Refresh Epoch 1
  712 65222, (Received from a RR-client), imported path from 16:11:15.15.15.15/32 (global)
    10.11.11.11 (metric 1) (via default) from 10.4.4.4 (10.4.4.4)
      Origin IGP, metric 0, localpref 100, valid, internal, best
      Extended Community: RT:16:10
      mpls labels in/out nolabel/11021
      rx pathid: 0, tx pathid: 0x0

As we can see above with the pcap, R5 sends the packet with a VPN label of 11021 to R11.