All posts by mdinham

MTU settings on Junos & IOS (part 5) with MPLS L2 VPN

This post follows on from part 4, but this time we’ll be configuring a Layer-2 Ethernet to Ethernet MPLS VPN between the 2 CEs.

MPLS L2VPN

I’m going to configure a Martini Layer 2 VPN. Martini uses LDP to signal and setup the VPN across the MPLS network.

With MPLS L2 VPNs there will also be a minimum of two labels. The top label being the transport label and the bottom label being the VC label. The transport label will be swapped hop by hop through the MPLS network. The VC label is used by the egress PE router to identify the Virtual Cirtcuit that the incoming packet relates to.

On an Ethernet segment, the frame would look like this for a payload of 1500 bytes.

l2vpn

  • L2 Header – 14 bytes. This is the Ethernet header for the frame. 14 bytes, or 18 bytes if the interface is an 802.1q trunk.
  • MPLS Label – 4 bytes. The transport label.
  • VC Label – 4 bytes. The VC ID.
  • Control Word – 4 bytes. This is an optional field only required when transporting FR or ATM, carrying additional L2 protocol information.
  • L2 Data – 1514 – 1518 bytes. The encapsulated Layer-2 frame that is being transmitted across the MPLS network. In this case 1500 bytes of data, plus the L2 header (an additional 4 bytes would be added if the VPN interface is a trunk).

Well this means that we will be putting a minimum of 1540 bytes (14 +4+4 + 4+1514) of data on the wire if there was 1500 bytes of customer data in the encapsulated L2 frame. If the customer interface is a trunk and the SP interface is a trunk, then we are up to 1548 bytes on the wire across the P network.

Lab

For this lab, I’ll be using the topology below. The base configurations are using OSPF as the routing protocol and LDP to exchange transport labels.

mplsl2vpnSoftware revisions are as follows

  • CE1, CE2, P, PE1: IOS (Cisco 7200 12.4(24)T)
  • PE2: Junos (Firefly 12.1X46)

The base configs are similar to part 3, using OSPF as the IGP and LDP to signal transport labels, so I’ll jump straight in to the Martini VPN config.

PE1

The configuration could not be simpler. The VCID must match at both sides and is set to 12.

interface FastEthernet1/0
 no ip address
 duplex full
 xconnect 2.2.2.2 12 encapsulation mpls
!

PE2

Not much to it on Junos either. Notice that I enable LDP on the loopback.

interfaces {
    ge-0/0/1 {
        encapsulation ethernet-ccc;
        unit 0 {
            family ccc;
        }
    }
}
protocols {
    ldp {
        interface lo0.0;
    }
    l2circuit {
        neighbor 1.1.1.1 {
            interface ge-0/0/1.0 {
                virtual-circuit-id 12;
            }
        }
    }
}

Let’s check that the circuit is up

PE1#sh mpls l2transport vc detail
Local interface: Fa1/0 up, line protocol up, Ethernet up
  Destination address: 2.2.2.2, VC ID: 12, VC status: up
    Output interface: Gi0/0, imposed label stack {18 299776}
    Preferred path: not configured
    Default path: active
    Next hop: 192.168.34.4
  Create time: 03:48:16, last status change time: 00:15:19
  Signaling protocol: LDP, peer 2.2.2.2:0 up
    MPLS VC labels: local 22, remote 299776
    Group ID: local 0, remote 0
    MTU: local 1500, remote 1500
    Remote interface description:
  Sequencing: receive disabled, send disabled
  VC statistics:
    packet totals: receive 3114, send 3124
    byte totals:   receive 319522, send 408559
    packet drops:  receive 0, seq error 0, send 3


root@firefly> show l2circuit connections extensive
Layer-2 Circuit Connections:

Legend for connection status (St)
EI -- encapsulation invalid      NP -- interface h/w not present
MM -- mtu mismatch               Dn -- down
EM -- encapsulation mismatch     VC-Dn -- Virtual circuit Down
CM -- control-word mismatch      Up -- operational
VM -- vlan id mismatch           CF -- Call admission control failure
OL -- no outgoing label          IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC    TM -- TDM misconfiguration
BK -- Backup Connection          ST -- Standby Connection
CB -- rcvd cell-bundle size bad  SP -- Static Pseudowire
LD -- local site signaled down   RS -- remote site standby
RD -- remote site signaled down  XX -- unknown

Legend for interface status
Up -- operational
Dn -- down
Neighbor: 1.1.1.1
    Interface                 Type  St     Time last up          # Up trans
    ge-0/0/1.0(vc 12)         rmt   Up     May  3 21:09:11 2014           1
      Remote PE: 1.1.1.1, Negotiated control-word: Yes (Null)
      Incoming label: 299776, Outgoing label: 22
      Negotiated PW status TLV: No
      Local interface: ge-0/0/1.0, Status: Up, Encapsulation: ETHERNET

Looks good!

My CE config is very simple, I just have a point to point interface between the two CEs and OSPF running across the L2 link. Here is CE1’s config

interface Loopback0
 ip address 11.11.11.11 255.255.255.255
 ip ospf 1 area 0
!
interface FastEthernet0/1
 ip address 192.168.12.1 255.255.255.0
 ip ospf 1 area 0
 duplex full
 speed 100
!

CE1#sh ip ro 22.22.22.22
Routing entry for 22.22.22.22/32
 Known via "ospf 1", distance 110, metric 2, type intra area
 Last update from 192.168.12.2 on FastEthernet0/1, 03:47:25 ago
 Routing Descriptor Blocks:
 * 192.168.12.2, from 22.22.22.22, 03:47:25 ago, via FastEthernet0/1
 Route metric is 2, traffic share count is 1

The obvious difference compared to L3 MPLS VPN is that the provider network has no involvement in customer routing.

What’s the maximum ping we can get from CE1 to CE2? The provider core is set to 1500 bytes on the PE-P-PE interfaces. It should be 1474 right? 1500 – 14 bytes of L2 headers – 2 labels – 1 control word. And it is.

CE1#ping 22.22.22.22 repeat 1 size 1474

Type escape sequence to abort.
Sending 1, 1474-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 56/56/56 ms
CE1#ping 22.22.22.22 repeat 1 size 1475

Type escape sequence to abort.
Sending 1, 1475-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
.
Success rate is 0 percent (0/1)

Let’s look at a capture on the PE1 facing interfaces on the P router.

l2vpncap

1514 bytes on the wire as expected.

Now how about I make the CE1-CE2 link dot1q. I’ve not changed anything on the PE routers, just enabled a 802.1q tagged interface using on each CE. As there will be an extra 4 bytes of overhead, the maximum ping size will drop to 1470.

CE1#ping 22.22.22.22 repeat 1 size 1470

Type escape sequence to abort.
Sending 1, 1470-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 56/56/56 ms
CE1#ping 22.22.22.22 repeat 1 size 1471

Type escape sequence to abort.
Sending 1, 1471-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
.
Success rate is 0 percent (0/1)

We can see the extra header in the packet capture – exactly where you’d expect to see it – in between the CE generated Ethernet header and IP data.

l2vpncapdot1q

Now we know how the data on the wire changes when an MPLS L2 VPN is created, it’s easy to make provision for the additional overhead across the MPLS core by increasing MTUs accordingly.

Thanks for reading my post. We’ve covered both Cisco and Juniper here, but be sure to check out other posts from the #JuniperFan bloggers here.

 

 

MTU settings on Junos & IOS (part 4) with MPLS L3 VPN

I had only intended to do 3 parts to this series, but I can’t really stop at part 3 (MPLS) without posting about how MPLS Layer 3 VPN affects MTU. In this post we’ll build a simple MPLS VPN network and our goal is to transmit 1500 bytes of IP data between the CEs.

MPLS VPN

With MPLS L3 VPNs there will be two labels. The top label being the transport label and the bottom label being the VPN label. The transport label will be swapped hop by hop through the MPLS network. The VPN label is used by the egress PE router to identify the VPN/VRF that the incoming packet relates to. Assuming PHP is being used, the transport label will be removed by the router before the egress PE (in this lab the “P” router), leaving only the VPN label in the stack.

IMG_1067

Lab

For this lab, I’ll be using the topology below. The base configurations are using OSPF as the routing protocol and LDP to exchange transport labels. A full mesh of MP-BGP VPNv4 sessions will be configured between the PE routers to exchange VPN labels.MPLSpart4   Software revisions are as follows

  • CE1, CE2, CE3, PE3: IOS (Cisco 7200 12.4(24)T)
  • PE1: IOS-XR (IOS-XRv 5.1.1)
  • PE2: Junos (Firefly 12.1X46)

As with previous posts in this series I’m going to show what needs to be done to enable the required MTU. I’ll give some commentary on the MPLS config but will save the detailed analysis of MPLS control/data plane for another time.

PE1

PE1 is running IOS-XR. The relevant parts of the base config are as below:

interface Loopback0
 ipv4 address 1.1.1.1 255.255.255.255
!
interface GigabitEthernet0/0/0/0
 ipv4 address 192.168.14.1 255.255.255.0
!
router ospf 1
 area 0
  interface Loopback0
   passive enable
  !
  interface GigabitEthernet0/0/0/0
   network point-to-point
  !
 !
!
router bgp 1
 address-family vpnv4 unicast
 !
 neighbor 2.2.2.2
  remote-as 1
  update-source Loopback0
  address-family vpnv4 unicast
  !
 !
 neighbor 3.3.3.3
  remote-as 1
  update-source Loopback0
  address-family vpnv4 unicast
  !
 !
!
mpls ldp
 interface GigabitEthernet0/0/0/0
 !
!
end

PE2

PE2 is running Junos on Firefly in packet mode.

interfaces {
    ge-0/0/0 {
        unit 0 {
            family inet {
                address 192.168.24.2/24;
            }
            family mpls;
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 2.2.2.2/32;
            }
        }
    }
}
routing-options {
    autonomous-system 1;
}
protocols {
    mpls {
        interface ge-0/0/0.0;
    }
    bgp {
        group ibgp {
            type internal;
            family inet-vpn {
                unicast;
            }
            peer-as 1;
            neighbor 1.1.1.1;
            neighbor 3.3.3.3;
        }
    }
    ospf {
        area 0.0.0.0 {
            interface ge-0/0/0.0 {
                interface-type p2p;
            }
            interface lo0.0 {
                passive;
            }
        }
    }
    ldp {
        interface ge-0/0/0.0;
    }
}
security {
    forwarding-options {
        family {
            mpls {
                mode packet-based;
            }
        }
    }
}

PE3

PE3 is running IOS

interface Loopback0
 ip address 3.3.3.3 255.255.255.255
 ip ospf 1 area 0
!
interface GigabitEthernet0/0
 ip address 192.168.34.3 255.255.255.0
 ip ospf network point-to-point
 ip ospf 1 area 0
 mpls ip
!
router ospf 1
 log-adjacency-changes
!
router bgp 1
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 1.1.1.1 remote-as 1
 neighbor 1.1.1.1 update-source Loopback0
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
 !
 address-family vpnv4
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
 exit-address-family
 !
!

PE1-CE1 Routing

Let’s go ahead and enable the routing and VRF configuration between PE1 and CE1. We’ll use BGP as the PE-CE routing protocol.

CE1. There isn’t anything special about the configuration on the CE. BGP is used as the PE-CE routing protocol and is advertising the local interfaces to PE1.

interface Loopback0
 ip address 11.11.11.11 255.255.255.255
!
interface FastEthernet0/0
 ip address 192.168.11.1 255.255.255.0
!
router bgp 11
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 192.168.11.2 remote-as 1
 !
 address-family ipv4
  neighbor 192.168.11.2 activate
  no auto-summary
  no synchronization
  network 11.11.11.11 mask 255.255.255.255
  network 192.168.11.0
 exit-address-family
!

PE1. All the magic happens on the PE routers. Here the VRFs are defined and we add the BGP neighbor – note that the PE-CE BGP configuration is VRF based.

vrf vpn
 address-family ipv4 unicast
  import route-target
   1:1
  !
  export route-target
   1:1
  !
 !
!
interface GigabitEthernet0/0/0/1
 vrf vpn
 ipv4 address 192.168.11.2 255.255.255.0
!
router bgp 1
 vrf vpn
  rd 1.1.1.1:1
  address-family ipv4 unicast
  !
  neighbor 192.168.11.1
   remote-as 11
   address-family ipv4 unicast
   !
  !
 !
!

On XR, if I don’t configure an in/out export policy for the CE1 session then no routes will be sent/received. I’ve defined a simple policy to allow all routes to be sent/received.

route-policy announce
  pass
end-policy
!
router bgp 1
 vrf vpn
  neighbor 192.168.11.1
   address-family ipv4 unicast
    route-policy announce in
    route-policy announce out
   !
  !
 !
!

PE2-CE2 Routing

Here we’ll use OSPF as the PE-CE routing protocol.

PE2 is running Junos and most of the configuration all happens in a routing instance stanza.

IP addresses are added to the interface like normal:

interfaces {
    ge-0/0/1 {
        unit 0 {
            family inet {
                address 192.168.22.2/24;
            }
        }
    }
}

The routing instance defines the VRF, the route-distinguisher and route-targets to import. All the OSPF configuration takes place here as well. I have an export policy configured so that the BGP VRF routes are exported in to OSPF and distributed onward to CE2.

routing-instances {
    vpn {
        instance-type vrf;
        interface ge-0/0/1.0;
        route-distinguisher 2.2.2.2:1;
        vrf-target {
            import target:1:1;
            export target:1:1;
        }
        protocols {
            ospf {
                export bgp2ospf;
                area 0.0.0.0 {
                    interface ge-0/0/1.0 {
                        interface-type p2p;
                    }
                }
            }
        }
    }
}
policy-options {
    policy-statement bgp2ospf {
        from protocol bgp;
        then accept;
    }
}

PE3-CE3 Routing

PE3 is running IOS. Again, OSPF is used as the PE-CE routing protocol. Mutual redistribution is configured between the VRF based BGP and OSPF protocols.

ip vrf vpn
 rd 3.3.3.3:1
 route-target export 1:1
 route-target import 1:1
!
interface FastEthernet1/0
 ip vrf forwarding vpn
 ip address 192.168.33.3 255.255.255.0
 ip ospf network point-to-point
 ip ospf 33 area 0
!
router ospf 33 vrf vpn
 log-adjacency-changes
 redistribute bgp 1 subnets
!
router bgp 1
 address-family ipv4 vrf vpn
  redistribute ospf 33 vrf vpn
  no synchronization
 exit-address-family
!

VPN Testing

At this point we have an any to any MPLS VPN configured between the 3 CE routers. They should have full reachability, let’s see!

CE1#ping 22.22.22.22

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/46/56 ms
CE1#ping 33.33.33.33

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 33.33.33.33, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/65/92 ms

CE2#ping 33.33.33.33

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 33.33.33.33, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/60/68 ms
CE2#ping 11.11.11.11

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 11.11.11.11, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/44/68 ms
CE2#

CE3#ping 11.11.11.11

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 11.11.11.11, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 52/60/72 ms
CE3#ping 22.22.22.22

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 60/64/68 ms

Looks good!

But at this point the MTU settings on the PE-P-PE links are all set to the default of 1500 bytes, so a 1500 byte ping isn’t going to work.

CE1#ping 22.22.22.22 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
Packet sent with the DF bit set
M
Success rate is 0 percent (0/1)
CE1#
*Apr 26 17:25:27.099: ICMP: dst (192.168.11.1) frag. needed and DF set unreachable rcv from 192.168.14.1
CE1#ping 33.33.33.33 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 33.33.33.33, timeout is 2 seconds:
Packet sent with the DF bit set
M
Success rate is 0 percent (0/1)
CE1#
*Apr 26 17:25:33.855: ICMP: dst (192.168.11.1) frag. needed and DF set unreachable rcv from 192.168.14.1

OK, we are getting an ICMP unreachable back from PE1 exactly as expected.

We know that MPLS VPNs adds 2 labels to the stack, and therefore adds an extra 8 bytes of overhead, so all we need to do is increase the interface MTU to allow this extra data. From previous posts in this series we know how the different software does things.

On PE1, the IOS-XR router, the interface MTU will be increased to 1522 (1514 + 8), and the MPLS MTU will be set to 1508.

On P, the interface MTU will be increased to 1508 and the MPLS MTU will be set to 1508. The same config will be required on PE3 as both are running IOS.

On PE2, the Junos router, the interface MTU will be increased to 1522 (1514 + 8), and the MPLS MTU will be set to 1508.

Verification

CE1#ping 22.22.22.22 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 24/24/24 ms
CE1#ping 22.22.22.22 df-bit re 1 size 1500
*Apr 26 17:41:02.223: ICMP: echo reply rcvd, src 22.22.22.22, dst 192.168.11.1
CE1#ping 33.33.33.33 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 33.33.33.33, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 52/52/52 ms
CE1#
*Apr 26 17:41:05.975: ICMP: echo reply rcvd, src 33.33.33.33, dst 192.168.11.1

CE2#ping 11.11.11.11 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 11.11.11.11, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 20/20/20 ms
CE2#ping 33.33.33.33 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 33.33.33.33, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 4/4/4 ms

CE3#ping 11.11.11.11 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 11.11.11.11, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 44/44/44 ms
CE3#ping 22.22.22.22 df-bit re 1 size 1500

Type escape sequence to abort.
Sending 1, 1500-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
Packet sent with the DF bit set
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 56/56/56 ms

CE3#traceroute 11.11.11.11

Type escape sequence to abort.
Tracing the route to 11.11.11.11

  1 192.168.33.3 20 msec 16 msec 8 msec
  2 192.168.34.4 [MPLS: Labels 18/16005 Exp 0] 44 msec 8 msec 36 msec
  3 192.168.14.1 [MPLS: Label 16005 Exp 0] 60 msec 8 msec 60 msec
  4 192.168.11.1 8 msec 64 msec 8 msec

CE3#traceroute 22.22.22.22

Type escape sequence to abort.
Tracing the route to 22.22.22.22

  1 192.168.33.3 36 msec 28 msec 4 msec
  2 192.168.34.4 [MPLS: Labels 17/299888 Exp 0] 12 msec 56 msec 12 msec
  3 192.168.24.2 [MPLS: Label 299888 Exp 0] 28 msec 28 msec 28 msec
  4 192.168.22.1 60 msec 12 msec 32 msec

CE1#traceroute 33.33.33.33

Type escape sequence to abort.
Tracing the route to 33.33.33.33

  1 192.168.11.2 8 msec 8 msec 8 msec
  2 192.168.14.4 [MPLS: Labels 16/20 Exp 0] 40 msec 16 msec 36 msec
  3 192.168.33.3 [AS 1] [MPLS: Label 20 Exp 0] 20 msec 16 msec 16 msec
  4 192.168.33.1 [AS 1] 48 msec 44 msec 40 msec

Success!

In this simple lab we configured an any to any MPLS VPN and verified the settings required for a 1500 bytes of payload to be transmitted across our MPLS VPN network.

In a later post I will come back to this topology and go through the MPLS configuration in more detail.

There will also be a part 5 which will cover MPLS L2 VPN (pseudowire).

CCIE recertified, CCDE Written Exam Review

This week I took the CCDE written to recertify my CCIEs. That chalks up 8 years now of being a CCIE.

Firstly, why did I recert? Ethan Banks recently wrote a great post “Is the CCIE Certification Losing value?” in which Ethan talks about his upcoming re-certification deadline, and if he’ll take the exam or not.

I decided to recertify for a few simple reasons:

  • It took a shed load of time and effort to become a dual CCIE. So much went in to achieving the R&S and SP, I’m not ready to let them go just yet.
  • In comparison to the actual lab, the written is a fairly simple way to keep the certs alive. I don’t necessarily agree with being able to recertify any CCIE by taking a single expert level exam, but as I don’t have much spare time,  it sure makes life easier!
  • Having the CCIE is a meaningful and simple way for clients to gauge my skill level, to a point of course. As with many CCIEs, my experience extends way beyond working on the CLI (which is one of the reasons I originally became interested in the CCDE).
  • In my opinion, the CCIE is valuable and isn’t going away any time soon. Whilst it has a place I’ll aim to keep the badges.
  • I like learning.

One thing for sure though – if you are working on the CCIE right now, great, but when you’ve finished I think it’s important to learn more than just networking. How about Python? Do you know virtualisation? What do you know about SDN? etc

352-001

Now on to the exam itself. I took the CCDE v1 written exam at the last recert, I thought I might take the CCDE practical. I still might, who knows, but this time I wanted to check out the v2 exam.

The 1st thing to note about the exam blueprint is that it’s huge. You can be tested on pretty much anything – networking, wireless, datacentre, standards, process, etc. There is no formal training requirement for this exam, just a load of books to read. Check out the list. Over the years I’ve read most of the books on this list already cover to cover. Some more than once! Having a Safari books subscription is a great investment. If you don’t have access, I’d recommend it – particularly whilst you are working up to taking an exam. This time around I re-read my notes from last time, picked out the sections from the blueprint that I wanted to improve on, and went to town on the book list, CCO etc.

The exam itself is quite challenging – it is very different to other Cisco exams that I’ve taken in the past. Whilst it might not always go in to the depth and nuts and bolts that you’d see right the way though for example, the CCIE SP written, there’s a lot to the questions and you need to be prepared to take time to think clearly through several different designs, technologies & concepts before finding the best answer and clicking that next button.

You can go from R&S topics, to SP, to Datacentre! Everything is in there. But, in my opinion, it is a fair exam with a good distribution of the topics. Most of the questions were quite clear, but as can be expected there were several that I left comments on – some of the questions were just crazy, but that’s exams for you…

Overall I enjoyed the exam, and that’s the CCIE re-certification done now for another 2 years. Next time around it will be 10 years! That’s flown by.

If you are taking the CCIE or CCDE exams right now, or any vendor expert level certification for that matter, best of luck with the journey. It’s hard work, but most importantly make sure you enjoy the ride!

No rest for me yet though, now I’m working through the Juniper certs…

 

 

 

MTU settings on Junos & IOS (part 3) with RSVP MPLS/802.1q

Part 1 of this series focussed on the interface MTU configuration, looking how different vendors implement the setting. Some include the layer 2 headers, some don’t. Part 2 looked at Jumbo Frames, IP MTU and OSPF. In this post we’ll build a simple 4 node MPLS network and check out how the default MTU settings affect the transmission of data. We’ll also enable a couple of 802.1q tagged interfaces. The configuration will be adjusted accordingly to enable the transmission of 2000 bytes of IP payload across the MPLS core.

802.1q Frame

First of all a quick refresher on what an Ethernet frame looks like if it contains an 802.1q tag in the header. 802.1q frame 802.1q does not encapsulate the Ethernet frame, simply a new 4 byte header is inserted between the source MAC address and the Ethertype. Now with 1500 bytes payload, the frame size has increased to 1522 bytes (1518 not counting the FCS). The 802.1q tag consists of the following fields:

  • TPID: Tag Protocol ID. A 16-bit field with a value set to 0x8100 to identify the frame as IEEE 802.1q tagged
  • Priority. A 3 bit field which refers to the IEEE 802.1p priority. i.e. class of service
  • DEI: Drop eligible indicator. Used to indicated if frames can be dropped.
  • VID: VLAN ID. A 12-bit field specifying the VLAN.

Multiple 802.1q tags can be present in the frame. This is known as IEEE 802.1ad Q-inQ. More on this another time.

MPLS Frame

Also let’s take a look at an Ethernet Frame containing an MPLS header. MPLS Frame The 32-bit MPLS label is inserted between the L2 header and the protocol data. This is why it’s sometimes called a shim header. The ethertype is set to 0x8847 to identify the payload as MPLS. The MPLS label contains the following fields:

  • Label. The label itself – 20 bits.
  • EXP. A 3-bit field used for QoS markings.
  • S. 1-bit to represent if a label is the last in the stack.
  • TTL. An 8-bit time to live field.

As the S bit implies, there can be a stack of labels. More on this another time.

Lab

For this lab, I’ll be using the topology below. We’ll start with a base IP/OSPF configuration and add MPLS. MTUs will be adjusted to enable a 2000 byte MPLS payload to be transmitted across the topology. MPLS Topology Software revisions are as follows

  • IOS (Cisco 7200 12.4(24)T)
  • IOS-XR (IOS-XRv 5.1.1)
  • Junos (Olive 12.3R5.7)

R1 is running IOS, the base configuration is as below.

interface Loopback0
 ip address 1.1.1.1 255.255.255.255
!
interface GigabitEthernet1/0
 ip address 192.168.12.1 255.255.255.0
 ip ospf network point-to-point
 negotiation auto
!
router ospf 1
 log-adjacency-changes
 network 1.1.1.1 0.0.0.0 area 0
 network 192.168.12.1 0.0.0.0 area 0
!

MTUs are at the defaults

R1#show interfaces g1/0 | i MTU
  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
R1#show ip interface g1/0 | i MTU
  MTU is 1500 bytes

R2 is running IOS-XR with a configuration as below

interface Loopback0
 ipv4 address 2.2.2.2 255.255.255.255
!
interface GigabitEthernet0/0/0/0
!
interface GigabitEthernet0/0/0/0.23
 description Link to R3
 ipv4 address 192.168.23.2 255.255.255.0
 encapsulation dot1q 23
!
interface GigabitEthernet0/0/0/1
 description Link to R1
 ipv4 address 192.168.12.2 255.255.255.0
!
router ospf 1
 area 0
  interface Loopback0
   passive enable
  !
  interface GigabitEthernet0/0/0/0.23
   network point-to-point
  !
  interface GigabitEthernet0/0/0/1
   network point-to-point
  !
 !
!

There is nothing special about the link from R1 to R2, the MTU is set to the default 1514, so we can expect 1500 to be the IP MTU. Notice that the link from R2 to R3 is an 802.1q tagged interface.

XR has automatically increased the MTU on the 802.1q subinterface to 1518, to keep 1500 bytes available to IP.

RP/0/0/CPU0:R2-XRv#show ipv4 interface GigabitEthernet0/0/0/0.23 | i MTU
  MTU is 1518 (1500 is available to IP)

Here is the Junos configuration on R3.

interfaces {
    em0 {
        vlan-tagging;
        unit 23 {
            vlan-id 23;
            family inet {
                address 192.168.23.3/24;
            }
        }
    }
    em1 {
        unit 0 {
            family inet {
                address 192.168.34.3/24;
            }
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 3.3.3.3/32;
            }
        }
    }
}
protocols {
    ospf {
        area 0.0.0.0 {
            interface lo0.0 {
                passive;
            }
            interface em0.23 {
                interface-type p2p;
            }
            interface em1.0 {
                interface-type p2p;
            }
        }
    }
}

On the tagged interface to VLAN23, Junos has also increased the MTU to 1518 to accommodate the VLAN tag and enable 1500 bytes of protocol data. Note, if you configure the interface MTU manually or XR or Junos you’d still need to allow for the dot1q tags.

matt@R3-Junos> show interfaces em0 | match "Phy|Log|MTU|Tag"
Physical interface: em0, Enabled, Physical link is Up
  Type: Ethernet, Link-level type: Ethernet, MTU: 1518
  Logical interface em0.23 (Index 65) (SNMP ifIndex 506)
    Flags: SNMP-Traps VLAN-Tag [ 0x8100.23 ]  Encapsulation: ENET2
    Protocol inet, MTU: 1500

I like it how Junos includes the VLAN Tag 0x8100.23 in the output to indicate a 802.1q tag 23.

Jumping over to R4, the configuration is as below

interfaces {
    em1 {
        unit 0 {
            family inet {
                address 192.168.34.4/24;
            }
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 4.4.4.4/32;
            }
        }
    }
}
protocols {
    ospf {
        area 0.0.0.0 {
            interface lo0.0 {
                passive;
            }
            interface em1.0 {
                interface-type p2p;
            }
        }
    }
}

The MTU will be 1514 for the Physical, and 1500 for family inet.

matt@R4-Junos> show interfaces em1 | match "Phy|Log|MTU"
Physical interface: em1, Enabled, Physical link is Up
  Type: Ethernet, Link-level type: Ethernet, MTU: 1514
  Logical interface em1.0 (Index 69) (SNMP ifIndex 24)
    Protocol inet, MTU: 1500

At this point the biggest ping that we’ll be able to get from R4 to R1’s loopback address 1.1.1.1 is going to be 1472, representing 1500 byes of IP data and 1514 on the wire (1472 data + 8 ICMP header + 20 IP header + 14 Ethernet).

matt@R4-Junos> ping rapid do-not-fragment count 1 size 1472 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 1472 data bytes
!
--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 14.124/14.124/14.124/nan ms

MPLS

MPLS and will now be enabled across the topology. LSPs will be RSVP signalled and I’ll demonstrate how the MTU settings need to be updated. As this post is focussing on the approach different vendors take to MTU, I’ll skip the MPLS detail and save it for another time, but the relevant MPLS config is below.

On R1 I’ve enabled a Tunnel interface and statically routed IP traffic to R4’s Loopback via this MPLS TE tunnel. LDP has not been enabled and doesn’t need to be.

mpls traffic-eng tunnels
!
interface Tunnel0
 ip unnumbered Loopback0
 tunnel destination 4.4.4.4
 tunnel mode mpls traffic-eng
 tunnel mpls traffic-eng path-option 1 dynamic
 no routing dynamic
!
interface GigabitEthernet1/0
 mpls traffic-eng tunnels
!
router ospf 1
 mpls traffic-eng router-id Loopback0
 mpls traffic-eng area 0
!
ip route 4.4.4.4 255.255.255.255 Tunnel0

R2. In IOS-XR, the interfaces are added to the protocol.

router ospf 1
 area 0
  mpls traffic-eng
 !
 mpls traffic-eng router-id Loopback0
!
rsvp
 interface GigabitEthernet0/0/0/1
 !
 interface GigabitEthernet0/0/0/0.23
 !
!
mpls traffic-eng
 interface GigabitEthernet0/0/0/1
 !
 interface GigabitEthernet0/0/0/0.23

R3. On Junos we add the Interfaces to the protocols MPLS and RSVP and enable family mpls on the logical unit.

interfaces {
    em0 {
        unit 23 {            
            family mpls;
        }
    }
    em1 {
        unit 0 {
            family mpls;
        }
    }
}
protocols {
    rsvp {
        interface em0.23;
        interface em1.0;
    }
    mpls {
        interface em0.23;
        interface em1.0;
    }
    ospf {
        traffic-engineering;
    }
}

R4. I’ve created a dynamic label switched path to R1. This path gets installed in table inet.3 but as inet.3 is only used for BGP next hops by default, my traffic test from R4 to 1.1.1.1 won’t be labelled. I want to keep this lab simple, so I’ve used traffic-engineering mpls-forwarding to ensure that the LSP to 1.1.1.1 is placed in to table inet.0

interfaces {
    em1 {
        unit 0 {
            family mpls;
        }
    }
}
protocols {
    rsvp {
        interface em1.0;
    }
    mpls {
        traffic-engineering mpls-forwarding;
        label-switched-path R4toR1 {
            to 1.1.1.1;
        }
        interface em1.0;
    }
    ospf {
        traffic-engineering;
    }
}

matt@R4-Junos> show route table inet.0 1.1.1.1

inet.0: 9 destinations, 11 routes (9 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

1.1.1.1/32         @[OSPF/10] 00:09:15, metric 4
                    > to 192.168.34.3 via em1.0
                   #[RSVP/7/1] 00:05:45, metric 4
                    > to 192.168.34.3 via em1.0, label-switched-path R4toR1

OK, so this point we have a functioning MPLS topology with traffic from R1 to R4 (4.4.4.4) and from R4 to R1 (1.1.1.1) being label switched.

R1#traceroute 4.4.4.4

Type escape sequence to abort.
Tracing the route to 4.4.4.4

  1 192.168.12.2 [MPLS: Label 16000 Exp 0] 32 msec 20 msec 8 msec
  2 192.168.23.3 [MPLS: Label 299824 Exp 0] 8 msec 8 msec 12 msec
  3 4.4.4.4 12 msec 16 msec 8 msec

matt@R4-Junos> traceroute 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 40 byte packets
 1  192.168.34.3 (192.168.34.3)  1.050 ms  0.863 ms  0.765 ms
     MPLS Label=299792 CoS=0 TTL=1 S=1
 2  192.168.23.2 (192.168.23.2)  5.398 ms  3.393 ms  10.241 ms
     MPLS Label=16001 CoS=0 TTL=1 S=1
 3  192.168.12.1 (192.168.12.1)  10.008 ms  7.667 ms  4.486 ms

The MTU settings are still at the defaults, so let’s see what the maximum size of IP payload we are able to transmit. We know that only one label is in the stack, so will the maximum be 1496?

R1#ping 4.4.4.4 rep 2 df-bit size 1488

Type escape sequence to abort.
Sending 2, 1488-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
!!
Success rate is 100 percent (2/2), round-trip min/avg/max = 12/14/16 ms
R1#ping 4.4.4.4 rep 2 df-bit size 1489

Type escape sequence to abort.
Sending 2, 1489-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
..
Success rate is 0 percent (0/2)

matt@R4-Junos> ping rapid do-not-fragment count 2 size 1460 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 1460 data bytes
!!
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 5.558/6.949/8.340/1.391 ms

matt@R4-Junos> ping rapid do-not-fragment count 2 size 1461 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 1461 data bytes
..
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

No, that maximum that could be transmitted in both directions was 1488 bytes of IP data. Don’t forget for Junos, 28 bytes of headers are added to the ping size.

By default Junos derives the MPLS MTU from the Interface settings and makes allowance for 3 labels in the stack, so with a 1500 byte interface MTU, the MPLS MTU would be set to 1488.  This is why we can only ping with 1488 bytes of protocol data.

On both of the Junos routers I’m going to go ahead and set the Interface MTU to 9192, but since my goal is to test an MPLS switched payload of 2000 bytes IP data, I’ll set the MPLS MTU and leave the IP MTU at 1500 bytes. Note if the MPLS MTU is user configured then the configured value includes the labels. I’ll set the MPLS MTU to 2004 to allow the 2000 bytes of data and for this test I only need to make allowance for 1 label.

em0 {
    vlan-tagging;
    mtu 9192;
    unit 23 {
        vlan-id 23;
        family inet {
            mtu 1500;
            address 192.168.23.3/24;
        }
        family mpls {
            mtu 2004;
        }
    }
}
em1 {
    mtu 9192;
    unit 0 {
        family inet {
            mtu 1500;
            address 192.168.34.3/24;
        }
        family mpls {
            mtu 2004;
        }
    }
}

matt@R3-Junos> show interfaces em[01]* | match "em|proto"
Physical interface: em0, Enabled, Physical link is Up
  Logical interface em0.23 (Index 72) (SNMP ifIndex 506)
    Protocol inet, MTU: 1500
    Protocol mpls, MTU: 2004, Maximum labels: 3
Physical interface: em1, Enabled, Physical link is Up
  Logical interface em1.0 (Index 65) (SNMP ifIndex 24)
    Protocol inet, MTU: 1500
    Protocol mpls, MTU: 2004, Maximum labels: 3

I love how Junos allows a regexp to be used pretty much however I feel like using it.

At this point the maximum ping size from both sides of the network is currently 1496 bytes. This is as expected because we have 1 label in the stack and the Cisco routers are still set to 1500 bytes on the Interface MTU.

Let’s have a look what’s going on with the interface MTU parameters on the Ciscos.

R1#show mpls interfaces g1/0 detail
Interface GigabitEthernet1/0:
        IP labeling enabled (ldp):
          Interface config
        LSP Tunnel labeling not enabled
        BGP labeling not enabled
        MPLS operational
        MTU = 1500
RP/0/0/CPU0:R2-XRv#show im database interface GigabitEthernet0/0/0/0.23

View: OWN - Owner, L3P - Local 3rd Party, G3P - Global 3rd Party, LDP - Local Data Plane
      GDP - Global Data Plane, RED - Redundancy, UL - UL

Node 0/0/CPU0 (0x0)

Interface GigabitEthernet0/0/0/0.23, ifh 0x00000700 (up, 1518)
  Interface flags:          0x0000000000800597 (ROOT_IS_HW|IFINDEX
                            |SUP_NAMED_SUB|BROADCAST|CONFIG|VIS|DATA|CONTROL)
  Encapsulation:            dot1q
  Interface type:           IFT_VLAN_SUBIF
  Control parent:           GigabitEthernet0/0/0/0
  Data parent:              GigabitEthernet0/0/0/0
  Views:                    UL|GDP|LDP|G3P|L3P|OWN

  Protocol        Caps (state, mtu)
  --------        -----------------
  None            vlan_jump (up, 1518)
  None            spio (up, 1518)
  None            dot1q (up, 1518)
  arp             arp (up, 1500)
  ipv4            ipv4 (up, 1500)
  mpls            mpls (up, 1500)

It would appear that the MPLS MTU is 1500, so straight away we can see that Cisco are including the labels in the 1500, otherwise our ping of 1500 would be working.

Let’s go ahead and check this theory by increasing the Interface MTU and the MPLS MTU by 4 bytes on each Cisco. The IP MTU will remain set at 1500.

R1#show mpls interfaces g1/0 detail | i MTU
        MTU = 1504

RP/0/0/CPU0:R2-XRv#show im database interface GigabitEthernet0/0/0/0.23 | i "ipv4|mpls"
  ipv4            ipv4 (up, 1500)
  mpls            mpls (up, 1504)

R1#ping 4.4.4.4 rep 2 df-bit size 1500

Type escape sequence to abort.
Sending 2, 1500-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
!!
Success rate is 100 percent (2/2), round-trip min/avg/max = 16/16/16 ms

matt@R4-Junos> ping rapid do-not-fragment count 2 size 1472 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 1472 data bytes
!!
--- 1.1.1.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 8.091/9.467/10.844/1.377 ms

The 1500 byte ping now works exactly as expected.

Since my goal is to get a 2000 byte label switched ping working. I’ll go ahead and set the MPLS MTU on the Ciscos for test. I’ll set this to 2004 like I did on the Junos routers.

R1:
interface GigabitEthernet1/0
 mtu 4000
 ip mtu 1500
 mpls mtu 2004
end
R1#show mpls interfaces g1/0 detail | i MTU
        MTU = 2004

R2:
interface GigabitEthernet0/0/0/1
 mtu 9000
 ipv4 mtu 1500
 mpls
  mtu 2004
 !
!
RP/0/0/CPU0:R2-XRv#show im database interface GigabitEthernet0/0/0/0.23 | i "ipv4|mpls"
  ipv4            ipv4 (up, 1500)
  mpls            mpls (up, 2004)

OK so now we have the capability for an MPLS LSP with 2000 bytes payload. But can I ping from R4 to R1 with my 2000 byte ping?

Well actually I can’t.

matt@R4-Junos> ping 1.1.1.1 source 4.4.4.4 do-not-fragment size 1972
PING 1.1.1.1 (1.1.1.1): 1972 data bytes
76 bytes from 192.168.12.2: frag needed and DF set (MTU 1500)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 07d0 186f   2 0000  3e  01 12b5 4.4.4.4  1.1.1.1

^C
--- 1.1.1.1 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss

We’re being told that the packet is too big at R2. Why’s this? Well remember that MPLS will pop the label at the hop before the destination. This is known as penultimate hop popping. So the data between R2 is and R1 is transmitted as IP, not MPLS. As I left the MTU at 1500 for IP data, the echo request cannot be transmitted.

I actually did this on purpose by changing the config on R1 so that the LSPs were implicit null signalled. If I go ahead and change the signalling to explicit null, the LSP will then be labelled end to end and the ping will work.

On R1 I turn off implicit-null and R4 I turn on explicit-null.

R1(config)#no mpls traffic-eng signalling advertise implicit-null

matt@R4-Junos# set protocols mpls explicit-null

R1#ping 4.4.4.4 source 1.1.1.1 rep 2 df size 2000

Type escape sequence to abort.
Sending 2, 2000-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
Packet sent with the DF bit set
!!
Success rate is 100 percent (2/2), round-trip min/avg/max = 16/20/24 ms

matt@R4-Junos> ping 1.1.1.1 source 4.4.4.4 do-not-fragment size 1972
PING 1.1.1.1 (1.1.1.1): 1972 data bytes
1980 bytes from 1.1.1.1: icmp_seq=0 ttl=253 time=11.802 ms
^C
--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 11.802/11.802/11.802/0.000 ms

Excellent! The capture below was taken on R2s interface to R1, and you can see the 2018 bytes on the wire (1972 data + 8 ICMP + 20 IP + 4 MPLS + 14 Ethernet). Notice Label 0 is used for Explicit Null on IPv4.

ExplictNull

Summary

In this post we focussed on MPLS MTU and how to use MPLS MTU to enable a payload >1500 bytes. I’ve demonstrated the the Interface MTU, MPLS MTU and IP MTU can be set to totally different values.

Why does any of this matter? In a provider core you need to be able to transport your customers full size frames across your network. You can guarantee that customer data will be at least 1500 bytes, but depending on the service specification that you want to sell, could be as large as 9000.

I’ve introduced quite a lot of MPLS terms in this lab. In a later post I will go through this topology again in detail, and talk through the MPLS specifics.

 

 

MTU settings on Junos & IOS (part 2) with OSPF

Part 1 of this series focussed on the interface MTU configuration, looking how different vendors implement the setting. Some include the layer 2 headers, some don’t. Part 3 will add MPLS MTU to the mix.

This post looks at Jumbo Frames, IP MTU and we’ll also introduce OSPF to the mix. OSPF exchanges IP MTU information in DBD packets when forming a neighbor adjacency and will detect any MTU mismatch, so if the MTU settings are wrong or mismatched, we won’t fully establish the adjacency. Of course, this feature can be turned off, but following the OSPF Version 2 specification (RFC 2328):

If the Interface MTU field in the Database Description packet
indicates an IP datagram size that is larger than the router can
accept on the receiving interface without fragmentation, the
Database Description packet is rejected.

Jumbo Frames

Why would anyone actually want to increase the MTU size beyond 1500? Well, back in the day, larger packets were desirable because they resulted in less overhead on the server – fewer CPU interrupts, fewer CPU cycles wasted etc. Today with Large Segment Offload (LSO) etc the performance increase might not be what you’d expect. Take a look here for more info. As always – implement something that is suitable for your environment, and test before you do.  Anyway, you are not here to debate the use of Jumbo frames or otherwise, so let’s crack on.

IP MTU

As with Ethernet frames, the protocol MTU can be changed for IP packets. The accepted “standard” payload for a Jumbo Frame is 9000 bytes (i.e. an IP MTU set to 9000).

In this post, the interface MTU will be increased to the maximum supported by the interface hardware, but for the purposes of this post and to demonstrate that the interface MTU and IP MTU can be different, we will set the IP MTU to a consistent value of 2000 bytes.

Topology

The same virtual topology will be used as Part 1.

IMG_0098

Software revisions are as follows

  • IOS (Cisco 7200 12.4(24)T)
  • IOS-XE (CSR1000V 15.4(1)S)
  • IOS-XR (IOS-XRv 5.1.1)
  • Junos (12.3R5.7)
  • Junos (Firefly 12.1X46)

IOS/IOS-XE

Our Interface config is below, the interface MTU has been changed to 9216, and the protocol MTU to 2000. From our tests earlier, we know what this means that the maximum IP payload is 2000 bytes, which would result in 2014 bytes being put on the wire including the L2 headers.

interface GigabitEthernet2
 mtu 9216
 ip address 192.168.1.4 255.255.255.0
 ip mtu 2000
 negotiation auto
 cdp enable
end

OK so let’s go ahead and enable OSPF, nothing much to see here

interface Loopback0
 ip address 4.4.4.4 255.255.255.255
 !
 router ospf 1
 network 4.4.4.4 0.0.0.0 area 0
 network 192.168.1.4 0.0.0.0 area 0
 !

I already enabled OSPF on another router, but didn’t change the MTU from 1500. We’re not going to get a full adjacency here but let’s troubleshoot

1000v#show ip ospf neighbor
 Neighbor ID Pri State Dead Time Address Interface
 192.168.1.3 1 INIT/DROTHER 00:00:35 192.168.1.3 GigabitEthernet2
1000v#debug ip ospf adj
 OSPF adjacency debugging is on
 1000v#
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Neighbor change event
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: DR/BDR election
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Elect BDR 192.168.1.3
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Elect DR 192.168.1.4
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: DR: 192.168.1.4 (Id) BDR: 192.168.1.3 (Id)
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Neighbor change event
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: DR/BDR election
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Elect BDR 192.168.1.3
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: Elect DR 192.168.1.4
 *Apr 3 18:31:08.455: OSPF-1 ADJ Gi2: DR: 192.168.1.4 (Id) BDR: 192.168.1.3 (Id)
 *Apr 3 18:31:08.484: OSPF-1 ADJ Gi2: Send DBD to 192.168.1.3 seq 0x341 opt 0x52 flag 0x7 len 32
 *Apr 3 18:31:08.484: OSPF-1 ADJ Gi2: Retransmitting DBD to 192.168.1.3 [2]
 *Apr 3 18:31:09.102: OSPF-1 ADJ Gi2: Rcv DBD from 192.168.1.3 seq 0x26B1 opt 0x52 flag 0x7 len 32 mtu 1500 state EXSTART
 *Apr 3 18:31:09.102: OSPF-1 ADJ Gi2: Nbr 192.168.1.3 has smaller interface MTU

Well that’s pretty clear why there is a problem. Let’s move on to IOS-XR.

IOS-XR

Let’s see what IOS-XR has to say about OSPF & MTU

RP/0/0/CPU0:ios#show ospf neighbor
* Indicates MADJ interface

Neighbors for OSPF 1

Neighbor ID     Pri   State           Dead Time   Address         Interface
4.4.4.4         1     EXSTART/BDR     00:00:38    192.168.1.4     GigabitEthernet0/0/0/0
    Neighbor is up for 00:00:06

RP/0/0/CPU0:ios#show ospf trace errors
11   Apr  3 18:33:41.776* ospf_rcv_dbd: WARN nbr 4.4.4.4 larger MTU dbd_if_mtu 2000 oi_ip_mtu 1500

Pretty clear that there is an MTU problem. Below I’ve set the Interface MTU to 9000 – remember IOS-XR includes the L2 headers in the Interface MTU, so the maximum encapsulated data on this wire would be 8986.

UPDATE 23/12/15:  I’m setting the interface MTU to a different value to the IOS router’s MTU setting,  to show that it’s the IP MTU that is the important setting and must match for two OSPF routers to establish an adjacency.

The IP MTU is set to 2000 to match the other router.

interface Loopback0
 ipv4 address 3.3.3.3 255.255.255.255
!
interface GigabitEthernet0/0/0/0
 mtu 9000
 ipv4 mtu 2000
 ipv4 address 192.168.1.3 255.255.255.0
!
router ospf 1
 area 0
 interface Loopback0
 !
 interface GigabitEthernet0/0/0/0
 !
 !
!
end

OSPF has established with the 1000v now the IPv4 MTU has been changed to 2000, and we can see the route the 1000v’s loopback interface 4.4.4.4

Notice that for OSPF to be happy, it only matters that the IP MTU is the same on both routers, the physical MTU can be different.

RP/0/0/CPU0:ios#show ospf neighbor
Thu Apr 3 21:07:09.053 UTC
* Indicates MADJ interface
Neighbors for OSPF 1
Neighbor ID Pri State Dead Time Address Interface
4.4.4.4 1 FULL/DR 00:00:31 192.168.1.4 GigabitEthernet0/0/0/0
 Neighbor is up for 01:13:55
Total neighbor count: 1

RP/0/0/CPU0:ios#show route 4.4.4.4
Thu Apr 3 21:07:29.291 UTC
Routing entry for 4.4.4.4/32
 Known via "ospf 1", distance 110, metric 2, type intra area
 Installed Apr 3 19:53:14.416 for 01:14:14
 Routing Descriptor Blocks
 192.168.1.4, from 4.4.4.4, via GigabitEthernet0/0/0/0
 Route metric is 2
 No advertising protos.

RP/0/0/CPU0:ios#ping 4.4.4.4 donotfrag size 2000 co 1
Thu Apr 3 21:07:36.781 UTC
Type escape sequence to abort.
Sending 1, 2000-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 1/1/1 ms
RP/0/0/CPU0:ios#

Junos

Finally we’ll configure the Junos router and make sure that we are able to ping with 2000 bytes of protocol data to the 1000v and XRv.

The Junos configuration is as follows, I’ve not added the MTU settings yet.

interfaces {
    }
    ge-0/0/1 {
        unit 0 {
            family inet {
                address 192.168.1.5/24;
            }
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 5.5.5.5/32;
            }
        }
    }
}
protocols {
    ospf {
        area 0.0.0.0 {
            interface lo0.0 {
                passive;
            }
            interface ge-0/0/1.0;
        }
    }
}

The neigbor isn’t going to establish so let’s add some traceoptions to double check what the problem is.

root@firefly> show configuration protocols
ospf {
    traceoptions {
        file ospf-log;
        flag error;
    }
}
root@firefly> show log ospf-log
Apr 3 21:39:00 firefly clear-log[1261]: logfile cleared
Apr  3 21:39:01.701093 OSPF packet ignored: MTU mismatch from 192.168.1.3 on intf ge-0/0/1.0 area 0.0.0.0
Apr  3 21:39:04.794783 OSPF packet ignored: MTU mismatch from 192.168.1.4 on intf ge-0/0/1.0 area 0.0.0.0

OK, that’s pretty clear. Let’s fix the MTU and IP MTU.

root@firefly# show | compare
[edit interfaces ge-0/0/1]
+   mtu 9192;
[edit interfaces ge-0/0/1 unit 0 family inet]
+      mtu 2000;

Notice that the IP MTU is configured under family inet.

root@firefly> show interfaces ge-0/0/1 | match "ge-|MTU:"
Physical interface: ge-0/0/1, Enabled, Physical link is Up
  Link-level type: Ethernet, MTU: 9192, Link-mode: Full-duplex, Speed: 1000mbps,
  Logical interface ge-0/0/1.0 (Index 74) (SNMP ifIndex 519)
    Protocol inet, MTU: 2000
root@firefly> show ospf neighbor
Address          Interface              State     ID               Pri  Dead
192.168.1.4      ge-0/0/1.0             Full      4.4.4.4            1    38
192.168.1.3      ge-0/0/1.0             Full      3.3.3.3            1    37

root@firefly> show route protocol ospf

inet.0: 6 destinations, 6 routes (6 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

3.3.3.3/32         *[OSPF/10] 00:02:15, metric 2
                    > to 192.168.1.3 via ge-0/0/1.0
4.4.4.4/32         *[OSPF/10] 00:02:15, metric 2
                    > to 192.168.1.4 via ge-0/0/1.0

All good, we can see routes to 3.3.3.3 and 4.4.4.4

Now for the ping, remember that the Junos ping size excludes the ICMP (8 bytes) and IP (20 bytes) headers , so we’ll be expecting the maximum working ping size to be 1972 bytes, for 2000 bytes of protocol data and 2014 bytes on the wire.

root@firefly> ping rapid count 1 do-not-fragment size 1972 3.3.3.3
PING 3.3.3.3 (3.3.3.3): 1972 data bytes
!
--- 3.3.3.3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 3.293/3.293/3.293/0.000 ms

root@firefly> ping rapid count 1 do-not-fragment size 1972 4.4.4.4
PING 4.4.4.4 (4.4.4.4): 1972 data bytes
!
--- 4.4.4.4 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.961/0.961/0.961/0.000 ms

root@firefly> ping rapid count 1 do-not-fragment size 1973 3.3.3.3
PING 3.3.3.3 (3.3.3.3): 1973 data bytes
ping: sendto: Message too long
.
--- 3.3.3.3 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss

NX-OS

I don’t have a Nexus in my lab, but for completeness below is the config for updating the MTU on NX-OS.

On the 7k the system will be enabled for Jumbo by default, if you need to change this value it’s done like this

switch(config)#system jumbomtu 9216

Make sure you check the Interfaces and Vlan interfaces have the correct MTU. That’s done with the interface command “mtu X”. For Layer 2 interfaces, configure either the default MTU size (1500 bytes) or up to the system jumbo MTU size.

On the 5k it’s done a bit differently – in a QoS policy map! For NX-OS >4.1

switch(config)#policy-map type network-qos jumbo
switch(config-pmap-nq)#class type network-qos class-default
switch(config-pmap-c-nq)#mtu 9216
switch(config-pmap-c-nq)#exit
switch(config-pmap-nq)#exit
switch(config)#system qos
switch(config-sys-qos)#service-policy type network-qos jumbo

 Summary

In this post I’ve shown how the physical MTU can vary from the IP MTU, and how it’s important to have the same IP MTU when working with OSPF.

Also discussed were debugging steps to troubleshoot OSPF MTU issues on IOS and Junos.

Gotta love Junos for keeping things consistent!