Category Archives: Juniper

Juniper vMX – Getting Started Guide

I’m excited to finally have the opportunity to play with Juniper’s vMX! Since it was announced last year I’ve been eagerly waiting for release – a couple of client projects already have passed by where the vMX would have been a perfect fit. vMX already won an award earlier this year at Interop Tokyo 2015!

In this post I’ll be giving a bit of background on the vMX architecture and licensing, and then go on to walk through a lab based configuration of vMX.

The vMX is a virtual MX Series Router that is optimized to run as software on x86 servers. Like other MX routers, it runs Junos, and Trio has been compiled for x86! Yes, that means the sophisticated L2, L2.5 and L3 forwarding features we are used to on the MX are present on the vMX.

Architecture

vMX can be installed on server hardware of your choice, so long as it is x86 based and running Linux (although I’m sure a version to run on vmware won’t be too far away).

vMX itself actually consists of two separate VMs – a virtual forwarding plane (VFP) running the vTrio, and a virtual control plane (VCP) running Junos.

The Linux virtualisation solution KVM is what Juniper are using to spin up the virtual instances of the control and forwarding planes, and multiple instances of vMX can be run on the same hardware. To see Juniper using Linux and KVM is no surprise as this is what we are used to on Juniper’s other products such as the QFX.

The VMs are managed by a simple orchestration script which is used to create, stop and start the vMX instances. A simple configuration file defines parameters such as memory and vCPUs to allocate to the VCP and VFP.

A couple of Linux bridges are created by the orchestration script. Clearly VCP and VFP need to be able to communicate directly so an “internal” bridge is automatically created for each vMX instance to enable this communication.  An “external” bridge is also created, this is used to enable the management interface on the Linux Physical host to be used for the virtual management interfaces on the VCP and VFP.

For data interfaces, there are a couple of techniques available for packet I/O depending on the required vMX throughput –

  • Paravirtualisation using KVMs virtio drivers
  • PCI passthrough using single root I/O virtualisation (SR-IOV), enabling packets to bypass the hypervisor and therefore increase I/O.

Juniper recommend virtio or SR-IOV up to 3Gbps, and SR-IOV over 3Gbps (using a minimum of 2 x 10GE interfaces).

Which you will choose will ultimately depend on  your use case for the vMX.

Licensing

Now this is what I really like about vMX! Licensing is based on a combination of throughput and features, and the lowest available throughput license is 100Mbps! Yes – you don’t need to be shifting multi-Gigabits of traffic to start with vMX. You can start small and pay-as-you-grow with vMX.

Below 1Gbps there are only 3 options – 100Mbps, 250Mbps and 500Mbps. Full scale features are included! List price on the 100Mbps option is a very reasonable $750.

At 1Gbps and above, licences are a combination of features (Base, Advance, and Premium) and full duplex throughput (1G, 5G, 10G, 40G)

  • Base – IP routing with 32,000 routes in the forwarding table. Basic Layer 2 functionality, Layer 2 bridging and switching.
  • Advance – Features in the BASE application package IP routing with routes up to platform scale in the forwarding table. IP and MPLS switching for unicast and multicast applications. Layer 2 features include Layer 2 VPN, VPLS, EVPN, and Layer 2 Circuit
    VXLAN.
  • Premium – Features in the BASE and ADVANCE application packages. Layer 3 VPN for IP and multicast

Setting up vMX on Ubuntu

Now I’m going to walk through setting up vMX on Ubuntu 14.04 LTS server (Juniper’s recommended flavour of Linux for vMX). Just for fun this is actually running as a nested Vmware VM on my Macbook Pro – fine for a lab, but don’t try this in production! 🙂 I have allocated 8GB RAM, 4 vCPUs  and two vNICs to the Ubuntu VM. Also the VM is enabled to support hypervisor applications within the VM.

At this point Ubuntu Server has been freshly installed, and the option to install virtualisation was selected during setup.

First things first, let’s update all packages, install the prerequsite packages and restart the system

[email protected]:~$ sudo apt-get upgrade
<snip>
[email protected]:~$ sudo apt-get install bridge-utils qemu-kvm libvirt-bin python python-netifaces vnc4server libyaml-dev python-yaml numactl libparted0-dev libpciaccess-dev libnuma-dev libyajl-dev libxml2-dev libglib2.0-dev libnl-dev python-pip python-dev libxml2-dev libxslt-dev
<snip>
[email protected]:~$ sudo reboot

Configuring vMX

As this is a lab based build, I will be using virtio for the virtual NIC. There are two options on the VFP – a “Lite” version PFE for labs and performance version for normal operation.

Note: Ubuntu 14.04 provides libvirt 1.2.2 which works for VFP lite version. However for the VFP performance version you must upgrade to libvirt 1.2.8.

Let’s extract the vMX application bundle and get going!

[email protected]:~$ tar xzf vmx-14.1R5.4-1.tgz
[email protected]:~$ cd vmx-14.1R5.4-1/
[email protected]:~/vmx-14.1R5.4-1$ ls
config drivers env images scripts vmx.sh

First of all we need to setup the vmx config file, this is done by editing config/vmx.conf

First of all I set an instance name for vmx, and set the correct vmx images. I’m using vPFE-lite.

---
#Configuration on the host side - management interface, VM images etc.
HOST:
    identifier                : vmx1   # Maximum 4 characters
    host-management-interface : eth0
    routing-engine-image      : "/home/mdinham/vmx-14.1R5.4-1/images/jinstall64-vmx-14.1R5.4-domestic.img"
    routing-engine-hdd        : "/home/mdinham/vmx-14.1R5.4-1/images/vmxhdd.img"
    forwarding-engine-image   : "/home/mdinham/vmx-14.1R5.4-1/images/vPFE-lite-20150707.img"

Now the parameters the control plane and forwarding plane.

I’ve allocated 1 vCPU to vRE and 3 vCPU to vPFE. 1GB RAM to the RE and 6GB to the forwarding plane, as per the defaults for 14.1

UPDATE: Feb 2016
For vMX on 15.1, allocate 1 vCPU to vRE and 3 vCPU to vPFE. 2GB RAM to the RE and 8GB to the forwarding plane.

I have also tried vMX with 2GB allocated to the vPFE and the forwarding plane loaded, which could be fine for lab purposes. I’d expect 1GB to be the minimum on the vRE. 3 x vCPU seems to be the minimum for the vPFE.

Note that device-type is set to “virtio” for the interfaces.

---
#External bridge configuration
BRIDGES:
    - type  : external
      name  : br-ext                  # Max 10 characters

---
#vRE VM parameters
CONTROL_PLANE:
    vcpus       : 1
    memory-mb   : 2048
    console_port: 8601

    interfaces  :
      - type      : static
        ipaddr    : 10.102.144.94
        macaddr   : "0A:00:DD:C0:DE:0E"

---
#vPFE VM parameters
FORWARDING_PLANE:
    memory-mb   : 6144
    vcpus       : 3
    console_port: 8602
    device-type : virtio

    interfaces  :
      - type      : static
        ipaddr    : 10.102.144.98
        macaddr   : "0A:00:DD:C0:DE:10"

---
#Interfaces
JUNOS_DEVICES:
 - interface : ge-0/0/0
 mac-address : "02:06:0A:0E:FF:F0"
 description : "ge-0/0/0 interface"

I will only be using one interface in this lab, but up to 10 can be configured. For SR-IOV, things are done slightly differently – see this vMX doc for reference.

I now need to deploy the vMX instance using the orchestration script. “-lv” provides verbose logging. My vMX instance will be created by the script and automatically started.

[email protected]:~/vmx-14.1R5.4-1$ sudo ./vmx.sh -lv --install
==================================================
 Welcome to VMX
==================================================
Date..............................................07/18/15 13:19:03
VMX Identifier....................................vmx1
Config file......................................./home/mdinham/vmx-14.1R5.4-1/config/vmx.conf
Build Directory.................................../home/mdinham/vmx-14.1R5.4-1/build/vmx1
Environment file................................../home/mdinham/vmx-14.1R5.4-1/env/ubuntu_virtio.env
Junos Device Type.................................virtio
Initialize scripts................................[OK]
Copy images to build directory....................[OK]
==================================================
 VMX Environment Setup Completed
==================================================
==================================================
 VMX Install & Start
==================================================
Linux distribution................................ubuntu
Check GRUB........................................[Disabled]
Installation status of qemu-kvm...................[OK]
Installation status of libvirt-bin................[OK]
Installation status of bridge-utils...............[OK]
Installation status of python.....................[OK]
Installation status of libyaml-dev................[OK]
Installation status of python-yaml................[OK]
Installation status of numactl....................[OK]
Installation status of libnuma-dev................[OK]
Installation status of libparted0-dev.............[OK]
Installation status of libpciaccess-dev...........[OK]
Installation status of libyajl-dev................[OK]
Installation status of libxml2-dev................[OK]
Installation status of libglib2.0-dev.............[OK]
Installation status of libnl-dev..................[OK]
Check Kernel Version..............................[Disabled]
Check Qemu Version................................[Disabled]
Check libvirt Version.............................[Disabled]
Check virsh connectivity..........................[OK]
IXGBE Enabled.....................................[Disabled]
==================================================
 Pre-Install Checks Completed
==================================================
Check for VM vcp-vmx1.............................[Not Running]
Check for VM vfp-vmx1.............................[Not Running]
Cleanup VM states.................................[OK]
Check if bridge br-ext exists.....................[No]
Cleanup VM bridge br-ext..........................[OK]
Cleanup VM bridge br-int-vmx1.....................[OK]
==================================================
 VMX Stop Completed
==================================================
Check VCP image...................................[OK]
Check VFP image...................................[OK]
VMX Model.........................................Lite
Check VCP Config image............................[OK]
Check management interface........................[OK]
Setup huge pages to 8192..........................[OK]
Attempt to kill libvirt...........................[OK]
Attempt to start libvirt..........................[OK]
Sleep 2 secs......................................[OK]
Check libvirt support for hugepages...............[OK]
==================================================
 System Setup Completed
==================================================
Get Management Address of eth0....................[OK]
Generate libvirt files............................[OK]
Sleep 2 secs......................................[OK]
Find configured management interface..............eth0
Find existing management gateway..................eth0
Check if eth0 is already enslaved to br-ext.......[No]
Gateway interface needs change....................[Yes]
Create br-ext.....................................[OK]
Get Management Gateway............................192.168.100.254
Flush eth0........................................[OK]
Start br-ext......................................[OK]
Bind eth0 to br-ext...............................[OK]
Get Management MAC................................00:0c:29:76:a8:15
Assign Management MAC 00:0c:29:76:a8:15...........[OK]
Add default gw 192.168.100.254....................[OK]
Create br-int-vmx1................................[OK]
Start br-int-vmx1.................................[OK]
Check and start default bridge....................[OK]
Define vcp-vmx1...................................[OK]
Define vfp-vmx1...................................[OK]
Wait 2 secs.......................................[OK]
Start vcp-vmx1....................................[OK]
Start vfp-vmx1....................................[OK]
Wait 2 secs.......................................[OK]
==================================================
 VMX Bringup Completed
==================================================
Check if br-ext is created........................[Created]
Check if br-int-vmx1 is created...................[Created]
Check if VM vcp-vmx1 is running...................[Running]
Check if VM vfp-vmx1 is running...................[Running]
Check if tap interface vcp_ext-vmx1 exists........[OK]
Check if tap interface vcp_int-vmx1 exists........[OK]
Check if tap interface vfp_ext-vmx1 exists........[OK]
Check if tap interface vfp_int-vmx1 exists........[OK]
==================================================
 VMX Status Verification Completed.
==================================================
Log file..........................................
 /home/mdinham/vmx-14.1R5.4-1/build/vmx1/logs/vmx_1437221943.log
==================================================
 Thankyou for using VMX
==================================================

Connecting to the console port on the VMs

We can now connect to the vMX control plane! This is done using the vmx.sh script again.

Specify vcp (control plane – Junos) or vcf (vPFE) and the instance name.

[email protected]:~/vmx-14.1R5.4-1$ ./vmx.sh --console vcp vmx1
--
Login Console Port For vcp-vmx1 - 8601
Press Ctrl-] to exit anytime
--
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.


Amnesiac (ttyd0)

login:

After a while, the FPC and interfaces will come online

root> show interfaces terse | match ge-0/0/0
ge-0/0/0 up up

root> show chassis fpc
 Temp CPU Utilization (%) Memory Utilization (%)
Slot State (C) Total Interrupt DRAM (MB) Heap Buffer
 0 Online Absent 100 0 512 14 0

I’ll go ahead and add an IP address to ge-0/0/0. Note: if I was using the management interface I could configure interface FXP0 also now. Remember FXP0 will be bridged to the host eth0 adapter (or an adapter you specify).

root# set interfaces ge-0/0/0.0 family inet address 192.168.100.5/24

Can I ping anything?

root> ping 192.168.100.5
PING 192.168.100.5 (192.168.100.5): 56 data bytes
64 bytes from 192.168.100.5: icmp_seq=0 ttl=64 time=0.059 ms
^C
--- 192.168.100.5 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.059/0.059/0.059/0.000 ms

root> ping 192.168.100.254
PING 192.168.100.254 (192.168.100.254): 56 data bytes
^C
--- 192.168.100.254 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

OK, so I can ping the interface but nothing else on the host. As I’m using virtio I need to create a device binding between the host physical NIC and the vMX interface.

Creating a virtio binding

This is done in the config file config/vmx-junosdev.conf.

virtio bindings are flexible and can be used to map multiple vMX instances to the same physical host interface, or to connect vMX instances together.

A new Linux bridge will be created between host interface eth1 and ge-0/0/0 on vmx1.

##############################################################
#
#  vmx-junos-dev.conf
#  - Config file for junos device bindings.
#  - Uses YAML syntax.
#  - Leave a space after ":" to specify the parameter value.
#  - For physical NIC, set the 'type' as 'host_dev'
#  - For junos devices, set the 'type' as 'junos_dev' and
#    set the mandatory parameter 'vm-name' to the name of
#    the vPFE where the device exists
#  - For bridge devices, set the 'type' as 'bridge_dev'
#
##############################################################
interfaces :

     - link_name  : vmx_link
       endpoint_1 :
         - type        : junos_dev
           vm_name     : vmx1
           dev_name    : ge-0/0/0
       endpoint_2 :
         - type        : host_dev
           dev_name    : eth1

If eth1 is not already up on the Linux host, bring it up

sudo ifconfig eth1 up

Again the orchestration script vmx.sh is used to create the device bindings

[email protected]:~/vmx-14.1R5.4-1$ sudo ./vmx.sh --bind-dev
Bind Link vmx_link(ge-0.0.0-vmx1, eth1)...........[OK]

And we can see a new bridge has been created called “vmx_link” as referenced in the bindings configuration file

[email protected]:~/vmx-14.1R5.4-1$ brctl show
bridge name     bridge id               STP enabled     interfaces
br-ext          8000.000c2976a815       yes             br-ext-nic
                                                        eth0
                                                        vcp_ext-vmx1
                                                        vfp_ext-vmx1
br-int-vmx1             8000.52540050c859       yes             br-int-vmx1-nic
                                                        vcp_int-vmx1
                                                        vfp_int-vmx1
virbr0          8000.fe060a0efff1       yes             ge-0.0.1-vmx1
                                                        ge-0.0.2-vmx1
                                                        ge-0.0.3-vmx1
vmx_link                8000.000c2976a81f       no              eth1
                                                        ge-0.0.0-vmx1

Now to retry that ping!

[email protected]:~/vmx-14.1R5.4-1$ ./vmx.sh --console vcp vmx1
--
Login Console Port For vcp-vmx1 - 8601
Press Ctrl-] to exit anytime
--
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.


root> ping 192.168.100.254
PING 192.168.100.254 (192.168.100.254): 56 data bytes
64 bytes from 192.168.100.254: icmp_seq=0 ttl=64 time=4.951 ms
64 bytes from 192.168.100.254: icmp_seq=1 ttl=64 time=2.081 ms
^C
--- 192.168.100.254 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.081/3.516/4.951/1.435 ms

Success! At this point I’ve a working vMX with an interface mapped to a NIC on the Ubuntu host. What happens if I turn on OSPF and LDP?

root> show ospf neighbor
Address          Interface              State     ID               Pri  Dead
192.168.100.254  ge-0/0/0.0             Full      10.0.0.2           1    37
192.168.100.1    ge-0/0/0.0             Full      10.0.0.1         128    39

root> show ldp neighbor
Address            Interface          Label space ID         Hold time
192.168.100.254    ge-0/0/0.0         10.0.0.2:0               13

Excellent, now the fun can really begin, but I’ll save that for another time!

vPFE

One last thing – what does the VFP look like?

[email protected]:~/vmx-14.1R5.4-1$ sudo ./vmx.sh --console vfp vmx1
--
Login Console Port For vfp-vmx1 - 8602
Press Ctrl-] to exit anytime
--
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.


Wind River Linux 6.0.0.12 localhost console

localhost login: root
Password:
Last login: Sat Jul 18 12:20:49 UTC 2015 on console

The riot process is where all the magic happens!

top - 12:52:55 up 33 min,  1 user,  load average: 0.38, 0.74, 0.62
Tasks: 102 total,   1 running, 101 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  2.1%sy,  0.0%ni, 96.8%id,  0.0%wa,  0.0%hi,  0.6%si,  0.0%st
Mem:   5824060k total,  4454308k used,  1369752k free,    12184k buffers
Swap:        0k total,        0k used,        0k free,    44552k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1019 root      20   0 36.6g  37m  10m S   12  0.7   4:25.04 riot
<snip>

Further reference

I hope you enjoyed this vMX post! For further reference on any of the above material please see the Juniper Release Notes for vMX

Junos – securing the RE (filter order is important – eBGP running slow?)

First of all – the Juniper Day One books are a superb resource for learning Junos. If you’ve not checked out the library already – do it now! 😉

Recently a client of mine found a bit of a gotcha with the framework filters discussed in the Day One – Securing the Routing Engine (and also the O’Reilly MX book which references the same material).

These books are both great references for deriving a set of RE filters to secure your Junos routers, so I won’t go in to the detail here – just check out the books!

Now on the the gotcha… I’m also mailing a link to this post to the Juniper community folks, so hopefully a revised copy can be linked on the website soon 🙂

UPDATE August 2015: there is now a revised version of the day 1 that mentions the issue discussed in this post and I have also reviewed the O’Reilly MX book and suggested an edit.

The books provide a detailed framework set of filters for protecting the routing engine, allowing you to pick and choose and chain filters as necessary. But if you chain the filters exactly as shown in the example material, you might have some problems. The example material is great, just remember to adapt it to your environment and only include what you need.

Below is an example of the filter order

input-list [ accept-common-services accept-ospf accept-rip accept-bfd accept-bgp accept-ldp accept-rsvp discard-all ];

Filter “accept-common-services” is first in the chain, which essentially references a further set of filters to be checked in order. Let’s take a look at what that is doing. Pretty self explanatory 🙂

filter accept-common-services {
   apply-flags omit;
   term accept-icmp {
      filter accept-icmp;
   }
   term accept-traceroute {
      filter accept-traceroute;
   }
   term accept-ssh {
      filter accept-ssh;
   }
   term accept-snmp {
      filter accept-snmp;
   }
   term accept-ntp {
      filter accept-ntp;
   }
   term accept-web {
      filter accept-web;
   }
   term accept-dns {
      filter accept-dns;
   }
}

The filter we need to look is “accept-traceroute”. In the Day 1, there are terms for UDP, ICMP and TCP, but I’ve only shown the TCP term below.

filter accept-traceroute {
/* omitted text - UDP and ICMP terms */ 
  apply-flags omit;
   term accept-traceroute-tcp {
      from {
         destination-prefix-list {
            router-ipv4;
            router-ipv4-logical-systms;
         }
         protocol tcp;
         ttl 1;
      }
      then {
         policer management-1m;
         count accept-traceroute-tcp;
         accept;
      }
   }
}

Let’s take a look at what this is doing – matching TCP traffic to IPv4 interfaces on the router, with a TTL of 1. If the traffic is matches, then we accept the traffic, count the packets and police to 1Mbps. At first glance this sounds good – traceroute traffic (e.g. tcptraceroute) using TCP is rate limited (the accept-traceroute also does the same for UDP and ICMP).

But what about TCP traffic with a legitimate TTL of 1?

Which routing protocol uses TCP, a TTL of 1 and might send a lot of data – eBGP !

So the unintended effect is if the filters are chained in as shown in the examples, i.e. as below:

[ accept-common-services accept-ospf accept-rip accept-bfd accept-bgp accept-ldp accept-rsvp discard-all ]

then the “accept-bgp” filter will not be matched for eBGP trafic – the “accept-common-services” filter will match eBGP, and eBGP traffic will be rate limited to 1Mbps! Clearly you don’t want this if you are synchronising a full routing table.

The quick way to look for this is to run the command “show firewall filter lo0.0-i”, and look for the counters “accept-traceroute-tcp-lo0.0-i” and the policer “management-1m-accept-traceroute-tcp-lo0.0-i”. If either of these are showing high counter values, then either you are experiencing excessive TCP traceroute, or perhaps you might want to look at the filter order! 🙂

There is a simple fix of course – either put the “accept-bgp” filter first in the chain, or if TCP traceroute isn’t of interest then remove the configuration from the “accept-traceroute” filter.

So if you’ve used these filters as a basis for you own RE filters, it might be worth taking a look at the filter chain order.

Mapping traffic to an LSP on Junos – BGP and table inet.3 (part 2)

Now that we know about IGP based LSP forwarding on Junos, the 2nd part in this series focuses on BGP and table inet.3. 

We also continue from where part 1 left off, by looking at how traffic-engineering bgp-igp and mpls-forwarding can affect route redistribution from OSPF into BGP.

Lab Topology

For this lab, I’ll be using the topology below.

lsplab

Software revisions are as follows

  • CE Routers (CE1, CE2): IOS (Cisco 7200 12.4(24)T)
  • P Routers (R1, R2, R3, R4, R5): IOS (Cisco 7200 12.4(24)T)
  • PE Routers (Junos1, Junos2): Junos (Olive 12.3R5.7)

As with Part 1, the base configurations are using OSPF as the routing protocol and LDP to exchange transport labels.

Route redistribution (bgp-igp and mpls-forwarding)

Here’s what I changed on Junos 1. The OSPF route 102.102.102.102 learnt via OSPF from CE2 will be redistributed in to BGP.

By the way, I’m not suggesting that your CEs should be part of your core IGP, but for the purposes on this lab test… 🙂

[email protected]# show | compare
[edit protocols bgp group internal]
+    export ospf2bgp;
[edit policy-options]
+   policy-statement ospf2bgp {
+       from {
+           protocol ospf;
+           route-filter 102.102.102.102/32 exact;
+       }
+       then accept;
+   }

The configuration on Junos1 is still running with “traffic-engineering mpls-forwarding” so the routing table has OSPF as the active route for routing, and the LDP route is active for forwarding

[email protected]> show route 102.102.102.102

inet.0: 21 destinations, 35 routes (21 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 @[OSPF/150] 00:00:28, metric 0, tag 0
                    > to 192.168.46.4 via em0.0
                   #[LDP/9] 00:00:28, metric 1
                    > to 192.168.46.4 via em0.0, Push 28

Hence you would definitely expect the routing policy to match on the OSPF route 102.102.102.102 and therefore we’ll see the route in BGP right? Sure enough if I hop over to Junos2, the route is there:

[email protected]> show route receive-protocol bgp 6.6.6.6

inet.0: 22 destinations, 23 routes (22 active, 0 holddown, 0 hidden)
  Prefix                  Nexthop              MED     Lclpref    AS path
  102.102.102.102/32      192.168.46.4         0       100        I

OK show what happens if I change over to traffic-engineering bgp-igp  on Junos1? The LDP route becomes the active route for routing and forwarding, and isn’t matched by my policy, so is not advertised to Junos2.

inet.0: 22 destinations, 36 routes (22 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 *[LDP/9] 00:00:04, metric 1
                    > to 192.168.46.4 via em0.0, Push 28
                    [OSPF/150] 00:04:54, metric 0, tag 0
                    > to 192.168.46.4 via em0.0

[email protected]> show route receive-protocol bgp 6.6.6.6

inet.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)

inet.3: 14 destinations, 14 routes (14 active, 0 holddown, 0 hidden)

BGP (LSP forwarding) and table inet.3

We’ll now have a look at how BGP operates. I’ve set the “traffic-engineering” on Junos1 back to the defaults. We should expect BGP to recursively resolve it’s next-hop via inet.3 and therefore MPLS route the traffic. Let’s see!

iBGP Peering

There is an iBGP peering session between Junos1 and Junos2. No other routers are running iBGP

eBGP Peering

Junos2 has an eBGP peering with CE1. CE has a second Loopback 112.112.112.112 being advertised via this eBGP session.

[email protected]> show configuration protocols bgp
group as102 {
    peer-as 102;
    neighbor 192.168.102.1;
}
group internal {
    local-address 7.7.7.7;
    peer-as 1;
    neighbor 6.6.6.6;
}

[email protected]> show route receive-protocol bgp 192.168.102.1

inet.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)
  Prefix                  Nexthop              MED     Lclpref    AS path
* 112.112.112.112/32      192.168.102.1        0                  102 I

LSP Fowarding and Routing

So how does Junos1 route to CE2s IP address 112.112.112.112? Let’s take a look at the routing tables.

[email protected]> show route 112.112.112.112

inet.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

112.112.112.112/32 *[BGP/170] 00:05:00, MED 0, localpref 100, from 7.7.7.7
                      AS path: 102 I, validation-state: unverified
                    > to 192.168.46.4 via em0.0, Push 27

OK, so we see 112.112.112.112/32 in table inet.0 as expected, and it looks like label 27 is going to be pushed. Let’s take a look at this in more detail:

inet.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)
112.112.112.112/32 (1 entry, 1 announced)
TSI:
KRT in-kernel 112.112.112.112/32 -> {indirect(131070)}
        *BGP    Preference: 170/-101
                Next hop type: Indirect
                Address: 0x9378ba4
                Next-hop reference count: 3
                Source: 7.7.7.7
                Next hop type: Router, Next hop index: 561
                Next hop: 192.168.46.4 via em0.0, selected
                Label operation: Push 27
                Label TTL action: prop-ttl
                Session Id: 0x1
                Protocol next hop: 192.168.102.1
                Indirect next hop: 93b8000 131070 INH Session ID: 0x2
                State: 
                Local AS:     1 Peer AS:     1
                Age: 5:41       Metric: 0       Metric2: 1
                Validation State: unverified
                Task: BGP_1.7.7.7.7+179
                Announcement bits (2): 0-KRT 6-Resolve tree 2
                AS path: 102 I
                Accepted
                Localpref: 100
                Router ID: 7.7.7.7
                Indirect next hops: 1
                        Protocol next hop: 192.168.102.1 Metric: 1
                        Indirect next hop: 93b8000 131070 INH Session ID: 0x2
                        Indirect path forwarding next hops: 1
                                Next hop type: Router
                                Next hop: 192.168.46.4 via em0.0
                                Session Id: 0x1
                        192.168.102.0/24 Originating RIB: inet.3
                          Metric: 1                       Node path count: 1
                          Forwarding nexthops: 1
                                Nexthop: 192.168.46.4 via em0.0

The key here is the protocol next hop – 192.168.102.1.

192.168.102.1 isn’t directly attached to Junos1 – it is CE2s address on the Junos2<->CE2 segment, Therefore BGP will recursively resolve this next hop via table inet.3 and inet.0. As the inet.3 LDP route has a lower preference compared to the inet.0 OSPF route, the inet.3 route will be chosen and traffic will be placed on the LSP automatically, pushing label 27 in this case.

[email protected]> show route 192.168.102.1

inet.0: 22 destinations, 22 routes (22 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

192.168.102.0/24   *[OSPF/10] 00:06:30, metric 5
                    > to 192.168.46.4 via em0.0

inet.3: 14 destinations, 14 routes (14 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

192.168.102.0/24   *[LDP/9] 00:06:30, metric 1
                    > to 192.168.46.4 via em0.0, Push 27

[email protected]> show route forwarding-table destination 112.112.112.112
Routing table: default.inet
Internet:
Destination        Type RtRef Next hop           Type Index NhRef Netif
112.112.112.112/32 user     0                    indr 131070     2
                              192.168.46.4      Push 27   561     2 em0.0

[email protected]> traceroute 112.112.112.112
traceroute to 112.112.112.112 (112.112.112.112), 30 hops max, 40 byte packets
 1  192.168.46.4 (192.168.46.4)  28.026 ms  27.884 ms  28.510 ms
     MPLS Label=27 CoS=0 TTL=1 S=1
 2  192.168.34.3 (192.168.34.3)  29.767 ms  25.848 ms  28.571 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 3  192.168.35.5 (192.168.35.5)  29.831 ms  26.455 ms  28.586 ms
     MPLS Label=27 CoS=0 TTL=1 S=1
 4  192.168.57.7 (192.168.57.7)  29.478 ms  25.518 ms  29.075 ms
 5  192.168.102.1 (192.168.102.1)  32.961 ms  31.147 ms  33.398 ms

Traffic is labelled!

But what about IGP traffic to the protocol next hop? Well that won’t follow the LSP of course because we don’t have “mpls traffic-engineering” configured.

[email protected]> show route forwarding-table destination 192.168.102.1
Routing table: default.inet
Internet:
Destination        Type RtRef Next hop           Type Index NhRef Netif
192.168.102.0/24   user     0 192.168.46.4       ucst   555    32 em0.0

Exactly as expected!

I’ve shown that BGP is using table inet.3 to resolve next hops, where as normal IGP routing is using inet.0.

Another thing to remember with BGP & inet.3… if inet.0 contains a better route (e.g. better preference) then BGP would use the inet.0 route and traffic would not be forwarded on the LSP.

In this case, as none of the P routers are running BGP, this would break the connectivity (the P routers don’t know how to get to 112.112.112.112 so would drop the traffic). Hence, the traffic has to follow the LSP for the traffic to reach CE2.

Mapping traffic to an LSP on Junos (part 1)

It’s been a while since I wrote a Junos post… there are a few things that I’ve been meaning to write up for a while, so this series will focus on MPLS label switched paths (LSP) and how to put traffic on to an LSP.

By default IGPs won’t use an LSP, so this 1st part focuses on what you’ll need to configure should you want both BGP and IGP traffic forwarding to use LSPs.

Later parts will look at other options available in Junos to forward traffic via an LSP, and more advanced topics e.g. LSP selection based on BGP extended community.

I’ll start with a quick recap of the Junos routing tables – inet.0, inet.3 and mpls.0

Junos Routing Tables (IPv4)

inet.0

Table inet.0 is the primary unicast routing table used by IPv4. It’s where IGPs will resolve next hops.

inet.3

Table inet.3 is the routing table populated by MPLS protocols such as RSVP or LDP. A lookup here will result in a label being pushed.

It’s also worth noting that inet.3 is used by BGP to resolve BGP next-hops. BGP examines both inet.3 and inet.0, choosing a next-hop based on the lowest Junos preference value. In case of a tie, inet.3 is used.

mpls.0

This is the MPLS label switching table and is used by label switch routers. Routers along the LSP will use this table to swap and pop labels as appropriate.

Lab Topology

For this lab, I’ll be using the topology below.

lsplab

Software revisions are as follows

  • CE Routers (CE1, CE2): IOS (Cisco 7200 12.4(24)T)
  • P Routers (R1, R2, R3, R4, R5): IOS (Cisco 7200 12.4(24)T)
  • PE Routers (Junos1, Junos2): Junos (Olive 12.3R5.7)

The base configurations are using OSPF as the routing protocol and LDP to exchange transport labels.

LSP Traffic Forwarding

Default MPLS forwarding on Cisco and Juniper

Let’s have a look at Junos1 and R4 and see how traffic would be forwarded by default.

Junos1

102.102.102.102 is  an IP address assigned to the Looback0 on CE2. Let’s take a look at the routing table on Junos1:

[email protected]> show route 102.102.102.102

inet.0: 21 destinations, 21 routes (21 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 *[OSPF/150] 00:08:21, metric 0, tag 0
> to 192.168.46.4 via em0.0

inet.3: 14 destinations, 14 routes (14 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 *[LDP/9] 00:04:32, metric 1
> to 192.168.46.4 via em0.0, Push 28

Table inet.0 contains only the OSPF leant route. Remember that although there is a LDP route in table inet.3, it won’t be used for IGP route lookups, and we can verify this with a look at the forwarding table for prefix 102.102.102.102.

[email protected]> show route forwarding-table destination 102.102.102.102
Routing table: default.inet
Internet:
Destination        Type RtRef Next hop           Type Index NhRef Netif
102.102.102.102/32 user     0 192.168.46.4       ucst   555    30 em0.0

Note that type is “ucst”, if the traffic was to be labelled it would say “Push” followed by the label number to be pushed.

Therefore IPv4 unicast traffic by default on Junos will not be labelled.

R4

Now let’s take a look at R4

R4#show mpls forwarding-table 102.102.102.102
Local  Outgoing      Prefix            Bytes Label   Outgoing   Next Hop
Label  Label or VC   or Tunnel Id      Switched      interface
28     29            102.102.102.102/32   \
                                       0             Fa1/0      192.168.34.3

R4#sh ip cef 102.102.102.102 detail
102.102.102.102/32, epoch 0
  local label info: global/28
  nexthop 192.168.34.3 FastEthernet1/0 label 29

R4 shows a very different story – traffic will be labelled, and will push label 29.

Also note that R4 expects labelled traffic going to 102.102.102.102 to be received with label 28.

Verification

Let’s verify this situation with a traceroute on Junos1

[email protected]> traceroute 102.102.102.102
traceroute to 102.102.102.102 (102.102.102.102), 30 hops max, 40 byte packets
 1  192.168.46.4 (192.168.46.4)  8.855 ms  2.253 ms  4.572 ms
 2  192.168.34.3 (192.168.34.3)  28.783 ms  26.034 ms  28.780 ms
     MPLS Label=29 CoS=0 TTL=1 S=1
 3  192.168.35.5 (192.168.35.5)  27.265 ms  25.306 ms  28.669 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 4  192.168.57.7 (192.168.57.7)  29.321 ms  26.853 ms  28.707 ms
 5  192.168.102.1 (192.168.102.1)  34.871 ms  30.649 ms  33.596 ms

Well that’s pretty clear that the 1st hop traffic from Junos was not labelled, and traffic from the IOS box R4 was labelled. The egress label applied for the R4->R3 traffic was label 29 as expected.

So what if we want Junos to forward IGP traffic via an LSP? Well there are a couple of MPLS configuration options: traffic-engineering bgp-igp and mpls-forwarding.

traffic-engineering bgp-igp

Traffic-engineering bgp-igp configures BGP and the IGPs to use LSPs for forwarding traffic destined for egress routers. The bgp-igp option causes all inet.3 routes to be moved to the inet.0 routing table.

[email protected]# show | compare
[edit protocols mpls]
+   traffic-engineering bgp-igp;

[email protected]> show route 102.102.102.102

inet.0: 21 destinations, 35 routes (21 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 *[LDP/9] 00:00:33, metric 1
                    > to 192.168.46.4 via em0.0, Push 28
                    [OSPF/150] 00:00:33, metric 0, tag 0
                    > to 192.168.46.4 via em0.0

The LDP route and the OSPF route are now in table inet.0, and the LDP route with preference 9 is now the best route.

The traceroute should now show the 1st hop being labelled – let’s see:

[email protected]> traceroute 102.102.102.102
traceroute to 102.102.102.102 (102.102.102.102), 30 hops max, 40 byte packets
 1  192.168.46.4 (192.168.46.4)  30.523 ms  28.792 ms  28.561 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 2  192.168.34.3 (192.168.34.3)  28.502 ms  26.692 ms  28.756 ms
     MPLS Label=29 CoS=0 TTL=1 S=1
 3  192.168.35.5 (192.168.35.5)  28.296 ms  26.501 ms  28.403 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 4  192.168.57.7 (192.168.57.7)  28.205 ms  26.365 ms  28.581 ms
 5  192.168.102.1 (192.168.102.1)  33.717 ms  30.332 ms  33.996 ms

Spot on – and the 1st hop label is 28, as expected.

traffic-engineering mpls-forwarding

traffic-engineering bgp-igp will allow high-priority LSPs to supersede IGP routes in the inet.0 routing table.  Essentially this means that the active route might not be the IGP route, and therefore IGP routing might not be as expected, e.g. routing policy routes may not be matched.

The mpls-forwarding option enables LSPs to be used for forwarding but not route selection. Routes are added to both the inet.0 and inet.3 routing tables.

Let’s take a look at the routing table with the mpls-forwarding option in place:

[email protected]> show route 102.102.102.102

inet.0: 21 destinations, 35 routes (21 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 @[OSPF/150] 00:03:56, metric 0, tag 0
                    > to 192.168.46.4 via em0.0
                   #[LDP/9] 00:00:02, metric 1
                    > to 192.168.46.4 via em0.0, Push 28

inet.3: 14 destinations, 14 routes (14 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both

102.102.102.102/32 *[LDP/9] 00:00:02, metric 1
                    > to 192.168.46.4 via em0.0, Push 28

Note that both the OSPF routes and the LDP routes a present in table inet.0 but the OSPF route is marked “Routing Use Only” and the LDP route “Forwarding Use Only”.

Therefore the outcome for traffic forwarding will be the identical to “bgp-igp”:

[email protected]> traceroute 102.102.102.102
traceroute to 102.102.102.102 (102.102.102.102), 30 hops max, 40 byte packets
 1  192.168.46.4 (192.168.46.4)  28.526 ms  28.594 ms  29.658 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 2  192.168.34.3 (192.168.34.3)  28.161 ms  26.584 ms  28.419 ms
     MPLS Label=29 CoS=0 TTL=1 S=1
 3  192.168.35.5 (192.168.35.5)  28.882 ms  25.807 ms  28.394 ms
     MPLS Label=28 CoS=0 TTL=1 S=1
 4  192.168.57.7 (192.168.57.7)  28.521 ms  26.170 ms  28.564 ms
 5  192.168.102.1 (192.168.102.1)  34.301 ms  31.283 ms  34.266 ms

The difference can be noted when we are matching against a protocol in a routing policy.

In the second part in this series, I will look at BGP forwarding via an LSP, and will also demonstrate how these two MPLS options affect route redistribution from OSPF to BGP.

 

 

 

MTU settings on Junos & IOS (part 5) with MPLS L2 VPN

This post follows on from part 4, but this time we’ll be configuring a Layer-2 Ethernet to Ethernet MPLS VPN between the 2 CEs.

MPLS L2VPN

I’m going to configure a Martini Layer 2 VPN. Martini uses LDP to signal and setup the VPN across the MPLS network.

With MPLS L2 VPNs there will also be a minimum of two labels. The top label being the transport label and the bottom label being the VC label. The transport label will be swapped hop by hop through the MPLS network. The VC label is used by the egress PE router to identify the Virtual Cirtcuit that the incoming packet relates to.

On an Ethernet segment, the frame would look like this for a payload of 1500 bytes.

l2vpn

  • L2 Header – 14 bytes. This is the Ethernet header for the frame. 14 bytes, or 18 bytes if the interface is an 802.1q trunk.
  • MPLS Label – 4 bytes. The transport label.
  • VC Label – 4 bytes. The VC ID.
  • Control Word – 4 bytes. This is an optional field only required when transporting FR or ATM, carrying additional L2 protocol information.
  • L2 Data – 1514 – 1518 bytes. The encapsulated Layer-2 frame that is being transmitted across the MPLS network. In this case 1500 bytes of data, plus the L2 header (an additional 4 bytes would be added if the VPN interface is a trunk).

Well this means that we will be putting a minimum of 1540 bytes (14 +4+4 + 4+1514) of data on the wire if there was 1500 bytes of customer data in the encapsulated L2 frame. If the customer interface is a trunk and the SP interface is a trunk, then we are up to 1548 bytes on the wire across the P network.

Lab

For this lab, I’ll be using the topology below. The base configurations are using OSPF as the routing protocol and LDP to exchange transport labels.

mplsl2vpnSoftware revisions are as follows

  • CE1, CE2, P, PE1: IOS (Cisco 7200 12.4(24)T)
  • PE2: Junos (Firefly 12.1X46)

The base configs are similar to part 3, using OSPF as the IGP and LDP to signal transport labels, so I’ll jump straight in to the Martini VPN config.

PE1

The configuration could not be simpler. The VCID must match at both sides and is set to 12.

interface FastEthernet1/0
 no ip address
 duplex full
 xconnect 2.2.2.2 12 encapsulation mpls
!

PE2

Not much to it on Junos either. Notice that I enable LDP on the loopback.

interfaces {
    ge-0/0/1 {
        encapsulation ethernet-ccc;
        unit 0 {
            family ccc;
        }
    }
}
protocols {
    ldp {
        interface lo0.0;
    }
    l2circuit {
        neighbor 1.1.1.1 {
            interface ge-0/0/1.0 {
                virtual-circuit-id 12;
            }
        }
    }
}

Let’s check that the circuit is up

PE1#sh mpls l2transport vc detail
Local interface: Fa1/0 up, line protocol up, Ethernet up
  Destination address: 2.2.2.2, VC ID: 12, VC status: up
    Output interface: Gi0/0, imposed label stack {18 299776}
    Preferred path: not configured
    Default path: active
    Next hop: 192.168.34.4
  Create time: 03:48:16, last status change time: 00:15:19
  Signaling protocol: LDP, peer 2.2.2.2:0 up
    MPLS VC labels: local 22, remote 299776
    Group ID: local 0, remote 0
    MTU: local 1500, remote 1500
    Remote interface description:
  Sequencing: receive disabled, send disabled
  VC statistics:
    packet totals: receive 3114, send 3124
    byte totals:   receive 319522, send 408559
    packet drops:  receive 0, seq error 0, send 3


[email protected]> show l2circuit connections extensive
Layer-2 Circuit Connections:

Legend for connection status (St)
EI -- encapsulation invalid      NP -- interface h/w not present
MM -- mtu mismatch               Dn -- down
EM -- encapsulation mismatch     VC-Dn -- Virtual circuit Down
CM -- control-word mismatch      Up -- operational
VM -- vlan id mismatch           CF -- Call admission control failure
OL -- no outgoing label          IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC    TM -- TDM misconfiguration
BK -- Backup Connection          ST -- Standby Connection
CB -- rcvd cell-bundle size bad  SP -- Static Pseudowire
LD -- local site signaled down   RS -- remote site standby
RD -- remote site signaled down  XX -- unknown

Legend for interface status
Up -- operational
Dn -- down
Neighbor: 1.1.1.1
    Interface                 Type  St     Time last up          # Up trans
    ge-0/0/1.0(vc 12)         rmt   Up     May  3 21:09:11 2014           1
      Remote PE: 1.1.1.1, Negotiated control-word: Yes (Null)
      Incoming label: 299776, Outgoing label: 22
      Negotiated PW status TLV: No
      Local interface: ge-0/0/1.0, Status: Up, Encapsulation: ETHERNET

Looks good!

My CE config is very simple, I just have a point to point interface between the two CEs and OSPF running across the L2 link. Here is CE1’s config

interface Loopback0
 ip address 11.11.11.11 255.255.255.255
 ip ospf 1 area 0
!
interface FastEthernet0/1
 ip address 192.168.12.1 255.255.255.0
 ip ospf 1 area 0
 duplex full
 speed 100
!

CE1#sh ip ro 22.22.22.22
Routing entry for 22.22.22.22/32
 Known via "ospf 1", distance 110, metric 2, type intra area
 Last update from 192.168.12.2 on FastEthernet0/1, 03:47:25 ago
 Routing Descriptor Blocks:
 * 192.168.12.2, from 22.22.22.22, 03:47:25 ago, via FastEthernet0/1
 Route metric is 2, traffic share count is 1

The obvious difference compared to L3 MPLS VPN is that the provider network has no involvement in customer routing.

What’s the maximum ping we can get from CE1 to CE2? The provider core is set to 1500 bytes on the PE-P-PE interfaces. It should be 1474 right? 1500 – 14 bytes of L2 headers – 2 labels – 1 control word. And it is.

CE1#ping 22.22.22.22 repeat 1 size 1474

Type escape sequence to abort.
Sending 1, 1474-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 56/56/56 ms
CE1#ping 22.22.22.22 repeat 1 size 1475

Type escape sequence to abort.
Sending 1, 1475-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
.
Success rate is 0 percent (0/1)

Let’s look at a capture on the PE1 facing interfaces on the P router.

l2vpncap

1514 bytes on the wire as expected.

Now how about I make the CE1-CE2 link dot1q. I’ve not changed anything on the PE routers, just enabled a 802.1q tagged interface using on each CE. As there will be an extra 4 bytes of overhead, the maximum ping size will drop to 1470.

CE1#ping 22.22.22.22 repeat 1 size 1470

Type escape sequence to abort.
Sending 1, 1470-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 56/56/56 ms
CE1#ping 22.22.22.22 repeat 1 size 1471

Type escape sequence to abort.
Sending 1, 1471-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
.
Success rate is 0 percent (0/1)

We can see the extra header in the packet capture – exactly where you’d expect to see it – in between the CE generated Ethernet header and IP data.

l2vpncapdot1q

Now we know how the data on the wire changes when an MPLS L2 VPN is created, it’s easy to make provision for the additional overhead across the MPLS core by increasing MTUs accordingly.

Thanks for reading my post. We’ve covered both Cisco and Juniper here, but be sure to check out other posts from the #JuniperFan bloggers here.