Senior Network Consultant Kamran Raza explains how we have improved latencies across our pan-European fabric by up to 30% by favouring lower-latency route selection using the OSPF protocol.
What is OSPF?
The OSPF protocol is one of a family of IP Routing protocols. It is an Interior Gateway Protocol - IGP - for the Internet, used to distribute IP routing information throughout a single Autonomous System in an IP network. It’s very common. Most networks run this protocol to know how to send traffic from point A to point B. We use OSPF in our network, and it's been deployed since the birth of the network. OSPF basically does a shortest path calculation.
How does it work?
We assign a number to a destination. Let's say there are four nodes with a value of, say, 10 to 20, 20 to 30, 30 to 40 and so on. OSPF calculates the shortest part between these nodes. It’s like using Google maps, and as with Google, if OSPF finds the quickest path, it does not mean that it is the best path. Maybe you prefer another route that is better, for whatever reason.
Why did you change the settings?
We identified an issue in our network - sometimes OSPF was not optimised. This was because the values put a lower numerical value (which OSPF likes) on higher bandwidth – it wants to use the fattest pipes. The problem was that this did not account adequately for latency, which meant that sometimes traffic would go via an indirect and therefore a longer path. So, a few months ago, we set up brainstorm sessions with our network architects and created a new formula to take latency into account as well. We harmonised all our OSPF values then gave latency a higher factor than bandwidth. Doing this means that some of the links that were not used before are preferred because they are faster latency-wise. In the medium-term we can now increase the bandwidth on those links to reflect the routing needs of our customers.
Is the new solution running across the whole NL-ix network?
Yes, it is already up and running. We tested the solution out on our core network nodes first in and around Amsterdam and it worked very well. Now it’s deployed across the whole network.
What are the implications for clients?
It means that we offer much faster routes. Let’s say a customer uses a Copenhagen to London route. We have plenty of routes between these metros, all with plenty of bandwidth. Now route selection is based on selecting the shortest path. This means clients will have improved speed when they browse the internet or when they connect to their company network.
This will make a big difference to many clients. For instance, we connect to a lot of customers that use financial exchanges, like forex or trading, and they are very sensitive to latency because the orders are executed in milliseconds. We always try to keep their traffic on a direct low-latency route and the OSPF will make sure that this is always the case. Their speeds will be benefitting already.
Have you measured the impact on traffic movement?
Yes. We observed improvements of in the region of 3 to 5 milliseconds for a number of routes. Expressed as a percentage that’s an improvement of roughly 30%.
Do the OSPF settings help with network management?
Yes. This solution gives us extra insight into supplier speeds, and that helps us tell how we are doing, and helps our customers too. We use Metronet devices in all our nodes to measure speeds with a ping – like a heartbeat that reveals the latency in the network. This provides active comparison of different routes. So if a supplier tells us that this route is, say, 10 milliseconds, we can check via the OSPF model. It's difficult for suppliers to mislead us on what they deliver versus what is really in the network.
Is there an impact on network resilience?
Yes. Let's say there is an outage in one city – maybe an indirect failure in the network, that is not visible to us - then we will suddenly see increased latency on the network. Then we can contact that supplier and ask them what's going on, and they can fix it for us.
Do other service providers do this on their networks?
This is unlikely, but it is impossible to say definitively as no other networks are as transparent about their routing criteria and latency. But, in my experience, most service providers treat OSPF as a default, which means they just turn it on and run it as native. The default settings automatically prefer the higher bandwidth routes. So what we are doing is pretty innovative.
The network team say that NL-ix probably runs the fastest network in Europe. Would you agree?
Well, the problem is that other networks are either different in scope, i.e. they are smaller and regional, or if they have international reach they don’t publish accurate real-time latencies like we do. We just keep adding better routes, more bandwidth and efficiencies like OSPF and we keep forging ahead in terms of speed.
The formulae are in place now. If someone else thinks they have a faster network, then we would appreciate it if they told us about it. If someone else has a faster or more resilient network or they have a better approach to traffic management then we would be very interested to hear about it.
Senior Network Consultant Kamran received his degree in Software Engineering from Lahore University. He has been a networking specialist for 14 years, working with Etisalat, Nokia, Amazon and, most recently, NL-ix.