How can I optimize the throughput of a VPN across a WAN based link ?
I was recently asked this question the other day by a client, after seeing the results (in which the transfer speeds were nearly tripled) I thought it would make an interesting article.
Table of Contents
My client had a VPN (Site to Site) between their office, in the UK and another office in East Africa. On either end of the VPN was a number of Window 2k3 and 2k8 boxes.
Their goal was to optimize the VPN to ensure the maximum throughput between each of the sites could be achieved.
As you would expect due to the distance and regions of both offices the path was both high in latency and also was susceptible to packet loss.
To optimize the throughput across both sites there were a number of areas that we will cover, such as Fragmentation, MSS Clamping, MSS/RWIN tuning and also ensuring that the path between the sites is used as efficiently as possible.
Note : Before any changes were made a file transfer was initiated across the VPN and the transfer rates recorded.
To ensure that traffic between each of the sites is not fragmented the MSS (Maximum Segment Size) is clamped.
MSS clamping works by rewriting the MSS value that is announced within the 3 way handshake.
So, how do we calculate the MSS ?
First of all a ping is sent to one of the servers via the VPN, ensuring that the DF (Don’t Fragment) bit is set. This ensures that if the MTU on any of the hops further upstream is smaller then the packet sent, an ‘ICMP Fragmentation Needed’ will be sent rather than the packet being fragmented.
This allows us to adjust the data payload length within our ping until we receive the ICMP message “ICMP Fragmentation Needed” from our destination. At the point we see this message we know that the packet size was too big for the PathMTU.
Because our pings our ICMP based, with the ICMP header consisting of 8 bytes and a TCP header of 20 bytes we also subtract a further 12 bytes from our payload length to ensure that at the point TCP is used rather then ICMP the packets are not fragmented.
First of all we start off pinging the remote host (across the VPN) using the following flags:
|-f||set the ‘do not fragment’ flag|
|-l||the packet payload|
|-n||the number of packets|
We start with a payload length of 1408. As you can see we get the expected response.
C:\Users\rick>ping 192.168.1.10 -f -l 1408 -n 2 Pinging 192.168.1.10 with 1464 bytes of data: Reply from 192.168.1.10: bytes=64 (sent 1464) time=39ms TTL=45 Reply from 192.168.1.10: bytes=64 (sent 1464) time=39ms TTL=45
Ping statistics for 192.168.1.10:
Packets: Sent = 2, Received = 2, Lost = 0 (0% loss),
We next up our payload length to 1409 bytes. However this time we can see that the packet is to large and requires fragmentation.
C:\Users\rick>ping 192.168.1.10 -f -l 1409 -n 2 Pinging 192.168.1.10 with 1465 bytes of data: Packet needs to be fragmented but DF set. Packet needs to be fragmented but DF set.
Ping statistics for 192.168.1.10:
Packets: Sent = 2, Received = 0, Lost = 2 (100% loss),
Based on these tests we take the maximum payload length (which is 1408 bytes) and subtract a further 12 bytes (because of TCP headers), leaving us with our optimal MSS.
MAXIMUM PAYLOAD LENGTH – 12 BYTES = OPTIMAL MSS
Note : Based our optimal MSS value, the additional headers that would contribute to the remainder of the packet are,
MAXIMUM PAYLOAD (1408bytes) + IP HEADER (20bytes) + ICMP HEADER (8 bytes) + IPSEC HEADER (64 bytes) = 1500 MTU.
The interesting point here is that the IPSEC header size can change based on the ciphers used.
Note : The Cisco ASA clamps the MSS (of the inital SYN) in each direction.
Cisco ASA v7.0 and later introduced a feature that blocked traffic containing an MSS higher that that announced by its peer (within the 3 way handshake). Though this is meant to limit potential buffer overruns etc as some servers may not always adhere to this behaviour you may find instances where this feature needs to be disabled.
To confirm whether this feature is blocking packets and needs to be disabled, run the command ‘show asp drop’. Before and after any of your tests.
To ensure that PMTUD can operate the ICMP message ‘Fragmentation needed and DF set’ is permitted to all servers at both sites from any location.
Though this isn’t required within our scenario due to MSS clamping it is still considered good practice should it ever be required within the future.
Selective ACK (SACK)
One issue with TCP is how the receiver acknowledges what packets they have received within the sliding window.
Consider the scenario where a host has sent you 10,000 bytes. However bytes 2001-4000 were lost in transit. Normally the ACK sent would say, “Ive got everything up to 2000 bytes” and the sender would then resend 2001-10000.
SelectiveACK allows the receiver to say, “I got 1-2000 and 4001-10000”. The host would then only resend bytes 2001-4000.
The Nagle algorithm is a method to alleviate network overhead by combining a number of smaller packets into one. Since TCP/IP requires a 40 byte header (20 bytes TCP, 20 bytes IP), a packet that contains a 1 byte data payload can result in the packet being 41 bytes in length.
Though this can greatly improve efficiency on a TCP/IP network, this feature doesn’t work well with applications that require small packets to operate correctly. Such as telnet, where each keystroke is sent to the server. Because of this Nagle is normally best used in situations where large data transfers are required rather then interactive based application such as RDP or SSH.
Receive Window Tuning
The Receive Window (RWIN) is advertised by the receiver and corresponds to the amount of bytes within the receive buffer. Also known as the Window size, the RWIN ensures that the receiver does not receive more data then it is able to process.
The sender is able to send data up to the window size without requiring an acknowledgment back. At any point the Window size can be adjusted by the receiver ; this acts as a form of flow control for the receive buffer.
Why do we need to tune the Receive Window (RWIN) ?
Well if the RWIN is too small then it can limit the throughput whilst the sender is awaiting for receiver acknowledge the traffic. If it is too big then the sender will have to resend the entire RWIN everytime packet loss occurs (however SACK can relieve such problems to a certain extent).
Windows 7 and 2008 include a feature called “AutoTuning”. This automatically calculates and adjusts the RWIN based on what it believes to be the optimal size. For all other operating systems the RWIN size can be calculated and manually tuned, if required. This is done using the following formula,
BANDWIDTH(kbps) / 8 * AVERAGE LATENCY(Millisec) = RWIN SIZE(Bytes)
The RWIN size is then rounded off to a multiple of the Maximum Segment Size (MSS).
Note: In most instances your RWIN size should calculated to 64k or smaller. However should the it be larger then 64k Windows Scaling (WS) can be set. The Window Scale value is advertised within the TCP header and is a value that acts as a multiplier to the Receive Window size.
Taking onboard the above and keeping inline with our case example the following changes were then implemented,
1. The optimal MSS value was calculated and MSS clamping was set on the UK gateway. As this was a Cisco ASA the following command was used,
ciscoasa(config)# sysopt connection tcp-mss maximum <MSS IN BYTES>
2. MSS blocking was disabled on the UK gateway. Again as this was a Cisco ASA the following commands were used,
ciscoasa(config)# access-list MSS-EXCEEDED-ACL permit tcp any any
ciscoasa(config)# class-map MSS-EXCEEDED-MAP
ciscoasa(config-cmap)# match access-list MSS-EXCEEDED-ACL
ciscoasa(config)# tcp-map mss-map
ciscoasa(config-tcp-map)# exceed-mss allow
ciscoasa(config)# policy-map global_policy
ciscoasa(config-pmap)# class MSS-EXCEEDED-MAP
ciscoasa(config-pmap-c)# set connection advanced-options mss-map
3. As the VPN was mainly used for data transfers Nagle was enabled on all of the servers on both sides of the VPN.
4. Selective ACK (SACK) was enabled on all servers on both sides on the VPN.
5. The access control policies on both VPN gateways were updated to permit the ICMP Type 3 Code 4 “Fragmentation needed and DF set” inbound to all servers from any source address.
6. The RWIN calculated and updated on all servers not running Windows 2008.
- How to Configure a BIND Server on Ubuntu - March 15, 2018
- What is a BGP Confederation? - March 6, 2018
- Cisco – What is BGP ORF (Outbound Route Filtering)? - March 5, 2018
Want to become a networking expert?
Here is our hand-picked selection of the best courses you can find online:
Cisco CCNA 200-301 Certification Gold Bootcamp
Complete Cyber Security Course – Network Security
Internet Security Deep Dive course
Python Pro Bootcamp
and our recommended certification practice exams:
AlphaPrep Practice Tests - Free Trial