TCP OpenVPN吞吐量非常低(100Mbit端口,CPU利用率低)

我在两台服务器之间遇到极慢的OpenVPN传输速率.对于这个问题,我将调用服务器A和服务器B.

服务器A和服务器B都运行CentOS 6.6.两者都位于具有100Mbit线路的数据中心,并且OpenVPN之外的两个服务器之间的数据传输接近~88Mbps.

但是,当我尝试通过我在服务器A和服务器B之间建立的OpenVPN连接传输任何文件时,我的吞吐量大约为6.5Mbps.

来自iperf的测试结果:

[  4] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 49184
[  4]  0.0-10.0 sec  7.38 MBytes  6.19 Mbits/sec
[  4]  0.0-10.5 sec  7.75 MBytes  6.21 Mbits/sec
[  5] local 10.0.0.1 port 5001 connected with 10.0.0.2 port 49185
[  5]  0.0-10.0 sec  7.40 MBytes  6.21 Mbits/sec
[  5]  0.0-10.4 sec  7.75 MBytes  6.26 Mbits/sec

除了这些OpenVPN iperf测试之外,两个服务器几乎完全空闲且零负载.

服务器A被分配了IP 10.0.0.1,它是OpenVPN服务器.服务器B被分配了IP 10.0.0.2,它是OpenVPN客户端.

服务器A的OpenVPN配置如下:

port 1194
proto tcp-server
dev tun0
ifconfig 10.0.0.1 10.0.0.2
secret static.key
comp-lzo
verb 3

服务器B的OpenVPN配置如下:

port 1194
proto tcp-client
dev tun0
remote 204.11.60.69
ifconfig 10.0.0.2 10.0.0.1
secret static.key
comp-lzo
verb 3

我注意到了什么:

我的第一个想法是我在服务器上的CPU瓶颈. OpenVPN是单线程的,这两个服务器都运行不是最快的Intel Xeon L5520处理器.但是,我在其中一个iperf测试中运行了一个top命令,并按1来查看核心的CPU利用率,发现每个核心的CPU负载非常低:

top - 14:32:51 up 13:56,  2 users,  load average: 0.22, 0.08, 0.06
Tasks: 257 total,   1 running, 256 sleeping,   0 stopped,   0 zombie
Cpu0  :  2.4%us,  1.4%sy,  0.0%ni, 94.8%id,  0.3%wa,  0.0%hi,  1.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st
Cpu3  :  0.3%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    946768k total,   633640k used,   313128k free,    68168k buffers
Swap:  4192188k total,        0k used,  4192188k free,   361572k cached

2.当iperf运行时,Ping时间在OpenVPN隧道上显着增加.当iperf未运行时,隧道上的ping时间始终为60ms(正常).但是当iperf正在运行并且推动大量流量时,ping时间变得不稳定.您可以在下面看到在我开始iperf测试时第4次ping之前ping时间是否稳定:

PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=60.1 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=60.1 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=60.2 ms
** iperf test begins **
64 bytes from 10.0.0.2: icmp_seq=4 ttl=64 time=146 ms
64 bytes from 10.0.0.2: icmp_seq=5 ttl=64 time=114 ms
64 bytes from 10.0.0.2: icmp_seq=6 ttl=64 time=85.6 ms
64 bytes from 10.0.0.2: icmp_seq=7 ttl=64 time=176 ms
64 bytes from 10.0.0.2: icmp_seq=8 ttl=64 time=204 ms
64 bytes from 10.0.0.2: icmp_seq=9 ttl=64 time=231 ms
64 bytes from 10.0.0.2: icmp_seq=10 ttl=64 time=197 ms
64 bytes from 10.0.0.2: icmp_seq=11 ttl=64 time=233 ms
64 bytes from 10.0.0.2: icmp_seq=12 ttl=64 time=152 ms
64 bytes from 10.0.0.2: icmp_seq=13 ttl=64 time=216 ms

3.如上所述,我在OpenVPN隧道之外运行iperf,吞吐量正常 – 大约88Mbps.

我尝试过的:

1.我认为压缩可能会导致压缩,所以我通过从两个配置中删除comp-lzo并重新启动OpenVPN来关闭压缩.没有得到改善.

2.尽管我之前发现CPU利用率很低,但我认为默认密码可能有点过于密集,系统无法跟上.所以我将密码RC2-40-CBC添加到两个配置(一个非常轻量级的密码)并重新启动OpenVPN.没有得到改善.

3.我在各种论坛上阅读了如何调整片段,mssfix和mtu-tun可能有助于提高性能.我在this article中描述了一些变化,但同样没有改进.

有什么可能导致OpenVPN性能不佳的想法?

最佳答案
经过大量的谷歌搜索和配置文件调整后,我找到了解决方案.我现在可以获得60Mbps的持续速度和高达80Mbps的速率.它比我在VPN之外收到的传输速率慢一点,但我认为这样做会很好.

第一步是在OpenVPN配置中为服务器和客户端设置sndbuf 0和rcvbuf 0.

在看到有关public forum post(这是Russian original post的英文译本)的建议之后,我做了这个改变,我将在这里引用:

It’s July, 2004. Usual home internet speed in developed countries is 256-1024 Kbit/s, in less developed countries is 56 Kbit/s. Linux 2.6.7 has been released not a long ago and 2.6.8 where TCP Windows Size Scaling would be enabled by default is released only in a month. OpenVPN is in active development for 3 years already, 2.0 version is almost released.
One of the developers decides to add some code for socket buffer, I think to unify buffer sizes between OSes. In Windows, something goes wrong with adapters’ MTU if custom buffers sizes are set, so finally it transformed to the following code:

#ifndef WIN32
o->rcvbuf = 65536;
o->sndbuf = 65536;
#endif

If you used OpenVPN, you should know that it can work over TCP and UDP. If you set custom TCP socket buffer value as low as 64 KB, TCP Window Size Scaling algorithm can’t adjust Window Size to more than 64 KB. What does that mean? That means that if you’re connecting to other VPN site over long fat link, i.e. USA to Russia with ping about 100 ms, you can’t get speed more than 5.12 Mbit/s with default OpenVPN buffer settings. You need at least 640 KB buffer to get 50 Mbit/s over that link.
UDP would work faster because it doesn’t have window size but also won’t work very fast.

As you already may guess, the latest OpenVPN release still uses 64 KB
socket buffer size. How should we fix this issue? The best way is to
disallow OpenVPN to set custom buffer sizes. You should add the
following code in both server and client config files:

sndbuf 0
rcvbuf 0

作者继续描述如果您自己无法控制客户端配置,如何将缓冲区大小调整推送到客户端.

在我做出这些更改之后,我的吞吐率提高到了20Mbps.然后我看到单核上的CPU利用率有点高,所以我从客户端和服务器上的配置中删除了comp-lzo(压缩).找到了!传输速度持续上升至60Mbps,突发速度为80Mbps.

我希望这有助于其他人解决他们自己的OpenVPN缓慢问题!

转载注明原文:TCP OpenVPN吞吐量非常低(100Mbit端口,CPU利用率低) - 代码日志