Python网络速度非常慢

Python网络速度非常慢,python,performance,networking,dns,gethostbyname,Python,Performance,Networking,Dns,Gethostbyname,我有两台服务器(让我把它们命名为A和B) 事实: starttime = time.time() for domain in domain_list_1: ip = socket.gethostbyname(domain) print '%.1f items per second' % (100/(time.time()-starttime)) >> Server A Results: 3.3 items per second >> Server B Result

我有两台服务器(让我把它们命名为A和B)

事实:

starttime = time.time()
for domain in domain_list_1:
    ip = socket.gethostbyname(domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 0.7 items per second
starttime = time.time()
for domain in domain_list_2:
    os.system('nslookup %s > /dev/null' % domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 3.3 items per second
12879 21:29:24.182590 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000035>
12879 21:29:24.182694 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000018>
12879 21:29:24.182778 sendto(5, "'!\1\0\0\1\0\0\0\0\0\0\njanadrakka\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32 <0.000040>
12879 21:29:24.182881 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.067000>
12879 21:29:24.249987 ioctl(5, FIONREAD, [130]) = 0 <0.000022>
12879 21:29:24.250100 recvfrom(5, "'!\201\200\0\1\0\1\0\2\0\2\njanadrakka\3com\0\0\1\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 130 <0.000032>
12879 21:29:24.250287 close(5)          = 0 <0.000053>
4850  21:28:55.501276 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000019>
4850  21:28:55.501348 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000014>
4850  21:28:55.501419 sendto(5, "\346\10\1\0\0\1\0\0\0\0\0\0\fdeghatgostar\3com\0\0\1"..., 34, MSG_NOSIGNAL, NULL, 0) = 34 <0.000036>
4850  21:28:55.501506 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.615731>
4850  21:28:56.117335 ioctl(5, FIONREAD, [129]) = 0 <0.000033>
4850  21:28:56.117429 recvfrom(5, "\346\10\201\200\0\1\0\1\0\2\0\2\fdeghatgostar\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 129 <0.000011>
4850  21:28:56.117499 close(5)          = 0 <0.000009>
  • 他们有相同的CPU,内存,主板,硬盘,上行速度
  • 它们都在Ubuntu12.04和Python2.7.3以及Django的最新版本上
  • 它们还位于同一数据中心,具有相同的服务器设置名称
  • 它们的ping和traceroute结果与名称服务器类似
服务器A工作正常。我的问题是服务器B在使用python连接到internet时非常慢

下面是我在两台服务器上进行的测试(domain_list_1和domain_list_2是两个列表,每个列表中包含100个唯一的域):

测试一:

starttime = time.time()
for domain in domain_list_1:
    ip = socket.gethostbyname(domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 0.7 items per second
starttime = time.time()
for domain in domain_list_2:
    os.system('nslookup %s > /dev/null' % domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 3.3 items per second
12879 21:29:24.182590 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000035>
12879 21:29:24.182694 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000018>
12879 21:29:24.182778 sendto(5, "'!\1\0\0\1\0\0\0\0\0\0\njanadrakka\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32 <0.000040>
12879 21:29:24.182881 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.067000>
12879 21:29:24.249987 ioctl(5, FIONREAD, [130]) = 0 <0.000022>
12879 21:29:24.250100 recvfrom(5, "'!\201\200\0\1\0\1\0\2\0\2\njanadrakka\3com\0\0\1\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 130 <0.000032>
12879 21:29:24.250287 close(5)          = 0 <0.000053>
4850  21:28:55.501276 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000019>
4850  21:28:55.501348 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000014>
4850  21:28:55.501419 sendto(5, "\346\10\1\0\0\1\0\0\0\0\0\0\fdeghatgostar\3com\0\0\1"..., 34, MSG_NOSIGNAL, NULL, 0) = 34 <0.000036>
4850  21:28:55.501506 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.615731>
4850  21:28:56.117335 ioctl(5, FIONREAD, [129]) = 0 <0.000033>
4850  21:28:56.117429 recvfrom(5, "\346\10\201\200\0\1\0\1\0\2\0\2\fdeghatgostar\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 129 <0.000011>
4850  21:28:56.117499 close(5)          = 0 <0.000009>
测试二:

starttime = time.time()
for domain in domain_list_1:
    ip = socket.gethostbyname(domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 0.7 items per second
starttime = time.time()
for domain in domain_list_2:
    os.system('nslookup %s > /dev/null' % domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 3.3 items per second
12879 21:29:24.182590 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000035>
12879 21:29:24.182694 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000018>
12879 21:29:24.182778 sendto(5, "'!\1\0\0\1\0\0\0\0\0\0\njanadrakka\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32 <0.000040>
12879 21:29:24.182881 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.067000>
12879 21:29:24.249987 ioctl(5, FIONREAD, [130]) = 0 <0.000022>
12879 21:29:24.250100 recvfrom(5, "'!\201\200\0\1\0\1\0\2\0\2\njanadrakka\3com\0\0\1\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 130 <0.000032>
12879 21:29:24.250287 close(5)          = 0 <0.000053>
4850  21:28:55.501276 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000019>
4850  21:28:55.501348 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000014>
4850  21:28:55.501419 sendto(5, "\346\10\1\0\0\1\0\0\0\0\0\0\fdeghatgostar\3com\0\0\1"..., 34, MSG_NOSIGNAL, NULL, 0) = 34 <0.000036>
4850  21:28:55.501506 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.615731>
4850  21:28:56.117335 ioctl(5, FIONREAD, [129]) = 0 <0.000033>
4850  21:28:56.117429 recvfrom(5, "\346\10\201\200\0\1\0\1\0\2\0\2\fdeghatgostar\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 129 <0.000011>
4850  21:28:56.117499 close(5)          = 0 <0.000009>
从测试2中可以看到,服务器B上的联网没有问题

我对urllib2做了类似的测试,结果是相同的(服务器A可以,但服务器B使用urllib2比使用wget或curl做同样的工作要慢)。所以我认为这是一个Python问题。我只是不知道服务器B上的Python设置出了什么问题

是否有一种方法可以分析内部流程,并找出代码的哪一部分会减慢整个流程


提前谢谢你

根据Greg给出的建议,我查看了strace输出,发现如下:

服务器A:

starttime = time.time()
for domain in domain_list_1:
    ip = socket.gethostbyname(domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 0.7 items per second
starttime = time.time()
for domain in domain_list_2:
    os.system('nslookup %s > /dev/null' % domain)
print '%.1f items per second' % (100/(time.time()-starttime))
>> Server A Results: 3.3 items per second
>> Server B Results: 3.3 items per second
12879 21:29:24.182590 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000035>
12879 21:29:24.182694 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000018>
12879 21:29:24.182778 sendto(5, "'!\1\0\0\1\0\0\0\0\0\0\njanadrakka\3com\0\0\1\0\1", 32, MSG_NOSIGNAL, NULL, 0) = 32 <0.000040>
12879 21:29:24.182881 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.067000>
12879 21:29:24.249987 ioctl(5, FIONREAD, [130]) = 0 <0.000022>
12879 21:29:24.250100 recvfrom(5, "'!\201\200\0\1\0\1\0\2\0\2\njanadrakka\3com\0\0\1\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 130 <0.000032>
12879 21:29:24.250287 close(5)          = 0 <0.000053>
4850  21:28:55.501276 connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, 16) = 0 <0.000019>
4850  21:28:55.501348 poll([{fd=5, events=POLLOUT}], 1, 0) = 1 ([{fd=5, revents=POLLOUT}]) <0.000014>
4850  21:28:55.501419 sendto(5, "\346\10\1\0\0\1\0\0\0\0\0\0\fdeghatgostar\3com\0\0\1"..., 34, MSG_NOSIGNAL, NULL, 0) = 34 <0.000036>
4850  21:28:55.501506 poll([{fd=5, events=POLLIN}], 1, 5000) = 1 ([{fd=5, revents=POLLIN}]) <0.615731>
4850  21:28:56.117335 ioctl(5, FIONREAD, [129]) = 0 <0.000033>
4850  21:28:56.117429 recvfrom(5, "\346\10\201\200\0\1\0\1\0\2\0\2\fdeghatgostar\3com\0\0\1"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("206.251.73.9")}, [16]) = 129 <0.000011>
4850  21:28:56.117499 close(5)          = 0 <0.000009>

最后,感谢Greg的建议。

“连接到internet”与您所做的“按名称查找DNS条目”大不相同。您的名称解析系统可能在服务器B上配置错误。这不太可能是Python的问题,因为Python只是在调用操作系统。@Greg,如果操作系统级别的名称解析有问题,那么如何解释nslookup比Python的socket快得多。gethostbyname()?nslookup绕过您的本地解析程序库,因为它特定于DNS(请记住,名称可以通过多种方式解析,其中只有一种是DNS)。请检查主机名的查找方式。@Greg,/etc/nsswitch.conf在A&B上是完全相同的。
主机的行:
说明了什么?该行列出的服务是解析程序库检查主机名查找的内容。其中一个条目可能是
dns
,但可能还有其他条目(这可能会导致您的问题)。您还可以在
strace
下运行程序,并查看额外延迟的位置(但这样可能很难跟踪)。