python whois库对活动域名不返回任何内容_Python_Dns_Records_Whois

python whois库对活动域名不返回任何内容

python dns

python whois库对活动域名不返回任何内容,python,dns,records,whois,Python,Dns,Records,Whois,我正在尝试使用python whois库来收集一些网站的whois记录问题是，我没有得到一些网站，如nih.gov，这是一个活跃的域名 w = whois.whois("nih.gov") print w {u'updated_date': None, u'status': u'ACTIVE', u'name': None, u'dnssec': None, u'city': None, u'expiration_date': None, u'zipcode': None, u'domain_

我正在尝试使用python whois库来收集一些网站的whois记录

问题是，我没有得到一些网站，如nih.gov，这是一个活跃的域名

w = whois.whois("nih.gov")
print w
{u'updated_date': None, u'status': u'ACTIVE', u'name': None, u'dnssec': None, u'city': None, u'expiration_date': None, u'zipcode': None, u'domain_name': u'NIH.GOV', u'country': None, u'whois_server': None, u'state': None, u'registrar': None, u'referral_url': None, u'address': None, u'name_servers': None, u'org': None, u'creation_date': None, u'emails': None}

我不明白问题出在哪里，我应该使用哪个库或如何使用它来涵盖所有情况？

以下是一些代码

import sys
import socket
from datetime import datetime as dt
import time

def whois(ip):

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(("whois.arin.net", 43))
    s.send(('n ' + ip + '\r\n').encode())

    response = b""

    # setting time limit in secondsmd
    startTime = time.mktime(dt.now().timetuple())
    timeLimit = 3
    while True:
        elapsedTime = time.mktime(dt.now().timetuple()) - startTime
        data = s.recv(4096)
        response += data
        if (not data) or (elapsedTime >= timeLimit):
            break
    s.close()

    print(response.decode())

def main():
    domain = sys.argv[1];
    ip = socket.gethostbyname(domain);
    whois(ip)

main()

例如：

c:\Temp>py test.py www.google.com

#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#


#
# The following results may also be obtained via:
# https://whois.arin.net/rest/nets;q=216.58.213.196?showDetails=true&showARIN=false&showNonArinTopLevelNet=false&ext=netref2
#

NetRange:       216.58.192.0 - 216.58.223.255
CIDR:           216.58.192.0/19
NetName:        GOOGLE
NetHandle:      NET-216-58-192-0-1
Parent:         NET216 (NET-216-0-0-0-0)
NetType:        Direct Allocation
OriginAS:       AS15169
Organization:   Google LLC (GOGL)
RegDate:        2012-01-27
Updated:        2012-01-27
Ref:            https://whois.arin.net/rest/net/NET-216-58-192-0-1


OrgName:        Google LLC
OrgId:          GOGL
Address:        1600 Amphitheatre Parkway
City:           Mountain View
StateProv:      CA
PostalCode:     94043
Country:        US
RegDate:        2000-03-30
Updated:        2017-12-21
Ref:            https://whois.arin.net/rest/org/GOGL


OrgAbuseHandle: ABUSE5250-ARIN
OrgAbuseName:   Abuse
OrgAbusePhone:  +1-650-253-0000
OrgAbuseEmail:  network-abuse@google.com
OrgAbuseRef:    https://whois.arin.net/rest/poc/ABUSE5250-ARIN

OrgTechHandle: ZG39-ARIN
OrgTechName:   Google LLC
OrgTechPhone:  +1-650-253-0000
OrgTechEmail:  arin-contact@google.com
OrgTechRef:    https://whois.arin.net/rest/poc/ZG39-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#

特别是www.nih.gov，我们得到：

c:\Temp>py test.py www.nih.gov

#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#


#
# The following results may also be obtained via:
# https://whois.arin.net/rest/nets;q=23.21.241.1?showDetails=true&showARIN=false&showNonArinTopLevelNet=false&ext=netref2
#

NetRange:       23.20.0.0 - 23.23.255.255
CIDR:           23.20.0.0/14
NetName:        AMAZON-EC2-USEAST-10
NetHandle:      NET-23-20-0-0-1
Parent:         NET23 (NET-23-0-0-0-0)
NetType:        Direct Allocation
OriginAS:       AS16509
Organization:   Amazon.com, Inc. (AMAZO-4)
RegDate:        2011-09-19
Updated:        2014-09-03
Comment:        The activity you have detected originates from a dynamic hosting environment.
Comment:        For fastest response, please submit abuse reports at http://aws-portal.amazon.com/gp/aws/html-forms-controller/contactus/AWSAbuse
Comment:        For more information regarding EC2 see:
Comment:        http://ec2.amazonaws.com/
Comment:        All reports MUST include:
Comment:        * src IP
Comment:        * dest IP (your IP)
Comment:        * dest port
Comment:        * Accurate date/timestamp and timezone of activity
Comment:        * Intensity/frequency (short log extracts)
Comment:        * Your contact details (phone and email) Without these we will be unable to identify the correct owner of the IP address at that point in time.
Ref:            https://whois.arin.net/rest/net/NET-23-20-0-0-1


OrgName:        Amazon.com, Inc.
OrgId:          AMAZO-4
Address:        Amazon Web Services, Inc.
Address:        P.O. Box 81226
City:           Seattle
StateProv:      WA
PostalCode:     98108-1226
Country:        US
RegDate:        2005-09-29
Updated:        2017-01-28
Comment:        For details of this service please see
Comment:        http://ec2.amazonaws.com/
Ref:            https://whois.arin.net/rest/org/AMAZO-4


OrgAbuseHandle: AEA8-ARIN
OrgAbuseName:   Amazon EC2 Abuse
OrgAbusePhone:  +1-206-266-4064
OrgAbuseEmail:  abuse@amazonaws.com
OrgAbuseRef:    https://whois.arin.net/rest/poc/AEA8-ARIN

OrgTechHandle: ANO24-ARIN
OrgTechName:   Amazon EC2 Network Operations
OrgTechPhone:  +1-206-266-4064
OrgTechEmail:  amzn-noc-contact@amazon.com
OrgTechRef:    https://whois.arin.net/rest/poc/ANO24-ARIN

OrgNOCHandle: AANO1-ARIN
OrgNOCName:   Amazon AWS Network Operations
OrgNOCPhone:  +1-206-266-4064
OrgNOCEmail:  amzn-noc-contact@amazon.com
OrgNOCRef:    https://whois.arin.net/rest/poc/AANO1-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#

不同的选项

这是另一个选择

这段代码在脚本文件夹中创建了一个文件，其中包含来自不同服务的whois请求的HTML。你可以修改它以满足你的需要，我刚刚写了一些基础知识

import urllib.request
import tempfile
import io
from bs4 import BeautifulSoup
import sys

def writeFile(text):
    with io.open('whoisData.txt', "w", encoding="utf-8") as f:
        f.write(text)
    f.close()

def readHTML(domain):
    url = 'https://www.whois.com/whois/' + domain
    html = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(html)

    # kill all script and style elements
    for script in soup(["script", "style"]):
        script.extract()    # rip it out

    # get text
    text = soup.get_text()

    # break into lines and remove leading and trailing space on each
    lines = (line.strip() for line in text.splitlines())
    # break multi-headlines into a line each
    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
    # drop blank lines
    text = '\n'.join(chunk for chunk in chunks if chunk)
    writeFile(text)

def main():
    domain = sys.argv[1]
    readHTML(domain)

main()

从（解析HTMLs）中获取了一些参考。

这里有一些代码可以完成这项工作

import sys
import socket
from datetime import datetime as dt
import time

def whois(ip):

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(("whois.arin.net", 43))
    s.send(('n ' + ip + '\r\n').encode())

    response = b""

    # setting time limit in secondsmd
    startTime = time.mktime(dt.now().timetuple())
    timeLimit = 3
    while True:
        elapsedTime = time.mktime(dt.now().timetuple()) - startTime
        data = s.recv(4096)
        response += data
        if (not data) or (elapsedTime >= timeLimit):
            break
    s.close()

    print(response.decode())

def main():
    domain = sys.argv[1];
    ip = socket.gethostbyname(domain);
    whois(ip)

main()

例如：

c:\Temp>py test.py www.google.com

#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#


#
# The following results may also be obtained via:
# https://whois.arin.net/rest/nets;q=216.58.213.196?showDetails=true&showARIN=false&showNonArinTopLevelNet=false&ext=netref2
#

NetRange:       216.58.192.0 - 216.58.223.255
CIDR:           216.58.192.0/19
NetName:        GOOGLE
NetHandle:      NET-216-58-192-0-1
Parent:         NET216 (NET-216-0-0-0-0)
NetType:        Direct Allocation
OriginAS:       AS15169
Organization:   Google LLC (GOGL)
RegDate:        2012-01-27
Updated:        2012-01-27
Ref:            https://whois.arin.net/rest/net/NET-216-58-192-0-1


OrgName:        Google LLC
OrgId:          GOGL
Address:        1600 Amphitheatre Parkway
City:           Mountain View
StateProv:      CA
PostalCode:     94043
Country:        US
RegDate:        2000-03-30
Updated:        2017-12-21
Ref:            https://whois.arin.net/rest/org/GOGL


OrgAbuseHandle: ABUSE5250-ARIN
OrgAbuseName:   Abuse
OrgAbusePhone:  +1-650-253-0000
OrgAbuseEmail:  network-abuse@google.com
OrgAbuseRef:    https://whois.arin.net/rest/poc/ABUSE5250-ARIN

OrgTechHandle: ZG39-ARIN
OrgTechName:   Google LLC
OrgTechPhone:  +1-650-253-0000
OrgTechEmail:  arin-contact@google.com
OrgTechRef:    https://whois.arin.net/rest/poc/ZG39-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#

特别是www.nih.gov，我们得到：

c:\Temp>py test.py www.nih.gov

#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#


#
# The following results may also be obtained via:
# https://whois.arin.net/rest/nets;q=23.21.241.1?showDetails=true&showARIN=false&showNonArinTopLevelNet=false&ext=netref2
#

NetRange:       23.20.0.0 - 23.23.255.255
CIDR:           23.20.0.0/14
NetName:        AMAZON-EC2-USEAST-10
NetHandle:      NET-23-20-0-0-1
Parent:         NET23 (NET-23-0-0-0-0)
NetType:        Direct Allocation
OriginAS:       AS16509
Organization:   Amazon.com, Inc. (AMAZO-4)
RegDate:        2011-09-19
Updated:        2014-09-03
Comment:        The activity you have detected originates from a dynamic hosting environment.
Comment:        For fastest response, please submit abuse reports at http://aws-portal.amazon.com/gp/aws/html-forms-controller/contactus/AWSAbuse
Comment:        For more information regarding EC2 see:
Comment:        http://ec2.amazonaws.com/
Comment:        All reports MUST include:
Comment:        * src IP
Comment:        * dest IP (your IP)
Comment:        * dest port
Comment:        * Accurate date/timestamp and timezone of activity
Comment:        * Intensity/frequency (short log extracts)
Comment:        * Your contact details (phone and email) Without these we will be unable to identify the correct owner of the IP address at that point in time.
Ref:            https://whois.arin.net/rest/net/NET-23-20-0-0-1


OrgName:        Amazon.com, Inc.
OrgId:          AMAZO-4
Address:        Amazon Web Services, Inc.
Address:        P.O. Box 81226
City:           Seattle
StateProv:      WA
PostalCode:     98108-1226
Country:        US
RegDate:        2005-09-29
Updated:        2017-01-28
Comment:        For details of this service please see
Comment:        http://ec2.amazonaws.com/
Ref:            https://whois.arin.net/rest/org/AMAZO-4


OrgAbuseHandle: AEA8-ARIN
OrgAbuseName:   Amazon EC2 Abuse
OrgAbusePhone:  +1-206-266-4064
OrgAbuseEmail:  abuse@amazonaws.com
OrgAbuseRef:    https://whois.arin.net/rest/poc/AEA8-ARIN

OrgTechHandle: ANO24-ARIN
OrgTechName:   Amazon EC2 Network Operations
OrgTechPhone:  +1-206-266-4064
OrgTechEmail:  amzn-noc-contact@amazon.com
OrgTechRef:    https://whois.arin.net/rest/poc/ANO24-ARIN

OrgNOCHandle: AANO1-ARIN
OrgNOCName:   Amazon AWS Network Operations
OrgNOCPhone:  +1-206-266-4064
OrgNOCEmail:  amzn-noc-contact@amazon.com
OrgNOCRef:    https://whois.arin.net/rest/poc/AANO1-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/public/whoisinaccuracy/index.xhtml
#

不同的选项

这是另一个选择

这段代码在脚本文件夹中创建了一个文件，其中包含来自不同服务的whois请求的HTML。你可以修改它以满足你的需要，我刚刚写了一些基础知识

import urllib.request
import tempfile
import io
from bs4 import BeautifulSoup
import sys

def writeFile(text):
    with io.open('whoisData.txt', "w", encoding="utf-8") as f:
        f.write(text)
    f.close()

def readHTML(domain):
    url = 'https://www.whois.com/whois/' + domain
    html = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(html)

    # kill all script and style elements
    for script in soup(["script", "style"]):
        script.extract()    # rip it out

    # get text
    text = soup.get_text()

    # break into lines and remove leading and trailing space on each
    lines = (line.strip() for line in text.splitlines())
    # break multi-headlines into a line each
    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
    # drop blank lines
    text = '\n'.join(chunk for chunk in chunks if chunk)
    writeFile(text)

def main():
    domain = sys.argv[1]
    readHTML(domain)

main()

从（解析HTMLs）中获取了一些参考。

比较，比如。这似乎是

whois

提供的所有信息。比较一下，比如。看来这是whois提供的所有信息。谢谢@oBit91，我会用的。效率很高。还有一件事，当我使用python whois库时，我可以提取whois响应的注册记录。我们可以通过response.decode（）提取注册器名称吗？很抱歉，我现在无法访问我的笔记本电脑write。但它似乎不返回whois记录。我使用python test.py bbc.com进行测试，打印的值与whois.whois（“bbc.com”）输出的值不同，后者更有效。@ShahroozPooryousef我添加了第二个选项，使用在线web解析HTML文件。无论哪种方式，您都可以看到最适合自己的内容。

whois.arin.net

将用于IP地址，因此您可以使用解析的主机名（

www.google.com

，在您的示例中）来查询它，而不是使用域名，实际上可能无法解析。至于使用whois.com这只是众多提供whois访问权限的公司之一，你应该确保阅读他们的TOS，如果他们不收集或使用你的数据，你应该感到高兴。。。（换言之：最好是查询注册表whois服务器，而不是第三方）谢谢@oBit91，我将使用它。效率很高。还有一件事，当我使用python whois库时，我可以提取whois响应的注册记录。我们可以通过response.decode（）提取注册器名称吗？很抱歉，我现在无法访问我的笔记本电脑write。但它似乎不返回whois记录。我使用python test.py bbc.com进行测试，打印的值与whois.whois（“bbc.com”）输出的值不同，后者更有效。@ShahroozPooryousef我添加了第二个选项，使用在线web解析HTML文件。无论哪种方式，您都可以看到最适合自己的内容。

whois.arin.net

将用于IP地址，因此您可以使用解析的主机名（

www.google.com

，在您的示例中）来查询它，而不是使用域名，实际上可能无法解析。至于使用whois.com这只是众多提供whois访问权限的公司之一，你应该确保阅读他们的TOS，如果他们不收集或使用你的数据，你应该感到高兴。。。（换言之：最好查询注册表whois服务器，而不是第三方）