Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/289.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 横幅抓取HTTP_Python_Banner - Fatal编程技术网

Python 横幅抓取HTTP

Python 横幅抓取HTTP,python,banner,Python,Banner,我正在尝试实现HTTP横幅抓取。 我写道: s=socket.socket(socket.AF_INET,socket.SOCK_STREAM) s.settimeout(2) s.connect((ip_address,80)) byte = str.encode("Server:\r\n") s.send(byte) banner = s.recv(1024) print(banner) 它应该打印错误请求以及有关服务器的进一步信息,但它会打印浏览器的HTML。当http web服务器

我正在尝试实现HTTP横幅抓取。 我写道:

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.settimeout(2)
s.connect((ip_address,80)) 

byte = str.encode("Server:\r\n")
s.send(byte)
banner = s.recv(1024)
print(banner)

它应该打印
错误请求
以及有关服务器的进一步信息,但它会打印浏览器的HTML。

当http web服务器从您的
客户端收到
http
方法时,例如
服务器:\r\n
,这对web服务器没有意义,它可能返回同时包含标题和内容的响应

:

4xx类状态代码适用于以下情况: 似乎出了差错。除非在响应HEAD请求时 服务器应该包含一个包含错误解释的实体 情况,以及它是暂时的还是永久的情况。这些 状态代码适用于任何请求方法。用户代理应该 向用户显示任何包含的实体

因此,如果您只需要的头部分,请发送HTTP
HEAD
请求

以下是一个例子:

import socket


def http_banner_grabber(ip, port=80, method="HEAD",
                        timeout=60, http_type="HTTP/1.1"):
    assert method in ['GET', 'HEAD']
    # @see: http://stackoverflow.com/q/246859/538284
    assert http_type in ['HTTP/0.9', "HTTP/1.0", 'HTTP/1.1']
    cr_lf = '\r\n'
    lf_lf = '\n\n'
    crlf_crlf = cr_lf + cr_lf
    res_sep = ''
    # how much read from buffer socket in every read
    rec_chunk = 4096
    s = socket.socket()
    s.settimeout(timeout)
    s.connect((ip, port))
    # the req_data is like 'HEAD HTTP/1.1 \r\n'
    req_data = "{} / {}{}".format(method, http_type, cr_lf)
    # if is a HTTP 1.1 protocol request,
    if http_type == "HTTP/1.1":
        # then we need to send Host header (we send ip instead of host here!)
        # adding host header to req_data like 'Host: google.com:80\r\n'
        req_data += 'Host: {}:{}{}'.format(ip, port, cr_lf)
        # set connection header to close for HTTP 1.1
        # adding connection header to req_data like 'Connection: close\r\n'
        req_data += "Connection: close{}".format(cr_lf)
    # headers join together with `\r\n` and ends with `\r\n\r\n`
    # adding '\r\n' to end of req_data
    req_data += cr_lf
    # the s.send() method may send only partial content. 
    # so we used s.sendall()
    s.sendall(req_data.encode())
    res_data = b''
    # default maximum header response is different in web servers: 4k, 8k, 16k
    # @see: http://stackoverflow.com/a/8623061/538284
    # the s.recv(n) method may receive less than n bytes, 
    # so we used it in while.
    while 1:
        try:
            chunk = s.recv(rec_chunk)
            res_data += chunk
        except socket.error:
            break
        if not chunk:
            break
    if res_data:
        # decode `res_data` after reading all content of data buffer
        res_data = res_data.decode()
    else:
        return '', ''
    # detect header and body separated that is '\r\n\r\n' or '\n\n'
    if crlf_crlf in res_data:
        res_sep = crlf_crlf
    elif lf_lf in res_data:
        res_sep = lf_lf
    # for under HTTP/1.0 request type for servers doesn't support it
    #  and servers send just send body without header !
    if res_sep not in [crlf_crlf, lf_lf] or res_data.startswith('<'):
        return '', res_data
    # split header and data section from
    # `HEADER\r\n\r\nBODY` response or `HEADER\n\nBODY` response
    content = res_data.split(res_sep)
    banner, body = "".join(content[:1]), "".join(content[1:])
    return banner, body
您还可以使用
GET
方法和其他选项进行尝试:

for domain, ip in addresses.items():
    banner, body = http_banner_grabber(ip, method="GET", http_type='HTTP/0.9')
    print('*' * 24)
    print(domain, ip, 'GET HTTP/0.9')
    print(banner)
输出(第一个示例):

输出(第二个示例):

现在,如果您查看
msdn.microsoft.com
google.com
Server
标题,在我们的两种示例中,通过此工具,我们能够发现一个新事物:

  • 对于
    HTTP 1.1
    请求
    google.com
    Server
    gws
    ,对于
    http0.9
    请求,
    Server
    更改为
    GFE/2.0

  • 对于
    HTTP 1.1
    请求
    msdn.microsoft.com
    Server

    Microsoft IIS/8.0
    对于
    HTTP 0.9
    请求,
    Server
    更改为
    Microsoft HTTPAPI/2.0


使用
httplib
@J.F.Sebastian发出HEAD请求可能更简单,是的,使用
httplib
也可以,但是
socket
更具概念性。谢谢,您的建议已得到应用。响应可能会使用
\n
而不是
\r\n
(您应该发送
\r\n
,但会收到
\r\n
\n
)相关信息:
for domain, ip in addresses.items():
    banner, body = http_banner_grabber(ip, method="GET", http_type='HTTP/0.9')
    print('*' * 24)
    print(domain, ip, 'GET HTTP/0.9')
    print(banner)
************************
google.com 216.239.32.20 HEAD HTTP/1.1
HTTP/1.1 200 OK
Date: Mon, 31 Mar 2014 01:25:53 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: **** it was to long line and removed ****
P3P: **** it was to long line and removed ****
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Connection: close
************************
msdn.microsoft.com 157.56.148.19 HEAD HTTP/1.1
HTTP/1.1 301 Moved Permanently
Content-Length: 0
Location: http://157.56.148.19/en-us/default.aspx
Server: Microsoft-IIS/8.0
P3P: **** it was to long line and removed ****
X-Powered-By: ASP.NET
X-Instance: CH104
Date: Mon, 31 Mar 2014 01:25:53 GMT
Connection: close
msdn.microsoft.com 157.56.148.19 GET HTTP/0.9
HTTP/1.1 400 Bad Request
Content-Type: text/html; charset=us-ascii
Server: Microsoft-HTTPAPI/2.0
Date: Mon, 31 Mar 2014 01:27:13 GMT
Connection: close
Content-Length: 311
************************
google.com 216.239.32.20 GET HTTP/0.9
HTTP/1.0 400 Bad Request
Content-Type: text/html; charset=UTF-8
Content-Length: 1419
Date: Mon, 31 Mar 2014 01:27:14 GMT
Server: GFE/2.0