在python的HTTP响应中接收正文内容_Python_Sockets_Http_Recv

在python的HTTP响应中接收正文内容

python sockets http

在python的HTTP响应中接收正文内容,python,sockets,http,recv,Python,Sockets,Http,Recv,我试图从正文中获取内容，但当我需要sock.recv时，我总是返回0字节。我已经得到了标题，它工作得很好，但我收到了它的字节。我现在的问题是：我有内容长度，标题的长度，还有标题。现在我想把尸体分开任务3d PS：我知道它不能工作，因为它是在屏幕截图上，但我还没有找到另一个解决方案 # -*- coding: utf-8 -*- """ task3.simple_web_browser XX-YYY-ZZZ <Your name> """ from socket import

我试图从正文中获取内容，但当我需要sock.recv时，我总是返回0字节。我已经得到了标题，它工作得很好，但我收到了它的字节。我现在的问题是：我有内容长度，标题的长度，还有标题。现在我想把尸体分开任务3d

PS：我知道它不能工作，因为它是在屏幕截图上，但我还没有找到另一个解决方案

  # -*- coding: utf-8 -*-
"""
task3.simple_web_browser
XX-YYY-ZZZ
<Your name>
"""

from socket import gethostbyname, socket, timeout, AF_INET, SOCK_STREAM
from sys import argv


HTTP_HEADER_DELIMITER = b'\r\n\r\n'
CONTENT_LENGTH_FIELD = b'Content-Length:'
HTTP_PORT = 80
ONE_BYTE_LENGTH = 1

def create_http_request(host, path, method='GET'):
    '''
    Create a sequence of bytes representing an HTTP/1.1 request of the given method.

    :param host: the string contains the hostname of the remote server
    :param path: the string contains the path to the document to retrieve
    :param method: the string contains the HTTP request method (e.g., 'GET', 'HEAD', etc...)
    :return: a bytes object contains the HTTP request to send to the remote server

    e.g.,) An HTTP/1.1 GET request to http://compass.unisg.ch/
    host: compass.unisg.ch
    path: /
    return: b'GET / HTTP/1.1\nHost: compass.unisg.ch\r\n\r\n'
    '''
    ###   Task 3(a)   ###

    # Hint 1: see RFC7230-7231 for the HTTP/1.1 syntax and semantics specification
    # https://tools.ietf.org/html/rfc7230
    # https://tools.ietf.org/html/rfc7231
    # Hint 2: use str.encode() to create an encoded version of the string as a bytes object
    # https://docs.python.org/3/library/stdtypes.html#str.encode
    r =  '{} {} HTTP/1.1\nHost: {}\r\n\r\n'.format(method, path, host)
    response = r.encode()

    return response


    ### Task 3(a) END ###


def get_content_length(header):
    '''
    Get the integer value from the Content-Length HTTP header field if it
    is found in the given sequence of bytes. Otherwise returns 0.

    :param header: the bytes object contains the HTTP header
    :return: an integer value of the Content-Length, 0 if not found
    '''
    ###   Task 3(c)   ###

    # Hint: use CONTENT_LENGTH_FIELD to find the value
    # Note that the Content-Length field may not be always at the end of the header.
    for line in header.split(b'\r\n'):
        if CONTENT_LENGTH_FIELD in line:
            return int(line[len(CONTENT_LENGTH_FIELD):])
    return 0


    ### Task 3(c) END ###


def receive_body(sock, content_length):
    '''
    Receive the body content in the HTTP response

    :param sock: the TCP socket connected to the remote server
    :param content_length: the size of the content to recieve
    :return: a bytes object contains the remaining content (body) in the HTTP response
    '''
    ###   Task 3(d)   ###
    body = bytes()
    data = bytes()


    while True:
        data = sock.recv(content_length)
        if len(data)<=0:
            break
        else:
            body += data

    return body 


    ### Task 3(d) END ###


def receive_http_response_header(sock):
    '''
    Receive the HTTP response header from the TCP socket.

    :param sock: the TCP socket connected to the remote server
    :return: a bytes object that is the HTTP response header received
    '''
    ###   Task 3(b)   ###

    # Hint 1: use HTTP_HEADER_DELIMITER to determine the end of the HTTP header
    # Hint 2: use sock.recv(ONE_BYTE_LENGTH) to receive the chunk byte-by-byte

    header = bytes() 
    chunk = bytes()

    try:
        while HTTP_HEADER_DELIMITER not in chunk:
            chunk = sock.recv(ONE_BYTE_LENGTH)
            if not chunk:
                break
            else:
                header += chunk
    except socket.timeout:
        pass

    return header  

    ### Task 3(b) END ###


def main():
    # Change the host and path below to test other web sites!
    host = 'example.com'
    path = '/index.html'
    print(f"# Retrieve data from http://{host}{path}")

    # Get the IP address of the host
    ip_address = gethostbyname(host)
    print(f"> Remote server {host} resolved as {ip_address}")

    # Establish the TCP connection to the host
    sock = socket(AF_INET, SOCK_STREAM)
    sock.connect((ip_address, HTTP_PORT))
    print(f"> TCP Connection to {ip_address}:{HTTP_PORT} established")

 # Uncomment this comment block after Task 3(a)
    # Send an HTTP GET request
    http_get_request = create_http_request(host, path)
    print('\n# HTTP GET request ({} bytes)'.format(len(http_get_request)))
    print(http_get_request)
    sock.sendall(http_get_request)
 # Comment block for Task 3(a) END

 # Uncomment this comment block after Task 3(b)
    # Receive the HTTP response header
    header = receive_http_response_header(sock)
    print(type(header))
    print('\n# HTTP Response Header ({} bytes)'.format(len(header)))
    print(header)
 # Comment block for Task 3(b) END

#  Uncomment this comment block after Task 3(c)
    content_length = get_content_length(header)
    print('\n# Content-Length')
    print(f"{content_length} bytes")
 # Comment block for Task 3(c) END

 # Uncomment this comment block after Task 3(d)
    body = receive_body(sock, content_length)
    print('\n# Body ({} bytes)'.format(len(body)))
    print(body)
 # Comment block for Task 3(d) END

if __name__ == '__main__':
    main()

#-*-编码：utf-8-*-
"""
task3.simple\u web\u浏览器
XX-YYY-ZZZ
"""
从套接字导入gethostbyname、套接字、超时、AF\u INET、SOCK\u流
从系统导入argv
HTTP\u头\u分隔符=b'\r\n\r\n'
内容长度字段=b'内容长度：'
HTTP_端口=80
一字节长度=1
def create_http_请求（主机、路径、方法='GET'）：
'''
创建表示给定方法的HTTP/1.1请求的字节序列。
：param host：该字符串包含远程服务器的主机名
：param path：字符串包含要检索的文档的路径
：param method：字符串包含HTTP请求方法（例如，“GET”、“HEAD”等）
：return:bytes对象包含要发送到远程服务器的HTTP请求
e、 例如，一个HTTP/1.1 GET请求http://compass.unisg.ch/
主持人：compass.unisg.ch
路径：/
return:b'GET/HTTP/1.1\n主机：compass.unisg.ch\r\n\r\n'
'''
###任务3（a）###
#提示1：有关HTTP/1.1语法和语义规范，请参阅RFC7230-7231
# https://tools.ietf.org/html/rfc7230
# https://tools.ietf.org/html/rfc7231
#提示2：使用str.encode（）创建字符串的编码版本作为字节对象
# https://docs.python.org/3/library/stdtypes.html#str.encode
r='{}{}HTTP/1.1\n主机：{}\r\n\r\n'。格式（方法、路径、主机）
响应=r.encode（）
返回响应
###任务3（a）结束###
def get_内容_长度（标题）：
'''
从内容长度HTTP头字段中获取整数值（如果
在给定的字节序列中找到。否则返回0。
：param header:bytes对象包含HTTP头
：return：内容长度的整数值，如果找不到，则为0
'''
###任务3（c）###
#提示：使用CONTENT\u LENGTH\u字段查找值
#请注意，内容长度字段可能并不总是在标题的末尾。
对于标头中的行。拆分（b'\r\n'）：
如果行中的内容\长度\字段：
返回int（行[len（内容长度字段）：]）
返回0
###任务3（c）结束###
def接收体（袜子、内容物长度）：
'''
在HTTP响应中接收正文内容
：param sock：连接到远程服务器的TCP套接字
：param content_length：要接收的内容的大小
：return:bytes对象包含HTTP响应中的剩余内容（正文）
'''
###任务3（d）###
body=bytes（）
数据=字节（）
尽管如此：
数据=sock.recv（内容长度）
if len（数据）
我有内容长度，标题的长度，还有标题
你没有。在receive\u-http\u-response\u-header
中，检查http\u-header\u-DELIMITER
始终只检查最新字节（chunk
而不是header
），这意味着您永远不会匹配头的结尾：
    while HTTP_HEADER_DELIMITER not in chunk:
        chunk = sock.recv(ONE_BYTE_LENGTH)
        if not chunk:
            break
        else:
            header += chunk

然后，假设您已经阅读了完整的标题，而实际上您已经阅读了完整的响应。这意味着另一个<代码> RecV<代码>当您尝试读取响应体时，只返回0，因为没有更多的数据，即主体已经包含在您所考虑的HTTP报头中。
除此之外，receive\u body
也是错误的，因为您在receive\u http\u response\u header
中也犯了类似的错误：目标是不反复读取recv
content\u length

字节，直到没有像您当前这样的字节可用，但目标是在

length（body）时返回

匹配

内容\u长度

并在正文未完全读取时继续读取剩余数据。

如果是http请求，为什么不能使用python

请求

包？它的使用非常简单，比如

requests.get（“http://example.com）content

在本练习中，您不允许使用请求库…@Chris感谢我在“我有内容长度、标题长度和标题长度”之前从未使用过stackoverflow的提示-您知道吗？在

receive\u-http\u-response\u-header

中，您检查

http\u-header\u-DELIMITER

时总是只检查最新的字节（

chunk

而不是

header

），这意味着您永远不会匹配头的结尾。@SteffenUllrich是的，我以前发现了错误。无论如何，谢谢你。。