Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/sockets/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python的HTTP响应中接收正文内容_Python_Sockets_Http_Recv - Fatal编程技术网

在python的HTTP响应中接收正文内容

在python的HTTP响应中接收正文内容,python,sockets,http,recv,Python,Sockets,Http,Recv,我试图从正文中获取内容,但当我需要sock.recv时,我总是返回0字节。我已经得到了标题,它工作得很好,但我收到了它的字节。我现在的问题是:我有内容长度,标题的长度,还有标题。现在我想把尸体分开 任务3d PS:我知道它不能工作,因为它是在屏幕截图上,但我还没有找到另一个解决方案 # -*- coding: utf-8 -*- """ task3.simple_web_browser XX-YYY-ZZZ <Your name> """ from socket import

我试图从正文中获取内容,但当我需要sock.recv时,我总是返回0字节。我已经得到了标题,它工作得很好,但我收到了它的字节。我现在的问题是:我有内容长度,标题的长度,还有标题。现在我想把尸体分开 任务3d

PS:我知道它不能工作,因为它是在屏幕截图上,但我还没有找到另一个解决方案

  # -*- coding: utf-8 -*-
"""
task3.simple_web_browser
XX-YYY-ZZZ
<Your name>
"""

from socket import gethostbyname, socket, timeout, AF_INET, SOCK_STREAM
from sys import argv


HTTP_HEADER_DELIMITER = b'\r\n\r\n'
CONTENT_LENGTH_FIELD = b'Content-Length:'
HTTP_PORT = 80
ONE_BYTE_LENGTH = 1

def create_http_request(host, path, method='GET'):
    '''
    Create a sequence of bytes representing an HTTP/1.1 request of the given method.

    :param host: the string contains the hostname of the remote server
    :param path: the string contains the path to the document to retrieve
    :param method: the string contains the HTTP request method (e.g., 'GET', 'HEAD', etc...)
    :return: a bytes object contains the HTTP request to send to the remote server

    e.g.,) An HTTP/1.1 GET request to http://compass.unisg.ch/
    host: compass.unisg.ch
    path: /
    return: b'GET / HTTP/1.1\nHost: compass.unisg.ch\r\n\r\n'
    '''
    ###   Task 3(a)   ###

    # Hint 1: see RFC7230-7231 for the HTTP/1.1 syntax and semantics specification
    # https://tools.ietf.org/html/rfc7230
    # https://tools.ietf.org/html/rfc7231
    # Hint 2: use str.encode() to create an encoded version of the string as a bytes object
    # https://docs.python.org/3/library/stdtypes.html#str.encode
    r =  '{} {} HTTP/1.1\nHost: {}\r\n\r\n'.format(method, path, host)
    response = r.encode()

    return response


    ### Task 3(a) END ###


def get_content_length(header):
    '''
    Get the integer value from the Content-Length HTTP header field if it
    is found in the given sequence of bytes. Otherwise returns 0.

    :param header: the bytes object contains the HTTP header
    :return: an integer value of the Content-Length, 0 if not found
    '''
    ###   Task 3(c)   ###

    # Hint: use CONTENT_LENGTH_FIELD to find the value
    # Note that the Content-Length field may not be always at the end of the header.
    for line in header.split(b'\r\n'):
        if CONTENT_LENGTH_FIELD in line:
            return int(line[len(CONTENT_LENGTH_FIELD):])
    return 0


    ### Task 3(c) END ###


def receive_body(sock, content_length):
    '''
    Receive the body content in the HTTP response

    :param sock: the TCP socket connected to the remote server
    :param content_length: the size of the content to recieve
    :return: a bytes object contains the remaining content (body) in the HTTP response
    '''
    ###   Task 3(d)   ###
    body = bytes()
    data = bytes()


    while True:
        data = sock.recv(content_length)
        if len(data)<=0:
            break
        else:
            body += data

    return body 


    ### Task 3(d) END ###


def receive_http_response_header(sock):
    '''
    Receive the HTTP response header from the TCP socket.

    :param sock: the TCP socket connected to the remote server
    :return: a bytes object that is the HTTP response header received
    '''
    ###   Task 3(b)   ###

    # Hint 1: use HTTP_HEADER_DELIMITER to determine the end of the HTTP header
    # Hint 2: use sock.recv(ONE_BYTE_LENGTH) to receive the chunk byte-by-byte

    header = bytes() 
    chunk = bytes()

    try:
        while HTTP_HEADER_DELIMITER not in chunk:
            chunk = sock.recv(ONE_BYTE_LENGTH)
            if not chunk:
                break
            else:
                header += chunk
    except socket.timeout:
        pass

    return header  

    ### Task 3(b) END ###


def main():
    # Change the host and path below to test other web sites!
    host = 'example.com'
    path = '/index.html'
    print(f"# Retrieve data from http://{host}{path}")

    # Get the IP address of the host
    ip_address = gethostbyname(host)
    print(f"> Remote server {host} resolved as {ip_address}")

    # Establish the TCP connection to the host
    sock = socket(AF_INET, SOCK_STREAM)
    sock.connect((ip_address, HTTP_PORT))
    print(f"> TCP Connection to {ip_address}:{HTTP_PORT} established")

 # Uncomment this comment block after Task 3(a)
    # Send an HTTP GET request
    http_get_request = create_http_request(host, path)
    print('\n# HTTP GET request ({} bytes)'.format(len(http_get_request)))
    print(http_get_request)
    sock.sendall(http_get_request)
 # Comment block for Task 3(a) END

 # Uncomment this comment block after Task 3(b)
    # Receive the HTTP response header
    header = receive_http_response_header(sock)
    print(type(header))
    print('\n# HTTP Response Header ({} bytes)'.format(len(header)))
    print(header)
 # Comment block for Task 3(b) END

#  Uncomment this comment block after Task 3(c)
    content_length = get_content_length(header)
    print('\n# Content-Length')
    print(f"{content_length} bytes")
 # Comment block for Task 3(c) END

 # Uncomment this comment block after Task 3(d)
    body = receive_body(sock, content_length)
    print('\n# Body ({} bytes)'.format(len(body)))
    print(body)
 # Comment block for Task 3(d) END

if __name__ == '__main__':
    main()
#-*-编码:utf-8-*-
"""
task3.simple\u web\u浏览器
XX-YYY-ZZZ
"""
从套接字导入gethostbyname、套接字、超时、AF\u INET、SOCK\u流
从系统导入argv
HTTP\u头\u分隔符=b'\r\n\r\n'
内容长度字段=b'内容长度:'
HTTP_端口=80
一字节长度=1
def create_http_请求(主机、路径、方法='GET'):
'''
创建表示给定方法的HTTP/1.1请求的字节序列。
:param host:该字符串包含远程服务器的主机名
:param path:字符串包含要检索的文档的路径
:param method:字符串包含HTTP请求方法(例如,“GET”、“HEAD”等)
:return:bytes对象包含要发送到远程服务器的HTTP请求
e、 例如,一个HTTP/1.1 GET请求http://compass.unisg.ch/
主持人:compass.unisg.ch
路径:/
return:b'GET/HTTP/1.1\n主机:compass.unisg.ch\r\n\r\n'
'''
###任务3(a)###
#提示1:有关HTTP/1.1语法和语义规范,请参阅RFC7230-7231
# https://tools.ietf.org/html/rfc7230
# https://tools.ietf.org/html/rfc7231
#提示2:使用str.encode()创建字符串的编码版本作为字节对象
# https://docs.python.org/3/library/stdtypes.html#str.encode
r='{}{}HTTP/1.1\n主机:{}\r\n\r\n'。格式(方法、路径、主机)
响应=r.encode()
返回响应
###任务3(a)结束###
def get_内容_长度(标题):
'''
从内容长度HTTP头字段中获取整数值(如果
在给定的字节序列中找到。否则返回0。
:param header:bytes对象包含HTTP头
:return:内容长度的整数值,如果找不到,则为0
'''
###任务3(c)###
#提示:使用CONTENT\u LENGTH\u字段查找值
#请注意,内容长度字段可能并不总是在标题的末尾。
对于标头中的行。拆分(b'\r\n'):
如果行中的内容\长度\字段:
返回int(行[len(内容长度字段):])
返回0
###任务3(c)结束###
def接收体(袜子、内容物长度):
'''
在HTTP响应中接收正文内容
:param sock:连接到远程服务器的TCP套接字
:param content_length:要接收的内容的大小
:return:bytes对象包含HTTP响应中的剩余内容(正文)
'''
###任务3(d)###
body=bytes()
数据=字节()
尽管如此:
数据=sock.recv(内容长度)
if len(数据)
我有内容长度,标题的长度,还有标题

你没有。在
receive\u-http\u-response\u-header
中,检查
http\u-header\u-DELIMITER
始终只检查最新字节(
chunk
而不是
header
),这意味着您永远不会匹配头的结尾:

    while HTTP_HEADER_DELIMITER not in chunk:
        chunk = sock.recv(ONE_BYTE_LENGTH)
        if not chunk:
            break
        else:
            header += chunk
然后,假设您已经阅读了完整的标题,而实际上您已经阅读了完整的响应。这意味着另一个<代码> RecV<代码>当您尝试读取响应体时,只返回0,因为没有更多的数据,即主体已经包含在您所考虑的HTTP报头中。
除此之外,
receive\u body
也是错误的,因为您在
receive\u http\u response\u header
中也犯了类似的错误:目标是不反复读取
recv
content\u length
字节,直到没有像您当前这样的字节可用,但目标是在
length(body)时返回
匹配
内容\u长度
并在正文未完全读取时继续读取剩余数据。

如果是http请求,为什么不能使用python
请求
包?它的使用非常简单,比如
requests.get(“http://example.com)content
在本练习中,您不允许使用请求库…@Chris感谢我在“我有内容长度、标题长度和标题长度”之前从未使用过stackoverflow的提示-您知道吗?在
receive\u-http\u-response\u-header
中,您检查
http\u-header\u-DELIMITER
时总是只检查最新的字节(
chunk
而不是
header
),这意味着您永远不会匹配头的结尾。@SteffenUllrich是的,我以前发现了错误。无论如何,谢谢你。。