Python-如何在io.BufferedReader中使用自定义缓冲区大小?
据我所知,Python-如何在io.BufferedReader中使用自定义缓冲区大小?,python,python-2.7,io,Python,Python 2.7,Io,据我所知,io.BufferedReader的buffer\u size参数应该控制传递给底层读取器的读取缓冲区大小 然而,我没有看到这种行为。相反,当Ireader.read()整个文件时,将使用io.DEFAULT\u BUFFER\u SIZE,并忽略BUFFER\u SIZE。当Ireader.read(length)时,length被用作缓冲区大小,并且buffer\u size参数再次被忽略 最简单的例子: import io class MyReader(io.RawIOBase
io.BufferedReader
的buffer\u size
参数应该控制传递给底层读取器的读取缓冲区大小
然而,我没有看到这种行为。相反,当Ireader.read()
整个文件时,将使用io.DEFAULT\u BUFFER\u SIZE
,并忽略BUFFER\u SIZE
。当Ireader.read(length)
时,length
被用作缓冲区大小,并且buffer\u size
参数再次被忽略
最简单的例子:
import io
class MyReader(io.RawIOBase):
def __init__(self, length):
self.length = length
self.position = 0
def readinto(self, b):
print('read buffer length: %d' % len(b))
length = min(len(b), self.length - self.position)
self.position += length
b[:length] = 'a' * length
return length
def readable(self):
return True
def seekable(self):
return False
print('# read entire file')
reader = io.BufferedReader(MyReader(20000), buffer_size=100)
print('output length: %d' % len(reader.read()))
print('\n# read part of file file')
reader = io.BufferedReader(MyReader(20000), buffer_size=100)
print('output length: %d' % len(reader.read(10000)))
print('\n# read beyond end of file file')
reader = io.BufferedReader(MyReader(20000), buffer_size=100)
print 'output length: %d' % len(reader.read(30000))
产出:
# read entire file
read buffer length: 8192
read buffer length: 8192
read buffer length: 8192
read buffer length: 8192
read buffer length: 8192
output length: 20000
# read part of file file
read buffer length: 10000
output length: 10000
# read beyond end of file file
read buffer length: 30000
read buffer length: 10000
output length: 20000
我是否误解了BufferedReader的工作原理?BufferedReader的要点是保留一个内部缓冲区,您可以设置该缓冲区的大小。该缓冲区用于满足较小的读取,以避免在较慢的I/O设备上进行许多读取调用 然而,缓冲区并不试图限制读取的大小 从: 从该对象读取数据时,可能会从底层原始流请求更大量的数据,并将其保存在内部缓冲区中。然后,可以在后续读取时直接返回缓冲数据 对象继承自,表示:
RawIOBase
的主要区别在于方法read()
、readinto()
和write()
将尝试(分别)读取请求的输入或消耗所有给定的输出,代价可能是进行多个系统调用
因为您在对象上调用了.read()
,所以会从包装对象中读取较大的块,以便将所有数据读取到底。BufferedIOReader()
实例持有的内部缓冲区在这里不起作用,毕竟您要求所有数据
如果读取较小的块,则缓冲区将发挥作用:
>>> reader = io.BufferedReader(MyReader(2048), buffer_size=512)
>>> __ = reader.read(42) # initial read, fill buffer
read buffer length: 512
>>> __ = reader.read(123) # within the buffer, no read to underlying file needed
>>> __ = reader.read(456) # deplete buffer, another read needed to re-fill
read buffer length: 512
>>> __ = reader.read(123) # within the buffer, no read to underlying file needed
>>> __ = reader.read() # read until end, uses larger blocks to read from wrapped file
read buffer length: 8192
read buffer length: 8192
read buffer length: 8192
BufferedIOReader
的要点是保留一个内部缓冲区,并设置该缓冲区的大小。该缓冲区用于满足较小的读取,以避免在较慢的I/O设备上进行许多读取调用
然而,缓冲区并不试图限制读取的大小
从:
从该对象读取数据时,可能会从底层原始流请求更大量的数据,并将其保存在内部缓冲区中。然后,可以在后续读取时直接返回缓冲数据
对象继承自,表示:
RawIOBase
的主要区别在于方法read()
、readinto()
和write()
将尝试(分别)读取请求的输入或消耗所有给定的输出,代价可能是进行多个系统调用
因为您在对象上调用了.read()
,所以会从包装对象中读取较大的块,以便将所有数据读取到底。BufferedIOReader()
实例持有的内部缓冲区在这里不起作用,毕竟您要求所有数据
如果读取较小的块,则缓冲区将发挥作用:
>>> reader = io.BufferedReader(MyReader(2048), buffer_size=512)
>>> __ = reader.read(42) # initial read, fill buffer
read buffer length: 512
>>> __ = reader.read(123) # within the buffer, no read to underlying file needed
>>> __ = reader.read(456) # deplete buffer, another read needed to re-fill
read buffer length: 512
>>> __ = reader.read(123) # within the buffer, no read to underlying file needed
>>> __ = reader.read() # read until end, uses larger blocks to read from wrapped file
read buffer length: 8192
read buffer length: 8192
read buffer length: 8192