Python pandas笔记本中的sqlite返回不完整的读取错误
我正在连接到一个网站,我创建了两个表来存储数据,我试图在某个日期后收集数据,但返回了一个错误。我不知道是否需要将代码添加到我的处理程序部分,但我收集数据的代码如下所示:Python pandas笔记本中的sqlite返回不完整的读取错误,python,sqlite,http,pandas,Python,Sqlite,Http,Pandas,我正在连接到一个网站,我创建了两个表来存储数据,我试图在某个日期后收集数据,但返回了一个错误。我不知道是否需要将代码添加到我的处理程序部分,但我收集数据的代码如下所示: record_cnt = 0 for link in data_list_post: data = pd.read_table(link, sep=',') print('%s:%s rows %s columns' % (link[-10:-4],data.shape[0], data.shape[1]))
record_cnt = 0
for link in data_list_post:
data = pd.read_table(link, sep=',')
print('%s:%s rows %s columns' % (link[-10:-4],data.shape[0], data.shape[1]))
record_cnt += data.shape[0]
data.to_sql(name='post', con=conPost, flavor='sqlite', if_exists='append')
返回的错误是:
IncompleteRead: IncompleteRead(8437886 bytes read)
完全错误回溯:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Program Files\Anaconda3\lib\http\client.py in _get_chunk_left(self)
540 try:
--> 541 chunk_left = self._read_next_chunk_size()
542 except ValueError:
C:\Program Files\Anaconda3\lib\http\client.py in _read_next_chunk_size(self)
507 try:
--> 508 return int(line, 16)
509 except ValueError:
ValueError: invalid literal for int() with base 16: b'00004000\r00:00,REGULAR,0000262144,0000327687 \n'
During handling of the above exception, another exception occurred:
IncompleteRead Traceback (most recent call last)
C:\Program Files\Anaconda3\lib\http\client.py in _readall_chunked(self)
557 while True:
--> 558 chunk_left = self._get_chunk_left()
559 if chunk_left is None:
C:\Program Files\Anaconda3\lib\http\client.py in _get_chunk_left(self)
542 except ValueError:
--> 543 raise IncompleteRead(b'')
544 if chunk_left == 0:
IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
IncompleteRead Traceback (most recent call last)
<ipython-input-13-e9dcb24183ff> in <module>()
1 record_cnt = 0
2 for link in data_list_post:
----> 3 data = pd.read_table(link, sep=',')
4 print('%s:%s rows %s columns' % (link[-10:-4],data.shape[0], data.shape[1])) #printing out values makes me feel safe....
5 record_cnt += data.shape[0]
C:\Program Files\Anaconda3\lib\site-packages\pandas\io\parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
560 skip_blank_lines=skip_blank_lines)
561
--> 562 return _read(filepath_or_buffer, kwds)
563
564 parser_f.__name__ = name
C:\Program Files\Anaconda3\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
299 filepath_or_buffer, _, compression = get_filepath_or_buffer(
300 filepath_or_buffer, encoding,
--> 301 compression=kwds.get('compression', None))
302 kwds['compression'] = (inferred_compression if compression == 'infer'
303 else compression)
C:\Program Files\Anaconda3\lib\site-packages\pandas\io\common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression)
315 # cat on the compression to the tuple returned by the function
316 to_return = (list(maybe_read_encoded_stream(req, encoding,
--> 317 compression)) +
318 [compression])
319 return tuple(to_return)
C:\Program Files\Anaconda3\lib\site-packages\pandas\io\common.py in maybe_read_encoded_stream(reader, encoding, compression)
235 reader = BytesIO(reader.read())
236 else:
--> 237 reader = StringIO(reader.read().decode(encoding, errors))
238 else:
239 if compression == 'gzip':
C:\Program Files\Anaconda3\lib\http\client.py in read(self, amt)
453
454 if self.chunked:
--> 455 return self._readall_chunked()
456
457 if self.length is None:
C:\Program Files\Anaconda3\lib\http\client.py in _readall_chunked(self)
563 return b''.join(value)
564 except IncompleteRead:
--> 565 raise IncompleteRead(b''.join(value))
566
567 def _readinto_chunked(self, b):
IncompleteRead: IncompleteRead(8437886 bytes read)
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
C:\Program Files\Anaconda3\lib\http\client.py in\u get\u chunk\u left(self)
540试试:
-->541 chunk\u left=self.\u read\u next\u chunk\u size()
542除值错误外:
C:\Program Files\Anaconda3\lib\http\client.py in\u read\u next\u chunk\u size(self)
507尝试:
-->508返回整数(第16行)
509除值错误外:
ValueError:基为16的int()的文本无效:b'00004000\r00:00,常规,00002621440000327687\n'
在处理上述异常期间,发生了另一个异常:
不完全读取回溯(最后一次最近调用)
C:\Program Files\Anaconda3\lib\http\client.py在\u readall\u分块(self)中
557虽然正确:
-->558 chunk_left=self._get_chunk_left()
559如果chunk_left为无:
C:\Program Files\Anaconda3\lib\http\client.py in\u get\u chunk\u left(self)
542除值错误外:
-->543提升未完成读取(b'')
544如果chunk_left==0:
不完全读取:不完全读取(读取0字节)
在处理上述异常期间,发生了另一个异常:
不完全读取回溯(最后一次最近调用)
在()
1条记录\u cnt=0
2对于数据列表中的链接\u post:
---->3数据=pd.read_表(链接,sep=',')
4打印(“%s:%s行%s列“%”(链接[-10:-4],data.shape[0],data.shape[1])#打印出的值让我感觉安全。。。。
5记录\u cnt+=data.shape[0]
解析器中的C:\Program Files\Anaconda3\lib\site packages\pandas\io\parsers.py(文件路径或缓冲区、sep、分隔符、标题、名称、索引列、usecols、挤压、前缀、重复、数据类型、引擎、转换器、真值、假值、skipinitialspace、SkipRous、skipfooter、nrows、na值、保留默认值、na过滤器、冗余、跳过空白行、解析日期、推断日期时间格式、保留日期列、日期分析器、dayfirst、i畸胎体、chunksize、压缩、千、十进制、行终止符、引号、转义、注释、编码、方言、元组、错误行、警告行、跳过页脚、双引号、delim空格、as-recarray、compact-int、使用无符号、低内存、缓冲行、内存映射、浮点精度)
560跳过空白行=跳过空白行)
561
-->562返回读取(文件路径或缓冲区,kwds)
563
564解析器名称
C:\Program Files\Anaconda3\lib\site packages\pandas\io\parsers.py in\u read(文件路径或缓冲区,kwds)
299文件路径\或\缓冲区,\压缩=获取\文件路径\或\缓冲区(
300文件路径\u或\u缓冲区,编码,
-->301压缩=kwds.get('compression',None))
302 kwds['compression']=(如果压缩=='experre',则推断压缩)
303(压缩)
C:\Program Files\Anaconda3\lib\site packages\pandas\io\common.py位于get\u文件路径\u或\u缓冲区(文件路径\u或\u缓冲区,编码,压缩)
315#cat对函数返回的元组进行压缩
316 to_return=(列表(可能是_read_encoded_stream(请求、编码、,
-->317(压缩)+
318[压缩])
319返回元组(to_return)
C:\Program Files\Anaconda3\lib\site packages\pandas\io\common.py在可能的\u read\u encoded\u流中(读取器、编码、压缩)
235 reader=BytesIO(reader.read())
236其他:
-->237 reader=StringIO(reader.read().decode(编码,错误))
238其他:
239如果压缩='gzip':
C:\Program Files\Anaconda3\lib\http\client.py处于读取状态(self,amt)
453
454如果self.chunked:
-->455返回self.\u readall\u chunked()
456
457如果self.length为无:
C:\Program Files\Anaconda3\lib\http\client.py在\u readall\u分块(self)中
563返回b“”。加入(值)
564除不完整外,阅读:
-->565提升未完成读取(b“”。连接(值))
566
567 def_readinto_分块(self,b):
不完全读取:不完全读取(读取8437886字节)
每次都会发生吗?这是我第一次这样做,我不知道你是否有任何建议。我的意思是,如果你尝试运行很多次,它总是失败吗?发生了三次。你能提供完整的错误回溯吗?