Python 3.x '；utf-8'；编解码器无法解码字节&；索引器错误：列表索引超出范围错误_Python 3.x

Python 3.x '；utf-8'；编解码器无法解码字节&；索引器错误：列表索引超出范围错误

python-3.x

Python 3.x '；utf-8'；编解码器无法解码字节&；索引器错误：列表索引超出范围错误,python-3.x,Python 3.x,打开上述文件时，出现以下错误： import sys dataset = open('file-00.csv','r') dataset_l = dataset.readlines() 所以我把代码改为下面的 **UnicodeDecodeError: 'utf-8' codec cant decode byte 0xfe in position 156: invalide start byte** 我还尝试了errors='ignore'，但对于这两个初始错误，现在都消失了，但在我的代码中

打开上述文件时，出现以下错误：

import sys
dataset = open('file-00.csv','r')
dataset_l = dataset.readlines()

所以我把代码改为下面的

**UnicodeDecodeError: 'utf-8' codec cant decode byte 0xfe in position 156: invalide start byte**

我还尝试了errors='ignore'，但对于这两个初始错误，现在都消失了，但在我的代码中，后来我遇到了另一个错误：

import sys
dataset = open('file-00.csv','r', errors='replace')
dataset_l = dataset.readlines()

文件“Label\u Classify\u Dataset.py”，第56行，在

def find_class_1(row):
    global file_l_sp
    for line in file_l_sp:
        if line[0] == row[2] and line[1] == row[4] and line[2] == row[5]:
            return line[3].strip()
    return 'other'

文件“Label\u classification\u Dataset.py”，第40行，在

find\u class\u 1

dataset_w_label += dataset_l[it].strip() + ',' + find_class_1(l) + ',' + find_class_2(l) + '\n'

如何修复第一个或第二个错误

更新

我使用readline枚举并打印了每一行，并设法找出了导致错误的行。这确实是一个随机的字符，但沙克一定已经替换了。删除此项会删除错误，但显然我宁愿跳过这些行，也不愿删除它们

if line[0] == row[2] and line[1] == row[4] and line[2] == row[5]:strong text



IndexError: list index out of range

我确信有更好的方法来枚举lol

试试下面的方法

with open('file.csv') as f:
    for i, line in enumerate(f):
        print('{} = {}'.format(i+1, line.strip()))

open（）中模式说明符中的b表示文件应被视为二进制文件，因此内容将保留为字节。不会像这样执行解码。

尝试使用“rb”打开文件，如

dataset=open（'file-00.csv'，'rb'）

在没有看到数据的情况下很难猜测。它真正有什么编码？不要忽略编码错误。用正确的编码打开文件。显然，

utf8

不是正确的编码。另外，不要使用

.readlines（）

和

.split（）

对于CSV文件，请使用

CSV

模块。第三，避免全局变量。您在这里所做的事情不需要它们。@RaunaqJain感谢我尝试过，请参阅下面的评论以了解新错误lol@ThomasWeller数据应该是utf8，因为它是一个pcap文件，使用指定utf8的LibreOffice将其转换为csv谢谢。我尝试了“rb”，但这会再次导致以后的错误，因为在分类后会添加一个新列（字符串），因此我得到错误：Type error:cant concat str to bytes

dataset = open('file-00.csv','rb')