Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何将csv字符串转换为熊猫列表?_Python_Python 3.x_Csv_Numpy_Pandas - Fatal编程技术网

Python 如何将csv字符串转换为熊猫列表?

Python 如何将csv字符串转换为熊猫列表?,python,python-3.x,csv,numpy,pandas,Python,Python 3.x,Csv,Numpy,Pandas,我正在使用具有以下格式的csv文件: "Id","Sequence" 3,"1,3,13,87,1053,28576,2141733,508147108,402135275365,1073376057490373,9700385489355970183,298434346895322960005291,31479360095907908092817694945,11474377948948020660089085281068730" 7,"1,2,1,5,5,1,11,16,7,1,23,44,

我正在使用具有以下格式的csv文件:

"Id","Sequence"
3,"1,3,13,87,1053,28576,2141733,508147108,402135275365,1073376057490373,9700385489355970183,298434346895322960005291,31479360095907908092817694945,11474377948948020660089085281068730"
7,"1,2,1,5,5,1,11,16,7,1,23,44,30,9,1,47,112,104,48,11,1,95,272,320,200,70,13,1,191,640,912,720,340,96,15,1,383,1472,2464,2352,1400,532,126,17,1,767,3328,6400,7168,5152,2464,784,160,19,1,1535,7424"
8,"1,2,4,5,8,10,16,20,32,40,64,80,128,160,256,320,512,640,1024,1280,2048,2560,4096,5120,8192,10240,16384,20480,32768,40960,65536,81920,131072,163840,262144,327680,524288,655360,1048576,1310720,2097152"
11,"1,8,25,83,274,2275,132224,1060067,3312425,10997342,36304451,301432950,17519415551,140456757358,438889687625,1457125820233,4810267148324,39939263006825,2321287521544174,18610239435360217"
我想把它读入一个数据帧中,
df['Id']
的类型类似于整数,
df['Sequence']
的类型类似于列表

我目前有以下乱码:

def clean(seq_string):
    return list(map(int, seq_string.split(',')))

# Read data
training_data_file = "data/train.csv"    
train = pd.read_csv(training_data_file)
train['Sequence'] = list(map(clean, train['Sequence'].values))
这似乎是可行的,但我觉得用熊猫和numpy也可以实现同样的效果

有人有建议吗?

您可以为
序列
列指定:

转换器
dict
,默认值
None

用于转换的函数目录 某些列中的值。键可以是整数或列 标签


这同样有效,除了序列是字符串列表而不是int列表之外:

df = pd.read_csv(training_data_file)
df['Sequence'] = df['Sequence'].str.split(',')
要将每个元素转换为int,请执行以下操作:

df = pd.read_csv(training_data_file)
df['Sequence'] = df['Sequence'].str.split(',').apply(lambda s: list(map(int, s)))

另一种解决方案是使用
ast
模块中的
literal\u eval
literal_eval
计算字符串作为Python解释器的输入,并应按预期返回列表

def clean(x):
    return literal_eval(x)

train = pd.read_csv(training_data_file, converters={'Sequence': clean})

美丽的。我以为会是这样简单的事情。:)干杯如果我想将其转换为int列表,我可以添加
。convert\u objects(convert\u numeric=True)
,对吗?该命令似乎已被弃用,需要在列表中循环并手动转换。但不知何故,这又回到了最初的解决方案。
def clean(x):
    return literal_eval(x)

train = pd.read_csv(training_data_file, converters={'Sequence': clean})