Python 将多个文件中的所有列合并为一行
打开文件中的所有CSV(名称以数字结尾) 然后获取列“IMO”(在每个选定文件中)以将它们连接到“df”数据帧中:Python 将多个文件中的所有列合并为一行,python,windows,pandas,Python,Windows,Pandas,打开文件中的所有CSV(名称以数字结尾) 然后获取列“IMO”(在每个选定文件中)以将它们连接到“df”数据帧中: import pandas as pd df = pd.concat([pd.read_csv(path + '/' + f) for f in all_names if f.split('_')[3][:-4].isdigit()]['IMO']) 但是我想在一行中完成它(纯粹的挑战,没有别的) 到目前为止,它返回了一个错误: IndexError: list index o
import pandas as pd
df = pd.concat([pd.read_csv(path + '/' + f) for f in all_names if f.split('_')[3][:-4].isdigit()]['IMO'])
但是我想在一行中完成它(纯粹的挑战,没有别的)
到目前为止,它返回了一个错误:
IndexError: list index out of range
以下是打印(所有名称)
的结果:
使用pandas筛选错误的文件名和参数
usecols
仅用于筛选列IMO
<在pandas
中,code>str[3]未失败,但如果4.
th列表不存在,则返回NaN
#one line solution
df = pd.concat([pd.read_csv(path + '/' + f, usecols=['IMO']) for f in pd.Series(all_names)[pd.Series(all_names).str.split('_').str[3].str[:-4].str.isdigit().fillna(False)]])
这与:
s = pd.Series(all_names)
v = s[s.str.split('_').str[3].str[:-4].str.isdigit().fillna(False)]
df = pd.concat([pd.read_csv(path + '/' + f, usecols=['IMO']) for f in v)
验证:
all_names = ['AIS_SIGHTINGS_Q1_2009.csv', 'AIS_SIGHTINGS_Q1_2009_corrected.csv', 'AIS_SIGHTINGS_Q1_2009_corrected_short.csv', 'AIS_SIGHTINGS_Q1_2010.csv', 'AIS_SIGHTINGS_Q1_2011.csv', 'AIS_SIGHTINGS_Q1_2012.csv', 'AIS_SIGHTINGS_Q1_2013.csv', 'AIS_SIGHTINGS_Q1_2014.csv', 'AIS_SIGHTINGS_Q2_2009.csv', 'AIS_SIGHTINGS_Q2_2010.csv', 'AIS_SIGHTINGS_Q2_2011.csv', 'AIS_SIGHTINGS_Q2_2012.csv', 'AIS_SIGHTINGS_Q2_2013.csv', 'AIS_SIGHTINGS_Q2_2014.csv', 'AIS_SIGHTINGS_Q3_2009.csv', 'AIS_SIGHTINGS_Q3_2010.csv', 'AIS_SIGHTINGS_Q3_2011.csv', 'AIS_SIGHTINGS_Q3_2012.csv', 'AIS_SIGHTINGS_Q3_2013.csv', 'AIS_SIGHTINGS_Q3_2014.csv', 'AIS_SIGHTINGS_Q4_2009.csv', 'AIS_SIGHTINGS_Q4_2010.csv', 'AIS_SIGHTINGS_Q4_2011.csv', 'AIS_SIGHTINGS_Q4_2012.csv', 'AIS_SIGHTINGS_Q4_2013.csv', 'AIS_SIGHTINGS_Q4_2014.csv', 'a_few_boats_AIS.csv', 'unique_boat_names.csv', 'unique_ports.csv', 'unique_vessel.csv']
s = pd.Series(all_names)
v = s[s.str.split('_').str[3].str[:-4].str.isdigit().fillna(False)]
print (v)
0 AIS_SIGHTINGS_Q1_2009.csv
3 AIS_SIGHTINGS_Q1_2010.csv
4 AIS_SIGHTINGS_Q1_2011.csv
5 AIS_SIGHTINGS_Q1_2012.csv
6 AIS_SIGHTINGS_Q1_2013.csv
7 AIS_SIGHTINGS_Q1_2014.csv
8 AIS_SIGHTINGS_Q2_2009.csv
9 AIS_SIGHTINGS_Q2_2010.csv
10 AIS_SIGHTINGS_Q2_2011.csv
11 AIS_SIGHTINGS_Q2_2012.csv
12 AIS_SIGHTINGS_Q2_2013.csv
13 AIS_SIGHTINGS_Q2_2014.csv
14 AIS_SIGHTINGS_Q3_2009.csv
15 AIS_SIGHTINGS_Q3_2010.csv
16 AIS_SIGHTINGS_Q3_2011.csv
17 AIS_SIGHTINGS_Q3_2012.csv
18 AIS_SIGHTINGS_Q3_2013.csv
19 AIS_SIGHTINGS_Q3_2014.csv
20 AIS_SIGHTINGS_Q4_2009.csv
21 AIS_SIGHTINGS_Q4_2010.csv
22 AIS_SIGHTINGS_Q4_2011.csv
23 AIS_SIGHTINGS_Q4_2012.csv
24 AIS_SIGHTINGS_Q4_2013.csv
25 AIS_SIGHTINGS_Q4_2014.csv
dtype: object
此代码的最终(工作)版本如下:
df=pd.concat([pd.read_csv(path+'/'+f,usecols=['IMO']),如果f.split('.)[0][-1].isdigit()]),则所有名称中的f都应为[pd.read_csv(path+'/'+f,usecols=['IMO'])
我还没有尝试过你的版本,但它看起来应该可以正常工作(如果你解决了括号问题;)。
谢谢你的回答。什么是
所有的名字
,你在检查什么情况?
all_names = ['AIS_SIGHTINGS_Q1_2009.csv', 'AIS_SIGHTINGS_Q1_2009_corrected.csv', 'AIS_SIGHTINGS_Q1_2009_corrected_short.csv', 'AIS_SIGHTINGS_Q1_2010.csv', 'AIS_SIGHTINGS_Q1_2011.csv', 'AIS_SIGHTINGS_Q1_2012.csv', 'AIS_SIGHTINGS_Q1_2013.csv', 'AIS_SIGHTINGS_Q1_2014.csv', 'AIS_SIGHTINGS_Q2_2009.csv', 'AIS_SIGHTINGS_Q2_2010.csv', 'AIS_SIGHTINGS_Q2_2011.csv', 'AIS_SIGHTINGS_Q2_2012.csv', 'AIS_SIGHTINGS_Q2_2013.csv', 'AIS_SIGHTINGS_Q2_2014.csv', 'AIS_SIGHTINGS_Q3_2009.csv', 'AIS_SIGHTINGS_Q3_2010.csv', 'AIS_SIGHTINGS_Q3_2011.csv', 'AIS_SIGHTINGS_Q3_2012.csv', 'AIS_SIGHTINGS_Q3_2013.csv', 'AIS_SIGHTINGS_Q3_2014.csv', 'AIS_SIGHTINGS_Q4_2009.csv', 'AIS_SIGHTINGS_Q4_2010.csv', 'AIS_SIGHTINGS_Q4_2011.csv', 'AIS_SIGHTINGS_Q4_2012.csv', 'AIS_SIGHTINGS_Q4_2013.csv', 'AIS_SIGHTINGS_Q4_2014.csv', 'a_few_boats_AIS.csv', 'unique_boat_names.csv', 'unique_ports.csv', 'unique_vessel.csv']
s = pd.Series(all_names)
v = s[s.str.split('_').str[3].str[:-4].str.isdigit().fillna(False)]
print (v)
0 AIS_SIGHTINGS_Q1_2009.csv
3 AIS_SIGHTINGS_Q1_2010.csv
4 AIS_SIGHTINGS_Q1_2011.csv
5 AIS_SIGHTINGS_Q1_2012.csv
6 AIS_SIGHTINGS_Q1_2013.csv
7 AIS_SIGHTINGS_Q1_2014.csv
8 AIS_SIGHTINGS_Q2_2009.csv
9 AIS_SIGHTINGS_Q2_2010.csv
10 AIS_SIGHTINGS_Q2_2011.csv
11 AIS_SIGHTINGS_Q2_2012.csv
12 AIS_SIGHTINGS_Q2_2013.csv
13 AIS_SIGHTINGS_Q2_2014.csv
14 AIS_SIGHTINGS_Q3_2009.csv
15 AIS_SIGHTINGS_Q3_2010.csv
16 AIS_SIGHTINGS_Q3_2011.csv
17 AIS_SIGHTINGS_Q3_2012.csv
18 AIS_SIGHTINGS_Q3_2013.csv
19 AIS_SIGHTINGS_Q3_2014.csv
20 AIS_SIGHTINGS_Q4_2009.csv
21 AIS_SIGHTINGS_Q4_2010.csv
22 AIS_SIGHTINGS_Q4_2011.csv
23 AIS_SIGHTINGS_Q4_2012.csv
24 AIS_SIGHTINGS_Q4_2013.csv
25 AIS_SIGHTINGS_Q4_2014.csv
dtype: object