数据帧替换不工作-编码=';ISO-8859-1';,Python 3.6
我已将一些CSV文件导入数据帧数据帧替换不工作-编码=';ISO-8859-1';,Python 3.6,python,python-3.x,pandas,replace,character-encoding,Python,Python 3.x,Pandas,Replace,Character Encoding,我已将一些CSV文件导入数据帧 Data = pd.read_csv(filePath, encoding = 'ISO-8859-1', dtype=object) 我用一些值替换列“Indicator” DataT['Indicator'] = DataT['Indicator'].str.replace('export(us$ mil)', 'exports (in us$ mil)') DataT['Indicator'] = DataT['Indicator'].str.replac
Data = pd.read_csv(filePath, encoding = 'ISO-8859-1', dtype=object)
我用一些值替换列“Indicator”
DataT['Indicator'] = DataT['Indicator'].str.replace('export(us$ mil)', 'exports (in us$ mil)')
DataT['Indicator'] = DataT['Indicator'].str.replace('import(us$ mil)', 'imports (in us$ mil)')
但由于编码问题,更换不起作用
请建议如何解决这个问题
文件下载地点:
导入所有csv文件的代码:-
for i, file in os.listdir(sourcePath):
if file.upper().endswith('.CSV'):
filePath = os.path.join(sourcePath, file)
Data = pd.read_csv(filePath, encoding = 'ISO-8859-1', dtype=object)
Data['FileName'] = file
DataAll = pd.concat([DataAll, Data], sort=False)
从您的数据中加载样本时,我注意到“指示器”列的值并非都是小写,即
'Export(US$Mil)
而不是'Export(US$Mil)
。您需要使用正确的值,或者:
DataT['Indicator'] = DataT['Indicator'].str.lower().replace('export(us$ mil)',
'exports (in us$ mil)')
您始终可以使用
df[col].unique()检查列的唯一值从数据加载样本时,我注意到“Indicator”列的值并不都是小写的-即'Export(US$Mil)
而不是'Export(US$Mil)
。您需要使用正确的值,或者:
DataT['Indicator'] = DataT['Indicator'].str.lower().replace('export(us$ mil)',
'exports (in us$ mil)')
您总是可以使用df[col].unique()检查列的唯一值。
经过大量的尝试,我进入了下面的解决方案,只需导入re模块
但是,您可以将代码简化为:
import pandas as pd
import glob
import re
for f in glob('/your_Dir_path/somefiles*.csv'):
Data = pd.read_csv(f, encoding = 'ISO-8859-1', dtype=object)
数据集:
>>> Data['Indicator'].head()
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
Name: Indicator, dtype: object
>>> Data['Indicator'].head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 Trade (US$ Mil)-Top 5 Export Partner
8 Trade (US$ Mil)-Top 5 Export Partner
9 Trade (US$ Mil)-Top 5 Export Partner
10 Trade (US$ Mil)-Top 5 Export Partner
11 Trade (US$ Mil)-Top 5 Import Partner
12 Trade (US$ Mil)-Top 5 Export Partner
13 Trade (US$ Mil)-Top 5 Import Partner
14 Trade (US$ Mil)-Top 5 Export Partner
15 Trade (US$ Mil)-Top 5 Import Partner
16 Trade (US$ Mil)-Top 5 Export Partner
17 Trade (US$ Mil)-Top 5 Export Partner
18 Trade (US$ Mil)-Top 5 Import Partner
>>> Data['Indicator'].str.replace(re.escape("Trade (US$ Mil)"), "IN Trade (US$ Mil)").head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 IN Trade (US$ Mil)-Top 5 Export Partner
8 IN Trade (US$ Mil)-Top 5 Export Partner
9 IN Trade (US$ Mil)-Top 5 Export Partner
10 IN Trade (US$ Mil)-Top 5 Export Partner
11 IN Trade (US$ Mil)-Top 5 Import Partner
12 IN Trade (US$ Mil)-Top 5 Export Partner
13 IN Trade (US$ Mil)-Top 5 Import Partner
14 IN Trade (US$ Mil)-Top 5 Export Partner
15 IN Trade (US$ Mil)-Top 5 Import Partner
16 IN Trade (US$ Mil)-Top 5 Export Partner
17 IN Trade (US$ Mil)-Top 5 Export Partner
18 IN Trade (US$ Mil)-Top 5 Import Partner
19 IN Trade (US$ Mil)-Top 5 Import Partner
20 IN Trade (US$ Mil)-Top 5 Import Partner
21 IN Trade (US$ Mil)-Top 5 Export Partner
22 IN Trade (US$ Mil)-Top 5 Export Partner
23 IN Trade (US$ Mil)-Top 5 Export Partner
24 IN Trade (US$ Mil)-Top 5 Export Partner
25 IN Trade (US$ Mil)-Top 5 Export Partner
26 IN Trade (US$ Mil)-Top 5 Export Partner
27 IN Trade (US$ Mil)-Top 5 Export Partner
28 IN Trade (US$ Mil)-Top 5 Import Partner
29 IN Trade (US$ Mil)-Top 5 Export Partner
...
70 Partner share(%)-Top 5 Export Partner
71 Partner share(%)-Top 5 Import Partner
72 Partner share(%)-Top 5 Export Partner
73 Partner share(%)-Top 5 Import Partner
74 Partner share(%)-Top 5 Export Partner
75 Partner share(%)-Top 5 Export Partner
76 Partner share(%)-Top 5 Import Partner
77 Partner share(%)-Top 5 Import Partner
78 Partner share(%)-Top 5 Import Partner
79 Partner share(%)-Top 5 Export Partner
80 Partner share(%)-Top 5 Export Partner
81 Partner share(%)-Top 5 Export Partner
82 Partner share(%)-Top 5 Export Partner
83 Partner share(%)-Top 5 Export Partner
84 Partner share(%)-Top 5 Export Partner
85 Partner share(%)-Top 5 Export Partner
86 Partner share(%)-Top 5 Import Partner
87 Partner share(%)-Top 5 Export Partner
88 Partner share(%)-Top 5 Import Partner
89 Partner share(%)-Top 5 Export Partner
90 Country Growth (%)
91 Duty Free Tariff Lines Share (%)
92 Export Product share(%)
93 Export Product share(%)
94 Export Product share(%)
95 Export Product share(%)
96 Export Product share(%)
97 Export Product share(%)
98 Export Product share(%)
99 Export Product share(%)
Name: Indicator, Length: 100, dtype: object
结果:
>>> Data['Indicator'].head()
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
Name: Indicator, dtype: object
>>> Data['Indicator'].head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 Trade (US$ Mil)-Top 5 Export Partner
8 Trade (US$ Mil)-Top 5 Export Partner
9 Trade (US$ Mil)-Top 5 Export Partner
10 Trade (US$ Mil)-Top 5 Export Partner
11 Trade (US$ Mil)-Top 5 Import Partner
12 Trade (US$ Mil)-Top 5 Export Partner
13 Trade (US$ Mil)-Top 5 Import Partner
14 Trade (US$ Mil)-Top 5 Export Partner
15 Trade (US$ Mil)-Top 5 Import Partner
16 Trade (US$ Mil)-Top 5 Export Partner
17 Trade (US$ Mil)-Top 5 Export Partner
18 Trade (US$ Mil)-Top 5 Import Partner
>>> Data['Indicator'].str.replace(re.escape("Trade (US$ Mil)"), "IN Trade (US$ Mil)").head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 IN Trade (US$ Mil)-Top 5 Export Partner
8 IN Trade (US$ Mil)-Top 5 Export Partner
9 IN Trade (US$ Mil)-Top 5 Export Partner
10 IN Trade (US$ Mil)-Top 5 Export Partner
11 IN Trade (US$ Mil)-Top 5 Import Partner
12 IN Trade (US$ Mil)-Top 5 Export Partner
13 IN Trade (US$ Mil)-Top 5 Import Partner
14 IN Trade (US$ Mil)-Top 5 Export Partner
15 IN Trade (US$ Mil)-Top 5 Import Partner
16 IN Trade (US$ Mil)-Top 5 Export Partner
17 IN Trade (US$ Mil)-Top 5 Export Partner
18 IN Trade (US$ Mil)-Top 5 Import Partner
19 IN Trade (US$ Mil)-Top 5 Import Partner
20 IN Trade (US$ Mil)-Top 5 Import Partner
21 IN Trade (US$ Mil)-Top 5 Export Partner
22 IN Trade (US$ Mil)-Top 5 Export Partner
23 IN Trade (US$ Mil)-Top 5 Export Partner
24 IN Trade (US$ Mil)-Top 5 Export Partner
25 IN Trade (US$ Mil)-Top 5 Export Partner
26 IN Trade (US$ Mil)-Top 5 Export Partner
27 IN Trade (US$ Mil)-Top 5 Export Partner
28 IN Trade (US$ Mil)-Top 5 Import Partner
29 IN Trade (US$ Mil)-Top 5 Export Partner
...
70 Partner share(%)-Top 5 Export Partner
71 Partner share(%)-Top 5 Import Partner
72 Partner share(%)-Top 5 Export Partner
73 Partner share(%)-Top 5 Import Partner
74 Partner share(%)-Top 5 Export Partner
75 Partner share(%)-Top 5 Export Partner
76 Partner share(%)-Top 5 Import Partner
77 Partner share(%)-Top 5 Import Partner
78 Partner share(%)-Top 5 Import Partner
79 Partner share(%)-Top 5 Export Partner
80 Partner share(%)-Top 5 Export Partner
81 Partner share(%)-Top 5 Export Partner
82 Partner share(%)-Top 5 Export Partner
83 Partner share(%)-Top 5 Export Partner
84 Partner share(%)-Top 5 Export Partner
85 Partner share(%)-Top 5 Export Partner
86 Partner share(%)-Top 5 Import Partner
87 Partner share(%)-Top 5 Export Partner
88 Partner share(%)-Top 5 Import Partner
89 Partner share(%)-Top 5 Export Partner
90 Country Growth (%)
91 Duty Free Tariff Lines Share (%)
92 Export Product share(%)
93 Export Product share(%)
94 Export Product share(%)
95 Export Product share(%)
96 Export Product share(%)
97 Export Product share(%)
98 Export Product share(%)
99 Export Product share(%)
Name: Indicator, Length: 100, dtype: object
对于您的示例,您应该尝试以下方法:
import re
DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('export(us$ mil)'), 'exports (in us$ mil)')
DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('import(us$ mil)'), 'imports (in us$ mil)')
经过多次尝试,我进入了下面的解决方案,只需导入re模块
但是,您可以将代码简化为:
import pandas as pd
import glob
import re
for f in glob('/your_Dir_path/somefiles*.csv'):
Data = pd.read_csv(f, encoding = 'ISO-8859-1', dtype=object)
数据集:
>>> Data['Indicator'].head()
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
Name: Indicator, dtype: object
>>> Data['Indicator'].head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 Trade (US$ Mil)-Top 5 Export Partner
8 Trade (US$ Mil)-Top 5 Export Partner
9 Trade (US$ Mil)-Top 5 Export Partner
10 Trade (US$ Mil)-Top 5 Export Partner
11 Trade (US$ Mil)-Top 5 Import Partner
12 Trade (US$ Mil)-Top 5 Export Partner
13 Trade (US$ Mil)-Top 5 Import Partner
14 Trade (US$ Mil)-Top 5 Export Partner
15 Trade (US$ Mil)-Top 5 Import Partner
16 Trade (US$ Mil)-Top 5 Export Partner
17 Trade (US$ Mil)-Top 5 Export Partner
18 Trade (US$ Mil)-Top 5 Import Partner
>>> Data['Indicator'].str.replace(re.escape("Trade (US$ Mil)"), "IN Trade (US$ Mil)").head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 IN Trade (US$ Mil)-Top 5 Export Partner
8 IN Trade (US$ Mil)-Top 5 Export Partner
9 IN Trade (US$ Mil)-Top 5 Export Partner
10 IN Trade (US$ Mil)-Top 5 Export Partner
11 IN Trade (US$ Mil)-Top 5 Import Partner
12 IN Trade (US$ Mil)-Top 5 Export Partner
13 IN Trade (US$ Mil)-Top 5 Import Partner
14 IN Trade (US$ Mil)-Top 5 Export Partner
15 IN Trade (US$ Mil)-Top 5 Import Partner
16 IN Trade (US$ Mil)-Top 5 Export Partner
17 IN Trade (US$ Mil)-Top 5 Export Partner
18 IN Trade (US$ Mil)-Top 5 Import Partner
19 IN Trade (US$ Mil)-Top 5 Import Partner
20 IN Trade (US$ Mil)-Top 5 Import Partner
21 IN Trade (US$ Mil)-Top 5 Export Partner
22 IN Trade (US$ Mil)-Top 5 Export Partner
23 IN Trade (US$ Mil)-Top 5 Export Partner
24 IN Trade (US$ Mil)-Top 5 Export Partner
25 IN Trade (US$ Mil)-Top 5 Export Partner
26 IN Trade (US$ Mil)-Top 5 Export Partner
27 IN Trade (US$ Mil)-Top 5 Export Partner
28 IN Trade (US$ Mil)-Top 5 Import Partner
29 IN Trade (US$ Mil)-Top 5 Export Partner
...
70 Partner share(%)-Top 5 Export Partner
71 Partner share(%)-Top 5 Import Partner
72 Partner share(%)-Top 5 Export Partner
73 Partner share(%)-Top 5 Import Partner
74 Partner share(%)-Top 5 Export Partner
75 Partner share(%)-Top 5 Export Partner
76 Partner share(%)-Top 5 Import Partner
77 Partner share(%)-Top 5 Import Partner
78 Partner share(%)-Top 5 Import Partner
79 Partner share(%)-Top 5 Export Partner
80 Partner share(%)-Top 5 Export Partner
81 Partner share(%)-Top 5 Export Partner
82 Partner share(%)-Top 5 Export Partner
83 Partner share(%)-Top 5 Export Partner
84 Partner share(%)-Top 5 Export Partner
85 Partner share(%)-Top 5 Export Partner
86 Partner share(%)-Top 5 Import Partner
87 Partner share(%)-Top 5 Export Partner
88 Partner share(%)-Top 5 Import Partner
89 Partner share(%)-Top 5 Export Partner
90 Country Growth (%)
91 Duty Free Tariff Lines Share (%)
92 Export Product share(%)
93 Export Product share(%)
94 Export Product share(%)
95 Export Product share(%)
96 Export Product share(%)
97 Export Product share(%)
98 Export Product share(%)
99 Export Product share(%)
Name: Indicator, Length: 100, dtype: object
结果:
>>> Data['Indicator'].head()
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
Name: Indicator, dtype: object
>>> Data['Indicator'].head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 Trade (US$ Mil)-Top 5 Export Partner
8 Trade (US$ Mil)-Top 5 Export Partner
9 Trade (US$ Mil)-Top 5 Export Partner
10 Trade (US$ Mil)-Top 5 Export Partner
11 Trade (US$ Mil)-Top 5 Import Partner
12 Trade (US$ Mil)-Top 5 Export Partner
13 Trade (US$ Mil)-Top 5 Import Partner
14 Trade (US$ Mil)-Top 5 Export Partner
15 Trade (US$ Mil)-Top 5 Import Partner
16 Trade (US$ Mil)-Top 5 Export Partner
17 Trade (US$ Mil)-Top 5 Export Partner
18 Trade (US$ Mil)-Top 5 Import Partner
>>> Data['Indicator'].str.replace(re.escape("Trade (US$ Mil)"), "IN Trade (US$ Mil)").head(100)
0 GDP (current US$ Mil)
1 No. Of Export partners
2 No. Of Export products
3 No. Of Import partners
4 No. Of Import products
5 No. Of Tariff Agreement
6 Trade Balance (current US$ Mil)
7 IN Trade (US$ Mil)-Top 5 Export Partner
8 IN Trade (US$ Mil)-Top 5 Export Partner
9 IN Trade (US$ Mil)-Top 5 Export Partner
10 IN Trade (US$ Mil)-Top 5 Export Partner
11 IN Trade (US$ Mil)-Top 5 Import Partner
12 IN Trade (US$ Mil)-Top 5 Export Partner
13 IN Trade (US$ Mil)-Top 5 Import Partner
14 IN Trade (US$ Mil)-Top 5 Export Partner
15 IN Trade (US$ Mil)-Top 5 Import Partner
16 IN Trade (US$ Mil)-Top 5 Export Partner
17 IN Trade (US$ Mil)-Top 5 Export Partner
18 IN Trade (US$ Mil)-Top 5 Import Partner
19 IN Trade (US$ Mil)-Top 5 Import Partner
20 IN Trade (US$ Mil)-Top 5 Import Partner
21 IN Trade (US$ Mil)-Top 5 Export Partner
22 IN Trade (US$ Mil)-Top 5 Export Partner
23 IN Trade (US$ Mil)-Top 5 Export Partner
24 IN Trade (US$ Mil)-Top 5 Export Partner
25 IN Trade (US$ Mil)-Top 5 Export Partner
26 IN Trade (US$ Mil)-Top 5 Export Partner
27 IN Trade (US$ Mil)-Top 5 Export Partner
28 IN Trade (US$ Mil)-Top 5 Import Partner
29 IN Trade (US$ Mil)-Top 5 Export Partner
...
70 Partner share(%)-Top 5 Export Partner
71 Partner share(%)-Top 5 Import Partner
72 Partner share(%)-Top 5 Export Partner
73 Partner share(%)-Top 5 Import Partner
74 Partner share(%)-Top 5 Export Partner
75 Partner share(%)-Top 5 Export Partner
76 Partner share(%)-Top 5 Import Partner
77 Partner share(%)-Top 5 Import Partner
78 Partner share(%)-Top 5 Import Partner
79 Partner share(%)-Top 5 Export Partner
80 Partner share(%)-Top 5 Export Partner
81 Partner share(%)-Top 5 Export Partner
82 Partner share(%)-Top 5 Export Partner
83 Partner share(%)-Top 5 Export Partner
84 Partner share(%)-Top 5 Export Partner
85 Partner share(%)-Top 5 Export Partner
86 Partner share(%)-Top 5 Import Partner
87 Partner share(%)-Top 5 Export Partner
88 Partner share(%)-Top 5 Import Partner
89 Partner share(%)-Top 5 Export Partner
90 Country Growth (%)
91 Duty Free Tariff Lines Share (%)
92 Export Product share(%)
93 Export Product share(%)
94 Export Product share(%)
95 Export Product share(%)
96 Export Product share(%)
97 Export Product share(%)
98 Export Product share(%)
99 Export Product share(%)
Name: Indicator, Length: 100, dtype: object
对于您的示例,您应该尝试以下方法:
import re
DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('export(us$ mil)'), 'exports (in us$ mil)')
DataT['Indicator'] = DataT['Indicator'].str.replace(re.escape('import(us$ mil)'), 'imports (in us$ mil)')
请提供示例数据和输入的错误代码。但是,如果您尝试DataT['Indicator']=DataT['Indicator'].astype(str).str.replace('export(us$mil)','exports(us$mil)',regex=True)
@pygo,谢谢,不起作用。我没有得到任何错误。附加的数据和源链接截图。如果只是删除regex=True并重试?@pygo,我正在使用所有文件。您可以尝试“en_AGO_AllYears_WITS_Trade_Summary.CSV”您可以提供示例数据和您正在输入的错误代码吗。但是,如果您尝试DataT['Indicator']=DataT['Indicator'].astype(str).str.replace('export(us$mil)'exports(us$mil'),regex=True)
@pygo,谢谢,不起作用。我没有得到任何错误。附加的数据和源链接截图。如果只是删除regex=True并重试?@pygo,我正在使用所有文件。您可以尝试“en_AGO_AllYears_WITS_Trade_Summary.CSV”谢谢回复,我在导入200+CSV文件后转换为小写。谢谢回复,我在导入200+CSV文件后转换为小写。很好,正在使用此功能。非常感谢你的努力。这对我很有帮助。@SPy,太好了。太好了,做这个。非常感谢你的努力。这对我很有帮助。@SPy,太好了,它帮助了我。