获取Python中的特定行
我有两个csv文件 一个是:获取Python中的特定行,python,csv,pandas,numpy,Python,Csv,Pandas,Numpy,我有两个csv文件 一个是: "CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH" "1652714033","2015/1/12","4747.3800","4736.8000","10.5800" "3332440062","2015/1/12","408.6800","407.8200","0.8600" "7804314033","2015/1/12","1794.3500","1792.5000","1.8500" "0114
"CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH"
"1652714033","2015/1/12","4747.3800","4736.8000","10.5800"
"3332440062","2015/1/12","408.6800","407.8200","0.8600"
"7804314033","2015/1/12","1794.3500","1792.5000","1.8500"
"0114314033","2015/1/12","3525.2000","3519.4400","5.7600"
"1742440062","2015/1/12","3097.1900","3091.4100","5.7800"
"8230100023","2015/1/12","1035.0500","1026.8400","8.2100"
6360609057
8771218657
1338004100
2500009393
9184968250
9710581700
8833903141
总共大约六百万行
另一项是:
"CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH"
"1652714033","2015/1/12","4747.3800","4736.8000","10.5800"
"3332440062","2015/1/12","408.6800","407.8200","0.8600"
"7804314033","2015/1/12","1794.3500","1792.5000","1.8500"
"0114314033","2015/1/12","3525.2000","3519.4400","5.7600"
"1742440062","2015/1/12","3097.1900","3091.4100","5.7800"
"8230100023","2015/1/12","1035.0500","1026.8400","8.2100"
6360609057
8771218657
1338004100
2500009393
9184968250
9710581700
8833903141
总共大约一万行
第二个csv文件仅包含CONS_NO。我想找到第一个csv文件中与第二个csv文件中的编号对应的行;并删除Python中第一个csv文件中的其他行。您可以使用
pandas
中的合并方法合并这两个DataFrame
我将您的示例数据更改为以下内容:
test1.csv
是:
"CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH"
"1652714033","2015/1/12","4747.3800","4736.8000","10.5800"
"3332440062","2015/1/12","408.6800","407.8200","0.8600"
"7804314033","2015/1/12","1794.3500","1792.5000","1.8500"
"8833903141","2015/1/12","3525.2000","3519.4400","5.7600"
"1742440062","2015/1/12","3097.1900","3091.4100","5.7800"
"8833903141","2015/1/12","1035.0500","1026.8400","8.2100"
`test2.csv'是:
6360609057
8771218657
1338004100
2500009393
9184968250
9710581700
8833903141
现在可以使用以下代码合并它们:
import pandas as pd
df1 = pd.read_csv('test1.csv')
df2 = pd.read_csv('test2.csv', names=['CONS_NO'])
pd.merge(df1, df2, on='CONS_NO')
它给出以下输出:
CONS_NO DATA_DATE KWH_READING KWH_READING1 KWH
0 8833903141 2015/1/12 3525.20 3519.44 5.76
1 8833903141 2015/1/12 1035.05 1026.84 8.21
到目前为止你尝试了什么?熊猫支持。试着自己解决这个问题,如果你卡住了,用一些代码编辑这个问题。谢谢,让我试试。