String 数据帧中的字符串用python连接
我得到一个数据框,如下所示,它从csv文件读取String 数据帧中的字符串用python连接,string,csv,pandas,dataframe,String,Csv,Pandas,Dataframe,我得到一个数据框,如下所示,它从csv文件读取 COMPOUND CELL_LINE AUC 0 'ADAM17' 'A549' 97.228927 1 'ADAM17' 'BT-20' 75.409415 2 'ADAM17' 'BT-549' 66.641992 3 'ADAM17' 'CAL-120' 82.707886 4 'ADAM17' 'CAL-148' 59.822385 5 'ADAM17' 'C
COMPOUND CELL_LINE AUC
0 'ADAM17' 'A549' 97.228927
1 'ADAM17' 'BT-20' 75.409415
2 'ADAM17' 'BT-549' 66.641992
3 'ADAM17' 'CAL-120' 82.707886
4 'ADAM17' 'CAL-148' 59.822385
5 'ADAM17' 'CAL-51' 79.014796
6 'ADAM17' 'CAMA-1' 66.700791
7 'ADAM17' 'Calu-3' 302.225056
8 'ADAM17' 'Calu-6' 99.496544
其中I indexAUC.iloc[0,0]
。它给了我'ADAM17'
我尝试测试
AUC.iloc[0,0]='ADAM17'
结果是False
发生了什么事
第二个问题是如何读取csv文件并直接成为以下字符串而不使用单引号
COMPOUND CELL_LINE AUC
0 ADAM17 A549 97.228927
1 ADAM17 BT-20 75.409415
2 ADAM17 BT-549 66.641992
3 ADAM17 CAL-120 82.707886
4 ADAM17 CAL-148 59.822385
5 ADAM17 CAL-51 79.014796
6 ADAM17 CAMA-1 66.700791
7 ADAM17 Calu-3 302.225056
8 ADAM17 Calu-6 99.496544
我认为您需要添加两个
”
,因为如果您只有一个”
,它将被读取为字符串,而不带引号
:
import pandas as pd
import io
temp=u"""COMPOUND,CELL_LINE,AUC
'ADAM17','A549',97.228927
'ADAM17','BT-20',75.409415
'ADAM17','BT-549',66.641992
'ADAM17','CAL-120',82.707886
'ADAM17','CAL-148',59.822385
'ADAM17','CAL-51',79.014796
'ADAM17','CAMA-1',66.700791
'ADAM17','Calu-3',302.225056
'ADAM17','Calu-6',99.496544"""
#after testing replace io.StringIO(temp) to filename
AUC = pd.read_csv(io.StringIO(temp))
print AUC
COMPOUND CELL_LINE AUC
0 'ADAM17' 'A549' 97.228927
1 'ADAM17' 'BT-20' 75.409415
2 'ADAM17' 'BT-549' 66.641992
3 'ADAM17' 'CAL-120' 82.707886
4 'ADAM17' 'CAL-148' 59.822385
5 'ADAM17' 'CAL-51' 79.014796
6 'ADAM17' 'CAMA-1' 66.700791
7 'ADAM17' 'Calu-3' 302.225056
8 'ADAM17' 'Calu-6' 99.496544
print AUC.iloc[0,0] == 'ADAM17'
False
print AUC.iloc[0,0] == "ADAM17"
False
print AUC.iloc[0,0] == "'ADAM17'"
True
您可以将参数quotechar=“”
添加到以删除'
:
在路上。在我的原始csv文件中。该字符串已包含单引号,我想在Dataframe中删除它。原始文件:复合单元线AUC“ADAM17”BT-20“84.86402756”ADAM17“BT-549”69.95587388“ADAM17”CAL-120“70.06211297”
import pandas as pd
import io
temp=u"""COMPOUND,CELL_LINE,AUC
'ADAM17','A549',97.228927
'ADAM17','BT-20',75.409415
'ADAM17','BT-549',66.641992
'ADAM17','CAL-120',82.707886
'ADAM17','CAL-148',59.822385
'ADAM17','CAL-51',79.014796
'ADAM17','CAMA-1',66.700791
'ADAM17','Calu-3',302.225056
'ADAM17','Calu-6',99.496544"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), quotechar="'")
print df
COMPOUND CELL_LINE AUC
0 ADAM17 A549 97.228927
1 ADAM17 BT-20 75.409415
2 ADAM17 BT-549 66.641992
3 ADAM17 CAL-120 82.707886
4 ADAM17 CAL-148 59.822385
5 ADAM17 CAL-51 79.014796
6 ADAM17 CAMA-1 66.700791
7 ADAM17 Calu-3 302.225056
8 ADAM17 Calu-6 99.496544