Python 保留数据帧中的原始值
我有一个csv文件,其格式如下:Python 保留数据帧中的原始值,python,django,csv,pandas,Python,Django,Csv,Pandas,我有一个csv文件,其格式如下: A, -0.1234540756893158 B, 0.123450496711731 C, 0.12345994493484497 D, -0.12345484461784363 E, 12344656.0 F, -1234648.0 G, 12342316.0 H, 12552.37109375 I, 16247.228515625 J, -12.123796875 K, 1081104201 L, 123 我在读这本书时带着: df = pd.read_
A, -0.1234540756893158
B, 0.123450496711731
C, 0.12345994493484497
D, -0.12345484461784363
E, 12344656.0
F, -1234648.0
G, 12342316.0
H, 12552.37109375
I, 16247.228515625
J, -12.123796875
K, 1081104201
L, 123
我在读这本书时带着:
df = pd.read_csv('/output.csv', header=None, names=['c1','c2'])
然后,我将获得如下有趣的索引,并将其保存在csv中:
my_list = [0,1,2,3,4,5,6,7,8,9,10,11]
df[df.index.isin(my_list)].to_csv(thefile2, sep=',', header=None, index = False)
但当我检查“文件2”的内容时,我得到这样一个输出:
A,-0.123454075689
B,0.123450496712
C,0.123459944935
D,-0.123454844618
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.3710938
I,16247.2285156
J,-12.123797
K,1081104201.0
L,123.0
可以看出,A、B、C、D、H、I和J的值向上舍入,K和L的末尾为0。在输出文件中。我的问题是,如何获得第二列中的原始值?使用参数
dtype=str
将所有值强制转换为字符串
:
样本:
import pandas as pd
from pandas.compat import StringIO
temp=u"""A,-0.1234540756893158
B,0.123450496711731
C,0.12345994493484497
D,-0.12345484461784363
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.37109375
I,16247.228515625
J,-12.123796875
K,1081104201
L,123"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), header=None, names=['c1','c2'], dtype=str)
print (df)
c1 c2
0 A -0.1234540756893158
1 B 0.123450496711731
2 C 0.12345994493484497
3 D -0.12345484461784363
4 E 12344656.0
5 F -1234648.0
6 G 12342316.0
7 H 12552.37109375
8 I 16247.228515625
9 J -12.123796875
10 K 1081104201
11 L 123
print (type(df.loc[0, 'c2']))
<class 'str'>
将熊猫作为pd导入
从pandas.compat导入StringIO
温度=u“A,-0.1234540756893158
B、 0.123450496711731
C、 0.12345994493484497
D、 -0.12345484461784363
E、 12344656.0
F、 -1234648.0
G、 12342316.0
H、 12552.37109375
一、 16247.228515625
J、 -12.123796875
K、 1081104201
五十、 123英寸
#测试后,将“StringIO(temp)”替换为“filename.csv”
df=pd.read_csv(StringIO(temp),header=None,names=['c1','c2'],dtype=str)
打印(df)
c1 c2
0A-0.1234540756893158
1 B 0.123450496711731
2 C 0.12345994493484497
三维-0.12345484461784363
4 E 12344656.0
5 F-1234648.0
6 G 12342316.0
7小时12552.37109375
8 I 16247.228515625
9 J-12.123796875
10K 1081104201
11 L 123
打印(类型(df.loc[0,'c2']))
是否可以将c2
读取为string
pd.read_csv('/output.csv',header=None,names=['c1','c2'],dtype=str)
?谢谢,这就是答案!如果你需要赏金,也许你可以把它写在答案部分:Dpandas推断出数据类型为float,因为你有分数,这是预期的行为,请注意,仅保留此处的文本表示形式以供书写在此处可能很有用,但您需要在每次处理此csv时设置数据类型。因此,此处真正的问题是什么,因为这是适当的数据类型问题作者和此页面的其他访问者,如果我提到第二列中的值可以使用以下语句以浮点形式读取到其全部精度,可能会觉得很方便:df=pd.read\u csv(StringIO(temp),header=None,names=['c1','c2'],dtype={'c1':str,'c2':np.float64}
import pandas as pd
from pandas.compat import StringIO
temp=u"""A,-0.1234540756893158
B,0.123450496711731
C,0.12345994493484497
D,-0.12345484461784363
E,12344656.0
F,-1234648.0
G,12342316.0
H,12552.37109375
I,16247.228515625
J,-12.123796875
K,1081104201
L,123"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), header=None, names=['c1','c2'], dtype=str)
print (df)
c1 c2
0 A -0.1234540756893158
1 B 0.123450496711731
2 C 0.12345994493484497
3 D -0.12345484461784363
4 E 12344656.0
5 F -1234648.0
6 G 12342316.0
7 H 12552.37109375
8 I 16247.228515625
9 J -12.123796875
10 K 1081104201
11 L 123
print (type(df.loc[0, 'c2']))
<class 'str'>