python dataframe:命名列正在生成新列
我在txt文件中有csv数据,如:python dataframe:命名列正在生成新列,python,csv,pandas,dataframe,Python,Csv,Pandas,Dataframe,我在txt文件中有csv数据,如: 20050601, 25.22, 25.31, 24.71, 24.71, 27385 20050602, 24.68, 25.71, 24.68, 25.45, 16919 20050603, 25.07, 25.40, 24.72, 24.82, 12632 我想将此数据放入一个pandas数据框,其中的列名为日期,关闭,高
20050601, 25.22, 25.31, 24.71, 24.71, 27385
20050602, 24.68, 25.71, 24.68, 25.45, 16919
20050603, 25.07, 25.40, 24.72, 24.82, 12632
我想将此数据放入一个pandas数据框,其中的列名为日期
,关闭
,高
,低
,打开
,卷
当我使用此代码时:
df = pd.read_table(File,header=None,names=['date', 'close', 'high', low', 'open', 'volume'])
输出为:
date close high low \
0 20050601, 25.22, 25.31, 24.71, ... NaN NaN NaN
1 20050602, 24.68, 25.71, 24.68, ... NaN NaN NaN
2 20050603, 25.07, 25.40, 24.72, ... NaN NaN NaN
open volume
0 NaN NaN
1 NaN NaN
2 NaN NaN `
0
0 20050601, 25.22, 25.31, 24.71, ...
1 20050602, 24.68, 25.71, 24.68, ...
2 20050603, 25.07, 25.40, 24.72, ...
当我使用:
df = pd.read_table(File,header=None)
输出为:
date close high low \
0 20050601, 25.22, 25.31, 24.71, ... NaN NaN NaN
1 20050602, 24.68, 25.71, 24.68, ... NaN NaN NaN
2 20050603, 25.07, 25.40, 24.72, ... NaN NaN NaN
open volume
0 NaN NaN
1 NaN NaN
2 NaN NaN `
0
0 20050601, 25.22, 25.31, 24.71, ...
1 20050602, 24.68, 25.71, 24.68, ...
2 20050603, 25.07, 25.40, 24.72, ...
我认为当header设置为none时,header中的零位于最右边的列上,并导致新名称移到它的右边,从而创建新列。不过我不确定
谢谢所有能帮助我的人 我解决了这个问题:
df = pd.read_table(File,names=['date','close','high','low','open','volume'],sep=',' )
有人知道为什么sep=',“
会使它花费2倍的时间吗?
您可以与分隔符、\s+
一起使用,以指示、
和任意空白:
import pandas as pd
import io
temp=u"""20050601, 25.22, 25.31, 24.71, 24.71, 27385
20050602, 24.68, 25.71, 24.68, 25.45, 16919
20050603, 25.07, 25.40, 24.72, 24.82, 12632"""
#after testing change io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
sep=",\s+",
header=None,
names=['date','close','high','low','open','volume'],
engine='python')
print df
date close high low open volume
0 20050601 25.22 25.31 24.71 24.71 27385
1 20050602 24.68 25.71 24.68 25.45 16919
2 20050603 25.07 25.40 24.72 24.82 12632
因为它可以正常工作,而以前不能正常工作?如果您的分隔符是
,
为什么不使用pd.read\u csv
?