python dataframe:命名列正在生成新列

python dataframe:命名列正在生成新列,python,csv,pandas,dataframe,Python,Csv,Pandas,Dataframe,我在txt文件中有csv数据,如: 20050601, 25.22, 25.31, 24.71, 24.71, 27385 20050602, 24.68, 25.71, 24.68, 25.45, 16919 20050603, 25.07, 25.40, 24.72, 24.82, 12632 我想将此数据放入一个pandas数据框,其中的列名为日期,关闭,高

我在txt文件中有csv数据,如:

20050601,      25.22,      25.31,      24.71,      24.71,   27385
20050602,      24.68,      25.71,      24.68,      25.45,   16919
20050603,      25.07,      25.40,      24.72,      24.82,   12632
我想将此数据放入一个pandas数据框,其中的列名为
日期
关闭
打开

当我使用此代码时:

df = pd.read_table(File,header=None,names=['date', 'close', 'high', low', 'open', 'volume'])
输出为:

                                             date  close  high  low  \
0     20050601,      25.22,      25.31,      24.71, ...    NaN   NaN  NaN   
1     20050602,      24.68,      25.71,      24.68, ...    NaN   NaN  NaN   
2     20050603,      25.07,      25.40,      24.72, ...    NaN   NaN  NaN   
  open  volume  
0      NaN     NaN  
1      NaN     NaN  
2      NaN     NaN  `
                                                      0
0     20050601,      25.22,      25.31,      24.71, ...
1     20050602,      24.68,      25.71,      24.68, ...
2     20050603,      25.07,      25.40,      24.72, ...
当我使用:

df = pd.read_table(File,header=None)
输出为:

                                             date  close  high  low  \
0     20050601,      25.22,      25.31,      24.71, ...    NaN   NaN  NaN   
1     20050602,      24.68,      25.71,      24.68, ...    NaN   NaN  NaN   
2     20050603,      25.07,      25.40,      24.72, ...    NaN   NaN  NaN   
  open  volume  
0      NaN     NaN  
1      NaN     NaN  
2      NaN     NaN  `
                                                      0
0     20050601,      25.22,      25.31,      24.71, ...
1     20050602,      24.68,      25.71,      24.68, ...
2     20050603,      25.07,      25.40,      24.72, ...
我认为当header设置为none时,header中的零位于最右边的列上,并导致新名称移到它的右边,从而创建新列。不过我不确定

谢谢所有能帮助我的人

我解决了这个问题:

df = pd.read_table(File,names=['date','close','high','low','open','volume'],sep=',' )
有人知道为什么
sep=',“
会使它花费2倍的时间吗?

您可以与分隔符
、\s+
一起使用,以指示
和任意空白:

import pandas as pd
import io

temp=u"""20050601,      25.22,      25.31,      24.71,      24.71,   27385
20050602,      24.68,      25.71,      24.68,      25.45,   16919
20050603,      25.07,      25.40,      24.72,      24.82,   12632"""


#after testing change io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), 
                 sep=",\s+", 
                 header=None, 
                 names=['date','close','high','low','open','volume'], 
                 engine='python')

print df

       date  close   high    low   open  volume
0  20050601  25.22  25.31  24.71  24.71   27385
1  20050602  24.68  25.71  24.68  25.45   16919
2  20050603  25.07  25.40  24.72  24.82   12632

因为它可以正常工作,而以前不能正常工作?如果您的分隔符是
为什么不使用
pd.read\u csv