Python 插入数据时删除的数据框列
我试图创建一个包含3列的数据框,但出于某种原因,只添加了一列:Python 插入数据时删除的数据框列,python,pandas,dataframe,Python,Pandas,Dataframe,我试图创建一个包含3列的数据框,但出于某种原因,只添加了一列: # Create a new DataFrame from our transformed data stock_incident_df = pd.DataFrame(stock_incident_data, columns=['date', 'number_of_incidents', 'stock_price_close']) print(stock_incident_df.describe()) number_
# Create a new DataFrame from our transformed data
stock_incident_df = pd.DataFrame(stock_incident_data, columns=['date', 'number_of_incidents', 'stock_price_close'])
print(stock_incident_df.describe())
number_of_incidents
count 1551.000000
mean 154.629916
std 25.782985
min 77.000000
25% 137.000000
50% 154.000000
75% 171.000000
max 342.000000
即使我分离构造函数并附加数据,问题也会出现:
stock_incident_df = pd.DataFrame(columns=['date', 'number_of_incidents', 'stock_price_close'])
print(stock_incident_df.describe())
stock_incident_df = stock_incident_df.append(stock_incident_data)
print(stock_incident_df.describe())
date number_of_incidents stock_price_close
count 0 0 0
unique 0 0 0
top NaN NaN NaN
freq NaN NaN NaN
1
count 1551.000000
mean 154.629916
std 25.782985
min 77.000000
25% 137.000000
50% 154.000000
75% 171.000000
max 342.000000
我的输入数据是具有以下格式的列表列表:
[
[Timestamp('2014-01-02 00:00:00'), 119, 16441.35],
[Timestamp('2014-01-03 00:00:00'), 124, 16469.99],
[Timestamp('2014-01-06 00:00:00'), 100, 16425.11],
[Timestamp('2014-01-07 00:00:00'), 115, 16530.94]
]
是否正确导入了时间戳? 如果使用pd.Timestamp,它似乎可以工作
import pandas as pd
stock_incident_data=[
[pd.Timestamp('2014-01-02 00:00:00'), 119, 16441.35],
[pd.Timestamp('2014-01-03 00:00:00'), 124, 16469.99],
[pd.Timestamp('2014-01-06 00:00:00'), 100, 16425.11],
[pd.Timestamp('2014-01-07 00:00:00'), 115, 16530.94]
]
stock_incident_df = pd.DataFrame(stock_incident_data, columns=['date', 'number_of_incidents', 'stock_price_close'])
stock_incident_df
Out[17]:
date number_of_incidents stock_price_close
0 2014-01-02 119 16441.35
1 2014-01-03 124 16469.99
2 2014-01-06 100 16425.11
3 2014-01-07 115 16530.94
我的错误-
descripe()
方法的输出中没有包含日期,只需打印数据框即可显示数据:
stock_incident_df = pd.DataFrame(stock_incident_data, columns=['date', 'number_of_incidents', 'stock_price_close'])
print(stock_incident_df)
date ... stock_price_close
0 2014-01-02 ... 16441.3
1 2014-01-03 ... 16470
2 2014-01-06 ... 16425.1
3 2014-01-07 ... 16530.9
4 2014-01-08 ... 16462.7
... ... ... ...
1546 2018-03-18 ... 31585 24946.51
Name: Close, dtype: float64
1547 2018-03-24 ... 31590 23533.2
Name: Close, dtype: float64
1548 2018-03-25 ... 31590 23533.2
Name: Close, dtype: float64
1549 2018-03-30 ... 31594 24103.11
Name: Close, dtype: float64
1550 2018-03-31 ... 31594 24103.11
Name: Close, dtype: float64
[1551 rows x 3 columns]
对我来说,工作非常完美。
是否正确导入了时间戳?
?如果我复制并粘贴代码时没有引发错误,我会得到这个输出number\u of\u incents stock\u price\u close
我正在运行google colab,如果这有什么不同的话。我看到了问题所在,我正在打印.descripe()
,它省略了日期列