Python 如何在数据帧中添加由pd.Timestamp和float组成的行_Python_Pandas_Dataframe_Series

Python 如何在数据帧中添加由pd.Timestamp和float组成的行

python pandas dataframe

Python 如何在数据帧中添加由pd.Timestamp和float组成的行,python,pandas,dataframe,series,Python,Pandas,Dataframe,Series,我试图使用以下代码将包含一些时间戳和一些浮点值的行附加到数据帧 pair_columns = ['T1 Time', 'T1 Active', 'T1 Reactive', 'T2 Time', 'T2 Active', 'T2 Reactive'] # an empty dataframe matched_pairs = pd.DataFrame(columns=pair_columns) # A list with some Timestamp value_with_timestamp

我试图使用以下代码将包含一些时间戳和一些浮点值的行附加到数据帧

pair_columns = ['T1 Time', 'T1 Active', 'T1 Reactive', 'T2 Time', 'T2 Active', 'T2 Reactive']

# an empty dataframe
matched_pairs = pd.DataFrame(columns=pair_columns)


# A list with some Timestamp
value_with_timestamp = [pd.Timestamp('2011-10-21 20:08:42+0000', tz='UTC'), 21.847724815467735, -78.998453511820344, pd.Timestamp('2011-10-21 20:08:54+0000', tz='UTC'), -74.608437575303114, 48.537725275212779]
ser_timestamp = pd.Series(value_with_timestamp)


# This pass, but the dataframe get a row containing only NaN
matched_pairs.loc[len(matched_pairs)] = ser_timestamp
print("Dataframe with series containing timestamp")
print(matched_pairs.head())

# Exception TypeError: data type not understood
matched_pairs.loc[len(matched_pairs)] = value_with_timestamp
print(matched_pairs.head())

# Exception TypeError: data type not understood
matched_pairs = matched_pairs.append(ser_timestamp, ignore_index=True)
print(matched_pairs.head())

这段代码不起作用，但使用字符串而不是时间戳，一切正常

import pandas as pd

matched_pairs_string = pd.DataFrame(columns=pair_columns)

# The same list but with string instend of timestamp
value_string = ['2011-10-21 20:08:42+0000', 21.847724815467735, -78.998453511820344, '2011-10-21 20:08:54+0000', -74.608437575303114, 48.537725275212779]

# Add the list with the string to the dataframe, this work like a charm
matched_pairs_string.loc[len(matched_pairs_string)] = value_string
print("Dataframe with string instead of timestamp")
print(matched_pairs_string.head())

我做错了什么？有没有办法实现我想要的？我只想按原样将此数据添加为一行，而不是将时间戳转换为另一种类型

从技术上讲，问题不在于时间戳，而在于您分配给行的对象类型：一个序列（在第一个代码块中尝试）与一个列表（在第二个代码块中尝试）

由于pandas数据帧中的每一列都是pandas系列，因此无法将行指定给系列。考虑转换为一个行分配列表，使用<代码>系列。

matched_pairs.loc[len(matched_pairs)] = ser_timestamp.tolist()
#               T1 Time  T1 Active  T1 Reactive             T2 Time  T2 Active  T2 Reactive
# 0 2011-10-21 20:08:42  21.847725   -78.998454 2011-10-21 20:08:54 -74.608438     48.53772

matched_pairs.loc[len(matched_pairs)] = value_with_timestamp
#               T1 Time  T1 Active  T1 Reactive             T2 Time  T2 Active  T2 Reactive
# 0 2011-10-21 20:08:42  21.847725   -78.998454 2011-10-21 20:08:54 -74.608438     48.53772

在此过程中，您可以指定适当的数据类型：

print(matched_pairs.dtypes)

# T1 Time        datetime64[ns]
# T1 Active             float64
# T1 Reactive           float64
# T2 Time        datetime64[ns]
# T2 Active             float64
# T2 Reactive           float64
# dtype: object

正如OP所指出的，可能存在版本问题，其中0.19中的上述版本引发了异常：

TypeError:无法理解数据类型

一种可能的解决方案是在行分配之前在空数据帧上显式定义数据类型（时间戳和浮点）。由于没有单个

dtype（）

调用，因此会运行一个循环来转换每个列：

pair_columns = ['T1 Time', 'T1 Active', 'T1 Reactive', 'T2 Time', 'T2 Active', 'T2 Reactive']
pair_dtypes = ['M8[ms]', 'float', 'float', 'M8[ms]', 'float', 'float']

# an empty dataframe
matched_pairs = pd.DataFrame(columns=pair_columns)
datatypes = {k:v for k,v in zip(pair_columns, pair_dtypes)}

for k,v in datatypes.items():
    matched_pairs[k] = matched_pairs[k].astype(v)

...
matched_pairs.loc[len(matched_pairs)] = ser_timestamp.tolist()
# matched_pairs.loc[len(matched_pairs)] = value_with_timestamp

谢谢你的回复，但是我看不出我们的代码和我的代码有什么区别。如果我们在尝试将纯列表分配给dataframe行时参与，那么已经进行了此尝试。在我的系统上，我们的解决方案不起作用（通过转换序列或使用纯列表），它返回一个TypeError:data type not Understanding。熊猫版是Ubuntu 16.04.Hmmm上的0.19.1，非常有趣。事实上，我确实看到了使用列表的第一个代码块中的下一行，但使用确切的发布数据，我无法重新创建。我在windows10上使用python3.4和python0.18。没有错误发生。您可以从另一个cpu环境中尝试吗？您确定您使用的就是这个发布的示例吗？试着用这段代码（不是实际的项目）创建一个干净的新会话。是的，你是对的，这是一个环境问题。有趣的是，我测试过的三个环境（2 Ubuntu 16.04、python 2.7.12、pandas 0.19.1和1 Windows 10 python 2.7.12以及pandas 0.19.1）只有一个工作正常。。。谢谢你！好奇的是，这会发生在熊猫0.19上吗？python 2.7？我添加了一个更新，在行分配之前，在空数据帧上显式定义

dtypes

。可能会将

pd.Timestamp（）

强制到

对象（默认类型）上。