Python 熊猫群比应用：系列中的怪异行为_Python_Pandas

Python 熊猫群比应用：系列中的怪异行为

python pandas

Python 熊猫群比应用：系列中的怪异行为,python,pandas,Python,Pandas,有人能解释为什么groupby应用于类似的数据帧会产生不同的结果吗 pred2的“p1”列被转换为float，并且正在丢失相关信息 import pandas as pd def predictions(tool): out = pd.Series(index=['p1', 'p2', 'useTime'], dtype=object) if 'step1' in list(tool.State): out['p1'] = str(tool[tool.State

有人能解释为什么groupby应用于类似的数据帧会产生不同的结果吗

pred2的“p1”列被转换为float，并且正在丢失相关信息

import pandas as pd

def predictions(tool):
    out = pd.Series(index=['p1', 'p2', 'useTime'], dtype=object)
    if 'step1' in list(tool.State):
        out['p1'] = str(tool[tool.State == 'step1'].Machine.values[0])
    if 'step2' in list(tool.State):
        out['p2'] = str(tool[tool.State == 'step2'].Machine.values[0])
        out['useTime'] = str(tool[tool.State == 'step2'].oTime.values[0])
    return out


df1 = pd.DataFrame({'Key': ['B', 'B', 'A', 'A'],
                   'State': ['step1', 'step2', 'step1', 'step2'],
                   'oTime': ['', '2016-09-19 05:24:33', '', '2016-09-19 23:59:04'],
                   'Machine': ['23', '36L', '36R', '36R']})

df2 = df1.copy()
df2.oTime = pd.to_datetime(df2.oTime)


pred1 = df1.groupby('Key').apply(predictions)
pred2 = df2.groupby('Key').apply(predictions)

print(pred1)
print(pred2)

结果如下：

      p1   p2              useTime
Key                               
A    36R  36R  2016-09-19 23:59:04
B     23  36L  2016-09-19 05:24:33
       p1   p2                        useTime
Key                                          
A     NaN  36R  2016-09-19T23:59:04.000000000
B    23.0  36L  2016-09-19T05:24:33.000000000

请注意p1列中的差异，尽管df1和df2几乎相同，但第三列被转换为时间戳