Python 获取数据帧中的特定值_Python_Python 3.x_Pandas

Python 获取数据帧中的特定值

python python-3.x pandas

Python 获取数据帧中的特定值,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个df，这些值是字典： df: A 2017-05-31 {'price': '7.25', 'weight': 0.0, 'time': 4.05am} 2017-06-01 {'price': '7.22', 'weight': 0.0 'time': 4.08am} 2017-06-02 {'price':

我有一个df，这些值是字典：

df:                               
                                      A
    2017-05-31    {'price': '7.25', 'weight': 0.0, 'time': 4.05am}
    2017-06-01    {'price': '7.22', 'weight': 0.0 'time': 4.08am}
    2017-06-02    {'price': '7.24', 'weight': 0.0, 'time': 5.08am}
    2017-06-05    {'price': '7.25', 'weight': 0.0, 'time': 6.07am}
    2017-06-06    {'price': '7.19', 'weight': 0.0, 'time':3.33am}
    2017-06-07    {'weight': 0.0, 'price': 7.12, 'time':1.09am}
    2017-06-09    {'weight': 0.0, 'price': 7.46, 'time':2.08am}

我想获得每行中键

price

的值。所需输出为

df:                               
                                  A
2017-05-31                       7.25
2017-06-01                       7.22
2017-06-02                       7.24
2017-06-05                       7.25
2017-06-06                       7.19
2017-06-07                       7.12
2017-06-09                       7.46

如果字典遵循相同的

价格权重时间结构

，我可以简单地应用如下代码：

format = lambda x: list(x.values())[0]
print(df.applymap(format))

然而不幸的是，情况并非如此

我曾考虑过对字典值进行排序，但我不确定如何在df中进行排序

有人能帮我解决这个问题吗？

使用lambda来选择

键

：

df['A'] = df['A'].apply(lambda x: x['price'])
print (df)
               A
2017-05-31  7.25
2017-06-01  7.22
2017-06-02  7.24
2017-06-05  7.25
2017-06-06  7.19
2017-06-07  7.12
2017-06-09  7.46.

对于所有值，使用

DataFrame

构造函数：

df1 = pd.DataFrame(df['A'].values.tolist(), index=df.index)
print (df1)
           price    time  weight
2017-05-31  7.25  4.05am     0.0
2017-06-01  7.22  4.08am     0.0
2017-06-02  7.24  5.08am     0.0
2017-06-05  7.25  6.07am     0.0
2017-06-06  7.19  3.33am     0.0
2017-06-07  7.12  1.09am     0.0
2017-06-09  7.46  2.08am     0.0

您可以使用

apply

并传递

lambda

来访问感兴趣的密钥：

df['A'].apply(lambda x: x['price'])

就我个人而言，我会避免在df中存储非标量值，因为这样会失去使用pandas IMO的任何矢量化优势。如果dict只有相同的键值对，我只需扩展dict并将这些键值存储为列和值，然后您就可以执行

df['price']

并执行矢量化算术运算。

我认为

df['a'].apply（lambda x:x['price']）

是否工作OP需要'price'键而不是'time'键否？我会考虑你的建议。非常感谢。