Python 从字典键和值填充数据帧_Python_Pandas_Dictionary_Dataframe

Python 从字典键和值填充数据帧

python pandas dictionary dataframe

Python 从字典键和值填充数据帧,python,pandas,dictionary,dataframe,Python,Pandas,Dictionary,Dataframe,我以下面的dataframe为例 df_test = pd.DataFrame(data=None, index=["green","yellow","red","pink"], columns=["bear","dog","cat"], dtype=None, copy=False) 我有以下字典，其中的键和值与我的数据帧的索引和列相同或相关 d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]} 我想根据显示的键和

我以下面的dataframe为例

df_test = pd.DataFrame(data=None, index=["green","yellow","red","pink"], columns=["bear","dog","cat"], dtype=None, copy=False)

我有以下字典，其中的键和值与我的数据帧的索引和列相同或相关

d = {"green":["bear","dog"], "yellow":["bear"], "red":["bear"]}

我想根据显示的键和值填充我的数据帧，如果键不存在，我想用空填充

期望输出

我只会考虑列出清单和循环搜索。有没有一个简单的方法来实现这一点？或者可以帮助我的函数？

使用loopd by dictionary并设置

True

值，然后将所有缺少的行替换为

Empty

，最后将缺少的值替换为

fillna

：

for k, v in d.items():
    for x in v:
        df_test.loc[k, x] = 'Yes'

df_test = df_test.mask(df_test.isnull().all(axis=1), 'Empty').fillna('No')
print (df_test)
         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty

这是一个通过和的矢量化解决方案

您可以通过以下方式实现您的目标：

# You can use elements that are not in the original dataframe
# and the row will be filled with empty

index_list = ["green", "yellow", "red", "pink", "purple"]

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

解释

您可以通过检查数据帧的列是否存在于dict的相应字段中来完成所需的操作：

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
    index=x.index), axis=1)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False

然后根据dict的键重新编制索引，以填充缺少的颜色，并将其填充为空：

index_list = ["green","yellow","red","pink", "purple"]

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
       index=x.index), axis=1).reindex(index_list)

        bear    dog    cat
green   True   True  False
yellow  True  False  False
red     True  False  False
pink     NaN    NaN    NaN
purple   NaN    NaN    NaN

然后，如果要更改这些值，可以使用以下词典替换它们：

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty

@可能-解决方案已修改，请检查。谢谢！我想现在可以了！我很快就会接受它：）@may-你认为dicts中的列表是空的吗？或者

NaN

s值？对我来说，它工作得很好，没有最后的粉红色行。问题在于真实数据？或者还有样品？只有你的作品！对不起，这是我问题的最佳答案：DSame问题。粉红色不在字典里，它会随此消失solution@may，不，我没有编造我的结果<代码>粉红色显示为最后一行。因此，

index=np.hstack（（df_-test.index，df.index.difference（df_-test.index）））

部分。@may-所以它对你有用，索引=[“绿色”、“黄色”、“红色”、“粉色”]？是的，它有用！只需将列表放入

reindex

。如果不存在，则将用

空

填充。增加了一个例子。

replace_dict = {True: 'Yes', False: 'No', np.nan:'Empty'}

df_test.loc[list(d.keys())].apply(lambda x : pd.Series(x.index.isin(d[x.name]),
        index=x.index), axis=1).reindex(index_list).replace(replace_dict) 

         bear    dog    cat
green     Yes    Yes     No
yellow    Yes     No     No
red       Yes     No     No
pink    Empty  Empty  Empty
purple  Empty  Empty  Empty