Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
要使用.isin()测试的列中的可选值(python)_Python_Pandas_Dataframe - Fatal编程技术网

要使用.isin()测试的列中的可选值(python)

要使用.isin()测试的列中的可选值(python),python,pandas,dataframe,Python,Pandas,Dataframe,考虑两个数据帧: df1 = pd.DataFrame(['apple and banana are sweet fruits','how fresh is the banana','cherry from japan'],columns=['fruits_names']) df2 = pd.DataFrame([['apple','red'],['banana','yellow'],['cherry','black']],columns=['fruits','colors']) 然后代码:

考虑两个数据帧:

df1 = pd.DataFrame(['apple and banana are sweet fruits','how fresh is the banana','cherry from japan'],columns=['fruits_names'])
df2 = pd.DataFrame([['apple','red'],['banana','yellow'],['cherry','black']],columns=['fruits','colors'])
然后代码:

colors =[]
for f in df1.fruits_names.str.split().apply(set):   #convert content in a set with splitted words

    color = [df2[df2['fruits'].isin(f)]['colors']]  #matching fruits in a list
    colors.append(color)
我可以很容易地在df1中插入颜色

df1['color'] = colors

output:
                    fruits_names            color
0  apple and banana are sweet fruits  [[red, yellow]]
1            how fresh is the banana       [[yellow]]
2                  cherry from japan        [[black]]
问题是,列“fruits”是否有其他值,如:

df2 = pd.DataFrame([[['green apple|opal apple'],'red'],[['banana|cavendish banana'],'yellow'],['cherry','black']],columns=['fruits','colors'])
如何保持此代码正常工作

我最后尝试的是创建一个新列,其中包含水果的分隔值:

df2['Types'] = cf['fruits'].str.split('|')
和。在此处应用(元组):

但它不匹配

我认为您需要:

print(df1)

    fruits_names
0   green apple and banana are sweet fruits
1   how fresh is the banana
2   cherry and opal apple from japan
使用
split
df.explode()

输出:

   fruits              colors
0   green apple        red
0   opal apple         red
1   banana             yellow
1   cavendish banana   yellow
2   cherry             black
将其转换为
dict

d = {i:j for i,j in zip(df2["fruits"].values, df2["colors"].values)}
基于条件创建列

df1["colors"] = [[v for k,v in d.items() if k in x] for x in df1["fruits_names"]]

print(df1)
最终输出:

    fruits_names                            colors
0   green apple and banana are sweet fruits [red, yellow]
1   how fresh is the banana                 [yellow]
2   cherry and opal apple from japan        [red, black]

你好试试这个。您可以使用数据结构中的理解来进一步定制它
df1["colors"] = [[v for k,v in d.items() if k in x] for x in df1["fruits_names"]]

print(df1)
    fruits_names                            colors
0   green apple and banana are sweet fruits [red, yellow]
1   how fresh is the banana                 [yellow]
2   cherry and opal apple from japan        [red, black]
import pandas as pd
import numpy as np
df1 = pd.DataFrame(['green apple and banana are sweet fruits','how fresh is the banana','cherry from japan'],columns=['fruits_names'])
df2 = pd.DataFrame([['green apple|opal apple','red'],['banana|cavendish banana','yellow'],['cherry','black']],columns=['fruits','colors'])
df2['sep_colors'] = np.where(df2['fruits'], (df2['fruits'].str.split(pat='|')), df2['fruits'])


dic = dict(zip(df2['colors'].tolist(),df2['sep_colors'].tolist()))

final = []
for row in range(len(df1.fruits_names)):
    list1 = []
    for key, value in dic.items():
        for item in value:
            if item in df1.iloc[row][0]:
                list1.append(key)
    final.append(list1)

df1['colors'] = final