Python 当您需要通过'；自我'；作为论据_Python_Pandas_Dataframe_Apply

Python 当您需要通过'；自我'；作为论据

python pandas dataframe

Python 当您需要通过'；自我'；作为论据,python,pandas,dataframe,apply,Python,Pandas,Dataframe,Apply,我有一个数据帧df，其中一列是“关键字”，另一列是“可能的关键字”，因此前两行如下所示： df['keywords'][0] = 'traveling' df['possible keywords'][0] = ['traveling', 'fishing','cooking'] df['keywords'][1] = 'fishing' df['possible keywords'][0] = ['traveling', 'fishing','cooking'] 让我们假设df['mabl

我有一个数据帧df，其中一列是“关键字”，另一列是“可能的关键字”，因此前两行如下所示：

df['keywords'][0] = 'traveling'
df['possible keywords'][0] = ['traveling', 'fishing','cooking']

df['keywords'][1] = 'fishing'
df['possible keywords'][0] = ['traveling', 'fishing','cooking']

让我们假设df['mable keywords']系列中的每个元素都包含相同的列表，具有相同的元素（['traveling'、'fishing'、'cooking']）

我想生成第三列，其中包括不在“关键字”列中的“可能的关键字”，因此相应的行如下所示：

df['non keywords'][0] = ['fishing','cooking']
df['non keywords'][1] = ['traveling','cooking']

我可以通过以下代码实现这一点：

def establish(X):
    my_list = ['traveling', 'fishing','cooking']
    for element in my_list:
        if element in X:
            my_list.remove(element)
            return my_list

data['non keywords'] = data['keywords'].apply(establish)

但是，我必须将“可能的关键字”列中的值作为“我的列表”包含在建立函数中

如何将“可能的关键字”中的值作为参数传递给建立函数

到目前为止，我所尝试的方法存在以下问题：

建立函数的新版本：

def establish(my_list,X):
    for element in my_list:
        if element in X:
            my_list.remove(element)
            return my_list

my_list = ['traveling', 'fishing','cooking']
data['non keywords'] = data['keywords'].apply(establish(my_list))

Traceback (most recent call last):
  File "C:\Users\xxx\Anaconda3\lib\site-    packages\IPython\core\interactiveshell.py", line 3035, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-21-859ebaa71600>", line 1, in <module>
    data['non keywords'] = data['keywords'].apply(establish(my_list))
TypeError: establish() missing 1 required positional argument: 'X'

问题是：

Traceback (most recent call last):
  File "C:\Users\xxx\Anaconda3\lib\site-    packages\IPython\core\interactiveshell.py", line 3035, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-22-ee891e061f5a>", line 1, in <module>
    data['non keywords'] =     data['original_keyword'].apply(establish(my_list,data['keywords']))
  File "C:\Users\xxxx\Anaconda3\lib\site-packages\pandas\core\series.py", line 2058, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\src\inference.pyx", line 1046, in pandas.lib.map_infer    (pandas\lib.c:56983)
TypeError: 'NoneType' object is not callable

回溯（最近一次呼叫最后一次）：
文件“C:\Users\xxx\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py”，第3035行，运行代码
exec（代码对象、self.user\u全局、self.user\n）
文件“”，第1行，在
数据['non keywords']=数据['original_keywords']。应用（建立（我的_列表，数据['keywords']））
文件“C:\Users\xxxx\Anaconda3\lib\site packages\pandas\core\series.py”，第2058行，在apply中
mapped=lib.map\u推断（值，f，convert=convert\u数据类型）
pandas.lib.map_expert（pandas\lib.c:56983）中的文件“pandas\src\expression.pyx”，第1046行
TypeError:“非类型”对象不可调用

非常感谢您的帮助

该方法需要一个函数或其他可调用的参数，这正是您在传递

build

时在第一个示例中传递的参数。在内部，pandas调用您传递的函数，并将指定列的每个条目作为参数依次调用

调用

build（my_list）

将无法正常工作，因为您的函数现在有两个参数

调用

build（my_list，data['keywords']）

是一个“有效”的函数调用，但将返回

None

，并将错误类型的参数作为其第二个参数，因为

build

需要一个条目而不是一列。一旦它返回

None

，这就是实际传递给

apply（）

函数的内容，该函数显然是不可调用的，因此pandas抛出

一种解决方案是创建一个helper函数，该函数“预烘焙”第一个参数，并将第二个参数作为其唯一参数，然后使用它调用

build（）

函数，这样就可以将第二个函数传递给

apply（）

方法。一种简便的方法是使用：

你好，非常感谢你的回复。我尝试了你的方法，但结果是“无”进入每个数据条目['non-keywods']hmm。。我只是用你的例子试了一下，两行“非关键字”都得到了['cooking']。问题在于，您的

build（）

方法一直在修改相同的列表（在您的情况下，如果无论输入如何，build函数都返回

None

，则会发生这种情况）。尝试更改

build（）

方法以执行类似于

return[elem for elem in my_list if elem！=X]

的操作，或者至少确保复制

my_list

，而不是就地编辑。希望有帮助，非常感谢lemonhead，您建议的语法工作得很好：def-build（x，my_-list）：return[elem-for-elem-in-my_-list if-elem-not-in x]。还有一点需要注意的是，我可以使用data['keywords'].apply（建立，args=（my_list，）-我相信它在新的pandas版本中可以工作-作为使用functools的替代方法

Traceback (most recent call last):
  File "C:\Users\xxx\Anaconda3\lib\site-    packages\IPython\core\interactiveshell.py", line 3035, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-22-ee891e061f5a>", line 1, in <module>
    data['non keywords'] =     data['original_keyword'].apply(establish(my_list,data['keywords']))
  File "C:\Users\xxxx\Anaconda3\lib\site-packages\pandas\core\series.py", line 2058, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\src\inference.pyx", line 1046, in pandas.lib.map_infer    (pandas\lib.c:56983)
TypeError: 'NoneType' object is not callable

import functools
my_list = ['traveling', 'fishing','cooking']
helper_func = functools.partial(establish, my_list) # note that helper_func is an actual function that you can call
data['non keywords'] = data['keywords'].apply(helper_func)