Python 多对多关系的查询（eval）过滤器_Python_Pandas

Python 多对多关系的查询（eval）过滤器

python pandas

Python 多对多关系的查询（eval）过滤器,python,pandas,Python,Pandas,假设我有一列类型列表： pd.DataFrame([[["item1", "item2"]]], columns=["a"]) a 0 [item1, item2] 我想匹配列表中的项目： mylist = ["item1", "item3"] 没有得到部分匹配这是可行的，但提供了部分匹配： df.query('a.str.join(" ").str.contains("|".join(@mylist))', engine='python') 我想的一种方法是将整词与str

假设我有一列类型列表：

pd.DataFrame([[["item1", "item2"]]], columns=["a"])
     a
0   [item1, item2]

我想匹配列表中的项目：

mylist = ["item1", "item3"]

没有得到部分匹配

这是可行的，但提供了部分匹配：

df.query('a.str.join(" ").str.contains("|".join(@mylist))', engine='python')

我想的一种方法是将整词与str.contains一起使用，就像在这里一样：

df[df.a.str.contains(r"\bitem1\b")]

这样很好，但在查询或评估中不起作用

当我尝试像这样在查询中实现它时，它不起作用：

df.query('a.str.join(" ").str.contains(r"\bitem1\b")', engine='python') # also use @mylist here

我已经收到了关于如何在没有查询df[[boolsetx.intersectionmylist for x in df['a']]的情况下执行此操作的答案，但是在我的系统中，如果我想避免重写大部分代码，我不得不使用query | eval

您的正则表达式应该可以工作，只需避开反斜杠：

df.query('a.str.join(" ").str.contains(r"\\bitem1\\b")', engine='python')

在查询函数中使用正则表达式有什么问题？你有错误吗？@KenSyme我没有找到与item1匹配的项，它似乎将\b视为单词的一部分？你可以尝试逃离\maybe？那么.containsr\\bitem1\\b？@KenSyme你说得对，我不得不逃避，我想我试过了。谢谢，没问题，我已经发布了一个答案，如果你能接受结束这个问题？