Isin函数跳过正确值python_Python_Python 3.x_Pandas_Filtering_Isin

Isin函数跳过正确值python

python python-3.x pandas

Isin函数跳过正确值python,python,python-3.x,pandas,filtering,isin,Python,Python 3.x,Pandas,Filtering,Isin,我正在处理来自GETAPI请求的响应json文件。我已经能够找出如何将响应平坦化，并且我希望通过只包含pdf文件扩展名的记录来过滤相关的数据帧，这些记录将用于检索感兴趣的文件。代码如下： from flatten_json import flatten import requests import pandas as pd import re payload= {"chamber_type":"committee","chamber"

我正在处理来自GETAPI请求的响应json文件。我已经能够找出如何将响应平坦化，并且我希望通过只包含pdf文件扩展名的记录来过滤相关的数据帧，这些记录将用于检索感兴趣的文件。代码如下：

from flatten_json import flatten
import requests
import pandas as pd
import re
payload= {"chamber_type":"committee","chamber":"dail","date_start":"2018-01-01", "date_end":"2018-12-31", "limit":"1000"}
test = requests.get("https://api.oireachtas.ie/v1/debates", params=payload)
text = test.content.decode("utf-8")
print(text)
test.json()
test1=flatten(test.json())
df = pd.Series(test1).to_frame()
df[["pdf"]] = df[df.index.isin(["uri_pdf"])]

整个df返回nan，即使它应该给出一个肯定的结果。

我试图用相同的表达式过滤索引，但结果是一个空的df

isin在此不起作用的地方？

.isin（）的工作方式可能与您预期的不同（例如，包含）。IIUC，您需要str.contains（）：

或者在您的情况下，可以使用str.endswith（）

嗨，Danail，你试过检查它是否有效吗？就我而言，情况并非如此。或者是因为“index”没有定义，甚至当我在最初发布的df.index上迭代时也没有定义。如果你简单地执行df[df['index'].str.endswith（“uri_pdf”）]，你会得到什么？正如我在您的输出中看到的那样，我确实输入了“Index”，列名是大写的

。Index实际上是数据帧的索引，而不是列，这就是为什么输出是：pandas中的文件“pandas\u libs\Index.pyx”，第111行。_libs.Index.IndexEngine.get\u loc文件“pandas\u libs\Index.pyx”，第138行，在pandas.libs.index.IndexEngine.get_loc File“pandas_libs\hashtable\u class\u helper.pxi”，第1619行，在pandas.libs.hashtable.PyObjectHashTable.get_item File“pandas_libs\hashtable\u class\u helper.pxi”，第1627行，在pandas

df[df.index.str.contains（'pdf_uri'）]

经过一些调整后，您的解决方案最终对我有效。不，我知道Isin与精确匹配一起工作，

str.contains

更适合搜索字符串中的位。

df[df.index.str.contains('pdf_uri')]

df[df.index.str.endswith('pdf_uri')]