Python 使用feedparser分别识别itunes:关键字和itunes:类别？_Python_Rss_Feedparser

Python 使用feedparser分别识别itunes:关键字和itunes:类别？

python rss

Python 使用feedparser分别识别itunes:关键字和itunes:类别？,python,rss,feedparser,Python,Rss,Feedparser,我用它来解析rss提要，比如，但无法明确识别itunes:category值查看列表，似乎itunes:keywords和itunes:category值都被放入feed['tags']字典中从类别的测试中：  <rss xmlns:itunes="htt

我用它来解析rss提要，比如，但无法明确识别

itunes:category

值

查看列表，似乎

itunes:keywords

和

itunes:category

值都被放入

feed['tags']

字典中

从

类别的测试中

：

<!--
Description: iTunes channel category
Expect:      not bozo and feed['tags'][0]['term'] == 'Technology'
-->
<rss xmlns:itunes="http://www.itunes.com/DTDs/Podcast-1.0.dtd">
    <channel>
        <itunes:category text="Technology"></itunes:category>
    </channel>
</rss>

对于上面的示例提要，条目为：

<itunes:keywords>Hurley, Liss, feelings</itunes:keywords>

是否有任何方法可以唯一地识别来自itunes:category标签的值？

我找不到一种方法来使用just，所以我也利用了：

实现特定的

itunes:x

属性

```
itunes:category
```
在feedparser中作为
```
category
```
提供

```
itunes:feedparser中的关键字
```
确实被重命名为标签，并填充到术语

但是频道关键字与项目关键字混合在一起要单独识别项目关键字，请使用

scheme

作为筛选器

import feedparser
feedp = feedparser.parse(url)
#get all the keywords both item and channel
keywords = [k["term"] for k in feedp["feed"]["tags"]] 
# get the keywords from all the items 
keyword = [t["term"] for t in feedp["feed"]["tags"] if  t["scheme"] == 'http://www.itunes.com/']

这可能会删除其他标签（如果可用），但如果itunes:关键字和标签共存，则它们是重复的

```
itunes:duration
```
可作为
```
itunes\u duration
```

有点离题，但要完成答案：

如果有多个类别可用，它们将在类别中作为元组公开如报告中所述

但itunes没有多个类别

不再需要使用

beautifulSoup4

再次解析

<itunes:category text="Society &amp; Culture"/>
<itunes:category text="Technology"/>

[{'label': None, 'scheme': 'http://www.itunes.com/', 'term': 'Hurley'},
 {'label': None, 'scheme': 'http://www.itunes.com/', 'term': 'Liss'},
 {'label': None, 'scheme': 'http://www.itunes.com/', 'term': 'feelings'},
 {'label': None,'scheme': 'http://www.itunes.com/','term': 'Society & Culture'},
 {'label': None, 'scheme': 'http://www.itunes.com/', 'term': 'Technology'}]

import bs4

soup = bs4.BeautifulSoup(raw_data, "lxml")        

def is_itunes_category(tag):
        return tag.name == 'itunes:category'

categories = [tag.attrs['text'] for tag in soup.find_all(is_itunes_category)]

import feedparser
feedp = feedparser.parse(url)
category = feedp.feed.category

import feedparser
feedp = feedparser.parse(url)
#get all the keywords both item and channel
keywords = [k["term"] for k in feedp["feed"]["tags"]] 
# get the keywords from all the items 
keyword = [t["term"] for t in feedp["feed"]["tags"] if  t["scheme"] == 'http://www.itunes.com/']

import feedparser
feedp = feedparser.parse(url)
duration = feedp["itunes_duration"]

>>>import feedparser
>>>feedp = feedparser.parse(url)
>>>categories = feedp.feed.categories 
>>>print(categories)
>>>[(u'Syndic8', u'1024'),
(u'dmoz', 'Top/Society/People/Personal_Homepages/P/')]