使用Python分析YouTube评论--参数已禁用评论

使用Python分析YouTube评论--参数已禁用评论,python,youtube,nlp,Python,Youtube,Nlp,我正在尝试使用YouTube评论进行文本分析。我一直在使用以下网站的代码浏览YouTube: 脚本开始工作,但是如果注释被禁用,代码中有一部分会生成错误,我找不到方法来检查注释是否被禁用或注释是否存在,如果没有注释可刮,则跳过该视频,然后继续下一个视频 for i, video in enumerate(tqdm(video_id, ncols = 100)): try: response = service.commentThreads().list(

我正在尝试使用YouTube评论进行文本分析。我一直在使用以下网站的代码浏览YouTube:

脚本开始工作,但是如果注释被禁用,代码中有一部分会生成错误,我找不到方法来检查注释是否被禁用或注释是否存在,如果没有注释可刮,则跳过该视频,然后继续下一个视频

for i, video in enumerate(tqdm(video_id, ncols = 100)):
    try:
        response = service.commentThreads().list(
                        part = 'snippet',
                        videoId = video,
                        maxResults = 100, # Only take top 100 comments...
                        order = 'relevance', #... ranked on relevance
                        textFormat = 'plainText',
                        ).execute()
    
        comments_temp = []
        [...]
        video_desc_pop.extend([video_desc[i]]*len(comments_temp))
    except:
        # Something threw an error. Skip that video and move on
        print(f"{video} has comments disabled, or something else went wrong")

query_pop = [query] * len(video_id_pop)
创建错误的相关代码块是:

# =============================================================================
# Get Comments of Top Videos
# =============================================================================

video_id_pop = []
channel_pop = []
video_title_pop = []
video_desc_pop = []
comments_pop = []
comment_id_pop = []
reply_count_pop = []
like_count_pop = []

from tqdm import tqdm

for i, video in enumerate(tqdm(video_id, ncols = 100)):
    response = service.commentThreads().list(
                    part = 'snippet',
                    videoId = video,
                    maxResults = 100, # Only take top 100 comments...
                    order = 'relevance', #... ranked on relevance
                    textFormat = 'plainText',
                    ).execute()
    
    comments_temp = []
    comment_id_temp = []
    reply_count_temp = []
    like_count_temp = []
    for item in response['items']:
        comments_temp.append(item['snippet']['topLevelComment']['snippet']['textDisplay'])
        comment_id_temp.append(item['snippet']['topLevelComment']['id'])
        reply_count_temp.append(item['snippet']['totalReplyCount'])
        like_count_temp.append(item['snippet']['topLevelComment']['snippet']['likeCount'])
    comments_pop.extend(comments_temp)
    comment_id_pop.extend(comment_id_temp)
    reply_count_pop.extend(reply_count_temp)
    like_count_pop.extend(like_count_temp)
    
    video_id_pop.extend([video_id[i]]*len(comments_temp))
    channel_pop.extend([channel[i]]*len(comments_temp))
    video_title_pop.extend([video_title[i]]*len(comments_temp))
    video_desc_pop.extend([video_desc[i]]*len(comments_temp))
    
query_pop = [query] * len(video_id_pop)
编辑以添加:

创建代码的人留下了一条消息来修复错误,并说:

您可以将代码的查询部分包装在try…except语句中,如果try语句(查询部分)失败,您可以将except of空白响应或“error”字符串推送到列表中


如果这对其他人有意义,我有NFI如何执行…

注意:这不一定是“好”的编码风格,但如果我在为自己的短期个人用途编写脚本时遇到这个问题,我会这样做

Python(和许多其他语言)有一种捕获异常并在不崩溃的情况下处理它们的方法。如果使用得当,这是处理不良数据的一种非常好的方法

是异常的一个很好的概述。一般来说,他们采用的格式类似于

try:
    code_that_can_error()
except ExceptionThatWIllBeThrown as ex:
    handle_exception()
    print(ex) # ex is an object that has information about what went wrong
finally:
    clean_up()
(如果您有需要调用close的内容,如文件,Finally尤其有用。如果引发异常,您可能不会关闭它,但即使引发异常,Finally也保证会被调用。)

在您的情况下,我们所需要的只是忽略错误并转到下一个视频

for i, video in enumerate(tqdm(video_id, ncols = 100)):
    try:
        response = service.commentThreads().list(
                        part = 'snippet',
                        videoId = video,
                        maxResults = 100, # Only take top 100 comments...
                        order = 'relevance', #... ranked on relevance
                        textFormat = 'plainText',
                        ).execute()
    
        comments_temp = []
        [...]
        video_desc_pop.extend([video_desc[i]]*len(comments_temp))
    except:
        # Something threw an error. Skip that video and move on
        print(f"{video} has comments disabled, or something else went wrong")

query_pop = [query] * len(video_id_pop)