Python discord.py的Youtube搜索命令_Python_Discord_Discord.py

Python discord.py的Youtube搜索命令

python discord discord.py

Python discord.py的Youtube搜索命令,python,discord,discord.py,Python,Discord,Discord.py,简单。我正在使用python为discord制作一个youtube搜索命令代码如下： async def youtube(ctx, *, search): query_string = urllib.parse.urlencode({ 'search_query': search }) htm_content = urllib.request.urlopen( 'http://www.youtube.com/results?' + que

简单。我正在使用python为discord制作一个youtube搜索命令代码如下：

async def youtube(ctx, *, search):
    query_string = urllib.parse.urlencode({
        'search_query': search
    })
    htm_content = urllib.request.urlopen(
        'http://www.youtube.com/results?' + query_string
    )
    search_results = re.findall('href=\"\\/watch\\?v=(.{11})', htm_content.read().decode())
    await ctx.send('http://www.youtube.com/watch?v=' + search_results[0])

我的错误是：

Ignoring exception in command youtube:
Traceback (most recent call last):
  File "C:\Users\Ryzen\AppData\Roaming\Python\Python37\site-packages\discord\ext\commands\core.py", line 83, in wrapped
    ret = await coro(*args, **kwargs)
  File "C:\Users\Ryzen\Desktop\ae\bot\bot 2.0\bot.py", line 738, in youtube
    await ctx.send('http://www.youtube.com/watch?v=' + search_results[0])
IndexError: list index out of range

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Ryzen\AppData\Roaming\Python\Python37\site-packages\discord\ext\commands\bot.py", line 892, in invoke
    await ctx.command.invoke(ctx)
  File "C:\Users\Ryzen\AppData\Roaming\Python\Python37\site-packages\discord\ext\commands\core.py", line 797, in invoke
    await injected(*ctx.args, **ctx.kwargs)
  File "C:\Users\Ryzen\AppData\Roaming\Python\Python37\site-packages\discord\ext\commands\core.py", line 92, in wrapped
    raise CommandInvokeError(exc) from exc
discord.ext.commands.errors.CommandInvokeError: Command raised an exception: IndexError: list index out of range

谢谢

尼梅斯卡是正确的，无论你用什么来填充

搜索结果

列表都找不到任何东西

您可以做一些事情来帮助调试。首先，尝试将

htm_content.read（）.decode（）

的内容捕获到一个文件中，以查看得到的内容。您很有可能收到capcha、错误或其他无法使用的信息，因为您没有随请求发送用户代理

假设您确实得到了所需的DOM响应，那么在文件中拥有一个副本可以帮助您更准确地编写正则表达式。除此之外，用于测试/调试的在线工具（如regexr或regex101）将提供更多帮助。然后使用

r“strings”

意味着您可以直接复制正则表达式，而无需复制每个反斜杠

BASE = "https://youtube.com/results"

async def youtube(ctx, *, search):
    p = {"search_query": search}
    # Spoof a user agent header or the request will immediately fail
    h = {"User-Agent": "Mozilla/5.0"}
    async with aiohttp.ClientSession() as client:
        async with client.get(BASE, params=p, headers=h) as resp:
            dom = await resp.text()
            # open("debug.html", "w").write(dom)
    found = re.findall(r'href"\/watch\?v=([a-zA-Z0-9_-]{11})', dom)
    return f"https://youtu.be/{found[0]}"

最后一个警告是，谷歌倾向于用广告代替搜索结果，所以请记住你的正则表达式实际上返回了什么；）

或者，我建议使用设置一个新的开发人员项目，因为这将允许您一起跳过webscraping部分，而使用API客户端。为

谷歌api python客户端使用pip安装：
from googleapiclient.discovery import build

def get_service():
    # Get developer key from "credentials" tab of api dashboard
    return build("youtube", "v3", developerKey="key")

def search(term, channel):
    service = get_service()
    resp = service.search().list(
        part="id",
        q=term,
        # safeSearch="none" if channel.is_nsfw() else "moderate",
        videoDimension="2d",
    ).execute()
    return resp["items"][0]["id"]["videoId"]

我将正则表达式行更改为：
re.findall( r"watch\?v=(\S{11})", html_content.read().decode())

之后，它对我起了作用
让我们看看这一部分：
search_results = re.findall('href=\"\\/watch\\?v=(.{11})', htm_content.read().decode())
await ctx.send('http://www.youtube.com/watch?v=' + search_results[0])

我解决了这个问题，为search\u content
创建了一个变量来查看所有html页面
search_content= html_content.read().decode()

然后我尝试在html内容中找到这个模式
search_results = re.findall(r'\/watch\?v=\w+', search_content)

现在，您的机器人可以将第一个结果发送到discord服务器
此模式找到\/watch\？v=
部分，然后捕获下一个字符\w+
。在这个字符之后有一个's，因此re.findall
进程会中断捕获

以下是完整的代码：
@bot.command()
async def youtube(ctx, *, search):
    query_string = parse.urlencode({'search_query': search})
    html_content = request.urlopen('http://www.youtube.com/results?' + query_string)
    search_content= html_content.read().decode()
    search_results = re.findall(r'\/watch\?v=\w+', search_content)
    #print(search_results)
    await ctx.send('https://www.youtube.com' + search_results[0])

我发现了一个非常有趣的页面，您可以在其中调试正则表达式并在文本中查找模式。此工具可帮助我解决此问题：
我希望这篇文章能帮助你
…m/watch？v='+搜索结果[0]
也许列表是空的，没有0索引。尼米什卡是正确的。除此之外，您正在使用asyncio，因此应该使用。