Python 3.x InvalidSchema:未找到任何连接适配器python3.5.2_Python 3.x_Httprequest

Python 3.x InvalidSchema:未找到任何连接适配器python3.5.2

python-3.x

Python 3.x InvalidSchema:未找到任何连接适配器python3.5.2,python-3.x,httprequest,Python 3.x,Httprequest,我正在尝试从网页中提取电子邮件，以下是我的电子邮件抓取功能： def emlgrb(x): email_set = set() for url in x: try: response = requests.get(url) soup = bs.BeautifulSoup(response.text, "lxml") emails = set(re.findall(r"[a-z0-9\.\-+_

我正在尝试从网页中提取电子邮件，以下是我的电子邮件抓取功能：

def emlgrb(x):
    email_set = set()
    for url in x:
        try:
            response = requests.get(url)
            soup = bs.BeautifulSoup(response.text, "lxml")
            emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", soup.text, re.I))
            email_set.update(emails)
        except (requests.exceptions.MissingSchema, requests.exceptions.ConnectionError):
        continue
    return email_set

def handle_local_links(url, link):
    if link.startswith("/"):
         return "".join([url, link])
    return link

def get_links(url):
    try:
        response = requests.get(url, timeout=5)
        soup = bs.BeautifulSoup(response.text, "lxml")
        body = soup.body
        links = [link.get("href") for link in body.find_all("a")]
        links = [handle_local_links(url, link) for link in links]
        links = [str(link.encode("ascii")) for link in links]
        return links

此函数应由另一个函数提供，该函数创建url列表。馈线功能：

def emlgrb(x):
    email_set = set()
    for url in x:
        try:
            response = requests.get(url)
            soup = bs.BeautifulSoup(response.text, "lxml")
            emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", soup.text, re.I))
            email_set.update(emails)
        except (requests.exceptions.MissingSchema, requests.exceptions.ConnectionError):
        continue
    return email_set

def handle_local_links(url, link):
    if link.startswith("/"):
         return "".join([url, link])
    return link

def get_links(url):
    try:
        response = requests.get(url, timeout=5)
        soup = bs.BeautifulSoup(response.text, "lxml")
        body = soup.body
        links = [link.get("href") for link in body.find_all("a")]
        links = [handle_local_links(url, link) for link in links]
        links = [str(link.encode("ascii")) for link in links]
        return links

它会继续执行许多异常，如果引发这些异常，将返回空列表（不重要）。但是，get_links（）的返回值如下所示：

["b'https://pythonprogramming.net/parsememcparseface//'"]

['https://pythonprogramming.net/parsememcparseface//']

当然，列表中有很多链接（不能发布-声誉）。emlgrb（）函数无法处理列表（InvalidSchema：未找到任何连接适配器），但是如果手动删除b和冗余引号，则列表如下所示：

["b'https://pythonprogramming.net/parsememcparseface//'"]

['https://pythonprogramming.net/parsememcparseface//']

emlgrb（）可以工作。欢迎任何关于问题所在或创建“清洁功能”以从第一个列表中获取第二个列表的建议

谢谢

解决方案是删除

.encode（'ascii'）

您可以在

str（）

中添加编码，例如：

str（object=b''，encoding='utf-8'，errors='strict'）

这是因为str（）在

对象上调用。\uuuurepr\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。实际上，这就是当您执行打印（bytes\u obj）
时打印的内容。在str对象上调用.ecnode（）
，将创建bytes对象
 如果删除.encode（'ascii'），输出会是什么样子？实际上，效果很好-谢谢。我认为在str（）中也可以指定编码？如果你需要；）我在答案中添加了一些解释，效果好吗？：）很抱歉反应太晚。工作完全符合预期。谢谢