Python 从网页中查找隐藏电子邮件时遇到问题_Python_Web Scraping_Web Crawler

Python 从网页中查找隐藏电子邮件时遇到问题

python web-scraping web-crawler

Python 从网页中查找隐藏电子邮件时遇到问题,python,web-scraping,web-crawler,Python,Web Scraping,Web Crawler,虽然该网页没有显示任何电子邮件地址，但运行我的scraper可以在控制台上获取该地址，但它会显示群集文档。有没有办法在文件中只保留电子邮件和电话号码？以下是我的计划： import requests from lxml import html def Mainpage(): url = "https://www.houzz.de/professionals/c/Deutschland" response = requests.get(url) tree = html.f

虽然该网页没有显示任何电子邮件地址，但运行我的scraper可以在控制台上获取该地址，但它会显示群集文档。有没有办法在文件中只保留电子邮件和电话号码？以下是我的计划：

import requests
from lxml import html

def Mainpage():
    url = "https://www.houzz.de/professionals/c/Deutschland"
    response = requests.get(url)
    tree = html.fromstring(response.text)
    titles = tree.xpath('//div[@class="name-info"]')
    for title in titles:
        Name=title.xpath('.//a/@href')[0]
        FindindEmail(Name)

def FindindEmail(pagelink):
    response = requests.get(pagelink)
    tree = html.fromstring(response.text)
    titles = tree.xpath('//div[@class="professional-info-content"]/text()')
    for title in titles:
        print(title.strip())

Mainpage()

以下是被刮伤的部位：

终于找到了解决方案：

import requests
from lxml import html

def Mainpage():
    url = "https://www.houzz.de/professionals/c/Deutschland"
    response = requests.get(url)
    tree = html.fromstring(response.text)
    titles = tree.xpath('//div[@class="name-info"]')
    for title in titles:
        Name=title.xpath('.//a/@href')[0]
        FindingEmail(Name)

def FindingEmail(pagelink):
    response = requests.get(pagelink)
    tree = html.fromstring(response.text)
    titles = tree.xpath('//div[@class="professional-info-content"]/text()')
    for title in titles:
        if "E-Mail:" in title or "Fax:" in title:
            print(title)

Mainpage()

他们可能故意对像你这样的人隐藏它^^^你需要它干什么？没有网页url，很难帮助你网页上没有电子邮件地址。有一个指向Web表单的链接。您好，页面url在帖子中。