Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/email/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 电子邮件搜寻器(网络)_Python_Email - Fatal编程技术网

Python 电子邮件搜寻器(网络)

Python 电子邮件搜寻器(网络),python,email,Python,Email,此应用程序将在网站上运行,查找所有电子邮件并返回 def testEmails(url): 'Test the emails() function' email = '' content = urlopen(url).read().decode() pattern='[A-Za-z0-9_.]+\@[A-Za-z0-9_.]+\.' for attr in content: if attr[0] == 'href':

此应用程序将在网站上运行,查找所有电子邮件并返回

def testEmails(url):
    'Test the emails() function'
    email = ''
    content = urlopen(url).read().decode()
    pattern='[A-Za-z0-9_.]+\@[A-Za-z0-9_.]+\.'
    for attr in content:
        if attr[0] == 'href':
           print(attr)
           email+='{} '.format(attr)
    emails = re.findall(pattern,email)
    return emails
我一直收到一个空白字符串有人知道为什么吗

编辑:

这是我当前的代码:

def emails(content):
'return list of email addresses contained in string content'
    email = []
    content = urlopen(url).read().decode()
    pattern='[A-Za-z0-9_.]+\@[A-Za-z0-9_.]+\....'
    email.append(re.findall(pattern,content))
    print(email)
但出于某种原因,我得到:

[['somePERSON@university.ca"']]
而不是:

['somePERSON@university.ca']

urlopen().read().decode()
返回unicode字符串。因此,循环通过它循环通过indivudual字符。不是您要查找的HTML属性。您应该使用HTMLPasser来提取属性,或者在整个文档上运行re.findall(cruder,但也会以明文形式提取电子邮件地址)。

Ya,我试图避免从整个文档中提取属性,但是如果我没有其他方法。。。