Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从类似记录文件的相邻行中提取电子邮件:名称:电话:_Python_Python 3.x_List - Fatal编程技术网

Python 从类似记录文件的相邻行中提取电子邮件:名称:电话:

Python 从类似记录文件的相邻行中提取电子邮件:名称:电话:,python,python-3.x,list,Python,Python 3.x,List,我有一个文本文件,其列表如下: 这些都在我的文本文件中的单独行上 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 我正在尝试

我有一个文本文件,其列表如下: 这些都在我的文本文件中的单独行上

Email: jonsmith@emailaddie.com 
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith@emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith@emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
我正在尝试将组:[电子邮件、姓名、电话]组合并导出为另一个文本文件,每个组位于单独的行中

以下是我迄今为止所做的尝试:(如果我能让它正确地打印到终端,我知道如何写入另一个文件

我正在运行Ubuntu Linux

import re

stuff = list()

#get line
with open("a2.txt", "r") as ins:
    array = []
    for line in ins:
        if re.match("Email Address: ", line):
            array.append(line)
            if re.match("Phone Number: ", line):
                array.append(line)
                if re.match("Name: ", line):
                    array.append(line)
                    print(line)

如注释中所示,通过嵌套的
if
语句查看的是同一行。示例中没有一行与所有三个正则表达式匹配,因此代码永远不会提取任何内容。无论如何,这里不需要使用正则表达式;简单的
line.startswith()
对于查找单个静态字符串或小的静态字符串集来说已经足够了

相反,你想要

array = []
for line in ins:
    if line.startswith('Email Address:'):
        array.append(<<Capture the rest of the line>>)
    elif line.startswith('Name: '):
        array.append(<<Capture the rest of the line>>)
    elif line.startswith('Phone Number: '):
        array.append(<<Capture the rest of the line>>)
        print(array)
        array = []

一开始读起来有点困难,但你应该很快就能理解它。我们有一个
索引
,它告诉我们
字段
的期望值,并打印收集的信息,当
字段
用完时,将
索引
包装回零。这也方便了我们参考如果您确定参数(电子邮件、姓名和电话号码)为将以相同的顺序出现,然后代码将正常工作,如果没有,则在“else”语句中处理。您可以保存不完整的值或引发相同的异常

with open("path to the file") as fh:
# Track the status for a group
counter = 0
# List of all information
all_info = []
# Containing information of current group
current_info = ""
for line in fh:
    if line.startswith("Email:") and counter == 0:
        counter = 1
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Name:") and counter == 1:
        counter = 2
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Phone Number:") and counter == 2:
        counter = 0
        all_info.append("{}{},".format(current_info, line).replace("\n",""))
        current_info = ""
    else:
        # You can handle incomplete information here.
        counter = 0
        current_info = ""

你粘贴的代码缩进正确吗?@DroidX86我想是的,是的。所以你只想在所有3个条件都匹配的情况下打印行?你知道吗?@DroidX86正确。这个挑战项目开始时,我浏览了3000个PDF文件,将它们转换成文本文件,然后从中删除姓名、电子邮件和电话号码em使用python。但是输出文件最后将人员信息放在了单独的行上。我正试图用另一个py来纠正这一点。非常感谢。我的做法完全错了。对于我来说,编程最困难的部分必须是之前所涉及的方法,思维过程,甚至是编写一行代码。
with open("path to the file") as fh:
# Track the status for a group
counter = 0
# List of all information
all_info = []
# Containing information of current group
current_info = ""
for line in fh:
    if line.startswith("Email:") and counter == 0:
        counter = 1
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Name:") and counter == 1:
        counter = 2
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Phone Number:") and counter == 2:
        counter = 0
        all_info.append("{}{},".format(current_info, line).replace("\n",""))
        current_info = ""
    else:
        # You can handle incomplete information here.
        counter = 0
        current_info = ""