Python 从类似记录文件的相邻行中提取电子邮件：名称：电话：_Python_Python 3.x_List

Python 从类似记录文件的相邻行中提取电子邮件：名称：电话：

python python-3.x list

Python 从类似记录文件的相邻行中提取电子邮件：名称：电话：,python,python-3.x,list,Python,Python 3.x,List,我有一个文本文件，其列表如下：这些都在我的文本文件中的单独行上 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 Email: jonsmith@emailaddie.com Name: Jon Smith Phone Number: 555-1212 我正在尝试

我有一个文本文件，其列表如下：这些都在我的文本文件中的单独行上

Email: jonsmith@emailaddie.com 
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith@emailaddie.com
Name: Jon Smith
Phone Number: 555-1212
Email: jonsmith@emailaddie.com
Name: Jon Smith
Phone Number: 555-1212

我正在尝试将组：[电子邮件、姓名、电话]组合并导出为另一个文本文件，每个组位于单独的行中

以下是我迄今为止所做的尝试：（如果我能让它正确地打印到终端，我知道如何写入另一个文件

我正在运行Ubuntu Linux

import re

stuff = list()

#get line
with open("a2.txt", "r") as ins:
    array = []
    for line in ins:
        if re.match("Email Address: ", line):
            array.append(line)
            if re.match("Phone Number: ", line):
                array.append(line)
                if re.match("Name: ", line):
                    array.append(line)
                    print(line)

如注释中所示，通过嵌套的

if

语句查看的是同一行。示例中没有一行与所有三个正则表达式匹配，因此代码永远不会提取任何内容。无论如何，这里不需要使用正则表达式；简单的

line.startswith（）

对于查找单个静态字符串或小的静态字符串集来说已经足够了

相反，你想要

array = []
for line in ins:
    if line.startswith('Email Address:'):
        array.append(<<Capture the rest of the line>>)
    elif line.startswith('Name: '):
        array.append(<<Capture the rest of the line>>)
    elif line.startswith('Phone Number: '):
        array.append(<<Capture the rest of the line>>)
        print(array)
        array = []

一开始读起来有点困难，但你应该很快就能理解它。我们有一个

索引

，它告诉我们

字段

的期望值，并打印收集的信息，当

字段

用完时，将

索引

包装回零。这也方便了我们参考如果您确定参数（电子邮件、姓名和电话号码）为将以相同的顺序出现，然后代码将正常工作，如果没有，则在“else”语句中处理。您可以保存不完整的值或引发相同的异常

with open("path to the file") as fh:
# Track the status for a group
counter = 0
# List of all information
all_info = []
# Containing information of current group
current_info = ""
for line in fh:
    if line.startswith("Email:") and counter == 0:
        counter = 1
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Name:") and counter == 1:
        counter = 2
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Phone Number:") and counter == 2:
        counter = 0
        all_info.append("{}{},".format(current_info, line).replace("\n",""))
        current_info = ""
    else:
        # You can handle incomplete information here.
        counter = 0
        current_info = ""

你粘贴的代码缩进正确吗？@DroidX86我想是的，是的。所以你只想在所有3个条件都匹配的情况下打印行？你知道吗？@DroidX86正确。这个挑战项目开始时，我浏览了3000个PDF文件，将它们转换成文本文件，然后从中删除姓名、电子邮件和电话号码em使用python。但是输出文件最后将人员信息放在了单独的行上。我正试图用另一个py来纠正这一点。非常感谢。我的做法完全错了。对于我来说，编程最困难的部分必须是之前所涉及的方法，思维过程，甚至是编写一行代码。

with open("path to the file") as fh:
# Track the status for a group
counter = 0
# List of all information
all_info = []
# Containing information of current group
current_info = ""
for line in fh:
    if line.startswith("Email:") and counter == 0:
        counter = 1
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Name:") and counter == 1:
        counter = 2
        current_info = "{}{},".format(current_info, line)
    elif line.startswith("Phone Number:") and counter == 2:
        counter = 0
        all_info.append("{}{},".format(current_info, line).replace("\n",""))
        current_info = ""
    else:
        # You can handle incomplete information here.
        counter = 0
        current_info = ""