Python read.txt并在符号后拆分单词@_Python_String_Split

Python read.txt并在符号后拆分单词@

python string

Python read.txt并在符号后拆分单词@,python,string,split,Python,String,Split,我有一个带有电子邮件地址的11GB.txt文件。我只想将字符串保存到@符号之间。我的输出只生成第一行。我使用了早期项目的代码。我想将输出保存在另一个.txt文件中。我希望有人能帮助我我的代码： import re def get_html_string(file,start_string,end_string): answer="nothing" with open(file, 'rb') as open_file: for line in open_fi

我有一个带有电子邮件地址的11GB.txt文件。我只想将字符串保存到@符号之间。我的输出只生成第一行。我使用了早期项目的代码。我想将输出保存在另一个.txt文件中。我希望有人能帮助我

我的代码：

import re 

def get_html_string(file,start_string,end_string):
    answer="nothing"
    with open(file, 'rb') as open_file: 
        for line in open_file:
            line = line.rstrip()
            if re.search(start_string, line) :
                answer=line
                break
    start=answer.find(start_string)+len(start_string)
    end=answer.find(end_string)
    #print(start,end,answer)
    return answer[start:end]


beginstr=''
end='@'
file='test.txt'
readstring=str(get_html_string(file,beginstr,end))


print readstring

如果您的文件类似于此示例：

user@google.com
user2@jshds.com
Useruser@jsnl.com

您可以使用以下选项：

def get_email_name(file_name):
    with open(file_name) as file:
        lines = file.readlines()
    result = list()
    for line in lines:
        result.append(line.split('@')[0])
    return result

get_email_name('emails.txt')

输出：

如果您的文件类似于此示例：

user@google.com
user2@jshds.com
Useruser@jsnl.com

您可以使用以下选项：

def get_email_name(file_name):
    with open(file_name) as file:
        lines = file.readlines()
    result = list()
    for line in lines:
        result.append(line.split('@')[0])
    return result

get_email_name('emails.txt')

输出：

您的文件相当大（11G），因此不应该将所有这些字符串都保留在内存中。相反，逐行处理文件，并在读取下一行之前写入结果

这应该是有效的：

with open('test.txt', 'r') as input_file:
    with open('result.txt', 'w') as output_file:
       for line in input_file:
            prefix = line.split('@')[0]
            output_file.write(prefix + '\n')

您的文件相当大（11G），因此不应该将所有这些字符串都保留在内存中。相反，逐行处理文件，并在读取下一行之前写入结果

这应该是有效的：

with open('test.txt', 'r') as input_file:
    with open('result.txt', 'w') as output_file:
       for line in input_file:
            prefix = line.split('@')[0]
            output_file.write(prefix + '\n')

break

打破了

for

-为什么对于这样一个简单的任务，这个代码看起来很复杂？您的输入文件中每行有一个地址吗？

break

从

for

-循环中分离出来。为什么对于如此简单的任务，此代码看起来很复杂？输入文件中每行有一个地址吗？这正是我需要的！非常感谢！！：）这正是我需要的！非常感谢！！：）