Python 打印文件中的一段文本_Python

Python 打印文件中的一段文本

python

Python 打印文件中的一段文本,python,Python,我仍在学习python，我有一个文件示例： RDKit 3D 0 0 0 0 0 0 0 0 0 0999 V3000 M V30 BEGIN CTAB M V30 COUNTS 552 600 0 0 0 M V30 BEGIN ATOM M V30 1 C 7.3071 41.3785 19.7482 0 M V30 2 C 7.5456 41.3920 21.2703 0 M V30 3 C 8.3653 40.1559 21.687

我仍在学习python，我有一个文件示例：

 RDKit          3D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 552 600 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C 7.3071 41.3785 19.7482 0
M  V30 2 C 7.5456 41.3920 21.2703 0
M  V30 3 C 8.3653 40.1559 21.6876 0
M  V30 4 C 9.7001 40.0714 20.9228 0
M  V30 5 C 9.4398 40.0712 19.4042 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 0 1 1 2
M  V30 1 1 1 6
M  V30 2 1 1 10
M  V30 3 1 1 11
M  V30 4 1 2 3
M  V30 END BOND
M  V30 END CTAB
M  END

我只想打印以下部分之间的信息：

M  V30 BEGIN ATOM

以及：

由于不同文件的原子数不同，我希望可以使用一种通用方法。有人能帮忙吗

非常感谢。

试试这个：

with open('filename.txt','r') as f:
    ok_to_print = False
    for line in f.readlines():
        line = line.strip # remove whitespaces
        if line == 'M  V30 BEGIN BOND':
            ok_to_print = True
        elif line == 'M  V30 END ATOM':
            ok_to_print = False
        else:
            if ok_to_print:
                print(line)

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass

这将在您读取文件时逐行处理它。对于无法全部存储在内存中的大文件，这是理想的选择。对于小文件，您可以将整个内容读入内存并使用正则表达式

import re
data = ''
with open('filename.txt','r') as f:
    data = f.read()
a = re.compile('M  V30 BEGIN BOND(.+?)M  V30 END ATOM',re.I|re.M|re.DOTALL)
results = a.findall(data)
for result in results:
  print(result)

注意：此代码都没有经过测试。只是瞎写而已。

试试这个：

with open('filename.txt','r') as f:
    ok_to_print = False
    for line in f.readlines():
        line = line.strip # remove whitespaces
        if line == 'M  V30 BEGIN BOND':
            ok_to_print = True
        elif line == 'M  V30 END ATOM':
            ok_to_print = False
        else:
            if ok_to_print:
                print(line)

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass

这将在您读取文件时逐行处理它。对于无法全部存储在内存中的大文件，这是理想的选择。对于小文件，您可以将整个内容读入内存并使用正则表达式

import re
data = ''
with open('filename.txt','r') as f:
    data = f.read()
a = re.compile('M  V30 BEGIN BOND(.+?)M  V30 END ATOM',re.I|re.M|re.DOTALL)
results = a.findall(data)
for result in results:
  print(result)

注意：此代码都没有经过测试。只需盲写即可。

您可以尝试以下方法：

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass

您可以尝试以下方法：

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass

这就是我将如何做到这一点（与csv）

考虑到试图保持逻辑的简短和甜蜜分离，以及您想要一种可移植方法的事实：

def print_atoms_from_file(full_file_path):
    with open(full_file_path, 'r') as f:
        start_printing = False
        for line in f:

            if 'BEGIN ATOM' in line:
                start_printing = True
                continue

            if 'END ATOM' in line:
                start_printing = False
                continue

            if start_printing:
                print line

print_atoms_from_file('test_file_name.txt')

考虑到试图保持逻辑的简短和甜蜜分离，以及您想要一种可移植方法的事实：

def print_atoms_from_file(full_file_path):
    with open(full_file_path, 'r') as f:
        start_printing = False
        for line in f:

            if 'BEGIN ATOM' in line:
                start_printing = True
                continue

            if 'END ATOM' in line:
                start_printing = False
                continue

            if start_printing:
                print line

print_atoms_from_file('test_file_name.txt')

您可以尝试以下功能：

def extract_lines(filename, start_line, stop_line):
    lines=[]
    with open(filename,'r') as f:
        lines=f.readlines()

    list_of_lines=[line.rstrip('\n') for line in lines]

    start_point=list_of_lines.index(start_line)
    stop_point=list_of_lines.index(stop_line)

    return "\n".join(list_of_lines[i] for i in range(start_point+1,stop_point))

您可以尝试以下功能：

def extract_lines(filename, start_line, stop_line):
    lines=[]
    with open(filename,'r') as f:
        lines=f.readlines()

    list_of_lines=[line.rstrip('\n') for line in lines]

    start_point=list_of_lines.index(start_line)
    stop_point=list_of_lines.index(stop_line)

    return "\n".join(list_of_lines[i] for i in range(start_point+1,stop_point))

到目前为止，您尝试了什么？可能使用模块可以使其genericStart读取文件->在列表中找到起始字符串时开始捕获数据/新字符串->在找到要停止的字符串时停止。到目前为止，您尝试了什么？可能使用模块可以使其genericStart读取文件->在当您找到要停止的字符串时，您可以在列表/new string->stop中找到起始字符串。这取决于第二个字段是否始终为V30，但模型通常很好，正在寻找模式。也许当一个包含“BEGIN ATOM”时开始捕获行，当一个包含“END ATOM”时停止捕获行？好的，是的，可以将其更改为行。endswith（‘BEGIN BOND’）也将进行向上投票，但后来我了解到您使用正则表达式。这将取决于第二个字段是否始终是V30，但是模型通常是好的，寻找一种模式。也许当一个包含“BEGIN ATOM”时开始捕获行，当一个包含“END ATOM”时停止捕获行？好的，是的，可以将其更改为行。endswith（‘BEGIN BOND’）也将进行向上投票，但后来我了解到您使用正则表达式。因此，不需要向上投票。@Wychh太好了！你能把这个问题标为已回答吗？：）@太好了！你能把这个问题标为已回答吗？：）