Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/363.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 打印文件中的一段文本_Python - Fatal编程技术网

Python 打印文件中的一段文本

Python 打印文件中的一段文本,python,Python,我仍在学习python,我有一个文件示例: RDKit 3D 0 0 0 0 0 0 0 0 0 0999 V3000 M V30 BEGIN CTAB M V30 COUNTS 552 600 0 0 0 M V30 BEGIN ATOM M V30 1 C 7.3071 41.3785 19.7482 0 M V30 2 C 7.5456 41.3920 21.2703 0 M V30 3 C 8.3653 40.1559 21.687

我仍在学习python,我有一个文件示例:

 RDKit          3D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 552 600 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C 7.3071 41.3785 19.7482 0
M  V30 2 C 7.5456 41.3920 21.2703 0
M  V30 3 C 8.3653 40.1559 21.6876 0
M  V30 4 C 9.7001 40.0714 20.9228 0
M  V30 5 C 9.4398 40.0712 19.4042 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 0 1 1 2
M  V30 1 1 1 6
M  V30 2 1 1 10
M  V30 3 1 1 11
M  V30 4 1 2 3
M  V30 END BOND
M  V30 END CTAB
M  END
我只想打印以下部分之间的信息:

M  V30 BEGIN ATOM
以及:

由于不同文件的原子数不同,我希望可以使用一种通用方法。有人能帮忙吗

非常感谢。

试试这个:

with open('filename.txt','r') as f:
    ok_to_print = False
    for line in f.readlines():
        line = line.strip # remove whitespaces
        if line == 'M  V30 BEGIN BOND':
            ok_to_print = True
        elif line == 'M  V30 END ATOM':
            ok_to_print = False
        else:
            if ok_to_print:
                print(line)
# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass
这将在您读取文件时逐行处理它。对于无法全部存储在内存中的大文件,这是理想的选择。对于小文件,您可以将整个内容读入内存并使用正则表达式

import re
data = ''
with open('filename.txt','r') as f:
    data = f.read()
a = re.compile('M  V30 BEGIN BOND(.+?)M  V30 END ATOM',re.I|re.M|re.DOTALL)
results = a.findall(data)
for result in results:
  print(result)
注意:此代码都没有经过测试。只是瞎写而已。

试试这个:

with open('filename.txt','r') as f:
    ok_to_print = False
    for line in f.readlines():
        line = line.strip # remove whitespaces
        if line == 'M  V30 BEGIN BOND':
            ok_to_print = True
        elif line == 'M  V30 END ATOM':
            ok_to_print = False
        else:
            if ok_to_print:
                print(line)
# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass
这将在您读取文件时逐行处理它。对于无法全部存储在内存中的大文件,这是理想的选择。对于小文件,您可以将整个内容读入内存并使用正则表达式

import re
data = ''
with open('filename.txt','r') as f:
    data = f.read()
a = re.compile('M  V30 BEGIN BOND(.+?)M  V30 END ATOM',re.I|re.M|re.DOTALL)
results = a.findall(data)
for result in results:
  print(result)
注意:此代码都没有经过测试。只需盲写即可。

您可以尝试以下方法:

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass
您可以尝试以下方法:

# Read file contents
with open("file.txt") as file:
    inside = False
    for line in file:
        # Start section of interest
        if line.rstrip() == "M  V30 BEGIN ATOM":
            inside = True
        # End section of interest
        elif line.rstrip() == "M  V30 END ATOM":
            inside = False
        # Inside section of interest
        elif inside:
            print(line.rstrip())
        else:
            pass
这就是我将如何做到这一点(与csv)

这就是我将如何做到这一点(与csv)


考虑到试图保持逻辑的简短和甜蜜分离,以及您想要一种可移植方法的事实:

def print_atoms_from_file(full_file_path):
    with open(full_file_path, 'r') as f:
        start_printing = False
        for line in f:

            if 'BEGIN ATOM' in line:
                start_printing = True
                continue

            if 'END ATOM' in line:
                start_printing = False
                continue

            if start_printing:
                print line

print_atoms_from_file('test_file_name.txt')

考虑到试图保持逻辑的简短和甜蜜分离,以及您想要一种可移植方法的事实:

def print_atoms_from_file(full_file_path):
    with open(full_file_path, 'r') as f:
        start_printing = False
        for line in f:

            if 'BEGIN ATOM' in line:
                start_printing = True
                continue

            if 'END ATOM' in line:
                start_printing = False
                continue

            if start_printing:
                print line

print_atoms_from_file('test_file_name.txt')

您可以尝试以下功能:

def extract_lines(filename, start_line, stop_line):
    lines=[]
    with open(filename,'r') as f:
        lines=f.readlines()

    list_of_lines=[line.rstrip('\n') for line in lines]

    start_point=list_of_lines.index(start_line)
    stop_point=list_of_lines.index(stop_line)

    return "\n".join(list_of_lines[i] for i in range(start_point+1,stop_point))

您可以尝试以下功能:

def extract_lines(filename, start_line, stop_line):
    lines=[]
    with open(filename,'r') as f:
        lines=f.readlines()

    list_of_lines=[line.rstrip('\n') for line in lines]

    start_point=list_of_lines.index(start_line)
    stop_point=list_of_lines.index(stop_line)

    return "\n".join(list_of_lines[i] for i in range(start_point+1,stop_point))

到目前为止,您尝试了什么?可能使用模块可以使其genericStart读取文件->在列表中找到起始字符串时开始捕获数据/新字符串->在找到要停止的字符串时停止。到目前为止,您尝试了什么?可能使用模块可以使其genericStart读取文件->在当您找到要停止的字符串时,您可以在列表/new string->stop中找到起始字符串。这取决于第二个字段是否始终为V30,但模型通常很好,正在寻找模式。也许当一个包含“BEGIN ATOM”时开始捕获行,当一个包含“END ATOM”时停止捕获行?好的,是的,可以将其更改为行。endswith(‘BEGIN BOND’)也将进行向上投票,但后来我了解到您使用正则表达式。这将取决于第二个字段是否始终是V30,但是模型通常是好的,寻找一种模式。也许当一个包含“BEGIN ATOM”时开始捕获行,当一个包含“END ATOM”时停止捕获行?好的,是的,可以将其更改为行。endswith(‘BEGIN BOND’)也将进行向上投票,但后来我了解到您使用正则表达式。因此,不需要向上投票。@Wychh太好了!你能把这个问题标为已回答吗?:)@太好了!你能把这个问题标为已回答吗?:)