Python 3.x 用Python解析文本_Python 3.x_Split

Python 3.x 用Python解析文本

python-3.x

Python 3.x 用Python解析文本,python-3.x,split,Python 3.x,Split,我在文本文件中有如下示例数据。我想做的是搜索文本文件并返回“SpecialStuff”和下一个“；”之间的所有内容，就像我在Output示例中所做的那样。我对python非常陌生，所以非常感谢您提供的任何提示，例如.split（）工作吗您可以尝试以下方法： file = open("filename.txt", "r") # This opens the original file output = open("result.txt", "w") # This opens a new file

我在文本文件中有如下示例数据。我想做的是搜索文本文件并返回“SpecialStuff”和下一个“；”之间的所有内容，就像我在Output示例中所做的那样。我对python非常陌生，所以非常感谢您提供的任何提示，例如.split（）工作吗

您可以尝试以下方法：

file = open("filename.txt", "r") # This opens the original file
output = open("result.txt", "w") # This opens a new file to write to
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen.
for line in file:
    if ";" in line:
        seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon.
    if seenSpecialStuff == 1:
        output.write(line)  # Print if tracker is active 
    if "SpecialStuff" in line:
        seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen

这将返回一个名为result.txt的文件，该文件包含：

  select
    numbers
    ,othernumbers
    words

这段代码可以改进！因为这可能是一个家庭作业，你可能想做更多的研究如何使这更有效。希望它能成为你的一个有用的起点

干杯

编辑

如果您希望代码专门读取行“SpecialStuff”（而不是包含“SpecialStuff”的行），您可以轻松更改“If”语句以使其更具体：

file = open("my.txt", "r")
output = open("result.txt", "w")
seenSpecialStuff = 0
for line in file:
    if line.replace("\n", "") == ";":
        seenSpecialStuff = 0
    if seenSpecialStuff == 1:
        output.write(line)
    if line.replace("\n", "") == "SpecialStuff":
        seenSpecialStuff = 1

不要使用

str.split（）
parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)

请记住，这将捕获这两者之间的一切，包括新行和其他空格-如果您不需要，您还可以执行parsed.strip（）
来删除前导和尾随空格。
谢谢，这非常接近我想要的。唯一的问题是，代码的某些部分包含“abcSpecialStuffpdq”之类的字符串，因此它会捕获后面的所有内容。如何更改代码，使其只捕获字符串“SpecialStuff”后面的内容？您可以尝试使“if”语句类似于if line.replace（“\n”，”）==“SpecialStuff”：，这样，只有正好包含SpecialStuff的行才会触发创建跟踪器“1”！如果您希望它只查找特定的引用，也可以对其他行执行此操作！我编辑了答案以反映这一点！如果以后还需要获取“abcSpecialStuffpdq”中包含的信息，则必须添加单独的“If”语句，以便代码能够识别它。
with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:  # open the input and output files
    wanted = False  # do we want the current line in the output?
    for line in infile:
        if line.strip() == "SpecialStuff":  # marks the begining of a wanted block
            wanted = True
            continue
        if line.strip() == ";" and wanted:  # marks the end of a wanted block
            wanted = False
            continue

        if wanted: outfile.write(line)

parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)