Python 3.x 用Python解析文本

Python 3.x 用Python解析文本,python-3.x,split,Python 3.x,Split,我在文本文件中有如下示例数据。我想做的是搜索文本文件并返回“SpecialStuff”和下一个“;”之间的所有内容,就像我在Output示例中所做的那样。我对python非常陌生,所以非常感谢您提供的任何提示,例如.split()工作吗 您可以尝试以下方法: file = open("filename.txt", "r") # This opens the original file output = open("result.txt", "w") # This opens a new file

我在文本文件中有如下示例数据。我想做的是搜索文本文件并返回“SpecialStuff”和下一个“;”之间的所有内容,就像我在Output示例中所做的那样。我对python非常陌生,所以非常感谢您提供的任何提示,例如.split()工作吗

您可以尝试以下方法:

file = open("filename.txt", "r") # This opens the original file
output = open("result.txt", "w") # This opens a new file to write to
seenSpecialStuff = 0 # This will keep track of whether or not the 'SpecialStuff' line has been seen.
for line in file:
    if ";" in line:
        seenSpecialStuff = 0 # Set tracker to 0 if it sees a semicolon.
    if seenSpecialStuff == 1:
        output.write(line)  # Print if tracker is active 
    if "SpecialStuff" in line:
        seenSpecialStuff = 1 # Set tracker to 1 when SpecialStuff is seen
这将返回一个名为result.txt的文件,该文件包含:

  select
    numbers
    ,othernumbers
    words
这段代码可以改进!因为这可能是一个家庭作业,你可能想做更多的研究如何使这更有效。希望它能成为你的一个有用的起点

干杯

编辑

如果您希望代码专门读取行“SpecialStuff”(而不是包含“SpecialStuff”的行),您可以轻松更改“If”语句以使其更具体:

file = open("my.txt", "r")
output = open("result.txt", "w")
seenSpecialStuff = 0
for line in file:
    if line.replace("\n", "") == ";":
        seenSpecialStuff = 0
    if seenSpecialStuff == 1:
        output.write(line)
    if line.replace("\n", "") == "SpecialStuff":
        seenSpecialStuff = 1
不要使用
str.split()

parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)

请记住,这将捕获这两者之间的一切,包括新行和其他空格-如果您不需要,您还可以执行
parsed.strip()
来删除前导和尾随空格。

谢谢,这非常接近我想要的。唯一的问题是,代码的某些部分包含“abcSpecialStuffpdq”之类的字符串,因此它会捕获后面的所有内容。如何更改代码,使其只捕获字符串“SpecialStuff”后面的内容?您可以尝试使“if”语句类似于
if line.replace(“\n”,”)==“SpecialStuff”:
,这样,只有正好包含SpecialStuff的行才会触发创建跟踪器“1”!如果您希望它只查找特定的引用,也可以对其他行执行此操作!我编辑了答案以反映这一点!如果以后还需要获取“abcSpecialStuffpdq”中包含的信息,则必须添加单独的“If”语句,以便代码能够识别它。
with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:  # open the input and output files
    wanted = False  # do we want the current line in the output?
    for line in infile:
        if line.strip() == "SpecialStuff":  # marks the begining of a wanted block
            wanted = True
            continue
        if line.strip() == ";" and wanted:  # marks the end of a wanted block
            wanted = False
            continue

        if wanted: outfile.write(line)
parsed = None
with open("example.dat", "r") as f:
    data = f.read()  # load the file into memory for convinience
    start_index = data.find("SpecialStuff")  # find the beginning of your block
    if start_index != -1:
        end_index = data.find(";", start_index)  # find the end of the block
        if end_index != -1:
            parsed = data[start_index + 12:end_index]  # grab everything in between
if parsed is None:
    print("`SpecialStuff` Block not found")
else:
    print(parsed)