在Python中，如何仅获取特定行之后单词第一次出现的行？_Python_Python 3.x

在Python中，如何仅获取特定行之后单词第一次出现的行？

python python-3.x

在Python中，如何仅获取特定行之后单词第一次出现的行？,python,python-3.x,Python,Python 3.x,在我正在处理的文本文件中，有多行包含单词“TOP”，但是，我只希望在包含单词“IPT”的行之后出现第一个匹配项。我想问的第二个问题是，使用熊猫库是否更好，因为它是csv（逗号分隔值）文件这是我的代码，但它得到了所有包含单词“TOP”的行：我的文本文件示例： .... .... ...SAT... ... ...TOP # I don't want to get this line ... ... **...IPT... ... ... ...TOP... # I want to get t

在我正在处理的文本文件中，有多行包含单词“TOP”，但是，我只希望在包含单词“IPT”的行之后出现第一个匹配项。我想问的第二个问题是，使用熊猫库是否更好，因为它是csv（逗号分隔值）文件

这是我的代码，但它得到了所有包含单词“TOP”的行：

我的文本文件示例：

....
....
...SAT...
...
...TOP # I don't want to get this line
...
...
**...IPT...
...
...
...TOP... # I want to get this line**
...
...
...SAT...
...
...TOP... # I don't want to get this line.
**...IPT...
...TOP... # I want to get this line.**

要修复代码，只需添加变量、标记，无论是否已找到

IPT

temp={}#键将是行号，值将是包含删除换行符的“IPT”的行
已找到\u ipt=False
打开（“myfile.txt”，“r”）作为myfile：
fileNum=0
对于myfile.readlines（）中的行：
fileNum+=1
如果行[12:17]=“IPT”：
temp[fileNum]=行。替换（'\n'，''）
已找到\u ipt=True
elif（第[12:15]行=“顶部”）&已找到ipt:
打印（行）
已找到\u ipt=False

跟踪是否已在变量“found”中找到IPT。然后仅在find==True时查找TOP。找到后第一次找到TOP==True就是您要查找的内容，您可以停止查找

temp = { } # Keys will be the line number, and values will be the lines that contains "IPT" with newline character removed
with open("myfile.txt", 'r') as myfile:
    fileNum = 0
    found = False
    for line in myfile.readlines():
        fileNum +=1
        if line[12:17] == "IPT":
            temp[fileNum] = line.replace('\n', '')
            found = True
        if found == True && line[12:15] == "TOP":
            print(line)
            break

应该这样做

temp={}#键将是行号，值将是包含删除换行符的“IPT”的行
打开（“myfile.txt”，“r”）作为myfile：
#此变量显示是否已找到“IPT”
string\u found=False
#enumerate返回元组生成器，元组的第一个值是索引（从0开始），第二个值是行内容
对于行_num，枚举（myfile.readlines（））中的行：
#如果字符串“IPT”在我们的行中，并且我们还没有找到以前的IPT，我们将字符串_found设置为True，表示我们现在可以获得下一个“TOP”
如果行中有“IPT”，但未找到字符串_：
string_found=True
#如果我们的产品线中有一个“TOP”，并且我们以前已经找到了IPT，请保存该产品线
elif“TOP”在行中，字符串_已找到：
temp[line\u num]=行。替换（“\n”，”）
string\u found=False
打印（临时）

您需要编写两个操作：

当您没有看到

IPT

并且

IPT

在行中时：保存行并开始查找

TOP

当您看到

TOP

并且已经看到

IPT

时：打印该行并停止查找

TOP

另外，只需在第行中查找基本的字符串包含

“TOP”，而不是查看特定的索引，在这里您不需要如此具体
temp = {}
with open("myfile.txt", 'r') as myfile:
    search_mode = False
    for idx, line in enumerate(myfile):       # enumerate() return tuple : index + content  
        if not search_mode and "IPT" in line: # action 1
            temp[idx] = line.rstrip()
            search_mode = True
        elif search_mode and "TOP" in line:   # action 2
            print(line)
            search_mode = False

给出：
print(json.dumps(temp, indent=4))
# >>>
...TOP... # I want get this line**

...TOP... # I want get this line.**
{
    "7": "**...IPT...",
    "16": "**...IPT..."
}

熊猫数据框用于收集标记数据（想象一个CSV内容），这不是您在这里所拥有的
lines = myfile.readlines()
for i, line in enumerate(lines):

...

    if line[12:17] == "IPT":
        temp[fileNum] = line.replace('\n', '')

        for j, line2 in enumerate(lines[i:]):
        if line2[12:15] == "TOP":
            print(line2)
            break

当它找到IPT行时，它会循环另一行，但会从IPT行切片到下一行。
示例输出中的省略号（…）是否只是用来替换其他数据或文件的实际内容？@IainShelvington它们是用来替换其他不相关的数据的。“TOP”单词位于一行[12:15]子字符串处。阅读整个问题，他想找到所有位于IPT行后面的TOP。此外，一个只有ode的答案是没有价值的，“我发布代码然后写文本”也不是一个解决方案；）@azro是的，它会在IPT后找到所有的顶部。你的代码没有输出OP期望的结果，看起来更好，不是所有的顶部都应该是found@azro我不想在“IPT”行之后找到所有“TOP”行。“IPT”行之后的第一行也是唯一的一行。@请不要看下面我的答案，这是您所期望的100%；）您可以解释您的解决方案，而不仅仅是post代码和LEAVE。dictionary temp包含任何包含“TOP”的行。我只想要“IPT”行后面的行。@azro我添加了注释，对此很抱歉。@OmerT我用您的输入文件进行了测试，它只返回在“IPT”后面带有“TOP”的行，所以我不确定是什么让您搞砸了。这很有效，尽管我不得不屏蔽最后一个if块。您能告诉我continue语句是如何影响代码的控制流的吗？是的，我认为将其移动到pandas
将使此类操作更容易。你确实可以放下继续。您可以使用pass
来代替，或者干脆不使用其中任何一个…我稍微调整了一下答案-因为这两个文本不应该出现在一行中-如果向上一级，您需要移动最后一个。然后continue实际上是有意义的……是的，“TOP”和“IPT”确实不会出现在一行中，但是，我仍然认为“continue”语句是不必要的，因为“elif”语句不会运行，除非“found_IPT”变量为“true”，这取决于“if line[12:17]=”IPT:“if”语句的执行情况。我弄错了吗？
result = {}
with open("myfile.txt", 'r') as f:
    ipt_found = False
    for index, line in enumerate(f):
        # For every line number and line in the file
        if 'IPT' in line:
            # If we find IPT in the line then we set ipt_found to True
            ipt_found = True
        elif 'TOP' in line and ipt_found:
            # If we find TOP in the line and ipt_found is True then we add the line
            result[index] = line
            # Set ipt_found to False so we don't append anymore lines with TOP in
            # until we find another line with IPT in
            ipt_found = False
print(result)

result = {}
with open("myfile.txt", 'r') as f:
    ipt_found = False
    for index, line in enumerate(f):
        # For every line number and line in the file
        if 'IPT' in line:
            # If we find IPT in the line then we set ipt_found to True
            ipt_found = True
        elif 'TOP' in line and ipt_found:
            # If we find TOP in the line and ipt_found is True then we add the line
            result[index] = line
            # Set ipt_found to False so we don't append anymore lines with TOP in
            # until we find another line with IPT in
            ipt_found = False
print(result)