如何在Python中读取输入直到下一次出现

如何在Python中读取输入直到下一次出现,python,python-3.x,Python,Python 3.x,所以问题是,给定下面的输入,我想将每个URL(以[URL或[LINK或[WEBSITE]开头)和文本分开。我想将每个URL按顺序放入列表,将每个文本放入文本 我还想把每个文本合并成一行,这样每个链接都能与其对应的文本相匹配。下面是一个例子 [URL - https://url1.com] news_line1 word news_line2 word word news_line3 word word word [LINK - https://url2.com] headline_line1

所以问题是,给定下面的输入,我想将每个URL(以[URL或[LINK或[WEBSITE]开头)和文本分开。我想将每个URL按顺序放入列表,将每个文本放入文本

我还想把每个文本合并成一行,这样每个链接都能与其对应的文本相匹配。下面是一个例子

[URL - https://url1.com]
news_line1 word
news_line2 word word
news_line3 word word word

[LINK - https://url2.com]
headline_line1 letter
headline_line2 letter letter
headline_line3 letter letter letter

[WEBSITE - https://url3.com]
date_line1 sentence
date_line2 sentence sentence
date_line3 sentence sentence sentence
产出将是 链接:

及 正文:

我现在的代码是

import sys

inFile = sys.argv[1]

with open(inFile) as f:
    content = f.readlines()

content = [x.strip() for x in content]
url_links = []
sentences = []

for entry in content:
    sentence = ""
    if entry.startswith(("[news_text", "[headline", "[date")):
        url_links.append(entry)

    else:
        sentence = sentence + entry

    sentences.append(sentence)

for sentence in sentences:
    print(sentence)

news_line1 word
news_line2 word word
news_line3 word word word


headline_line1 letter
headline_line2 letter letter
headline_line3 letter letter letter


date_line1 sentence
date_line2 sentence sentence
date_line3 sentence sentence sentence
我得到的电流输出是

import sys

inFile = sys.argv[1]

with open(inFile) as f:
    content = f.readlines()

content = [x.strip() for x in content]
url_links = []
sentences = []

for entry in content:
    sentence = ""
    if entry.startswith(("[news_text", "[headline", "[date")):
        url_links.append(entry)

    else:
        sentence = sentence + entry

    sentences.append(sentence)

for sentence in sentences:
    print(sentence)

news_line1 word
news_line2 word word
news_line3 word word word


headline_line1 letter
headline_line2 letter letter
headline_line3 letter letter letter


date_line1 sentence
date_line2 sentence sentence
date_line3 sentence sentence sentence

我如何调整它,使其提供正确的输出?

如果不想在输出中添加不必要的空行,则应将其添加到循环中

if not entry:
    continue
要获得所需的输出,可以利用字符串:

要将文本拆分为块,让我们添加一个布尔变量,显示是否有块结束(当新
url\u链接开始处理时,块结束)


使用列表存储句子元素,记住startswith()区分大小写,修改后代码的相关部分如下:

url_links = []
sentences = []
sentence = []
for entry in s.split('\n'):    # s holds your string
    entry.strip()
    if entry.startswith(("[URL", "[LINK", "[WEBSITE")):
        url_links.append(entry)
        if sentence:    # add only not empty list
            sentences.append(' '.join(sentence))
        sentence = [] 
    else:
        if entry: sentence.append(entry)
else:   # this else belongs to for
    if sentence: sentences.append(' '.join(sentence))


for sentence in sentences:
    print(sentence)

如果我加入的话,所有的东西都会排在一行。我试图把每一篇文章都放在不同的行中。所以每一篇文章都在下面[URL etc将在一行中。我用代码编辑了我的答案,在每个块后添加新行这肯定完成了任务!但它没有输出正确的输出。当多行连接在一起时,它只是将它们放在一起而没有空格。当我这样做时“”。连接会在第一行后的每一行的开头添加空格t行。有什么建议吗?请查看编辑过的答案。例如,我得到的是“新闻”行1字新闻2字新闻3字标题1字字母”。如果是“新闻”行1字新闻2字新闻3字单词单词单词单词输入,它不会输出正确的输出nput文本之间有换行符。因此,s.split(\n)可能会将一件事拆分为多个文本。请在代码失败的地方共享输入示例。如果\n标记行尾,它应该可以工作。如果希望输出在一行上,为什么不使用
print(句子,end='')
切掉每个
print()中包含的换行符
默认情况下的语句?
previous_block_end = False
for entry in content:
    if not entry:
        continue

    sentence = ""
    if entry.startswith(("[URL", "[LINK", "[WEBSITE")):
        previous_block_end = True
        url_links.append(entry)
    else:
        sentence = sentence + entry
        if previous_block_end and len(url_links) > 1:
            sentence = '\n' + sentence
        if not previous_block_end:
            sentence = ' ' + sentence
        previous_block_end = False

    sentences.append(sentence)

result = ''.join(sentences)
print(result)
url_links = []
sentences = []
sentence = []
for entry in s.split('\n'):    # s holds your string
    entry.strip()
    if entry.startswith(("[URL", "[LINK", "[WEBSITE")):
        url_links.append(entry)
        if sentence:    # add only not empty list
            sentences.append(' '.join(sentence))
        sentence = [] 
    else:
        if entry: sentence.append(entry)
else:   # this else belongs to for
    if sentence: sentences.append(' '.join(sentence))


for sentence in sentences:
    print(sentence)