在元组python的行中迭代和替换单词

在元组python的行中迭代和替换单词,python,regex,loops,replace,tuples,Python,Regex,Loops,Replace,Tuples,我想遍历这个元组,对于每一行,遍历单词,使用regex查找并替换一些单词(确切地说是互联网地址),同时将它们保留为行 aList= [ "being broken changes people, \nand rn im missing the old me", "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point", "#News #Detroit Detroit water customer r

我想遍历这个元组,对于每一行,遍历单词,使用regex查找并替换一些单词(确切地说是互联网地址),同时将它们保留为行

aList=
[
  "being broken changes people, \nand rn im missing the old me", 
  "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point", 
  "#News #Detroit Detroit water customer receives shutoff threat over missing 10 cents: - Theresa Braxton is a l... T.CO/CHPBRVH9WKk", 
  "@_EdenRodwell \ud83d\ude29\ud83d\ude29ahh I love you!! Missing u, McDonald's car park goss soon please \u2764\ufe0f\u2764\ufe0fxxxxx", 
  "This was my ring tone, before I decided change was good and missing a call was insignificant T.CO?BUXLVZFDWQ", 
  "want to go on holiday again, missing the sun\ud83d\ude29\u2600\ufe0f"
]
下面的代码几乎可以做到这一点,但它将列表拆分为以行分隔的单词:

i=0
while i<len(aList):
    for line in aList[i].split():
        line = re.sub(r"^[http](.*)\/(.*)$", "", line)
        print (line)
        i+=1

谢谢

你的问题有点不清楚,但我想我明白你的意思了

newlist = [re.sub(r"{regex}", "", line) for line in alist]
应该迭代字符串列表,并使用python列表将与正则表达式模式匹配的任何字符串替换为空字符串

旁注:

仔细看看你的正则表达式,它看起来不像你想象的那样 我将看一看这篇关于regex中匹配URL的stack over flow帖子


你的问题有点不清楚,但我想我明白你的意思

newlist = [re.sub(r"{regex}", "", line) for line in alist]
应该迭代字符串列表,并使用python列表将与正则表达式模式匹配的任何字符串替换为空字符串

旁注:

仔细看看你的正则表达式,它看起来不像你想象的那样 我将看一看这篇关于regex中匹配URL的stack over flow帖子

由此:

re.sub(r"^[http](.*)\/(.*)$", "", line)
在我看来,好像你期望所有的URL都在这行的末尾。在这种情况下,请尝试:

[re.sub('http://.*', '', s) for s in aList]
在这里,
http://
匹配以
http://
开头的任何内容<代码>*匹配后面的所有内容

例子 以下是添加了一些URL的列表:

aList = [
  "being broken changes people, \nand rn im missing the old me",
  "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point",
  "#News #Detroit Detroit water customer receives shutoff threat over missing 10 cents: - Theresa Braxton is a http://example.com/CHPBRVH9WKk",
  "@_EdenRodwell ahh I love you!! Missing u, McDonald's car park goss soon please xxxxx",
  "This was my ring tone, before I decided change was good and missing a call was insignificant http://example.com?BUXLVZFDWQ",
  "want to go on holiday again, missing the sun"
  ]
结果如下:

>>> [re.sub('http://.*', '', s) for s in aList]
['being broken changes people, \nand rn im missing the old me',
 "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point",
 '#News #Detroit Detroit water customer receives shutoff threat over missing 10 cents: - Theresa Braxton is a ',
 "@_EdenRodwell ahh I love you!! Missing u, McDonald's car park goss soon please xxxxx",
 'This was my ring tone, before I decided change was good and missing a call was insignificant ',
 'want to go on holiday again, missing the sun']
由此:

re.sub(r"^[http](.*)\/(.*)$", "", line)
在我看来,好像你期望所有的URL都在这行的末尾。在这种情况下,请尝试:

[re.sub('http://.*', '', s) for s in aList]
在这里,
http://
匹配以
http://
开头的任何内容<代码>*匹配后面的所有内容

例子 以下是添加了一些URL的列表:

aList = [
  "being broken changes people, \nand rn im missing the old me",
  "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point",
  "#News #Detroit Detroit water customer receives shutoff threat over missing 10 cents: - Theresa Braxton is a http://example.com/CHPBRVH9WKk",
  "@_EdenRodwell ahh I love you!! Missing u, McDonald's car park goss soon please xxxxx",
  "This was my ring tone, before I decided change was good and missing a call was insignificant http://example.com?BUXLVZFDWQ",
  "want to go on holiday again, missing the sun"
  ]
结果如下:

>>> [re.sub('http://.*', '', s) for s in aList]
['being broken changes people, \nand rn im missing the old me',
 "@SaifAlmazroui @troyboy621 @petr_hruby you're all missing the point",
 '#News #Detroit Detroit water customer receives shutoff threat over missing 10 cents: - Theresa Braxton is a ',
 "@_EdenRodwell ahh I love you!! Missing u, McDonald's car park goss soon please xxxxx",
 'This was my ring tone, before I decided change was good and missing a call was insignificant ',
 'want to go on holiday again, missing the sun']

你有一个无限循环。Python不能使用plus-plus,您必须使用plus-equals。这是i++、bpachev的一个输入错误,我已经纠正了这个错误。John1024,代码运行时没有输入错误,我不允许在问题中输入互联网地址。地址的示例有(T.CO?BUXLVZFDWQ);我把它们放在了所有的caps中。regex
[http]
表示1个字符,可以是h、t或p。regex
http
表示按顺序排列的4个字符,h,t,t,然后p。是的,Aprillion。我明白了。谢谢你有一个无限循环。Python不能使用plus-plus,您必须使用plus-equals。这是i++、bpachev的一个输入错误,我已经纠正了这个错误。John1024,代码运行时没有输入错误,我不允许在问题中输入互联网地址。地址的示例有(T.CO?BUXLVZFDWQ);我把它们放在了所有的caps中。regex
[http]
表示1个字符,可以是h、t或p。regex
http
表示按顺序排列的4个字符,h,t,t,然后p。是的,Aprillion。我明白了。感谢这本书,但只在一行中重复,而不是一行中的每个单词。所以,它只是替换了一整行,这只是一个互联网地址。这是可行的,但只在这些行中进行迭代,而不是在一行中的每个单词中进行迭代。因此,它只是取代了一整行,只是一个互联网地址。