Python 如何将正则表达式应用于列表的每个子列表?

Python 如何将正则表达式应用于列表的每个子列表?,python,regex,list,python-2.7,parsing,Python,Regex,List,Python 2.7,Parsing,假设我有一个这样的列表: lis_ = [['"Fun is the enjoyment of pleasure"\t\t', '@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t','Report by @username - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Mal

假设我有一个这样的列表:

lis_ = [['"Fun is the enjoyment of pleasure"\t\t',
         '@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t','Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware https://t.co/k9sOEpKjbg\t\t'],
        ['I just became the mayor of Porta Romana on @username! http://4sq.com/9QROVv\t\t', "RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated http://t.co/heyOhpb1\t\t", "@username Don't use my family surname for your app ???? http://t.co/1yYLXIO9\t\t"]
        ]
我想删除每个子列表的链接,因此我尝试使用以下正则表达式:

new_list = re.sub(r'^https?:\/\/.*[\r\n]*', '', tweets, flags=re.MULTILINE)
我使用了
多行
标志,因为当我打印
列表时它看起来像:

[]
[]
[]
...
[]
上面的aproach的问题是我得到了一个
类型错误:预期的字符串或缓冲区
,显然我不能像这样将子列表传递给正则表达式如何将上述正则表达式应用于
列表中的子列表集
以获得类似的结果(即没有任何链接类型的子列表):

这可以通过地图来完成,还是有其他有效的方法


提前感谢各位

似乎您有一个
列表
列表
s的
字符串
s

在这种情况下,您只需以正确的方式迭代这些列表:

list_ = [['blablablalba', 'blabalbablbla', 'blablala', 'http://t.co/xSnsnlNyq5'], ['blababllba', 'blabalbla', 'blabalbal'],['http://t.co/xScsklNyq5'], ['blablabla', 'http://t.co/xScsnlNyq3']]

def remove_links(sublist):
    return [s for s in sublist if not re.search(r'https?:\/\/.*[\r\n]*', s)]

final_list = map(remove_links, list_)
# [['blablablalba', 'blabalbablbla', 'blablala'], ['blababllba', 'blabalbla', 'blabalbal'], [], ['blablabla']]
如果以后要删除任何空的子列表:

final_final_list = [l for l in final_list if l]

您需要使用
\b
而不是线锚的起点

>>> lis_ = [['"Fun is the enjoyment of pleasure"\t\t',
         '@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t','Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware https://t.co/k9sOEpKjbg\t\t'],
        ['I just became the mayor of Porta Romana on @username! http://4sq.com/9QROVv\t\t', "RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated http://t.co/heyOhpb1\t\t", "@username Don't use my family surname for your app ???? http://t.co/1yYLXIO9\t\t"]
        ]
>>> [[re.sub(r'\bhttps?:\/\/.*[\r\n]*', '', i)] for x in lis_ for i in x]
[['"Fun is the enjoyment of pleasure"\t\t'], ['@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t'], ['Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware '], ['I just became the mayor of Porta Romana on @username! '], ["RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated "], ["@username Don't use my family surname for your app ???? "]]


您应该修复您的
列表\
示例,因为现在它不是有效的Python,所以很难确切知道它是什么。我猜这是一个包含字符串列表的列表,但我们不应该猜测这样的事情。您的预期输出是什么?@AvinashRaj,我编辑过,谢谢大家的帮助!谢谢你的帮助。问题是我有这样的每个子列表的字符串:[
blablablalba blabalbablla blablablala
]而不是
['blablablalba','blabalbablla','blablablablala']
在每个子列表上我都有一个大的注释。
[blablablalba blabalbablla]
不是有效的python代码。你能说得更清楚些吗?很抱歉使用了
blabla
,我试图用一种简单的方式解释它。我编辑如果您在新输入上运行代码,它将返回
['“乐趣是乐趣的享受”\t\t',@username det fanns ett utvik med“sabrina without a stitch.acke nothing.@username\t\t']]]
,这似乎是正确的?我得到了:
['“乐趣是乐趣的享受”\t\t\t',@username det fanns utvik med“sabrina With a stitch.”不确认任何内容。@username\t\t'],[]]
在上一个子列表中,它删除了所有内容,而不仅仅是删除了链接。感谢您的帮助。我尝试了一种方法,得到了以下结果:
[[],[]]
这对我很有用。我给出了你提供的准确输入。谢谢你的帮助,另一方面,如果我得到这样的列表怎么样:
['“乐趣是快乐的享受”\t\t',fanns ett utvik med“一针”.@username\t\t'][“我只是!http://4sq.com/9QROVv\t\t',都灵在@username上!http://4sq.com/9iydG3\t\t']。。。[列表中的另一句话]
。我的意思是没有列表的列表只是一堆列表?。如何将一堆列表分配给变量?当然,它们一定在列表中。顺便说一下,我很好奇,谢谢
>>> lis_ = [['"Fun is the enjoyment of pleasure"\t\t',
         '@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t','Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware https://t.co/k9sOEpKjbg\t\t'],
        ['I just became the mayor of Porta Romana on @username! http://4sq.com/9QROVv\t\t', "RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated http://t.co/heyOhpb1\t\t", "@username Don't use my family surname for your app ???? http://t.co/1yYLXIO9\t\t"]
        ]
>>> [[re.sub(r'\bhttps?:\/\/.*[\r\n]*', '', i)] for x in lis_ for i in x]
[['"Fun is the enjoyment of pleasure"\t\t'], ['@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t'], ['Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware '], ['I just became the mayor of Porta Romana on @username! '], ["RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated "], ["@username Don't use my family surname for your app ???? "]]
>>> l = []
>>> for i in lis_:
        m = []
        for j in i:
            m.append(re.sub(r'\bhttps?:\/\/.*[\r\n]*', '', j))
        l.append(m)


>>> l
[['"Fun is the enjoyment of pleasure"\t\t', '@username det fanns ett utvik med "sabrina without a stitch". acke nothing. @username\t\t', 'Report by @username  - #JeSuisCharlie Movement Leveraged to Distribute DarkComet Malware '], ['I just became the mayor of Porta Romana on @username! ', "RT benturner83 Someone's chucking stuff out of the window of an office on tottenham court road #tcr street evacuated ", "@username Don't use my family surname for your app ???? "]]