列表管理python_Python - Fatal编程技术网

列表管理python

python

列表管理python,python,Python,我提取了一些url列表，并想处理这个列表。以下是提取的列表示例： http://help.naver.com/service/svc_index.jsp?selected_nodeId=NODE0000000235 http://www.naver.com/rules/service.html http://news.naver.com/main/principle.nhn http://www.naver.com/rules/privacy.html http://www.naver.com/

我提取了一些url列表，并想处理这个列表。以下是提取的列表示例：

http://help.naver.com/service/svc_index.jsp?selected_nodeId=NODE0000000235
http://www.naver.com/rules/service.html
http://news.naver.com/main/principle.nhn
http://www.naver.com/rules/privacy.html
http://www.naver.com/rules/disclaimer.html
http://help.naver.com/claim_main.asp
http://news.naver.com/main/ombudsman/guidecenter.nhn?mid=omb
http://www.nhncorp.com/
http://www.nhncorp.com/

我只想提取以“”开头的URL，所以最后我想要的列表如下

http://www.naver.com/rules/privacy.html
http://www.naver.com/rules/disclaimer.html
http://www.naver.com/rules/service.html

我如何才能只提取我想要的内容？

如果您的

旧列表包含所有URL作为字符串，您可以使用a来过滤它们
new = [url for url in old if url.startswith('http://www.naver.com')]

您可以将其编写为显式循环，但它只添加了几行代码：
new = []
for url in old:
   if url.startswith('http://www.naver.com'):
       new.append( url )

如果您计划在循环时从原始列表中删除项目：永远不要这样做，它不会起作用。您可以使用相同的方法修改原始列表：
你可以用一只手来做这件事。这些是使用Python处理列表的非常强大的方法
通过向列表中添加if

假设您的URL存储在变量myURL
中：
filteredurls = [url for url in myurls if url.startswith('http://www.naver.com')]

有人根据filter（）
建议了这个备选答案，但删除了它，为了完整起见，我将再次发布在这里：
newList = filter(lambda url: url.startswith('http://www.naver.com'), oldList)

列表理解法似乎更快（在我看来，更具可读性）：
你到底想要什么？您给出了一个示例答案，但我无法理解这个问题：您想要的是静态的.html页面而不是动态的答案吗？用Python命名变量“list”是一个非常糟糕的做法：]如果OP真的是一个初学者，请向他展示与非list理解等效的内容。（并不是说很难推导，但我认为很高兴能看到两者的区别。）您好，old[：]=[url for url in old if url.startswith（'这一个对我来说很好。再次感谢！@gnud:然后它就没有被使用；）哈哈，我想我有点赶时间。无论如何谢谢。这是非列表理解版本。它是巧合吗？可以http://www.naver.com
稍后出现在url中？@quamrana:'http://www.naver.com“
可以在url
中的任何位置根据此代码，我这样做是为了适应前导空格a和其他标记。URL本身应该“技术上”除非在开始处，否则不会出现在任何地方，因此此代码不应出错。@inspectorG4dget:can'http://naver.com“
出现在某个URL中的？之后？我不明白为什么会出现这种情况，除非我们讨论的是类似PHP脚本的内容。但在这种情况下，我的代码允许这样做
urlList = [ ... ] # your list of urls
extractedList = [url for url in urlList if url.startswith('http://www.naver.com')]

result = []
for url in myListOfUrls:
    if 'http://www.naver.com' in url:
        result.append(url)

newList = filter(lambda url: url.startswith('http://www.naver.com'), oldList)

$ python -m timeit -c "filter(lambda url: url.startswith('1'), map(str, range(100)))"
10000 loops, best of 3: 143 usec per loop

$ python -m timeit -c "[ url for url in map(str, range(100)) if url.startswith('1') ]"
10000 loops, best of 3: 117 usec per loop