如何在python中通过拆分列表元素来创建列表?
假设我有:如何在python中通过拆分列表元素来创建列表?,python,list,Python,List,假设我有: sentences = ['The girls are gorgeous', 'I'm mexican'] 我想得到: words = ['The','girls','are','gorgeous', 'I'm', 'mexican'] 我试过: words = [w.split(' ') for w in sentences] 但并没有达到预期的效果 当我需要获得频率时,这对计数器(单词)有效吗?像这样试试 sentences = ["The girls are gorge
sentences = ['The girls are gorgeous', 'I'm mexican']
我想得到:
words = ['The','girls','are','gorgeous', 'I'm', 'mexican']
我试过:
words = [w.split(' ') for w in sentences]
但并没有达到预期的效果
当我需要获得频率时,这对计数器(单词)有效吗?像这样试试
sentences = ["The girls are gorgeous", "I'm mexican"]
words = [word for sentence in sentences for word in sentence.split(' ')]
sentences = ['The girls are gorgeous', "I'm mexican"]
from collections import Counter
print Counter(item for items in sentences for item in items.split())
# Counter({'mexican': 1, 'girls': 1, 'are': 1, 'gorgeous': 1, "I'm": 1, 'The':1})
Counter(item for items in sentences for item in items.split()).most_common(10)
您的方法无效,因为,
split
返回一个列表。因此,您的代码创建了一个嵌套列表。您需要将其展平,以便与计数器一起使用。你可以用很多方法把它弄平
from itertools import chain
from collections import Counter
Counter(chain.from_iterable(words))
这是压平嵌套列表并找到频率的最佳方法。但是你可以使用一个生成器表达式,像这样
sentences = ["The girls are gorgeous", "I'm mexican"]
words = [word for sentence in sentences for word in sentence.split(' ')]
sentences = ['The girls are gorgeous', "I'm mexican"]
from collections import Counter
print Counter(item for items in sentences for item in items.split())
# Counter({'mexican': 1, 'girls': 1, 'are': 1, 'gorgeous': 1, "I'm": 1, 'The':1})
Counter(item for items in sentences for item in items.split()).most_common(10)
这将获取每个句子,将其拆分以获得单词列表,迭代这些单词并展平嵌套结构
若你们想找到前10个单词,那个么你们可以使用这样的方法
sentences = ["The girls are gorgeous", "I'm mexican"]
words = [word for sentence in sentences for word in sentence.split(' ')]
sentences = ['The girls are gorgeous', "I'm mexican"]
from collections import Counter
print Counter(item for items in sentences for item in items.split())
# Counter({'mexican': 1, 'girls': 1, 'are': 1, 'gorgeous': 1, "I'm": 1, 'The':1})
Counter(item for items in sentences for item in items.split()).most_common(10)
试试这个:
words = ' '.join(sentences).split()
“我是墨西哥人”
是无效语法。使用“我是墨西哥人”
来代替。如果我想马上得到重复次数最多的10个怎么办?@diegoaguilar您可以使用这样的方法打印计数器(item for items in句子中的item for items for items for items in items.split())。最常见的(10)
Nice@thefourye您应该编辑您的问题,以便将来帮助更多的人。。再次感谢deeply@diegoaguilar你的意思是答案,对吗?;)我更新了:)天啊,是的,回答。我现在太累了,弄糊涂了