Python 对元组列表执行set操作_Python_Set_Set Difference

Python 对元组列表执行set操作

python

Python 对元组列表执行set操作,python,set,set-difference,Python,Set,Set Difference,我试图找出两个容器之间的差异，但这些容器的结构很奇怪，因此我不知道对其进行差异处理的最佳方法。一个容器类型和结构我无法更改，但其他容器类型和结构我可以更改（变量delims）以下是我将如何做到这一点： delims = set(['on','with','to','and','in','the','from','or']) # ... descriptive_words = filter(lamdba x: x[0] not in delims, words) 使用过滤方法。一个可行的替代办

我试图找出两个容器之间的差异，但这些容器的结构很奇怪，因此我不知道对其进行差异处理的最佳方法。一个容器类型和结构我无法更改，但其他容器类型和结构我可以更改（变量delims）

以下是我将如何做到这一点：

delims = set(['on','with','to','and','in','the','from','or'])
# ...
descriptive_words = filter(lamdba x: x[0] not in delims, words)

使用过滤方法。一个可行的替代办法是：

delims = set(['on','with','to','and','in','the','from','or'])
# ...
decsriptive_words = [ (word, count) for word,count in words if word not in delims ]

确保

delims

在一组中，以允许。

我将如何做：

delims = set(['on','with','to','and','in','the','from','or'])
# ...
descriptive_words = filter(lamdba x: x[0] not in delims, words)

使用过滤方法。一个可行的替代办法是：

delims = set(['on','with','to','and','in','the','from','or'])
# ...
decsriptive_words = [ (word, count) for word,count in words if word not in delims ]

确保

delims

在一个集合中，以允许使用。

如果您要遍历它，为什么还要麻烦将它们转换为集合

dwords = [delim[0] for delim in delims]
words  = [word for word in words if word[0] not in dwords]

如果你正在迭代，为什么还要麻烦把它们转换成集合呢

dwords = [delim[0] for delim in delims]
words  = [word for word in words if word[0] not in dwords]

通过删除所有分隔符来修改

单词如何
words = collections.Counter(s.split())
for delim in delims:
    del words[delim]

通过删除所有分隔符来修改单词如何
words = collections.Counter(s.split())
for delim in delims:
    del words[delim]

为了提高性能，您可以使用lambda函数
filter(lambda word: word[0] not in delim, words)

为了提高性能，您可以使用lambda函数
filter(lambda word: word[0] not in delim, words)

最简单的答案是：
import collections

s = "the a a a a the a a a a a diplomacy"
delims = {'on','with','to','and','in','the','from','or'}
// For older versions of python without set literals:
// delims = set(['on','with','to','and','in','the','from','or'])
words = collections.Counter(s.split())

not_delims = {key: value for (key, value) in words.items() if key not in delims}
// For older versions of python without dict comprehensions:
// not_delims = dict(((key, value) for (key, value) in words.items() if key not in delims))

这给了我们：
{'a': 9, 'diplomacy': 1}

另一种选择是先发制人：
import collections

s = "the a a a a the a a a a a diplomacy"
delims = {'on','with','to','and','in','the','from','or'}
counted_words = collections.Counter((word for word in s.split() if word not in delims))

在这里，在将单词列表提供给计数器之前，对其应用过滤，这将得到相同的结果。
最简单的答案是：
import collections

s = "the a a a a the a a a a a diplomacy"
delims = {'on','with','to','and','in','the','from','or'}
// For older versions of python without set literals:
// delims = set(['on','with','to','and','in','the','from','or'])
words = collections.Counter(s.split())

not_delims = {key: value for (key, value) in words.items() if key not in delims}
// For older versions of python without dict comprehensions:
// not_delims = dict(((key, value) for (key, value) in words.items() if key not in delims))

这给了我们：
{'a': 9, 'diplomacy': 1}

另一种选择是先发制人：
import collections

s = "the a a a a the a a a a a diplomacy"
delims = {'on','with','to','and','in','the','from','or'}
counted_words = collections.Counter((word for word in s.split() if word not in delims))

在这里，在将单词列表提交给计数器之前，对单词列表应用过滤，这会得到相同的结果。
这看起来很有效，我想我会使用它，但是单词是元组列表，我怎么能说“单词[delim]”？@JakeM-直接将其应用于计数器对象。啊，我认为单词是反对象，看起来很有效，我想我会使用它，但是单词是一个元组列表，我怎么能说“单词[delim]”？@JakeM-直接在反对象上应用它。啊，我认为单词是反对象object@RobYoung是的，为了提高效率，我尽量避免重复使用它们。任何不迭代的解决方案都是最好的，我认为这是个坏主意。这将是O（n^2），不是吗？@Rob Young是的，我正试图避免为了效率而对它们进行迭代。任何不迭代的解决方案都是最好的，我认为这是个坏主意。它将是O（n^2），不是吗？第一个方法使用“in”，这是否意味着在每次比较中我们都在迭代整个delims？不是如果它们是集合或dict.O（1）查找。第一个方法使用“in”，这是否意味着在每次比较中我们都在迭代整个delims？不是如果它们是集合或dict.O（1）查找，.filter+lambda的可读性不如列表理解，列表理解可以。其次，由于delims是一个列表，所以它仍在执行O（n^2）。filter+lambda的可读性不如列表理解，列表理解可以。其次，由于delims是一个列表，所以它仍在执行O（n^2）。