Python 删除列表中对象的重复项_Python_Class_Duplicates

Python 删除列表中对象的重复项

python class

Python 删除列表中对象的重复项,python,class,duplicates,Python,Class,Duplicates,我得到了一个波形对象，定义如下： class wfm: """Class defining a waveform characterized by: - A name - An electrode configuration - An amplitude (mA) - A pulse width (microseconds)""" def __init__(self, name, config, amp, widt

我得到了一个波形对象，定义如下：

class wfm:
    """Class defining a waveform characterized by:
        - A name
        - An electrode configuration
        - An amplitude (mA)
        - A pulse width (microseconds)"""

    def __init__(self, name, config, amp, width=300):
        self.name = name
        self.config = config
        self.amp = amp
        self.width = width

    def __eq__(self, other):
        return type(other) is self.__class__ and other.name == self.name and other.config == self.config and other.amp == self.amp and other.width == self.width

    def __ne__(self, other):
        return not self.__eq__(other)

通过解析，我得到了一个名为waveforms的列表，其中包含770个wfm实例。有很多重复的，我需要删除它们

我的想法是获取等效对象的ID，将最大的ID存储在列表中，然后循环从末尾开始的所有波形，同时弹出每个副本

代码：

结果是（打印的thx）我有没有出现在ID列表中的重复项，例如750是763的重复项（打印显示；也测试），但这两个ID都没有出现在我的重复列表中

我很肯定有一个更好的解决方案，这种方法（目前还不起作用），我很高兴听到它。谢谢你的帮助

编辑：更复杂的场景

我有一个更复杂的情况。我有两门课，wfm（见上文）和stim：

class stim:
    """Class defining the waveform used for a stimultion by:
        - Duration (milliseconds)
        - Frequence Hz)
        - Pattern of waveforms"""

    def __init__(self, dur, f, pattern):
        self.duration = dur
        self.fq = f
        self.pattern = pattern

    def __eq__(self, other):
        return type(other) is self.__class__ and other.duration == self.duration and other.fq == self.fq and other.pattern == self.pattern

    def __ne__(self, other):
        return not self.__eq__(other)

我解析我的文件以填充dict：范例。看起来是这样的：

paradigm[file name STR] = (list of waveforms, list of stimulations)

# example:
paradigm('myfile.xml') = ([wfm1, ..., wfm10], [stim1, ..., stim5])

同样，我想删除重复项，即我只想保留以下位置的数据：

波形相同
和stim是一样的

例如：

file1 has 10 waveforms and file2 has the same 10 waveforms.
file1 has stim1 and stim2 ; file2 has stim3, sitm 4 and stim 5.

stim1 and stim3 are the same; so since the waveforms are also the same, I want to keep:
file1: 10 waveforms and stim1 and stim2
file2: 10 waveforms and stim 4 and stim5

这种相关性在我的头脑中有点混乱，因此我很难找到波形和刺激的正确存储解决方案，以便轻松地比较它们。如果你有什么想法，我很高兴听到。谢谢

问题

.index

方法使用您重载的

方法。所以
waveforms.index(waveforms[j])

将始终在列表中找到波形的第一个实例，该实例包含与波形[j]

相同的属性

w1 = wfm('a', {'test_param': 4}, 3, 2.0)
w2 = wfm('b', {'test_param': 4}, 3, 2.0)
w3 = wfm('a', {'test_param': 4}, 3, 2.0)

w1 == w3  # True
w2 == w3  # False

waveforms = [w1, w2, w3]
waveforms.index(waveforms[2]) == waveforms.index(waveforms[0]) == 0  # True

解决方案不变的如果以不变方式执行此操作，则不需要存储列表索引：

key = lambda w: hash(str(vars(w)))
dupes = set()
unique = [dupes.add(key(w)) or w for w in waveforms if key(w) not in dupes]

unique == [w1, w2]  # True

易变的大O分析在编写算法时，考虑大O复杂性是一个好习惯（尽管只有在需要时，优化才应该以可读性为代价）。在这种情况下，这些解决方案更具可读性，而且更大

由于双for循环，最初的解决方案是O（n^2）

提供的两种解决方案都是O（n）。

这与您的问题无关，但我认为您希望将

宽度

设置为默认参数，但将值

放置在错误的位置，对吗？如果您使用

类而不是类，则可能会更容易复制WFM=namedtuple（'WFM'，['name'，'config'，'amp'，'width']）
。然后在WFM
的实例列表上调用set（）将删除所有重复项。是的，感谢您对宽度的评论。。。你犯了一个小错误，因为它大部分时间都是不变的。我确实发现一些帖子看起来很相似，但我还没有解决我的问题，这就是我为什么发布这个帖子的原因。谢谢你的回答，亚历克斯，我试试看@马修解决了这个问题吗？从你所做的，我得到了一个非常相似的问题出现了；我无法解决它。。。你介意看一下编辑吗？
key = lambda w: hash(str(vars(w)))
dupes = set()
unique = [dupes.add(key(w)) or w for w in waveforms if key(w) not in dupes]

unique == [w1, w2]  # True

key = lambda w: hash(str(vars(w)))
seen = set()
idxs = [i if key(w) in seen else seen.add(key(w)) for i, w in enumerate(waveforms)]

for idx in filter(None, idxs[::-1]):
    waveforms.pop(idx)

waveforms == [w1, w2]  # True