Algorithm 如何计算两个列表的增量(插入/删除/移动索引)?

Algorithm 如何计算两个列表的增量(插入/删除/移动索引)?,algorithm,swift,Algorithm,Swift,假设我有两个具有唯一ID的对象列表和一个确定其顺序的属性,我如何有效地获取增量索引(插入了哪些索引,删除了哪些索引,移动了哪些索引) 输入示例: let before: [(id: String, timestamp: String)] = [ ("A", "2015-06-04T12:38:09Z"), ("B", "2015-06-04T10:12:45Z"), ("C", "2015-06-04T08:39:55Z"), ("D", "2015-06-03T

假设我有两个具有唯一ID的对象列表和一个确定其顺序的属性,我如何有效地获取增量索引(插入了哪些索引,删除了哪些索引,移动了哪些索引)

输入示例:

let before: [(id: String, timestamp: String)] = [
    ("A", "2015-06-04T12:38:09Z"),
    ("B", "2015-06-04T10:12:45Z"),
    ("C", "2015-06-04T08:39:55Z"),
    ("D", "2015-06-03T23:58:32Z"),
    ("E", "2015-06-01T00:05:51Z"),
]

let after: [(id: String, timestamp: String)] = [
    ("F", "2015-06-04T16:13:01Z"),
    ("C", "2015-06-04T15:10:29Z"),
    ("A", "2015-06-04T12:38:09Z"),
    ("B", "2015-06-04T10:12:45Z"),
]

let delta = deltaFn(before, after)
以下是上述可视化的:

BEFORE                                   AFTER
+-------+----+----------------------+    +-------+----+----------------------+
| index | id | timestamp            |    | index | id | timestamp            |
+-------+----+----------------------+    +-------+----+----------------------+
|     0 |  A | 2015-06-04T12:38:09Z |    |     0 |  F | 2015-06-04T16:13:01Z |
|     1 |  B | 2015-06-04T10:12:45Z |    |     1 |  C | 2015-06-04T15:10:29Z |
|     2 |  C | 2015-06-04T08:39:55Z |    |     2 |  A | 2015-06-04T12:38:09Z |
|     3 |  D | 2015-06-03T23:58:32Z |    |     3 |  B | 2015-06-04T10:12:45Z |
|     4 |  E | 2015-06-01T00:05:51Z |    |     - |    |                      |
+-------+----+----------------------+    +-------+----+----------------------+
预期结果(增量):


它可以通过使用两个映射来解决,即从每个元素的ID映射到其索引,并对它们进行比较

哈希映射的时间复杂度为O(n),基于树的映射的时间复杂度为O(nlogn)

伪代码:

map1 = empty map
map2 = empty map
for each element x with index i in before:
    map1.insert(x,i)
for each element x with index i in after:
    map2.insert(x,i)

//find moved and deleted:
for each key x in map1:
   id1 = map1.get(x)
   id2 = map2.get(x)
   if id2 == nil:
       add id1 to "deleted indexes"
   else if id1 != id2:
       add (id1,id2) to "moved indexes"
       map2.delete(x)
//find new indexes:
for each key x in map2:
    add map2.get(x) to "inserted indexes"
编辑:(在评论中建议)

如果基于树的映射到
O(max{m,n})(min{m,n}))
,则可以将内存输出最小化到
O(min{m,n}),并将时间最小化,其中
m,n
是两个列表的大小,方法是只映射最小的列表,然后迭代数组(未映射的数组)而不是映射

map = empty map
for each element x with index i in smaller list:
    map.insert(x,i)

for each element x with index i1 in larger list:
   i2 = map.get(x)
   if i2:
       if i1 != i2:
           add (i2, i1) to "moved indexes" if smaller list is before
           add (i1, i2) to "moved indexes" if smaller list is after
       map.delete(x)
   else:
       add i1 to "inserted indexes" if smaller list is before
       add i1 to "deleted indexes" if smaller list is after

// Find new indexes:
for each key x in map:
    add map.get(x) to "deleted indexes" if smaller list is before
    add map.get(x) to "inserted indexes" if smaller list is after
一个可能的解决方案(类似于@amit的答案,但只使用一个 地图):


或者,使用选项表示没有旧索引或新索引,如@doisk所示:

// A dictionary mapping each id to a pair
//    ( oldIndex, newIndex )
// where oldIndex = nil for inserted elements
// and newIndex = nil for deleted elements.
var map : [ String : (from: Int?, to: Int?)] = [:]

// Add [ id : (from, nil) ] for each id in before:
for (idx, elem) in enumerate(before) {
    map[elem.id] = (from: idx, to: nil)
}

// Update [ id : (from, to) ] or add [ id : (nil, to) ] for each id in after:
for (idx, elem) in enumerate(after) {
    map[elem.id] = (map[elem.id]?.from, idx)
}

// Compare:
var insertedIndices : [Int] = []
var deletedIndices : [Int] = []
var movedIndices : [(from: Int, to: Int)] = []

for pair in map.values {
    switch pair {
    case (let .Some(fromIdx), let .Some(toIdx)):
        movedIndices.append(from: fromIdx, to: toIdx)
    case (let .Some(fromIdx), .None):
        deletedIndices.append(fromIdx)
    case (.None, let .Some(toIdx)):
        insertedIndices.append(toIdx)
    default:
        fatalError("Oops") // This should not happen!
    }
}

我的解决方案不使用map函数。计算复杂度为O(n*m),其中
n:elms在前面
m:elms在后面

恐怕这不是最好的解决方案。。。然而,它在这里:)

以下是我所做的:

var map: [String : (bef: Int?, aft: Int?)] = [:]

for (idx, (bef, aft)) in zipWithPadding(before, after).enumerate()
  where bef?.id != aft?.id {
  bef.map{map[$0.id] = (idx, map[$0.id]?.aft)}
  aft.map{map[$0.id] = (map[$0.id]?.bef, idx)}
}

for (val, id) in map {
  switch id {
  case (_, nil):  print("\(val): del at \(id.bef!)")
  case (nil, _):  print("\(val): ins at \(id.aft!)")
  default:        print("\(val): mov from \(id.bef!) to \(id.aft!)")
  }
}

//D: del at 3
//E: del at 4
//F: ins at 0
//B: mov from 1 to 3
//A: mov from 0 to 2
//C: mov from 2 to 1
此方法与其他映射答案几乎相同,只是它少了一个循环,并且跳过每个数组中相同的值。
map
这里是字符串(数组中的
id
s)和元组的字典。元组是
Int
s,对应于第一个数组中给定
id
的索引,第二个数组中相同
id
的索引。
Int
s是可选的:这是我们计算每个
id
发生了什么的方法。如果第一个是nil,第二个不是,则插入了
id
。但是,如果第二个为零,则它已被删除。如果两个
Int
s都不是nil,则该
id
已从第一个移动到第二个

填充地图的方法是通过循环通过
zipWithPadding
函数的输出,该函数如下:

func zipWithPadding <
  S0: SequenceType, S1: SequenceType, E0, E1 where
  S0.Generator.Element == E0, S1.Generator.Element == E1
  > (s0: S0, _ s1: S1) -> AnyGenerator<(E0?, E1?)> {

    var (g0, g1) :
    (S0.Generator?, S1.Generator?) =
    (s0.generate(), s1.generate())

    return anyGenerator {
      let e0: E0? = g0?.next() ?? {g0 = nil; return nil}()
      let e1: E1? = g1?.next() ?? {g1 = nil; return nil}()
      return (e0 != nil || e1 != nil) ? (e0, e1) : nil
    }
}
因为不能保证生成器在返回一次nil后继续返回nil(生成器返回nil表示它完成了),所以不能一直为元组值调用同一个生成器。这就是为什么生成器本身在返回nil时设置为nil:这样您就不再调用它了

但是,数组生成器似乎在最后一个值之后返回nil。因此,如果您不介意未定义的行为:

func zipWithPadding <
  S0: SequenceType, S1: SequenceType, E0, E1 where
  S0.Generator.Element == E0, S1.Generator.Element == E1
  > (s0: S0, s1: S1) -> AnyGenerator<(E0?, E1?)> {

    var (g0, g1) = (s0.generate(), s1.generate())

    return anyGenerator {
      let (e0, e1) = (g0.next(), g1.next())
      return e0 != nil || e1 != nil ? (e0, e1) : nil
    }
}

而且它似乎奏效了!通过一些基本的测试,这个
zipWithPadding
函数运行得相当快。它似乎比两个for循环运行得更快,即使两个列表都不包含相同的元素。

下面是我的示例:

func deltaFn(before: [(id: String, timestamp: String)], after: [(id: String, timestamp: String)] ) -> ([Int], [Int], [String]) {

    // Get arrays of just the ids...
    let beforeIds = before.map { $0.id }
    let afterIds = after.map { $0.id }

    // Get the inserted and moved indexes...
    let (inserted, moved) = reduce(0..<afterIds.count, (inserted: [Int](), moved: [String]())) { 

        (var changes, index) -> ([Int], [String]) in

        if let beforeIndex = find(beforeIds, afterIds[index])  {
            if beforeIndex != index {
                changes.moved.append("(from: \(beforeIndex), to: \(index))")
            }
        } else {
            changes.inserted.append(index)
        }
        return changes
    }

    // Get the deleted indexes...
    let deleted = reduce(0..<beforeIds.count, [Int]()) { deleted, index in
        return contains(afterIds, beforeIds[index])
            ? deleted
            : deleted + [index]
    }

    // Return them all as a tuple...
    return (inserted, deleted, moved)
}

let (inserted, deleted, moved) = deltaFn(before, after)

println("Inserted: \(inserted)")  // Inserted: [0]
println("Deleted: \(deleted)")    // Deleted: [3, 4]
println("Moved: \(moved)")        // Moved: [(from: 2, to: 1), (from: 0, to: 2), (from: 1, to: 3)]
变成

(0..<afterIds.count).reduce((inserted: [Int](), moved: [String]()))

(0..给定id的时间戳是否始终保持不变?不,它是决定顺序的值,因此它必须更改才能更改项目的顺序。在上面的示例中,C的时间戳更改,使其上升到a和B之上。@Blixt您是否介意指导我们您在当前解决方案中发现的不满意的内容,您对f的期望是什么rom解决方案?当前的解决方案包括算法方法如何做+复杂性分析(例如我的),以及代码(例如@MartinR)。你到底想要什么?元组的(from,to)选项是否更有效?所以你可以做第二个for循环,比如:
map[elem.id]=(map[elem.id]?.from,idx)
@doisk:这是一个很好的建议。我实际上也用optionals准备了一个版本,但因为switch语句有点“丑陋”,所以放弃了它。但我没有想到这个简化的update语句。我将添加它作为替代。我认为它只能用一个映射来完成。它应该只足以转换“before”列表到一个映射。你怎么看?@GentianKasa感谢你的评论。如果你选择只加载“before”(或before,任意加载其中一个),那么它不会产生真正的区别(渐进)。但是,如果你聪明地只加载较短的列表,你会将内存输出压缩到
O(min{m,n})
,以及基于树的映射的时间。请参见编辑。@amit我试图遵循您的启发(我更新了它,因为它说
如果id2==nil,那么添加id2…
,我认为这是不对的),但我没有得到正确的结果。请查看此示例,如果您有任何建议,请告诉我:@Blixt是的,您是正确的-即使对于相同的索引,您也需要从列表中删除该项。它解决了您的问题吗?@amit我想是的,谢谢!
Array(zipWithPadding([1, 2, 3], [1, 2]))
//[({Some 1}, {Some 1}), ({Some 2}, {Some 2}), ({Some 3}, nil)]
func zipWithPadding <
  S0: SequenceType, S1: SequenceType, E0, E1 where
  S0.Generator.Element == E0, S1.Generator.Element == E1
  > (s0: S0, s1: S1) -> AnyGenerator<(E0?, E1?)> {

    var (g0, g1) = (s0.generate(), s1.generate())

    return anyGenerator {
      let (e0, e1) = (g0.next(), g1.next())
      return e0 != nil || e1 != nil ? (e0, e1) : nil
    }
}
struct PaddedZipGenerator<G0: GeneratorType, G1: GeneratorType> : GeneratorType {

  typealias E0 = G0.Element
  typealias E1 = G1.Element

  typealias Element = (E0?, E1?)

  private var (g0, g1): (G0?, G1?)

  mutating func next() -> PaddedZipGenerator.Element? {
    let e0: E0? = g0?.next() ?? {g0 = nil; return nil}()
    let e1: E1? = g1?.next() ?? {g1 = nil; return nil}()
    return (e0 != nil || e1 != nil) ? (e0, e1) : nil
  }
}

struct PaddedZip<S0: SequenceType, S1: SequenceType> : SequenceType {

  typealias Generator = PaddedZipGenerator<S0.Generator, S1.Generator>

  private let (s0, s1): (S0, S1)

  func generate() -> PaddedZip.Generator {
    return PaddedZipGenerator(g0: s0.generate(), g1: s1.generate())
  }
}

func zipWithPadding<S0: SequenceType, S1: SequenceType>(s0: S0, _ s1: S1) -> PaddedZip<S0, S1> {
  return PaddedZip(s0: s0, s1: s1)
}
func deltaFn(before: [(id: String, timestamp: String)], after: [(id: String, timestamp: String)] ) -> ([Int], [Int], [String]) {

    // Get arrays of just the ids...
    let beforeIds = before.map { $0.id }
    let afterIds = after.map { $0.id }

    // Get the inserted and moved indexes...
    let (inserted, moved) = reduce(0..<afterIds.count, (inserted: [Int](), moved: [String]())) { 

        (var changes, index) -> ([Int], [String]) in

        if let beforeIndex = find(beforeIds, afterIds[index])  {
            if beforeIndex != index {
                changes.moved.append("(from: \(beforeIndex), to: \(index))")
            }
        } else {
            changes.inserted.append(index)
        }
        return changes
    }

    // Get the deleted indexes...
    let deleted = reduce(0..<beforeIds.count, [Int]()) { deleted, index in
        return contains(afterIds, beforeIds[index])
            ? deleted
            : deleted + [index]
    }

    // Return them all as a tuple...
    return (inserted, deleted, moved)
}

let (inserted, deleted, moved) = deltaFn(before, after)

println("Inserted: \(inserted)")  // Inserted: [0]
println("Deleted: \(deleted)")    // Deleted: [3, 4]
println("Moved: \(moved)")        // Moved: [(from: 2, to: 1), (from: 0, to: 2), (from: 1, to: 3)]
reduce(0..<afterIds.count, (inserted: [Int](), moved: [String]()))
(0..<afterIds.count).reduce((inserted: [Int](), moved: [String]()))