Java 用0-（N-1）中唯一的数字替换重复的数字背景：_Java_Algorithm

Java 用0-（N-1）中唯一的数字替换重复的数字背景：

java algorithm

Java 用0-（N-1）中唯一的数字替换重复的数字背景：,java,algorithm,Java,Algorithm,我有一个N长度的正随机数数组，其中肯定包含重复项。 e、 g.10,4,5,7,10,9,10,9,8,10,5 编辑：N很可能是32，或者是这个大小的其他二次幂问题是：我正试图找到用0-（N-1）中缺失的数字替换重复项的最快方法。使用上面的示例，我希望得到如下结果： 10,4,5,7,0,9,1,2,8,3,6 目标是使每个数字中有一个从0到N-1，而不只是将所有数字替换为0-（N-1）（随机顺序很重要）。编辑：同样重要的是，此替换是确定性的，即相同的输入将具有相同的输出（不是随机的）

我有一个N长度的正随机数数组，其中肯定包含重复项。 e、 g.10,4,5,7,10,9,10,9,8,10,5
编辑：N很可能是32，或者是这个大小的其他二次幂

问题是：我正试图找到用0-（N-1）中缺失的数字替换重复项的最快方法。使用上面的示例，我希望得到如下结果：
10,4,5,7,0,9,1,2,8,3,6
目标是使每个数字中有一个从0到N-1，而不只是将所有数字替换为0-（N-1）（随机顺序很重要）。
编辑：同样重要的是，此替换是确定性的，即相同的输入将具有相同的输出（不是随机的）

我的解决方案：目前在Java中实现，使用2个布尔数组跟踪已使用/未使用的数字（范围[0，N]内的唯一数字/缺失数字），最坏情况下的运行时间大约为N+N*sqrt（N）。
守则如下：

public byte[] uniqueify(byte[] input)
{
    boolean[] usedNumbers = new boolean[N];
    boolean[] unusedIndices = new boolean[N];
    byte[] result = new byte[N];

    for(int i = 0; i < N; i++) // first pass through
    {
        int newIdx = (input[i] + 128) % N; // first make positive
        if(!usedNumbers[newIdx]) // if this number has not been used
        {
            usedNumbers[newIdx] = true; // mark as used
            result[i] = newIdx; // save it in the result
        }
        else // if the number is used
        {
            unusedIndices[i] = true; // add it to the list of duplicates
        }
    }

    // handle all the duplicates
    for(int idx = 0; idx < N; idx++) // iterate through all numbers
    {
        if(unusedIndices[idx]) // if unused
            for(int i = 0; i < N; i++) // go through all numbers again
            {
                if(!usedNumbers[i]) // if this number is still unused
                {
                    usedNumbers[i] = true; // mark as used
                    result[i] = idx;
                    break;
                }
            }
    }
    return result;
}

公共字节[]唯一化（字节[]输入）
{
boolean[]usedNumbers=新的boolean[N]；
布尔[]未使用的索引=新布尔[N]；
字节[]结果=新字节[N]；
对于（int i=0；i


这似乎是我所能期望的最快速度，但我想我应该问问互联网，因为有比我聪明得多的人可能有更好的解决方案
注意：建议/解决方案不必使用Java
多谢各位
<> > > >编辑< /强>：我忘了提到我把它转换成C++。我发布了java实现，因为它更完整。 使用A来跟踪使用/未使用的数字而不是布尔数组。那么，运行时间将是“代码> n log n < < /P>
最直接的解决方案是：
浏览列表并构建“未使用的”BST
再次浏览该列表，记录到目前为止在“使用过的”BST中看到的数字
如果发现重复项，则将其替换为“未使用”BST的随机元素
我的方法是
1.将数组复制到Java中的集合
Set将以尽可能快的复杂度自动删除重复项（因为Sun Micro已经实现了它，通常他们的方法是最快的，如..使用TimSort进行排序等..）
计算集合的大小（）
该尺寸不会为您提供任何副本
现在将数组0-n-1复制到同一集合中…将插入缺少的值
我认为甚至可以使用运行时n
。这样做的目的是跟踪原始列表中使用的项目以及在单独数组中处理过程中使用的其他项目。可能的java实现如下所示：
int[] list = { 10, 4, 5, 7, 10, 9, 10, 9, 8, 10, 5 };

boolean[] used = new boolean[list.length];
for (int i : list) {
    used[i] = true;
}

boolean[] done = new boolean[list.length];
int nextUnused = 0;

Arrays.fill(done, false);

for (int idx = 0; idx < list.length; idx++) {
    if (done[list[idx]]) {
        list[idx] = nextUnused;
    }
    done[list[idx]] = true;
    while (nextUnused < list.length && (done[nextUnused] || used[nextUnused])) {
        nextUnused++;
    }
}

System.out.println(Arrays.toString(list));

int[]list={10,4,5,7,10,9,10,9,8,10,5}；
boolean[]used=新的boolean[list.length]；
用于（int i:列表）{
used[i]=true；
}
布尔值[]完成=新布尔值[list.length]；
int nextUnused=0；
数组。填充（完成，错误）；
for（int-idx=0；idx
以下是我将如何编写它
public static int[] uniqueify(int... input) {
    Set<Integer> unused = new HashSet<>();
    for (int j = 0; j < input.length; j++) unused.add(j);
    for (int i : input) unused.remove(i);
    Iterator<Integer> iter = unused.iterator();
    Set<Integer> unique = new LinkedHashSet<>();
    for (int i : input)
        if (!unique.add(i))
            unique.add(iter.next());
    int[] result = new int[input.length];
    int k = 0;
    for (int i : unique) result[k++] = i;
    return result;
}

public static void main(String... args) {
    System.out.println(Arrays.toString(uniqueify(10, 4, 5, 7, 10, 9, 10, 9, 8, 10, 5)));
}

List needsreplace=newLinkedList（）；
boolean[]seen=新的boolean[input.length]；
对于（int i=0；i

这应该在2n左右运行。列表操作是固定时间的，即使第二个循环看起来是嵌套的，但外部循环的运行次数明显少于n次迭代，而内部循环总共只运行n次。
C#但应该很容易转换为java.O（n）
int[]list={0,0,6,0,5,0,4,0,1,2,3}；
int N=list.length；
布尔值[]InList=新布尔值[N]；
布尔值[]已使用=新布尔值[N]；
int[]未使用=新的int[N]；
对于（inti=0；i

编辑：尝试将其从c#转换为java。我这里没有java，因此它可能无法编译，但应该很容易修复。如果java没有自动将数组初始化为false，则可能需要将数组初始化为false。实现这一点的最快方法可能是最简单的方法。我将遍历数据列表，并记录每个d的计数istinct值和标记出现重复项的位置。然后
[10, 4, 5, 7, 0, 9, 1, 2, 8, 3, 6]

List<Integer> needsReplaced = newLinkedList<Integer>();
boolean[] seen = new boolean[input.length];

for (int i = 0; i < input.length; ++i) {
    if (seen[input[i]]) {
        needsReplaced.add(i);
    } else {
        seen[input[i]] = true;
    }

}

int replaceWith = 0;
for (int i : needsReplaced) {
    while (seen[replaceWith]) {
        ++replaceWith;
    }
    input[i] = replaceWith++;
}

        int[] list = { 0, 0, 6, 0, 5, 0, 4, 0, 1, 2, 3 };
        int N = list.length;

        boolean[] InList = new boolean[N];
        boolean[] Used = new boolean[N];
        int[] Unused = new int[N];

        for (int i = 0; i < N; i++) InList[list[i]] = true;
        for (int i = 0, j = 0; i < N; i++) 
            if (InList[i] == false)
                Unused[j++] = i;

        int UnusedIndex = 0;
        for (int i = 0; i < N; i++)
        {
            if (Used[list[i]] == true)
                list[i] = Unused[UnusedIndex++];
            Used[list[i]] = true;
        }

#include <iostream>
#include <cstring>

using namespace std;

int main()
{
  int data[] = { 10, 4, 5, 7, 10, 9, 10, 9, 8, 10, 5 };
  int N = sizeof(data) / sizeof(data[0]);

  int tally[N];
  memset(tally, 0, sizeof(tally));

  int dup_indices[N];
  int ndups = 0;

  // Build a count of each value and a list of indices of duplicate data
  for (int i = 0; i < N; i++) {
    if (tally[data[i]]++) {
      dup_indices[ndups++] = i;
    }
  }

  // Replace each duplicate with the next value having a zero count
  int t = 0;
  for (int i = 0; i < ndups; i++) {
    while (tally[t]) t++;
    data[dup_indices[i]] = t++;
  }

  for (int i = 0; i < N; i++) {
    cout << data[i] << " ";
  }

  return 0;
}

10 4 5 7 0 9 1 2 8 3 6