String 用随机数替换文件中的重复数_String_Bash_Sed

String 用随机数替换文件中的重复数

string bash sed

String 用随机数替换文件中的重复数,string,bash,sed,String,Bash,Sed,我希望使用“sed”将文件每行中出现的所有数字替换为随机数。例如，如果我的文件每行都有数字892，我想用一个介于800和900之间的唯一随机数替换它输入文件：- temp11;djaxfile11;892 temp12;djaxfile11;892 temp13;djaxfile11;892 temp14;djaxfile11;892 temp15;djaxfile11;892 预期输出文件：- temp11;djaxfile11;805 temp12;djaxfile

我希望使用“sed”将文件每行中出现的所有数字替换为随机数。例如，如果我的文件每行都有数字892，我想用一个介于800和900之间的唯一随机数替换它

输入文件：-

temp11;djaxfile11;892  
temp12;djaxfile11;892  
temp13;djaxfile11;892  
temp14;djaxfile11;892  
temp15;djaxfile11;892

预期输出文件：-

temp11;djaxfile11;805  
temp12;djaxfile11;846  
temp13;djaxfile11;833  
temp14;djaxfile11;881  
temp15;djaxfile11;810

temp11;djaxfile11;821  
temp12;djaxfile11;821  
temp13;djaxfile11;821  
temp14;djaxfile11;821  
temp15;djaxfile11;821

我正在尝试以下方法：-

sed -i -- "s/;892/;`echo $RANDOM % 100 + 800 | bc`/g" file.txt

但它正在用一个800到900之间的随机数取代所有出现的892

输出文件：-

temp11;djaxfile11;805  
temp12;djaxfile11;846  
temp13;djaxfile11;833  
temp14;djaxfile11;881  
temp15;djaxfile11;810

temp11;djaxfile11;821  
temp12;djaxfile11;821  
temp13;djaxfile11;821  
temp14;djaxfile11;821  
temp15;djaxfile11;821

你能帮我修改一下密码吗？提前感谢。

使用GNU，您可以执行以下操作

sed '/;892$/ { h; s/.*/echo $((RANDOM % 100 + 800))/e; x; G; s/892\n// }' filename

…但用awk做这件事要明智得多：

awk -F \; 'BEGIN { OFS = FS } $NF == 892 { $NF = int(rand() * 100 + 800) } 1' filename

为确保随机数是唯一的，修改awk代码如下：

awk -F \; 'BEGIN { OFS = FS } $NF == 892 { do { $NF = int(rand() * 100 + 800) } while(!seen[$NF]++) } 1'

和塞德那样做对我来说太疯狂了请注意，只有当文件中最后一个字段为892的行少于100行时，此操作才有效。

解释 sed代码如下所示：

/;892$/ {                              # if a line ends with ;892
  h                                    # copy it to the hold buffer
  s/.*/echo $((RANDOM % 100 + 800))/e  # replace the pattern space with the
                                       # output of echo $((...))
                                       # Note: this is a GNU extension
  x                                    # swap pattern space and hold buffer
  G                                    # append the hold buffer to the PS
                                       # the PS now contains line\nrandom number
  s/892\n//                            # remove the old field and the newline
}

awk代码要简单得多。带有

-F\，我们告诉awk以分号分隔行，然后
BEGIN { OFS = FS }  # output field separator is input FS, so the output
                    # is also semicolon-separated
$NF == 892 {        # if the last field is 892
                    # replace it with a random number
  $NF = int(rand() * 100 + 800)
}
1                   # print.

经修订的awk代码取代
$NF = int(rand() * 100 + 800)

与
…换句话说，它保存了一个它已经使用过的随机数表，并一直绘制数字，直到得到一个它以前从未见过的数字。
您必须绝对在sed中这样做吗？这在Python或PERL中很容易，所以您的文件中的行数永远不会超过101行，对吗？这个数字实际上不是随机的，因为它至少部分是由前面几行决定的？我的文件实际上有数千条记录。Wintermute提出的sed建议工作得很好，尽管这需要一些时间。从性能角度来看，awk是否更快？有什么想法吗？非常感谢！我尝试了你建议的sed代码，效果很好。我将尝试awk选项，并探索哪种方法是最快和最好的。回答了我自己的问题（现已删除）：（1）awk数组将接受字符串键，因此这应该适用于字符串替换；（2）如果您出现异常行为并使用系统
调用，它可能会返回状态代码并打印（不返回）输出。