要根据unix shell中的另一个文件对文件进行排序吗_Shell_Unix

要根据unix shell中的另一个文件对文件进行排序吗

shell unix

要根据unix shell中的另一个文件对文件进行排序吗,shell,unix,Shell,Unix,我有两个文件refere.txt和parse.txt reference.txt包含以下内容 julie,remo,rob,whitney,james parse.txt包含 remo/hello/1.0,remo/hello2/2.0,remo/hello3/3.0,whitney/hello/1.0,julie/hello/2.0,julie/hello/3.0,rob/hello/4.0,james/hello/6.0 现在，我的output.txt应该根据refer.txt中指定的顺

我有两个文件refere.txt和parse.txt

reference.txt包含以下内容

julie,remo,rob,whitney,james

parse.txt包含

remo/hello/1.0,remo/hello2/2.0,remo/hello3/3.0,whitney/hello/1.0,julie/hello/2.0,julie/hello/3.0,rob/hello/4.0,james/hello/6.0

现在，我的output.txt应该根据refer.txt中指定的顺序列出parse.txt中的文件

output.txt的ex应为：

julie/hello/2.0,julie/hello/3.0,remo/hello/1.0,remo/hello2/2.0,remo/hello3/3.0,rob/hello/4.0,whitney/hello/1.0,james/hello/6.0

我尝试了以下代码：

sort -nru refer.txt parse.txt

但是没有运气

请帮助我。TIA

您可以使用gnu awk：

awk -F/ -v RS=',|\n' 'FNR==NR{a[$1] = (a[$1])? a[$1] "," $0 : $0 ; next}
              {s = (s)? s "," a[$1] : a[$1]} END{print s}' parse.txt refer.txt

输出：

julie/hello/2.0、julie/hello/3.0、remo/hello/1.0、remo/hello/2.0、remo/hello/3.0、rob/hello/4.0、whitney/hello/1.0、james/hello/6.0

说明：在纯本机bash（4.x）中：

#将每个文件读入数组
IFS=，read-r-a值命令
while read line; do
  grep -w "^$line" <(tr , "\n" < parse.txt)
done < <(tr , "\n" < refer.txt) | paste -s -d , -

读行时；做
grep-w“^$line”
尽管如此，使用egrep可能是一个同样严格的解决方案，或者至少对于小数据集来说是如此，但考虑到提出的具体问题，我肯定会使用这种方法。（或者可能不是！上述方法很可能更快，也更可靠。）
所以字段是逗号分隔的，内部是斜杠分隔的，您需要按第一个字段和第三个字段对它们进行排序？在答案中添加了一些解释。这几乎肯定是对大输入的最佳答案，因此我投了赞成票。我倾向于保留自己的答案，因为坦率地说，我认为仅内置的本机bash方法更具可读性。我们可以使用cut和grep吗？任何使用cut和grep的方法都比这里的代码效率低得多。（它需要每行使用一次grep，调用大量调用开销——并且每行输出重新读取一次输入文件！）。我对教人们如何做错事情不感兴趣，所以我拒绝参与提供任何这样的解决方案。我对是否投票赞成意见分歧。一方面，它是正确的（只要名称在作为正则表达式处理时不会计算为除自身以外的任何内容，只要输入不包含需要保留的文字反斜杠或尾随空格）。另一方面，从性能的角度来看，这是非常糟糕的，我认为正是遵循这种糟糕做法的人们导致了一些团体认为所有shell脚本本质上都有糟糕的性能，并且必须消亡。而且，这并不是按字段过滤的，所以“hello”将匹配每个条目，即使“hello”在parse.txt中根本不是一个名字。谢谢你的提醒@CharlesDuffy。我更新了grep命令，使其仅从字符串开始匹配。此外，我不能否认你对绩效的评论。它并不是一个完美的闪电，但它最好的优点是简洁。如果OP只是做一些简单的工作，那么它很好，但显然在简洁性和性能之间有一个折衷。
# read each file into an array
IFS=, read -r -a values <parse.txt
IFS=, read -r -a ordering <refer.txt

# create a map from content before "/" to comma-separated full values in preserved order
declare -A kv=( )
for value in "${values[@]}"; do
  key=${value%%/*}
  if [[ ${kv[$key]} ]]; then
    kv[$key]+=",$value" # already exists, comma-separate
  else
    kv[$key]="$value"
  fi
done

# go through refer list, putting full value into "out" array for each entry
out=( )
for value in "${ordering[@]}"; do
  out+=( "${kv[$value]}" )
done

# print "out" array in comma-separated form
IFS=,
printf '%s\n' "${out[*]}" >output.txt

while read line; do
  grep -w "^$line" <(tr , "\n" < parse.txt)
done < <(tr , "\n" < refer.txt) | paste -s -d , -

tr , "\n" refer.txt | cat -n >person_id.txt  # 'cut -n' not posix, use sed and paste

cat person_id.txt | while read person_id person_key
do 
    print "$person_id" > $person_key
done

tr , "\n" parse.txt | sed 's/(^[^\/]*)(\/.*)$/\1 \1\2/' >person_data.txt

cat person_data.txt | while read foreign_key person_data
do 
    person_id="$(<$foreign_key)"
    print "$person_id" " " "$person_data" >>merge.txt
done

sort merge.txt >output.txt

# 1) index data file on required field
cat person_data.txt | while read data
do
    key="$(print "$data" | sed 's/(^[^\/]*)/\1/')"  # alt. `cut -d'/' -f1` ??
    print "$data" >>./person_data/"$key"
done

# 2) run batch job
cat refer_data.txt | while read key
do
    print ./person_data/"$key"
done