Bash 将文本文件中的所有数字缩写转换为数值_Bash_Awk_Sed

Bash 将文本文件中的所有数字缩写转换为数值

bash awk sed

Bash 将文本文件中的所有数字缩写转换为数值,bash,awk,sed,Bash,Awk,Sed,我想将文本文件中的所有数字缩写（如1K、100K、1M等）转换为纯数字值（如1000、100000、1000000等）例如，如果我有以下文本文件： 1.3K apples 87.9K oranges 156K mangos 541.7K carrots 1.8M potatoes 我想在bash中将其转换为以下内容： 1300 apples 87900 oranges 156000 mangos 541700 carrots 1800000 potatoes 我使用的命令是将匹配的数字缩写

我想将文本文件中的所有数字缩写（如1K、100K、1M等）转换为纯数字值（如1000、100000、1000000等）

例如，如果我有以下文本文件：

1.3K apples
87.9K oranges
156K mangos
541.7K carrots
1.8M potatoes

我想在bash中将其转换为以下内容：

1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes

我使用的命令是将匹配的数字缩写字符串替换为完整的数值，如下所示：

sed -e 's/1K/1000/g' -e 's/1M/1000000/g' text-file.txt

我的问题是，当发生变化时，我无法找到并替换所有可能的数字缩写。我想这样做，直到至少有一个十进制缩写。

请您尝试以下内容，用GNU

awk

中显示的示例编写和测试

awk '
{
  if(sub(/[kK]$/,"",$1)){
    $1*=1000
  }
  if(sub(/[mM]$/,"",$1)){
    $1*=1000000
  }
}
1
' Input_file

说明：添加上述内容的详细说明

awk '                     ##Starting awk program from here.
{
  if(sub(/[kK]$/,"",$1)){ ##Checking condition if 1st field ends with k/K then do following. Substituting k/K in first field with NULL here.
    $1*=1000              ##Multiplying 1000 with current 1st field value here.
  }
  if(sub(/[mM]$/,"",$1)){ ##Checking condition if 1st field ends with m/M then do following. Substituting m/M in first field with NULL here.
    $1*=1000000          ##Multiplying 1000000 with current 1st field value here.
  }
}
1                         ##1 will print current line here.
' Input_file              ##Mentioning Input_file name here.

输出如下

1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes

另一种

awk

变体：

awk'{q=substr（$1，length（$1））；
$1*=（q==“M”？1000000:（q==“K”？1000:1））}1'文件
1300个苹果
87900个橙子
156000芒果
541700胡萝卜
180万土豆

这将执行全局替换（如果每行有>1个字符串要转换）：

使用GNU coreutils，不要重新发明轮子

$ numfmt --from=si <file
1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes

$numfmt--from=si更像是一种编程方式，基于此，您可以创建所有可能转换因子的列表，并在需要时执行乘法：
awk 'BEGIN{f["K"]=1000; f["M"]=1000000}
     match($1,/[a-zA-Z]+/){$1 *= f[substr($1,RSTART,RLENGTH)]}
     1' file

鉴于：
纯粹的Bash（与sed和bc配合使用）仅用于咯咯笑：
另一个选项可能是仅使用bash和带有捕获组的模式，您可以在其中捕获M
或K
。如果模式匹配，则测试其中一个并设置乘数，然后使用bc

while IFS= read -r line
do
  if [[ $line =~ ^([[:digit:]]+(\.[[:digit:]]+)?)([MK])( .*)$ ]];then
    echo "$(bc <<< "${BASH_REMATCH[1]} * $([ ${BASH_REMATCH[3]} == "K" ] && echo "1000" || echo "1000000") / 1")${BASH_REMATCH[4]}"
  fi
done < text-file.txt

带有GNU awk的gensub（）
这可能适用于您（GNU-sed）：
创建查找并将其存储在保留空间中
将查找附加到每行，并使用模式匹配将查找中的键替换为其值
最后，在没有找到其他匹配项时打印该行
awk 'BEGIN{f["K"]=1000; f["M"]=1000000}
     match($1,/[a-zA-Z]+/){$1 *= f[substr($1,RSTART,RLENGTH)]}
     1' file

$ cat file
1.3K apples
87.9K oranges
156K mangos
541.7K carrots
1.8M potatoes

while read -r x y 
do 
    new_x=$(echo "$x" | sed -E 's/^([[:digit:].]*)[kK]/\1\*1000/; s/^([[:digit:].]*)[mM]/\1\*1000000/' | bc)
    printf "%'d %s\n" "$new_x" "$y"
done <file  

1,300 apples
87,900 oranges
156,000 mangos
541,700 carrots
1,800,000 potatoes

while IFS= read -r line
do
  if [[ $line =~ ^([[:digit:]]+(\.[[:digit:]]+)?)([MK])( .*)$ ]];then
    echo "$(bc <<< "${BASH_REMATCH[1]} * $([ ${BASH_REMATCH[3]} == "K" ] && echo "1000" || echo "1000000") / 1")${BASH_REMATCH[4]}"
  fi
done < text-file.txt

1300 apples
87900 oranges
156000 mangos
541700 carrots
1800000 potatoes

$ awk '
    BEGIN { mult[""]=1; mult["k"]=1000; mult["m"]=100000 }
    { $1 *= mult[gensub(/[^[:alpha:]]/,"","g",tolower($1))] }
1' file
1300 apples
87900 oranges
156000 mangos
541700 carrots
180000 potatoes

sed -E '1{x;s/^/K00M00000/;x}
        :a;G;s/([0-9])(\.([0-9]))?([KM])(.*)\n.*\4(0*).*/\1\3\6\5/i;ta
        P;d' file