Awk grep、剪切并从文件中删除\n_Awk_Grep_Cut_Tr

Awk grep、剪切并从文件中删除\n

awk grep

Awk grep、剪切并从文件中删除\n,awk,grep,cut,tr,Awk,Grep,Cut,Tr,我正在处理一个输入文件，其中包含新行上的用户ID列表。在bash脚本中，我在输入文件上运行while循环，使用grep-E执行ldapsearch查询，以筛选所需的结果。生成的输出文件当前的格式如下（/mountpoint/out\u file\u 1.out）然而，期望的输出应该如下所示用户识别号1；myORG_重新命名1 用户识别号1；myORG_重新命名2 用户识别码2；myORG_重新命名2 用户识别码2；myORG_重新命名3 到目前为止，我已经尝试使用grep和cut来实现上述所

我正在处理一个输入文件，其中包含新行上的用户ID列表。在bash脚本中，我在输入文件上运行while循环，使用grep-E执行ldapsearch查询，以筛选所需的结果。生成的输出文件当前的格式如下（/mountpoint/out\u file\u 1.out）

然而，期望的输出应该如下所示

用户识别号1；myORG_重新命名1
用户识别号1；myORG_重新命名2
用户识别码2；myORG_重新命名2
用户识别码2；myORG_重新命名3

到目前为止，我已经尝试使用grep和cut来实现上述所需的输出。下面是我在上面第一个结果文件上运行的确切命令：

grep -E '(^uid=|myORG_RESname1|myORG_RESname2|myORG_RESname3)' /mountpoint/out_file_1.out | cut -d, -f1 >&5

这将导致第二次输出（/mountpoint/out\u file\u 2.out）

再次使用cut运行另一个grep：

grep -E 'LDAPresource|uid=' /mountpoint/out_file_2.out | cut -d= -f2 >&6

最后生成此输出（/mountpoint/out\u file\u 3.out）：

这几乎就是我所需要的。我生成的最后一个输出需要去掉换行符，并为找到的每个资源名称重复用户ID，正如所需输出（/mountpoint/final_output.out）中所述：

使用：

tr'\n'；'<“输入文件>输出文件”

未给出所需的结果

有什么办法可以做到这一点吗？非常感谢您的帮助

编辑：

下面是我正在运行的实际bash脚本，以供参考：

#!/bin/bash

# assign file descriptor for input fd
exec 3< /mountpoint/userlist
# assign file descriptor for output fd unfiltered
exec 4> /mountpoint/out_file_1.out
# assign file descriptor for output fd filtered
exec 5> /mountpoint/out_file_2.out
# assign file descriptor for output fd final
exec 6> /mountpoint/out_file_3.out

while IFS= read -ru 3 LINE; do
    ldapsearch -h IPADDR -D "uid=admin,cn=Users,ou=Department,dc=myDC" -w somepwd "(uid=$LINE)" LDAPresource >&4
    grep -E '(^uid=|Resource1|Resource2|Resource3)' /mountpoint/out_file_1.out | cut -d, -f1 >&5
    grep -E 'TAMresource|uid=' /mountpoint/out_file_2.out | cut -d= -f2 >&6
    #tr '\n' ';' < input_filename > file
done
# close fd #3 inputfile
exec 3<&-
# close fd #4 & 5 outputfiles
exec 4>&-
exec 5>&-
# exit with 0 success status
exit 0

#/bin/bash
#为输入指定文件描述符
执行3/mountpoint/out\u file\u 1.out
#为输出指定文件描述符
exec 5>/mountpoint/out\u file\u 2.out
#为输出fd final分配文件描述符
exec 6>/mountpoint/out\u file\u 3.out
而IFS=读取-ru 3行；做
ldapsearch-h IPADDR-D“uid=admin，cn=Users，ou=Department，dc=myDC”-w somepwd“（uid=$LINE）”LDAPresource>&4
grep-E'（^uid=|Resource1 | Resource2 | Resource3）/mountpoint/out_file_1.out | cut-d，-f1>&5
grep-E'TAMresource | uid='/mountpoint/out_file_2.out | cut-d=-f2>&6
#tr'\n''；'<输入文件名>文件名
完成
#关闭fd#3输入文件
执行官3&-
执行官5>&-
#以0成功状态退出
出口0

对于所显示的示例，请尝试以下内容。使用GNU

awk

中显示的样本编写和测试

awk '
match($0,/uid=[^,]*/){
  val1=substr($0,RSTART+4,RLENGTH-4)
  next
}
{
  val=""
  while($0){
    match($0,/LDAPresource=[^ ]*/)
    val=(val?val OFS:"")(val1 ";" substr($0,RSTART+13,RLENGTH-13))
    $0=substr($0,RSTART+RLENGTH)
  }
  print val
}' Input_file

说明：添加上述内容的详细说明

awk '                                 ##Starting awk program from here.
match($0,/uid=[^,]*/){                ##Using match function to match regex uid= till comma comes in current line.
  val1=substr($0,RSTART+4,RLENGTH-4)  ##Creating val1 variable which has sub string of matched regex of above.
  next                                ##next will skip all further statements from here.
}
{
  val=""                              ##Nullifying val variable here.
  while($0){                          ##Running loop till current line value is not null.
    match($0,/LDAPresource=[^ ]*/)    ##using match to match regex from string LDAPresource= till space comes.
    val=(val?val OFS:"")(val1 ";" substr($0,RSTART+13,RLENGTH-13))  ##Creating val which has val1 ; and sub string of above matched regex.
    $0=substr($0,RSTART+RLENGTH)      ##Saving rest of line in current line.
  }
  print val                           ##Printing val here.
}' Input_file                         ##Mentioning Input_file name here.

要执行的转换的规范不清楚。似乎您希望成对处理行，使用在每对行的第一行上表示的

uid

属性和在每对行的第二行上指定的两个LDAPresource属性，并将它们组合成两行，每行包含一个

id；资源

配对

首先，我不会为此使用

grep

或

cut

sed

或

awk

将是更合适的工具。我更像是一个

sed

人，而不是

awk

人，但我相信一个非常简单的

awk

脚本可以一次完成这项工作。使用

sed

，我将使用两种：

首先，从输入到第三个输出，如下所示：

sed 's/^[^=]*=//; s/,.*//; n; s/LDAPresource=//g; s/ \{1,\}/\n/'

其次，要组合生成的三行以获得所需的输出：
```
sed 's/$/;/; h; N; x; N; H; x; s/;\n/;/g'
```

您可以通过管道将它们连接到一个命令中（尽管我当然建议为此编写一个脚本，而不是在命令行中全部键入）：

解释

给定的每个

sed

命令指定一个以分号分隔的步骤序列，该序列将在一个周期内执行，直到输入用尽为止

这是第一个多行格式的，带有注释

# The next line of input is implicitly read into sed's pattern space, sans trailing newline

# Replace the leading substring up to the first '=' with nothing (that is, delete it)
s/^[^=]*=//

# Replace the substring from the first comma to the end of the line with nothing.
# This leaves just the uid value.
s/,.*//

# Print the contents of the pattern space followed by a newline (supposes that the
# -n command line option has not been given) and replace the contents of the pattern
# space with the next line of input.
n

# Replace all substrings 'LDAPresource=' in the pattern space with nothing
s/LDAPresource=//g

# Replace the first (and only) run of one or more consecutive space characters with a newline
s/ \{1,\}/\n/

# The remaining contents of the pattern space and a trailing newline are printed at this point
# (assuming no '-n' option) and the cycle repeats.

第二个是：

# The next line of input is implicitly read into sed's pattern space sans trailing newline

# Substitute a semicolon (;) for the zero-length space at the end of the line (that
# is, append a semicolon).
s/$/;/

# Copy the contents of the pattern space into the hold space.  Both spaces then contain
# the uid plus a semicolon
h

# Append a newline followed by the next line of input (sans trailing newline) to the
# pattern space
N

# Swap the contents of the pattern and hold spaces.
x

# Append a newline followed by the next line of input (sans trailing newline) to the
# pattern space
N

# Append a newline followed by the contents of the pattern space to the hold space.
# After this, the contents of the hold space have the form
# <uid>;<newline><resource1><newline><uid>;<newline><resource2>
H

# Swap the pattern and hold spaces
x

# Replace each (semicolon, newline) pair with just a semicolon.  This completes
# joining the uid and resource pairs into semicolon-(only-)delimited form,
# leaving a newline between each pair
s/;\n/;/g

# The remaining contents of the pattern space and a trailing newline are printed at this
# point (assuming no '-n' option) and the cycle repeats.

#下一行输入被隐式地读入sed的模式空间，无尾随换行符
#用分号（；）替换行（该行）末尾的零长度空格
#是，请附加分号）。
s/$//
#将图案空间的内容复制到保留空间中。然后两个空间都包含
#uid加上分号
H
#将换行符后接下一行输入（无尾随换行符）附加到
#模式空间
N
#交换图案的内容并保留空格。
x
#将换行符后接下一行输入（无尾随换行符）附加到
#模式空间
N
#将紧跟模式空间内容的换行符追加到保留空间。
#在此之后，保留空间的内容具有以下形式：
# ;;
H
#交换图案并保留空格
x
#将每个（分号、换行符）对替换为一个分号。这就完成了
#将uid和资源对合并为分号分隔的形式，
#在每对之间留下一条新线
s/\n//G
#图案空间和尾随换行符的剩余内容在此位置打印
#点（假设无'-n'选项），循环重复。

示例输入和输出很有帮助，但它们本身让我们猜测您想要执行的转换的实际规格。添加

/mountpoint/out\u file\u 1.out

和您想要的输出，用于您的问题的示例输入（无注释）。这确实非常有效。@z0d1ac，欢迎您，干杯，快乐学习。我也在帖子中添加了详细的解释。

sed 's/^[^=]*=//; s/,.*//; n; s/LDAPresource=//g; s/ \{1,\}/\n/'

sed 's/$/;/; h; N; x; N; H; x; s/;\n/;/g'

sed 's/^[^=]*=//; s/,.*//; n; s/LDAPresource=//g; s/ \{1,\}/\n/' /mountpoint/out_file_1.out |
  sed 's/$/;/; h; N; x; N; H; x; s/;\n/;/g'

# The next line of input is implicitly read into sed's pattern space, sans trailing newline

# Replace the leading substring up to the first '=' with nothing (that is, delete it)
s/^[^=]*=//

# Replace the substring from the first comma to the end of the line with nothing.
# This leaves just the uid value.
s/,.*//

# Print the contents of the pattern space followed by a newline (supposes that the
# -n command line option has not been given) and replace the contents of the pattern
# space with the next line of input.
n

# Replace all substrings 'LDAPresource=' in the pattern space with nothing
s/LDAPresource=//g

# Replace the first (and only) run of one or more consecutive space characters with a newline
s/ \{1,\}/\n/

# The remaining contents of the pattern space and a trailing newline are printed at this point
# (assuming no '-n' option) and the cycle repeats.

# The next line of input is implicitly read into sed's pattern space sans trailing newline

# Substitute a semicolon (;) for the zero-length space at the end of the line (that
# is, append a semicolon).
s/$/;/

# Copy the contents of the pattern space into the hold space.  Both spaces then contain
# the uid plus a semicolon
h

# Append a newline followed by the next line of input (sans trailing newline) to the
# pattern space
N

# Swap the contents of the pattern and hold spaces.
x

# Append a newline followed by the next line of input (sans trailing newline) to the
# pattern space
N

# Append a newline followed by the contents of the pattern space to the hold space.
# After this, the contents of the hold space have the form
# <uid>;<newline><resource1><newline><uid>;<newline><resource2>
H

# Swap the pattern and hold spaces
x

# Replace each (semicolon, newline) pair with just a semicolon.  This completes
# joining the uid and resource pairs into semicolon-(only-)delimited form,
# leaving a newline between each pair
s/;\n/;/g

# The remaining contents of the pattern space and a trailing newline are printed at this
# point (assuming no '-n' option) and the cycle repeats.

$ awk -F'[=,[:space:]]+' -v OFS=',' 'NR%2{uid=$2; next} {print uid, $2 ORS uid, $4}' file
user_id1,myORG_RESname1
user_id1,myORG_RESname2
user_id2,myORG_RESname2
user_id2,myORG_RESname3