Python 从txt文件中提取数据并获得简洁的输出

Python 从txt文件中提取数据并获得简洁的输出,python,linux,awk,Python,Linux,Awk,我需要从.txt文件中提取一些信息,并获得一行简洁的输出 输出应该如下所示: Display 1-VMware SVGA 3D-1600 x 900 x 32位@60 Hz-主设备 文本文件包含以下信息: ws_diag 5.3.0 build-1427931 Device \\.\DISPLAY1 Desc = "VMware SVGA 3D" Mode = 1555 x 794 x 32-bit @ 60Hz Bounds = 0,0 1555,794 Flags

我需要从.txt文件中提取一些信息,并获得一行简洁的输出 输出应该如下所示:
Display 1-VMware SVGA 3D-1600 x 900 x 32位@60 Hz-主设备

文本文件包含以下信息:

ws_diag 5.3.0 build-1427931
Device \\.\DISPLAY1
   Desc = "VMware SVGA 3D"
   Mode = 1555 x 794 x 32-bit @ 60Hz
   Bounds = 0,0  1555,794
   Flags = PRIMARY_DEVICE, ATTACHED_TO_DESKTOP
Device \\.\DISPLAY2
   Desc = "VMware SVGA 3D"
   Flags = 0x00000000
Device \\.\DISPLAYV1
   Desc = "RDPDD Chained DD"
   Flags = MIRRORING_DRIVER, TS_COMPATIBLE
Device \\.\DISPLAYV2
   Desc = "RDP Encoder Mirror Driver"
   Flags = MIRRORING_DRIVER, TS_COMPATIBLE
Device \\.\DISPLAYV3
   Desc = "RDP Reflector Display Driver"
   Flags = MIRRORING_DRIVER, TS_COMPATIBLE
monitor-info.txt (END) 
这就是我到目前为止所做的:

import sys
file = open(monitor-info.txt[1])
while 1:
    line = file.readline()
    tpl = line.split(":")
    if tpl[0] == "Desc":
        var = tpl[0]
    if tpl[1] == "Mode":
        print var, tpl[1]
    if tpl[2] == "Flag":
        var = tpl[2]
    print var
       if not line:
        break
我还尝试了
awk

awk -F: '/^Device/{v=$2}/^Desc/{print v $2}/^Mode/{print v$3}/^Flags/{print v$4}' output_file.txt
gawk -F'\n' -v RS='Device \\\\\\\\.\\\\' '
    NF > 2 { # ignore the extraneous very first line
      delete dict # delete dictionary from previous record
      dict["Device"] = $1 # store device name
      for (i=2;i<NF;++i) { # store other fields in dict.
        split($i, tkns, / = /) # split into field name (e.g., "Desc") and value 
          # clean up strings (remove leading spaces from field name, remove
          # double quotes from value, and store in dictionary.
        dict[gensub(/^ +/, "", "", tkns[1])] = gensub(/"/, "", "g", tkns[2])
      }
        # Output desired fields, using the dictionary.
      printf "%s - %s - %s - %s\n", dict["Device"], dict["Desc"], dict["Mode"], dict["Flags"]
    }
  ' file

只是为了好玩,我想你的第一次
awk
尝试并不遥远。您只需将字段分隔符设置为
-F:
),而它应该是
=

也许你可以试试:

awk 'BEGIN{FS="="; OFS=" - "; desc=""}function display(){print dev, desc, flags}/Device/{if(desc!="") display(); desc=""; flags=""; dev=$0; gsub("Dev.*PLAY", "Display ", dev)}/Desc/{desc=$2}/Flags/{flags=$2}END{display}'
它的作用是:

  • 开始时,将字段分隔符设置为
    =
    ,并将输出字段分隔符设置为
    -
    (用于格式化)
  • 定义一个函数
    display
    来打印一行,因为它将被调用2次
  • 如果行包含
    设备
    ,则打印前面的设备(如果有),存储设备id并重置所有其他变量
  • 如果行包含
    Desc
    (或
    Mode
    ),则在相应变量中存储第二个字段
  • 在文件末尾,打印最后一个设备
所有这些都会产生:

Display 1 - "VMware SVGA 3D" - PRIMARY_DEVICE, ATTACHED_TO_DESKTOP
Display 2 - "VMware SVGA 3D" - 0x00000000
Display V1 - "RDP Encoder Mirror Driver" - MIRRORING_DRIVER, TS_COMPATIBLE
Display V2 - "RDP Encoder Mirror Driver" - MIRRORING_DRIVER, TS_COMPATIBLE
Display V3 - "RDP Encoder Mirror Driver" - MIRRORING_DRIVER, TS_COMPATIBLE

awk
语法有点晦涩,但非常紧凑……

使用GNU
awk

awk -F: '/^Device/{v=$2}/^Desc/{print v $2}/^Mode/{print v$3}/^Flags/{print v$4}' output_file.txt
gawk -F'\n' -v RS='Device \\\\\\\\.\\\\' '
    NF > 2 { # ignore the extraneous very first line
      delete dict # delete dictionary from previous record
      dict["Device"] = $1 # store device name
      for (i=2;i<NF;++i) { # store other fields in dict.
        split($i, tkns, / = /) # split into field name (e.g., "Desc") and value 
          # clean up strings (remove leading spaces from field name, remove
          # double quotes from value, and store in dictionary.
        dict[gensub(/^ +/, "", "", tkns[1])] = gensub(/"/, "", "g", tkns[2])
      }
        # Output desired fields, using the dictionary.
      printf "%s - %s - %s - %s\n", dict["Device"], dict["Desc"], dict["Mode"], dict["Flags"]
    }
  ' file
gawk-F'\n'-v RS='Device''
NF>2{#忽略无关的第一行
删除dict#从以前的记录中删除字典
dict[“设备”]=1美元#存储设备名称

对于(i=2;这不是一个人们为你编写代码的网站。你应该尝试一下,然后你可以问一些更具体的问题。awk-F:'/^Device/{v=$2}/^Desc/{print v$2}/^Mode/{print v$3}/^Flags/{print v$4}output\u file.txt试图使用awk并打印“Display 1-VMware SVGA 3D-1600 x 900 x 32位@60 Hz-主设备”从txt文件中,多行记录是一个非常好的技巧!顺便说一句,感谢您改进了我的答案。它抛出一个错误,说“awk:invally statement input record number 19,file monitor_info.txt source line number 1”而且似乎没有拉出模式字段…我尝试使用简单的awk,但在格式化“/Desc//mode//Flag/”文件名时遇到问题。文件名给出Desc=“VMware SVGA 3D”mode=1555 x 794 x 32位@60Hz Flags=主设备,连接到桌面Desc=“VMware SVGA 3D”Flags=0x00000000 Desc=“RDPDD链接DD”Flags=MIRRORING\u DRIVER,TS\u COMPATIBLE,有什么建议吗?@user3731311脚本没有拉模式我一点也不奇怪:我只是忘了…我会很快编辑我的帖子来修复它。关于错误,你能告诉我输入文件的第19行是什么吗(如果存在的话)谢谢大家终于让它工作了,但它还是发出了一些错误。我确实在不同的文件中得到了输出'BEGIN{FS=“=”OFS=“-”desc=”“}函数display(){print dev,desc,mode,flags}/Device/{if(desc!=”)display();desc=“”;flags=“”;dev=$0;gsub(“dev.*PLAY”,“display”,dev)}/desc/{desc desc desc dev=$2}/mode=$2}/flags/{flags=$2}END}{display}'monitor_info.txt|cat>outputfile.txt这是我的输入文件ws_diag 5.3.0 build-1427931 Device\\.\DISPLAY1 Desc=“VMware SVGA 3D”Mode=1555 x 794 x 32位@60Hz界限=0,0 1555794标志=主设备,连接到桌面设备\\.\DISPLAY2 Desc=“VMware SVGA 3D”Flags=0x00000000设备\\.\DISPLAYV1s Desc=“RDPDD链接DD”Flags=MIRRORING\u驱动程序,TS\u兼容设备\\.\DISPLAYV2 Desc=“RDP编码器镜像驱动程序”Flags=MIRRORING\u驱动程序,TS\u兼容设备\\.\DISPLAYV3 Desc=“RDP反射镜显示驱动程序”标志=镜像\u驱动程序,TS_COMPATIBLE@user3731311在文本文件中键入此数据时,我无法再现任何错误。我假设文件末尾有控制字符。您能给我看一下文件转储(
od-xc
hextump
,…)的最后几行吗?