Python 从文本文件中提取文本并以不同格式写入

Python 从文本文件中提取文本并以不同格式写入,python,parsing,text,Python,Parsing,Text,嗨,我正在尝试从程序生成的文件中提取一些文本行,并使用python以不同的格式写入另一个文本文件 以下是我到目前为止的情况: import os import glob path="D:\Programming\Python\Examples\Home\GainWizard\MassLynx\VxWorks\TargetRegistryFiles" os.chdir(path) print os.getcwd() print os.listdir(path) filelist = os

嗨,我正在尝试从程序生成的文件中提取一些文本行,并使用python以不同的格式写入另一个文本文件

以下是我到目前为止的情况:

import os
import glob



path="D:\Programming\Python\Examples\Home\GainWizard\MassLynx\VxWorks\TargetRegistryFiles"
os.chdir(path)
print os.getcwd()
print os.listdir(path)


filelist = os.listdir(os.getcwd())
filelist = filter(lambda x: not os.path.isdir(x), filelist)
newest = max(filelist, key=lambda x: os.stat(x).st_mtime)

print newest
f = open(newest,'r')

data = f.readlines()
print data
这会将所有文本添加到列表中

我所拥有的是

Autotune Ion Energy:Fixed Ion Energy 1,2.000000,Autotune Ion Energy:Fixed Ion Energy      2,2.000000,Autotune Ion Energy:MS1-Neg Opt,0.3,Autotune Ion Energy:MS1-Pos Opt,-0.2,Autotune Ion Energy:MS2-Neg Opt,0.4,Autotune Ion Energy:MS2-Pos Opt,0.6,Autotune Ion Energy:MSMS Mode Fixed Ion Energy 1,0.500000,Autotune Ion Energy:MSMS Mode Fixed Ion Energy 2,2.000000,Autotune Ion Energy:OptimumValuesSet,true,Debug:Use old bunching method,true,Detector Gain Negative:High Gain,368.861012,Detector Gain Negative:Low Gain,73.523644,Detector Gain Negative:a,1.865677e-021,Detector Gain Negative:b,8.441605,Detector Gain Postitve:High Gain,613.662847,Detector Gain Postitve:Low Gain,124.065398,Detector Gain Postitve:a,4.973557e-021,Detector Gain Postitve:b,8.367407,DivertValve:ValveZone,0,Engineers Settings:MS1 DC Balance -,0.300000,Engineers Settings:MS1 DC Polarity,1,Engineers Settings:MS1 High Mass Position,174.000000,Engineers Settings:MS1 High Mass Resolution,1801.000000,Engineers Settings:MS1 Low Mass Position,519.000000,Engineers Settings:MS1 Low Mass Resolution,511.000000,Engineers Settings:MS1 Resolution Linearity,873.000000,Engineers Settings:MS2 DC Balance -,-0.200000,Engineers Settings:MS2 DC Polarity,0,Engineers Settings:MS2 High Mass Position,190.000000,Engineers Settings:MS2 High Mass Resolution,1744.000000,Engineers Settings:MS2 Low Mass Position,519.000000,Engineers Settings:MS2 Low Mass Resolution,514.000000,Engineers Settings:MS2 Resolution Linearity,857.000000,Engineers Settings:PIC MS Scan CE,4.000000,Engineers Settings:PIC Threshold Calc Scan Delay,3,Engineers Settings:PIC decreasing data points,3,Engineers Settings:PIC nonDefault Scan Speed,5000.000000,Engineers Settings:PMT Type,Hamamatsu,Engineers Settings:RF Offset Negative,0.000000,Engineers Settings:RF Offset Positive,0.000000,Failure:Gas failed state,OK,Failure:Leak detected state,Tripped,Fluidics:AcknowledgeCountThreshold,5,Fluidics:ActiveReservoir,2,Fluidics:Aspirate Rate,1000,Fluidics:Draw Rate,1000,Fluidics:Fill Volume,250,Fluidics:Flow Rate,10,Fluidics:Flow State,Waste,Fluidics:Inject-Flow Rate,400,Fluidics:Inject-MethodType,4,Fluidics:Inject-Pump Time1,5,Fluidics:Inject-Pump Time2,6,Fluidics:Inject-Pump Time3,10,Fluidics:Max Flow Rate,1500,Fluidics:Pending Active TimeOut,10,Fluidics:Pending Complete TimeOut,1200,Fluidics:Pending Response TimeOut,10,Fluidics:Power Cycle Delay,3.000000,Fluidics:Precompression Dispense Rate,300,Fluidics:Precompression Dispense Volume,30,Fluidics:Precompression Enable,TRUE,Fluidics:Precompression Max Fill Volume,280,Fluidics:Purge Delay Length,1,Fluidics:Refill Wait Time,60.000000,Fluidics:Sample Purge Count,0,Fluidics:Wash Purge Count,1,Instrument:Collision gas status,off,Instrument:EPC Version,Feb 15 2012,Instrument:Serial Number,QCA331,Instrument:Unique Name,,Ion Energy Settings:Fixed Ion Energy 1,3.000000,Ion Energy Settings:Fixed Ion Energy 2,3.000000,Maintenance Counters:DAYS_SINCE_LAST_SERVICE_THRESHOLD,0,Maintenance Counters:OPERATE_SWITCHES,28,Maintenance Counters:OPERATE_SWITCHES_THRESHOLD,0,Maintenance Counters:OPERATE_TIME,141233,Maintenance Counters:OPERATE_TIME_THRESHOLD,0,Maintenance Counters:POLARITY_SWITCHES,187,Maintenance Counters:POLARITY_SWITCHES_THRESHOLD,0,Maintenance Counters:VACUUM_TIME,763973,Maintenance Counters:VACUUM_TIME_THRESHOLD,0,Protective Actions:ENABLE_DIVERT_TO_WASTE,1,Scan Parameters:Interchannel Delay,0.020000,Scan Parameters:Interscan Delay,0.020000,Scan Parameters:Manual Mode,true,Scan Parameters:Polarity Switching Interscan Delay,0.020000,Scan Parameters:Scan Speed Options,1000\,2000\,5000\,10000,Scan speed adjust::DefaultsVersionLevel,2,Scan speed adjust:HIGH_SCALE_MASS_ADJUST_MS1_SETTING,-60.000000,Scan speed adjust:HIGH_SCALE_MASS_ADJUST_MS2_SETTING,-32.000000,Scan speed adjust:ION_ENERGY_1_RAMP_SETTING,2.000000,Scan speed adjust:ION_ENERGY_2_RAMP_SETTING,2.000000,Scan speed adjust:LINEARITY_ADJUST_MS1_SETTING,0.000000,Scan speed adjust:LINEARITY_ADJUST_MS2_SETTING,0.000000,Scan speed adjust:LOW_MASS_RESOLUTION_MS1_SETTING,10.000000,Scan speed adjust:LOW_MASS_RESOLUTION_MS2_SETTING,20.000000,Scan speed adjust:LOW_SCALE_MASS_ADJUST_MS1_SETTING,-15.000000,Scan speed adjust:LOW_SCALE_MASS_ADJUST_MS2_SETTING,-15.000000,Scan speed adjust:MS1_ION_ENERGY_SETTING,1.000000,Scan speed adjust:MS1_ION_ENERGY_WRITE_SETTING,1.000000,Scan speed adjust:MS2_ION_ENERGY_SETTING,0.700000,Scan speed adjust:MS2_ION_ENERGY_WRITE_SETTING,0.700000,Scan speed adjust:RESOLUTION_ADJUST_MS1_SETTING,-15.000000,Scan speed adjust:RESOLUTION_ADJUST_MS2_SETTING,0.000000
我需要的是

START_TARGET_REGISTRY
Detector Gain Negative:a,1.087668e-021
Detector Gain Negative:b,8.536190
Detector Gain Negative:High Gain,392.233021 
Detector Gain Negative:Low Gain,76.782164
Detector Gain Postitve:a,4.061385e-021 
Detector Gain Postitve:b,8.398445
Detector Gain Postitve:High Gain,610.368775
Detector Gain Postitve:Low Gain,122.669833
END_TARGET_REGISTRY

谢谢

有些事情还不太清楚,比如你是否需要比“探测器增益”参数更多的参数,或者数字来自哪里(因为它们没有出现在你的例子中)

但是,这可能会让您到达您需要的位置:

from collections import OrderedDict

D = OrderedDict()
for field in data.split(','):    
    if ':' in field:
        k = field
    else:
        D[k]= field.strip()

with open(r"C:\temp\detector_gain.txt", 'w') as outfile:
    print("START_TARGET_REGISTRY", file=outfile)
    for k, v in D.items():
        if "Detector Gain" in k:
           print(k, v, sep=',', file=outfile)
    print("END_TARGET_REGISTRY", file=outfile)
由于数据的格式似乎是
CATEGORY_1:KEY_1、VALUE_1、CATEGORY_2:KEY_2、VALUE_2…
,我们使用该方法将数据分解为每个逗号处的字段

然后我们循环遍历每个字段,寻找一个:字符,它告诉我们正在读取一个
CATEGORY:KEY
字段

一旦我们有了
CATEGORY:KEY
字段,我们就知道下一个字段将是关联的值。因此,我们将其添加到Python字典中,该字典将键映射到值。我选择了字典,以防字段的顺序很重要

最后,我们阅读了我们构建的字典,寻找“检测器增益”字段。然后我们将它们打印到一个输出文件中——您可以看到我们如何使用上下文管理器打开它进行写作


如果您使用的是Python2,也可以从uuu future uuuu导入print u函数

放在一边-使用
latest=max(filelist,key=os.path.getmtime)可能更有效(或者至少更可读)
谢谢您的回复。不必担心代码的这一部分,只要我能弄清楚如何提取所需的信息,我就会整理一下。我是Python新手,因此非常感谢任何帮助。另一方面,请注意:最好在路径中使用正斜杠(
/
),即使在Windows上也是如此。或者逃避背后的打击。或者使用原始弦。嗨,Bo102010,我今晚试过了,效果很好,你能告诉我这是如何工作的吗。如何将输出写入文件,谢谢你的帮助。我简化了代码并添加了一些注释。如果这对你有效,你可以接受这个答案。