Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/347.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python中打开.ProcSpec文件?_Python - Fatal编程技术网

如何在python中打开.ProcSpec文件?

如何在python中打开.ProcSpec文件?,python,Python,我从SpectraSite软件(已处理的spectrum,OOI二进制文件)中获得,我试图用 f = open(r'C:\Users\path/file.ProcSpec', 'rb') file_content = f.read() f.close() 但文件内容是类似“b'PK\x03\x04\x14\x00\x08\x00\x08\x007t\x…”的数据。 如何将此类文件作为表格数据打开?方法1 在Spectra Suite中,您可以打开并再次导出这些文件,同时将文件类型设置为

我从SpectraSite软件(已处理的spectrum,OOI二进制文件)中获得,我试图用

f = open(r'C:\Users\path/file.ProcSpec', 'rb')
    file_content = f.read()
f.close()
但文件内容是类似“b'PK\x03\x04\x14\x00\x08\x00\x08\x007t\x…”的数据。
如何将此类文件作为表格数据打开?

方法1

在Spectra Suite中,您可以打开并再次导出这些文件,同时将文件类型设置为
制表符分隔的
。然后您应该能够使用csv库导入它们

请参见如何打开csv文件

直接从Python下载

是一个Matlab代码,它可以满足您的需要

从Matlab代码中,我们可以看到OOI格式是一个压缩的xml。此外,似乎还有一些无法由XML解析器解析的字符

知道了这一点,可以使用以下代码打开文件。请注意,这是一个没有错误处理或彻底测试的快速实现

import os
import shutil
import glob
import xml.etree.ElementTree as ET
import numpy as np

def parseNodeText(node):
        # bool
        nodeText = node.text
        if 'true' == nodeText or 'false' == nodeText:
            return nodeText == 'true'
        # int
        try:
            return int(nodeText)
        except:
            pass
        # float
        try:
            return float(nodeText)
        except:
            pass
        # text
        return nodeText

def extractArrayFromNode(node):
    arr = []
    for val in node:
        arr.append(parseNodeText(val))
    if 'double' in node[0].tag:
        dt = np.float32
    elif 'int' in node[0].tag:
        dt = np.int32
    else:
        dt = np.float32
    return np.array(arr,dtype=dt)
    

def readProcSpecFile(filePath):
    
    dirName = os.path.dirname(filePath)
    
    tmpdir = os.path.join(dirName,'tmp')
    os.makedirs(tmpdir, exist_ok=True)
    tmpFile= os.path.join(tmpdir,'tmp.zip')
    shutil.copy(filePath, tmpFile)
    shutil.unpack_archive(tmpFile, tmpdir)
    
    for f in glob.glob(tmpdir+r'\*.xml'):
        if 'OOISignatures.xml' in f:
            continue
        
        with open(f,"rb") as fi:
            s = fi.read()
       
        badChars = [b'\xa0',b'\x89',b'\x80'] 
        for c in badChars:
            s = s.replace(c,b' ')
            
        with open(f,"wb") as fi:
            fi.write(s)
        
        tree = ET.parse(f.encode("utf-8"))
        root = tree.getroot()
          
        result = [] 
        for spectra in root[0]:
            s = spectra[0]
            spect = {}
            for node in spectra:
                t = node.tag
                n = parseNodeText(node)
                if t == 'pixelValues':
                    spect[t] = extractArrayFromNode(node)
                elif t == 'channelWavelengths':
                    spect[t] = extractArrayFromNode(node)
                elif t == 'acquisitionTime':
                    times =  {}
                    for val in node:
                        times[val.tag] =  parseNodeText(val)
                    spect[t] = times
                    
                elif t == 'certificates':
                    pass
                elif t == 'channelCoefficients':
                    pass
                
                else:
                    spect[t] = n
            result.append(spect)  
        shutil.rmtree(tmpdir)
        return result
下面是一个关于如何使用它的示例:

iFile = r'your\path\file.ProcSpec'
spects = readProcSpecFile(iFile)
spect = spects[0]  
print(spect.keys())
wavelength = spect['channelWavelengths']
pixelValues = spect['pixelValues']
这是输出:

dict_keys(['saturated', 'integrationTime', 'strobeEnabled', 'strobeDelay', 'pixelValues', 'acquisitionTime', 'boxcarWidth', 'scansToAverage', 'correctForElectricalDark', 'correctForNonLinearity', 'correctForStrayLight', 'rotationEnabled', 'userName', 'channelWavelengths', 'channelNumber', 'channelStabilityScanEnabled', 'channelExternalTriggerEnabled', 'laserWavelength', 'interlock', 'numberOfPixels', 'numberOfDarkPixels', 'spectrometerSerialNumber', 'spectrometerFirmwareVersion', 'spectrometerClass', 'spectrometerPlugins', 'spectrometerNumberOfChannels', 'spectrometerMaximumIntensity', 'spectrometerMinimumIntegrationTime', 'spectrometerMaximumIntegrationTime', 'spectrometerIntegrationTimeStep', 'spectrometerIntegrationTimeBase', 'spectrometerNumberOfPixels', 'spectrometerNumberOfDarkPixels'])

In [2]:wavelength
Out[2]:array([ 199.51251,  199.98462,  200.4567 , ..., 1116.266  , 1116.6813 ,
       1117.0964 ], dtype=float32)

In [3]:pixelValues
Out[3]:array([64000. ,  1958.1,  1958.1, ...,  1957.1,  1957.1,  1957.5],
      dtype=float32)
以下是
波长
阵列的曲线图:


方法1

在Spectra Suite中,您可以打开并再次导出这些文件,同时将文件类型设置为
制表符分隔的
。然后您应该能够使用csv库导入它们

请参见如何打开csv文件

直接从Python下载

是一个Matlab代码,它可以满足您的需要

从Matlab代码中,我们可以看到OOI格式是一个压缩的xml。此外,似乎还有一些无法由XML解析器解析的字符

知道了这一点,可以使用以下代码打开文件。请注意,这是一个没有错误处理或彻底测试的快速实现

import os
import shutil
import glob
import xml.etree.ElementTree as ET
import numpy as np

def parseNodeText(node):
        # bool
        nodeText = node.text
        if 'true' == nodeText or 'false' == nodeText:
            return nodeText == 'true'
        # int
        try:
            return int(nodeText)
        except:
            pass
        # float
        try:
            return float(nodeText)
        except:
            pass
        # text
        return nodeText

def extractArrayFromNode(node):
    arr = []
    for val in node:
        arr.append(parseNodeText(val))
    if 'double' in node[0].tag:
        dt = np.float32
    elif 'int' in node[0].tag:
        dt = np.int32
    else:
        dt = np.float32
    return np.array(arr,dtype=dt)
    

def readProcSpecFile(filePath):
    
    dirName = os.path.dirname(filePath)
    
    tmpdir = os.path.join(dirName,'tmp')
    os.makedirs(tmpdir, exist_ok=True)
    tmpFile= os.path.join(tmpdir,'tmp.zip')
    shutil.copy(filePath, tmpFile)
    shutil.unpack_archive(tmpFile, tmpdir)
    
    for f in glob.glob(tmpdir+r'\*.xml'):
        if 'OOISignatures.xml' in f:
            continue
        
        with open(f,"rb") as fi:
            s = fi.read()
       
        badChars = [b'\xa0',b'\x89',b'\x80'] 
        for c in badChars:
            s = s.replace(c,b' ')
            
        with open(f,"wb") as fi:
            fi.write(s)
        
        tree = ET.parse(f.encode("utf-8"))
        root = tree.getroot()
          
        result = [] 
        for spectra in root[0]:
            s = spectra[0]
            spect = {}
            for node in spectra:
                t = node.tag
                n = parseNodeText(node)
                if t == 'pixelValues':
                    spect[t] = extractArrayFromNode(node)
                elif t == 'channelWavelengths':
                    spect[t] = extractArrayFromNode(node)
                elif t == 'acquisitionTime':
                    times =  {}
                    for val in node:
                        times[val.tag] =  parseNodeText(val)
                    spect[t] = times
                    
                elif t == 'certificates':
                    pass
                elif t == 'channelCoefficients':
                    pass
                
                else:
                    spect[t] = n
            result.append(spect)  
        shutil.rmtree(tmpdir)
        return result
下面是一个关于如何使用它的示例:

iFile = r'your\path\file.ProcSpec'
spects = readProcSpecFile(iFile)
spect = spects[0]  
print(spect.keys())
wavelength = spect['channelWavelengths']
pixelValues = spect['pixelValues']
这是输出:

dict_keys(['saturated', 'integrationTime', 'strobeEnabled', 'strobeDelay', 'pixelValues', 'acquisitionTime', 'boxcarWidth', 'scansToAverage', 'correctForElectricalDark', 'correctForNonLinearity', 'correctForStrayLight', 'rotationEnabled', 'userName', 'channelWavelengths', 'channelNumber', 'channelStabilityScanEnabled', 'channelExternalTriggerEnabled', 'laserWavelength', 'interlock', 'numberOfPixels', 'numberOfDarkPixels', 'spectrometerSerialNumber', 'spectrometerFirmwareVersion', 'spectrometerClass', 'spectrometerPlugins', 'spectrometerNumberOfChannels', 'spectrometerMaximumIntensity', 'spectrometerMinimumIntegrationTime', 'spectrometerMaximumIntegrationTime', 'spectrometerIntegrationTimeStep', 'spectrometerIntegrationTimeBase', 'spectrometerNumberOfPixels', 'spectrometerNumberOfDarkPixels'])

In [2]:wavelength
Out[2]:array([ 199.51251,  199.98462,  200.4567 , ..., 1116.266  , 1116.6813 ,
       1117.0964 ], dtype=float32)

In [3]:pixelValues
Out[3]:array([64000. ,  1958.1,  1958.1, ...,  1957.1,  1957.1,  1957.5],
      dtype=float32)
以下是
波长
阵列的曲线图:


是的,我也找到了这些方法。但问题是,我有数百个这种类型的文件,我不想在软件中手动打开它们,并将它们再次保存为以制表符分隔的文件,这将花费太长时间。我不熟悉MatLab,所以其他解决方案我无能为力。无论如何,谢谢您的帮助。@RehimAlizadeh请参阅updateIt出现
shutil.unpack\u archive
无法将.procspec文件识别为归档文件,但在我将文件扩展名替换为.zip后,它仍然有效。Still ET.parse向我提供了以下错误
xml.etree.ElementTree.ParseError:格式不正确(无效标记):第4300行第39列
@RehimAlizadeh请参阅更新我完全修改了我的答案。希望这能完全涵盖你的问题,是的,我也找到了这些方法。但问题是,我有数百个这种类型的文件,我不想在软件中手动打开它们,并将它们再次保存为以制表符分隔的文件,这将花费太长时间。我不熟悉MatLab,所以其他解决方案我无能为力。无论如何,谢谢您的帮助。@RehimAlizadeh请参阅updateIt出现
shutil.unpack\u archive
无法将.procspec文件识别为归档文件,但在我将文件扩展名替换为.zip后,它仍然有效。Still ET.parse向我提供了以下错误
xml.etree.ElementTree.ParseError:格式不正确(无效标记):第4300行第39列
@RehimAlizadeh请参阅更新我完全修改了我的答案。希望这能完全涵盖你的问题,