Python 使用字符串操作合并文件夹中的多个文本文件以创建数据帧
我的文件夹中有82个文本文件,如下所示:Python 使用字符串操作合并文件夹中的多个文本文件以创建数据帧,python,pandas,string,dataframe,file,Python,Pandas,String,Dataframe,File,我的文件夹中有82个文本文件,如下所示: Comment: Version: 2.3 [1.5.7248] File Name: C:\Users\RS-8800sn1320\Documents\SpectralEvolution\SR-8800_19A1320\2021_Feb_17\SR-8800_SN19A1320_IRRAD_Swansea_00001.sed Instrument: SR-8800_SN19A1320 [3] Detectors: 512,256,256 Measur
Comment:
Version: 2.3 [1.5.7248]
File Name: C:\Users\RS-8800sn1320\Documents\SpectralEvolution\SR-8800_19A1320\2021_Feb_17\SR-8800_SN19A1320_IRRAD_Swansea_00001.sed
Instrument: SR-8800_SN19A1320 [3]
Detectors: 512,256,256
Measurement: DIRECT_ENERGY
Date: 02/17/2021,02/17/2021
Time: 15:45:35.20,15:45:35.20
Temperature (C): 33.87,8.88,-5.57,33.87,8.88,-5.57
Battery Voltage: 7.70,7.70
Averages: 10,10
Integration: 7,50,30,7,50,30
Dark Mode: AUTO,AUTO
Foreoptic: FIBR15 {RADIANCE}, FIBR15 {RADIANCE}
Radiometric Calibration: RADIANCE
Units: W/m^2/sr/nm
Wavelength Range: 350,2500
从文件夹中的所有文件中,我想提取一个数据帧,该数据帧的索引名为冒号(:)之前的单词(这对于所有文件都是相同的),其余的列以字符串的形式包含剩余的信息
我已经走了这么远:
path = r"mypath"
all_files = glob.glob(path + "/*.txt")
meta_df = []
df = pd.read_fwf(filename, header = None, nrows=24)
df['Metadata'] = df[0].str.split(':').str[0]
df.set_index('Metadata', inplace=True)
meta_df.append(df[0])
但是它只是创建了一个带有列表的数据框,我不能剪切冒号后面的部分。您可以使用str.split
手动加载数据,然后将数据馈送到数据框。例如:
import glob
import pandas as pd
path = r'.'
all_files = glob.glob(path + "/*.txt")
all_data = []
for filename in all_files:
with open(filename, 'r') as f_in:
row = {}
for line in f_in:
line = line.strip().split(':', maxsplit=1)
if len(line) != 2:
continue
row[line[0]] = line[1]
if row:
all_data.append(row)
df = pd.DataFrame(all_data).T
print(df)
印刷品:
0 1
Comment
Version 2.3 [1.5.7248] 2.4
File Name C:\Users\RS-8800sn1320\Documents\SpectralEvol... C:\Something other
Instrument SR-8800_SN19A1320 [3] SR-XXX
Detectors 512,256,256 512,256,2564
Measurement DIRECT_ENERGY DIRECT_ENERGY
Date 02/17/2021,02/17/2021 02/17/2021,02/17/2021
Time 15:45:35.20,15:45:35.20 15:45:35.20,15:45:35.20
Temperature (C) 33.87,8.88,-5.57,33.87,8.88,-5.57 33.87,8.88,-5.57,33.87,8.88,-5.57
Battery Voltage 7.70,7.70 7.70,7.70
Averages 10,10 10,10
Integration 7,50,30,7,50,30 7,50,30,7,50,30
Dark Mode AUTO,AUTO AUTO,AUTO
Foreoptic FIBR15 {RADIANCE}, FIBR15 {RADIANCE} FIBR15 {RADIANCE}, FIBR15 {RADIANCE}
Radiometric Calibration RADIANCE RADIANCE
Units W/m^2/sr/nm W/m^2/sr/nm
Wavelength Range 350,2500 350,2500