从txt文件（奇怪地删除）中提取行，并使用Python写入新文件_Python_Gps

从txt文件（奇怪地删除）中提取行，并使用Python写入新文件

python gps

从txt文件（奇怪地删除）中提取行，并使用Python写入新文件,python,gps,Python,Gps,我有.txt文件（每个图像一个），其格式如下所示。然而，文件中使用的去污剂非常奇怪。我不知道如何提取我感兴趣的信息 ExifTool Version Number : 10.20 File Name : R0010023.tiff Directory : C:/gtag/wf1313 File Size : 46 MB File Modificati

我有.txt文件（每个图像一个），其格式如下所示。然而，文件中使用的去污剂非常奇怪。我不知道如何提取我感兴趣的信息

ExifTool Version Number         : 10.20
File Name                       : R0010023.tiff
Directory                       : C:/gtag/wf1313
File Size                       : 46 MB
File Modification Date/Time     : 2016:07:07 20:57:38+01:00
File Access Date/Time           : 2016:07:07 20:57:38+01:00
File Creation Date/Time         : 2016:07:04 21:18:17+01:00
File Permissions                : rw-rw-rw-
File Type                       : TIFF
File Type Extension             : tif
MIME Type                       : image/tiff
Exif Byte Order                 : Little-endian (Intel, II)
Image Width                     : 4928
Image Height                    : 3264
Bits Per Sample                 : 8 8 8
Compression                     : PackBits
Photometric Interpretation      : RGB
Image Description               : 
Make                            : RICOH IMAGING COMPANY, LTD.
Camera Model Name               : GR II
Strip Offsets                   : (Binary data 558 bytes, use -b option to extract)
Orientation                     : Horizontal (normal)
Samples Per Pixel               : 3
Rows Per Strip                  : 51
Strip Byte Counts               : (Binary data 447 bytes, use -b option to extract)
X Resolution                    : 72
Y Resolution                    : 72
Planar Configuration            : Chunky
Resolution Unit                 : inches
Software                        : GR Firmware Ver 01.02
Modify Date                     : 2016:06:21 13:09:52
XMP Toolkit                     : Image::ExifTool 10.20
Compressed Bits Per Pixel       : 3.2
Flash Fired                     : False
Flash Function                  : False
Flash Red Eye Mode              : False
Flash Return                    : No return detection
Interoperability Index          : R98 - DCF basic file (sRGB)
Y Cb Cr Positioning             : Centered
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Copyright                       : 
Exposure Time                   : 1/1250
F Number                        : 6.3
ISO                             : 100
Sensitivity Type                : Standard Output Sensitivity
Exif Version                    : 0230
Date/Time Original              : 2016:06:21 13:09:52
Create Date                     : 2016:06:21 13:09:52
Components Configuration        : Y, Cb, Cr, -
Aperture Value                  : 6.3
Brightness Value                : 8.6
Exposure Compensation           : 0
Max Aperture Value              : 2.8
Metering Mode                   : Multi-segment
Light Source                    : Shade
Maker Note Type                 : Rdc
Firmware Version                : 1.02
Recording Format                : JPEG
Exposure Program                : Manual
Drive Mode                      : Single-frame
White Balance                   : Shade
White Balance Fine Tune         : 0 0
Focus Mode                      : Manual
Auto Bracketing                 : Off
Macro Mode                      : Off
Flash Mode                      : Off
Flash Exposure Comp             : 0
Manual Flash Output             : Full
Full Press Snap                 : Off
Dynamic Range Expansion         : Off
Noise Reduction                 : Weak
Image Effects                   : Standard
Vignetting                      : Off
Toning Effect                   : Off
Hue Adjust                      : Off
Focal Length                    : 18.3 mm
AF Area X Position 1            : 632
AF Area Y Position 1            : 418
AF Area X Position              : 2435
AF Area Y Position              : 1610
AF Status                       : In Focus
AF Area Mode                    : Auto
Sensor Width                    : 4928
Sensor Height                   : 3264
Cropped Image Width             : 4928
Cropped Image Height            : 3264
Wide Adapter                    : Not Attached
Color Temp Kelvin               : 0
Crop Mode 35mm                  : Off
ND Filter                       : Off
WB Bracket Shot Number          : 0
User Comment                    : 
Flashpix Version                : 0100
Color Space                     : sRGB
Exif Image Width                : 4928
Exif Image Height               : 3264
Exposure Mode                   : Manual
Focal Length In 35mm Format     : 28 mm
Scene Capture Type              : Standard
Contrast                        : Normal
Saturation                      : Normal
Sharpness                       : Normal
Owner Name                      : 
Serial Number                   : (00000000)14100511
Lens Info                       : 18.3mm f/2.8
Lens Make                       : RICOH IMAGING COMPANY, LTD.
Lens Model                      : GR LENS
GPS Version ID                  : 2.3.0.0
GPS Latitude Ref                : xxxx
GPS Longitude Ref               : xxxx
GPS Altitude Ref                : Above Sea Level
GPS Time Stamp                  : 12:09:52
GPS Img Direction Ref           : True North
GPS Img Direction               : 228.21
GPS Date Stamp                  : 2016:06:21
GPS Pitch                       : 0.79
GPS Roll                        : 0.41
PrintIM Version                 : 0300
Aperture                        : 6.3
Flash                           : Off, Did not fire
GPS Altitude                    : 91.7 m Above Sea Level
GPS Date/Time                   : 2016:06:21 12:09:52Z
GPS Latitude                    : xx deg xx' x.xx" N
GPS Longitude                   : x deg x' xx.xx" W
GPS Position                    : xx deg xx' x.xx" N, x deg x' xx.xx" W
Image Size                      : 4928x3264
Megapixels                      : 16.1
Scale Factor To 35 mm Equivalent: 1.5
Shutter Speed                   : 1/1250
Circle Of Confusion             : 0.020 mm
Field Of View                   : 65.5 deg
Focal Length                    : 18.3 mm (35 mm equivalent: 28.0 mm)
Hyperfocal Distance             : 2.71 m
Light Value                     : 15.6

如果我尝试以下示例，将返回以下结果：

sfile = open("R001.txt", "r")
sfile.readline(1)

“E”

"习"

“fTo”

“olv”

sfile.readline(5)

“ersio”

等等等等。有人能告诉我如何处理这种类型的文件吗

我感兴趣的是提取几行

文件名GPS经度、GPS纬度等

我将非常感谢任何帮助

问候乔尔

编辑/更新

非常感谢您的评论！我真的很感激

我现在有以下几点,

import glob
file_list = glob.glob("*.txt")

for file_ in file_list:
    saved_lines = []
    sfile = open(file_, "r")
    lines = sfile.readlines() #array of all lines
    for line in lines:
        for text in ['File Name', 'GPS Longitude', 'GPS Latitude', 'GPS Altitude', 'GPS Img Direction', 'GPS Pitch', 'GPS Roll']:
            if text in line:
                saved_lines.append(line)
    parsed = "".join(saved_lines) #reassemble the file
    with open("parsed.txt", "a") as ofile: #write your output
        ofile.write(parsed)

dict={}
sfile = open("R0010022.txt", "r")
list = sfile.readlines()
for i in list:
    dict[i.split(':')[0]] = ''.join(i.split(':')[1:])

我现在面临的挑战是，我需要将数据格式化为以下格式（以便能够将其导入到我想要使用的程序中）

所以，上面的数据每幅图像一行

如上所述创建词典是一个很好的第一步（我认为）。但是，很难调用字典，因为字典的每个成员在成员名称后都有不同数量的空格。也就是说，文件名-----------------：。。。等等

是否有方法查找成员（不包括空格）

如果我能做到这一点，应该可以将每个图像分组，然后将每个组写入.csv或.txt文件中的单独行。

当您调用

readline（1）

时，您将收到第一行的1个字符。当您调用

readline（2）

时，您将收到第一行的下一个字符，依此类推。当你点击一条新线时，它将在第二条线上继续
不带任何参数调用
readline（）
，您将得到整行代码
如果需要多行，可以使用
readlines（）
，它返回包含文本文件中所有行的字符串列表。然后，您可以像处理普通列表一样提取它们

有关更多信息，请在调用
readline（1）
接收到第一行的1个字符时阅读。
。当您调用
readline（2）
时，您将收到第一行的下一个字符，依此类推。当你点击一条新线时，它将在第二条线上继续
不带任何参数调用
readline（）
，您将得到整行代码
如果需要多行，可以使用
readlines（）
，它返回包含文本文件中所有行的字符串列表。然后，您可以像处理普通列表一样提取它们

有关更多信息，请阅读。
，正如Scott Hunter（间接地）指出的那样，您不是逐行阅读文件。如果您将文档签出为readline，您会看到不带参数调用它会读取整行，而带数值参数调用它会读取那么多字节，而不是行
因此
sfile.readline（1）
读取文件的第一个字节（
'E'
），并递增指向该点的指针
sfile.readline（2）
然后从该点开始读取接下来的两个字节（
'xi'
）
从那里开始
相反，您可能希望执行以下操作：

saved_lines = [] sfile = open("R001.txt", "r") lines = sfile.readlines() #array of all lines for line in lines: for text in ['File Name', 'GPS Longitude', 'GPS Latitude']: if text in line: saved_lines.append(line) parsed = "".join(saved_lines) #reassemble the file with open("R001_PARSED.txt", "w") as ofile: #write your output ofile.write(parsed)

正如Scott Hunter（间接地）指出的，您并没有逐行读取文件。如果您将文档签出为readline，您会看到不带参数调用它会读取整行，而带数值参数调用它会读取那么多字节，而不是行
因此
sfile.readline（1）
读取文件的第一个字节（
'E'
），并递增指向该点的指针
sfile.readline（2）
然后从该点开始读取接下来的两个字节（
'xi'
）
从那里开始
相反，您可能希望执行以下操作：

saved_lines = [] sfile = open("R001.txt", "r") lines = sfile.readlines() #array of all lines for line in lines: for text in ['File Name', 'GPS Longitude', 'GPS Latitude']: if text in line: saved_lines.append(line) parsed = "".join(saved_lines) #reassemble the file with open("R001_PARSED.txt", "w") as ofile: #write your output ofile.write(parsed)

它由新行分隔
尝试
sfile.readlines（）
然后要获得文件名，您应该将其转换为字典。因此，循环浏览您制作的列表中的每个项目：

dict={} for i in list: dict[i.split(':')[0]] = ''.join(i.split(':')[1:])}

或者，如果您总是知道文件名在哪一行，只需使用
list[X]
它由新行分隔
尝试
sfile.readlines（）
然后要获得文件名，您应该将其转换为字典。因此，循环浏览您制作的列表中的每个项目：

dict={} for i in list: dict[i.split(':')[0]] = ''.join(i.split(':')[1:])}
或者，如果您总是知道文件名在哪一行，只需使用
list[X]
尝试以下操作，打开你的文件

File = opem("R001.txt", "r")
然后将数据读入列表

lstGPSData = File.readlines()
然后使用split函数进入不同的行，如果您想访问每一行，可以使用索引

For data in lstGPSData: lstValues = data.split(":") title = String(lstValue[0]) value = String(lstValue[1])
然后，如果要在每行末尾加一个逗号（
，
），可以将split参数更改为：

lstValues = data.split(",")
这样您就不需要单独访问它们了
如果需要更多信息，请查看此项，然后尝试以下操作：，打开你的文件

File = opem("R001.txt", "r")
然后将数据读入列表

lstGPSData = File.readlines()
然后使用split函数进入不同的行，如果您想访问每一行，可以使用索引

For data in lstGPSData: lstValues = data.split(":") title = String(lstValue[0]) value = String(lstValue[1])
然后，如果要在每行末尾加一个逗号（
，
），可以将split参数更改为：

lstValues = data.split(",")
这样您就不需要单独访问它们了

如果您需要更多，请查看此
您在这里看不到模式吗？您是否尝试在没有参数的情况下调用
readline
？您知道该文件的编码是什么吗？@Scott Hunter如果我调用'readline（）'我会得到以下信息，'ExifTool版本号：10.20\n'@Xander我不确定如何检查该文件的编码。你能解释一下怎么做吗？你看不到这里的模式吗？您是否尝试在没有参数的情况下调用
readline
？您知道该文件的编码是什么吗？@Scott Hunter如果我调用'readline（）'，我会得到以下信息，'ExifTool版本号：10.20\n'@Xander我不确定