Python 将数据从文本文件提取到输出文件_Python_File_Text_Extract

Python 将数据从文本文件提取到输出文件

python file text

Python 将数据从文本文件提取到输出文件,python,file,text,extract,Python,File,Text,Extract,我有很多文件的名字只是数字。（从1开始到最大值），这些文件中的每个文件都通过其“标记”（ObjectID=，X=，Y=，等等）彼此相似，但这些标记后面的值根本不相同我想通过手动将数据从一个文件复制/粘贴到另一个文件来简化我的工作，并使用Python编写了一个小脚本（因为我对Python有点经验）这是完整的脚本： import os BASE_DIRECTORY = 'C:\Users\Tom\Desktop\TheServer\scriptfiles\Objects' output_fil

我有很多文件的名字只是数字。（从1开始到最大值），这些文件中的每个文件都通过其“标记”（ObjectID=，X=，Y=，等等）彼此相似，但这些标记后面的值根本不相同

我想通过手动将数据从一个文件复制/粘贴到另一个文件来简化我的工作，并使用Python编写了一个小脚本（因为我对Python有点经验）

这是完整的脚本：

import os

BASE_DIRECTORY = 'C:\Users\Tom\Desktop\TheServer\scriptfiles\Objects'
output_file = open('output.txt', 'w')
output = {}
file_list = []

for (dirpath, dirnames, filenames) in os.walk(BASE_DIRECTORY):
    for f in filenames:
        if 'txt' in str(f):
            e = os.path.join(str(dirpath), str(f))
            file_list.append(e)

for f in file_list:
    print f
    txtfile = open(f, 'r')
    output[f] = []
    for line in txtfile:
        if 'ObjectID =' in line:
            output[f].append(line)
        elif 'X =' in line:
            output[f].append(line)
        elif 'Y =' in line:
            output[f].append(line)
tabs = []
for tab in output:
    tabs.append(tab)

tabs.sort()
for tab in tabs:
    for row in output[tab]:
        output_file.write(row + '')

现在，一切正常，输出文件如下所示：

ObjectID = 1216
X = -1480.500610
Y = 2610.885742
ObjectID = 970
X = -1517.210693
Y = 2522.842285
ObjectID = 3802
X = -1512.156616
Y = 2521.116210
etc.

但我不希望这样（每个值都有一个新行）。我需要它为每个文件执行此操作：

读取文件

删除值前面的标记

格式化将在输出文件夹中包含这些值的单行。（假设我想让它看起来像：“（1216，-1480.5006102522.842285）”

在输出文件夹中写入该行

对每个文件重复此操作

有什么帮助吗？

在您的循环中，记录您是否在记录中：

records = []
in_record = False
id, x, y = 0, 0, 0
for line in txtfile:
    if not in_record:
        if 'ObjectID =' in line:
            in_record = True
            id = line[10:]
    elif 'X =' in line:
        x = line[3:]
    elif 'Y =' in line:
        y = line[3:]
        records.append((id, x, y))
        in_record = False

然后，您将有一个元组列表，您可以使用该模块轻松编写这些元组。

以下是您需要的。我没有足够的时间编写将结果附加到新文件的代码。相反，它只是打印出来，但你明白了

import os.path

path = "path"

#getting the number of files in your folder
num_files = len([f for f in os.listdir(path)
                if os.path.isfile(os.path.join(path, f))])

#function that returns your desired output for a given file
def file_head_ext(file_path, file_num):
    with open(file_path + "/" + file_num) as myfile:
        head = [next(myfile).split("=") for x in range(3)]
        formatted_head = [elm[1].replace("\n",'').replace(" ","") for elm in head]
    return(",".join(formatted_head))


for filnum in range(1,num_files):
    print(file_head_ext(path, str(filnum)))

在这里找到生成内容的循环版本。
我重写了它，使行内容ObjectId、X和Y在同一行中

看起来这就是你想要做的：

for f in file_list:
    print f
    txtfile = open(f, 'r')
    output[f] = []
    for line in txtfile:
        myline = ''
        if 'ObjectID =' in line:
            pos = line.rfind("ObjectID =") + len("ObjectID =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])

        elif 'X =' in line:
            pos = line.rfind("X =") + len("X =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])
        elif 'Y =' in line:
            pos = line.rfind("Y =") + len("Y =")
            rest = line[pos:]
            # Here you set the delimiter after the ObjectID value. Can be ","
            numbers = rest.split(" ")
            if len(numbers) > 0: 
                myline.append(numbers[0])

        output[f].append(myline)

注意您需要知道分隔符在代码中的哪个字符将您试图查找的名称与要从行中获取的实际值分隔开来：

ObjectID=

希望这有帮助

data = open('sam.txt', 'r').read()

>>> print data
ObjectID = 1216
X = -1480.500610
Y = 2610.885742
ObjectID = 970
X = -1517.210693
Y = 2522.842285
ObjectID = 3802
X = -1512.156616
Y = 2521.116210
>>>

现在让我们做一些字符串替换：）

您可以从需要读取的文件中粘贴一些示例行吗？与输出行相同，我添加了代码，在其中您将值附加到一行中。现在它不会在文件中写入任何内容。@M.Rox您需要将

记录写入文件。
>>> data = data.replace('ObjectID =', '').replace('\nX = ', ',').replace('\nY = ', ',')
>>> print data
 1216,-1480.500610,2610.885742
 970,-1517.210693,2522.842285
 3802,-1512.156616,2521.116210