如何使用python将固定宽度的txt文件转换为csv?

如何使用python将固定宽度的txt文件转换为csv?,python,csv,text-files,Python,Csv,Text Files,我在网上搜索,但找不到我问题的答案。我需要将txt文件转换为csv文件。我已经想出了如何使用delimeters来实现这一点,但是,txt文件没有任何delimeters或头,因此我必须设置固定宽度的数字。该文件有数百万条记录。柱的宽度为10、60、60、30、2、3、6、9、5、6、12、12、5、3、12、12、5、3、5、3 我面临的两个挑战是: 1.将文件转换为具有上面列出的固定宽度的csv。 2.插入标题 数据如下所示: 0000000626ISOCKE BBBB ZZZZZ TW D

我在网上搜索,但找不到我问题的答案。我需要将txt文件转换为csv文件。我已经想出了如何使用delimeters来实现这一点,但是,txt文件没有任何delimeters或头,因此我必须设置固定宽度的数字。该文件有数百万条记录。柱的宽度为10、60、60、30、2、3、6、9、5、6、12、12、5、3、12、12、5、3、5、3

我面临的两个挑战是: 1.将文件转换为具有上面列出的固定宽度的csv。 2.插入标题

数据如下所示:

0000000626ISOCKE BBBB ZZZZZ TW DARTMOUTH 10 FDSAF DR DARTMOUTH CASN 7H44DR SAAB -11.111111 22.2222222 000 -33.333333 44.4444444 000 0000000627ISOCKE FFFF TTTTT TW HALIFAX 3367 FDSAF RD HALIFAX CASN 8C5ASE SAAB -55.555555 66.6666666 000 -77.777777 88.8888888 000 0000000628ISOCKE RE CHARLOTTETOWN 449 UYRNT ECSARW RD CHARLOTTETOWN CAPE CSE8HR SAAB -99.999999 11.1111111 000 -22.222222 33.3333333 000 用于撕裂每个固定宽度的行,并在适当时进行修剪。

使用切片对象:

>>> widths = 1,2,3
>>> slices = []
>>> offset = 0
>>> for w in widths:
...     slices.append(slice(offset, offset + w))
...     offset += w
...
>>> slices
[slice(0, 1, None), slice(1, 3, None), slice(3, 6, None)]
>>> pieces = ["abcdef"[slice] for slice in slices]
>>> pieces
['a', 'bc', 'def']
>>>

如果有人还在寻找解决方案,我已经用python开发了一个小脚本。只要您有Python3.5,它就很容易使用


就我个人而言,FixedWidth模块工作得非常好

这需要一些设置,但是,您可以将字段描述为字符串、数字(能够指定精度)、左对齐、右对齐、每个字段的填充字符是什么等等


非常强大,特别是如果你需要解析不止一种类型的文件:你只需提供预期的字段描述,它就可以完成其他所有工作。

在for循环中,为什么不只为rf中的行编写
,而不是为rf.readlines()中的行编写
这就是我在网上看到的代码。但这会对我的问题产生影响吗?关于
读线
:在不需要的时候在内存中保留数百万字符串的列表可能会产生其他问题。感谢您的回复。不过我是个新手。你能给我举个例子吗?
>struct.unpack('6s5s3s','000001foo 123')
('000001','foo','123')
在代码开始时设置
切片。对于文件中的每一行,请执行
fields=[line[slice]For slice in slice]
谢谢,伙计,您的代码在另一个要求中帮助了我:)
>>> widths = 1,2,3
>>> slices = []
>>> offset = 0
>>> for w in widths:
...     slices.append(slice(offset, offset + w))
...     offset += w
...
>>> slices
[slice(0, 1, None), slice(1, 3, None), slice(3, 6, None)]
>>> pieces = ["abcdef"[slice] for slice in slices]
>>> pieces
['a', 'bc', 'def']
>>>
"""
This script will convert Fixed width File into Delimiter File, tried on Python 3.5 only
Sample run: (Order of argument doesnt matter)
python ConvertFixedToDelimiter.py -i SrcFile.txt -o TrgFile.txt -c Config.txt -d "|"
Inputs are as follows
1. Input FIle - Mandatory(Argument -i) - File which has fixed Width data in it
2. Config File - Optional (Argument -c, if not provided will look for Config.txt file on same path, if not present script will not run)
    Should have format as
    FieldName,fieldLength
    eg:
    FirstName,10
    SecondName,8
    Address,30
    etc:
3. Output File - Optional (Argument -o, if not provided will be used as InputFIleName plus Delimited.txt)
4. Delimiter - Optional (Argument -d, if not provided default value is "|" (pipe))
"""
from collections import OrderedDict
import argparse
from argparse import ArgumentParser
import os.path
import sys


def slices(s, args):
    position = 0
    for length in args:
        length = int(length)
        yield s[position:position + length]
        position += length

def extant_file(x):
    """
    'Type' for argparse - checks that file exists but does not open.
    """
    if not os.path.exists(x):
        # Argparse uses the ArgumentTypeError to give a rejection message like:
        # error: argument input: x does not exist
        raise argparse.ArgumentTypeError("{0} does not exist".format(x))
    return x





parser = ArgumentParser(description="Please provide your Inputs as -i InputFile -o OutPutFile -c ConfigFile")
parser.add_argument("-i", dest="InputFile", required=True,    help="Provide your Input file name here, if file is on different path than where this script resides then provide full path of the file", metavar="FILE", type=extant_file)
parser.add_argument("-o", dest="OutputFile", required=False,    help="Provide your Output file name here, if file is on different path than where this script resides then provide full path of the file", metavar="FILE")
parser.add_argument("-c", dest="ConfigFile", required=False,   help="Provide your Config file name here,File should have value as fieldName,fieldLength. if file is on different path than where this script resides then provide full path of the file", metavar="FILE",type=extant_file)
parser.add_argument("-d", dest="Delimiter", required=False,   help="Provide the delimiter string you want",metavar="STRING", default="|")

args = parser.parse_args()

#Input file madatory
InputFile = args.InputFile
#Delimiter by default "|"
DELIMITER = args.Delimiter

#Output file checks
if args.OutputFile is None:
    OutputFile = str(InputFile) + "Delimited.txt"
    print ("Setting Ouput file as "+ OutputFile)
else:
    OutputFile = args.OutputFile

#Config file check
if args.ConfigFile is None:
    if not os.path.exists("Config.txt"):
        print ("There is no Config File provided exiting the script")
        sys.exit()
    else:
        ConfigFile = "Config.txt"
        print ("Taking Config.txt file on this path as Default Config File")
else:
    ConfigFile = args.ConfigFile

fieldNames = []
fieldLength = []
myvars = OrderedDict()


with open(ConfigFile) as myfile:
    for line in myfile:
        name, var = line.partition(",")[::2]
        myvars[name.strip()] = int(var)
for key,value in myvars.items():
    fieldNames.append(key)
    fieldLength.append(value)

with open(OutputFile, 'w') as f1:
    fieldNames = DELIMITER.join(map(str, fieldNames))
    f1.write(fieldNames + "\n")
    with open(InputFile, 'r') as f:
        for line in f:
            rec = (list(slices(line, fieldLength)))
            myLine = DELIMITER.join(map(str, rec))
            f1.write(myLine + "\n")