List 提前为Python列表指定不同的类型_List_Python 3.x

List 提前为Python列表指定不同的类型

list python-3.x

List 提前为Python列表指定不同的类型,list,python-3.x,List,Python 3.x,如果以前有人问过这个问题，我深表歉意，但我对Python非常陌生。我有一个包含以下类似数据记录的文件 K；0; 710; 85; 2.2013:12:04:13:11:36.291; 0.0000; 1.1009.3000; 0; K0; 710; 85; 3.2013:12:04:13:11:36.291; 0.0000; 1.1009.3000; 0; K17; 718; 86; 1.2013:12:04:13:11:36.198; 995.6880; 4.0.0000; 0; 0

如果以前有人问过这个问题，我深表歉意，但我对Python非常陌生。我有一个包含以下类似数据记录的文件

K；0; 710; 85; 2.2013:12:04:13:11:36.291; 0.0000; 1.1009.3000; 0; K0; 710; 85; 3.2013:12:04:13:11:36.291; 0.0000; 1.1009.3000; 0; K17; 718; 86; 1.2013:12:04:13:11:36.198; 995.6880; 4.0.0000; 0; 0.0000; 280; 0.0000; 576; 0.0000; 904; K17; 718; 86; 2.2013:12:04:13:11:36.198; 0.0000; 4.1484.0000; 0;1484.0000; 280;1484.0000; 576;1481.6000; 904;

这些记录的长度各不相同，但我只对每个记录中的前八项感兴趣。每个记录中的项目用分隔符分隔；字符和不同数量的空格字符。当我阅读该文件时，我希望将每一行分配给一个列表，但我也希望定义列表中的项目，使其具有正确的类型，例如str、int、int、int、int、datetime、float、int等。目前我使用以下代码：

def file_extract(pathfile):  
    file = open(pathfile)  
    contents = file.read()  
    # remove spaces and split data based on ';' and \n  
    data_list = [lines.replace(" ","").split(";") for lines in contents.split("\n")]  
    for line in data_list:  
        if line[0] == "K":  
            listraw=line[:9]  
            listraw[1]=int(line[1])  
            listraw[2]=int(line[2])  
            # continue setting types in the listraw[] etc. etc.

不幸的是，当我将每个记录从文件内容读入一个列表时，列表中的所有项目都会自动分配给字符串值，如下所示：； “K”“0”“710”“85”“2”“2013:12:04:13:11:36.291”。。。。

然后，我必须遍历列表中的每一项，以设置我希望的类型。有没有更优雅的方法来设置列表中的各个类型？

您可以将数据类型放入列表中，然后使用zip将它们与字段匹配。大概是这样的：

import datetime

# write a parser for the timepoints
def dateparser(string):
    # guessed the dateformat
    return datetime.datetime.strptime(string, '%Y:%m:%d:%H:%M:%S.%f')

# From your code `if line[0] == 'K'` I assume that 'K' is a key for the
# datatypes in the corresponding row.

# For every rowtype you define the datatypes here, where datatype
# is equivalent to a parser. Just make sure it accepts a string and returns the
# type you need.
# I guessed the types here so it works with your example.
parsers = {'K': [str,int,int,int,int,dateparser,float,int,float]}

# the example content
contents = """K; 0; 710; 85; 2; 2013:12:04:13:11:36.291; 0.0000; 1;1009.3000; 0;
K; 0; 710; 85; 3; 2013:12:04:13:11:36.291; 0.0000; 1;1009.3000; 0;
K; 17; 718; 86; 1; 2013:12:04:13:11:36.198; 995.6880; 4; 0.0000; 0; 0.0000; 280; 0.0000; 576; 0.0000; 904;
K; 17; 718; 86; 2; 2013:12:04:13:11:36.198; 0.0000; 4;1484.0000; 0;1484.0000; 280;1484.0000; 576;1481.6000; 904; """

data = []
# the right way for doing this with a file would be:
# with open(filepath, 'r') as f:
#     for line in f:
for line in contents.split('\n'):
    # skip empty lines
    if not line.strip():
        continue

    # first split then strip, feels safer this way...
    fields = [f.strip() for f in line.split(';')]

    # select the parserlist from our dict
    parser_list = parsers[fields[0]]

    # Now match the fields with their parsers, it will automatically stop
    # when there is no parser left. This means if you have 8 parsers only 8
    # fields will be evaluated and the rest is ignored.
    # Comes in handy when the lengths of your row types differ.
    # However it this also goes the other way around. If there 
    # are less fields than parsers, the last parsers will be
    # ignored. If you don't want this to happen you have to
    # make sure that len(fields) >= len(parser_list)
    data.append([parser(field) for parser, field in zip(parser_list, fields)])

for row in data:
    print(row)

印刷品：

['K', 0, 710, 85, 2, datetime.datetime(2013, 12, 4, 13, 11, 36, 291000), 0.0, 1, 1009.3]
['K', 0, 710, 85, 3, datetime.datetime(2013, 12, 4, 13, 11, 36, 291000), 0.0, 1, 1009.3]
['K', 17, 718, 86, 1, datetime.datetime(2013, 12, 4, 13, 11, 36, 198000), 995.688, 4, 0.0]
['K', 17, 718, 86, 2, datetime.datetime(2013, 12, 4, 13, 11, 36, 198000), 0.0, 4, 1484.0]

不，Python无法从您的数据中判断它应该是什么。您只需将其转换为任何合适的类型。好的，谢谢。在这种情况下，Python是否总是将未知数指定为字符串类型。。。不管他们进来时是什么类型的人，他们都会留下来。它们是字符串，因为您是从文件中读取的。这对我有效，但行if line.strip.isempty:应该是if not line.strip:吗？@JC_RMB continue跳过其余部分并继续下一次迭代。若行是空的，你们想跳过，所以一切正常。是的，我知道它检查并跳过空行。但是，我想知道原始代码中包含的.isempty函数。这是一个内置Python函数，因为我的IDE似乎没有识别它吗？@JC_RMB你说得对。。。我只看到了没有完全错过的你删除了我的空。我不得不承认我不知道为什么，但我100%肯定字符串会有这个方法。