如何用Python解析某些.csv行?(包括示例文件)

如何用Python解析某些.csv行?(包括示例文件),python,csv,dictionary,Python,Csv,Dictionary,我掌握了解析.csv文件和将某些行放入列表和/或字典的基本知识,但这一行我无法破解 共有9行包含一般信息,如 客户名称 发票号码 发票日期 …等等 然后是产品和价格的详细清单。我想做的是: 从前9行中获取“发票”、“发货日期”、“到期日期”和“到期金额” 仅从其余行中获取“说明”和“金额” 查字典。然后我将把这些数据写入mySql数据库。有人能建议我如何在这个“标题”(第9行)之后开始向字典添加条目吗 谢谢 : 当我尝试此代码时: import csv with open('test.cs

我掌握了解析.csv文件和将某些行放入列表和/或字典的基本知识,但这一行我无法破解

共有9行包含一般信息,如

  • 客户名称
  • 发票号码
  • 发票日期
  • …等等
然后是产品和价格的详细清单。我想做的是:

  • 从前9行中获取“发票”、“发货日期”、“到期日期”和“到期金额”
  • 仅从其余行中获取“说明”和“金额”
  • 查字典。然后我将把这些数据写入mySql数据库。有人能建议我如何在这个“标题”(第9行)之后开始向字典添加条目吗

    谢谢

    :

    当我尝试此代码时:

    import csv
    
    with open('test.csv') as csvfile:
        readCSV = csv.reader(csvfile, delimiter=",")
        for row in readCSV:
            print(row[0])
    
    我在终点站得到这个:


    账单ID
    发票编号
    发行日期
    到期日
    货币发票
    小计
    增值税(0%)
    应付金额
    回溯(最近一次调用上次):文件 “xlu test.py”,第7行,in 打印(第[0]行)索引器:列表索引超出范围xlgit:master❯


    我建议您单独阅读一般信息,然后使用模块作为字符串解析其余行。对于第一个目的,我将创建header_attributes字典,其余的将使用类实例读取

    import csv
    from StringIO import StringIO
    
    CLIENT_PROPERTY_LINE_COUNT = 10
    
    f = open("test.csv")
    
    #When reading the file, headers are comma separated in the following format: Property, Value. 
    #The if inside the forloop is used to ignore blank lines or lines with only one attribute.
    for i in xrange(CLIENT_PROPERTY_LINE_COUNT):
        splitted_line = f.readline().rsplit(",", 2)
    
        if len(splitted_line) == 2:
            property_name, property_value = splitted_line
            stripped_property_name = property_name.strip()
            stripped_property_value = property_value.strip()
            header_attributes[stripped_property_name] = stripped_property_value
    
    print(header_attributes)
    account_data = f.read()
    
    account_data_memory_file = StringIO()
    account_data_memory_file.write(account_data)
    account_data_memory_file.seek(0)
    
    account_reader = csv.DictReader(account_data_memory_file)
    
    for account in account_reader:
        print(account['Units'], account['Amount']
    
    您可以使用模块和读卡器对象

    import csv
    
    dict1 = {}
    dict2 = {}
    
    with open("test.csv", "rb") as f:
        reader = csv.reader(f, delimiter="\t")
        for i, line in enumerate(reader):
            if i in [3, 4, 5, 9]:
                prop_name = line[0]
                prop_val = line[1]
                dict1[prop_name] = prop_value # Invoice number, Issue date, Due date or Amount date
            elif i > 11:
                # Fetch other information like 'description' and 'amount'
                print "Description: " + line[5]
                print "Amount: " + line[-1]
                dict2[line[5]] = line[-1]
    
    print dict1
    print dict2
    

    最简单的解决方案是用逗号分割列表中的特定行,并从列表的末尾到开头读取数量和描述数据。您可能会出错,因为文件中有空行,不能拆分它们。以下是代码:

    import csv
    
    general_info=dict()
    rest_of_file_list=[]
    
    row_counter=0
    with open('test.csv', 'rb') as file:
    reader = csv.reader(file)
        for row in file:
            if row_counter==2:
                #invoice row
                general_info['Invoice number'] = row.split(',')[1].rstrip()
            elif row_counter==3:
                #issue date row
                general_info['Issue date'] = row.split(',')[1].rstrip()
            elif row_counter==4:
                #due date row
                general_info['Due date'] = row.split(',')[1].rstrip()
            elif row_counter==8:
                #amount due row
                general_info['Amount due'] = row.split(',')[1].rstrip()
            elif row_counter > 10:
                #last and 4th item from the end of the list are amount and description
                if row and not row.isspace():
                    item=dict()
                    lista=row.split(',')
    
                    item['Description']=lista[len(lista)-4].rstrip()
                    item['Amount']=lista[len(lista)-1].rstrip()
                    rest_of_file_list.append(item)
            row_counter+=1
    
    print(general_info)
    print(rest_of_file_list)    
    

    抱歉,忘了提一下-使用Python2.7。我想忽略前9行的一般信息,但我需要这9行中的4行。@Alex Starbuck我已经更新了我的答案,你可以检查一下。我如何将其存储到字典中?Dict1{key:value对形成第3、4、5、9行},Dict2{key:value对从第11行到文件末尾,只有'Description'和'Amount'}@AlexStarbuck-我已经编辑了答案。请看一看。我仍然得到相同的错误:
    Traceback(最近一次调用):文件“xlwings\u test.py”,第11行,在dict1[line[0]]=line[1]#发票编号索引器:列表索引超出范围
    import csv
    
    general_info=dict()
    rest_of_file_list=[]
    
    row_counter=0
    with open('test.csv', 'rb') as file:
    reader = csv.reader(file)
        for row in file:
            if row_counter==2:
                #invoice row
                general_info['Invoice number'] = row.split(',')[1].rstrip()
            elif row_counter==3:
                #issue date row
                general_info['Issue date'] = row.split(',')[1].rstrip()
            elif row_counter==4:
                #due date row
                general_info['Due date'] = row.split(',')[1].rstrip()
            elif row_counter==8:
                #amount due row
                general_info['Amount due'] = row.split(',')[1].rstrip()
            elif row_counter > 10:
                #last and 4th item from the end of the list are amount and description
                if row and not row.isspace():
                    item=dict()
                    lista=row.split(',')
    
                    item['Description']=lista[len(lista)-4].rstrip()
                    item['Amount']=lista[len(lista)-1].rstrip()
                    rest_of_file_list.append(item)
            row_counter+=1
    
    print(general_info)
    print(rest_of_file_list)