Python 从单独的CSV文件中匹配特定值并连接行

Python 从单独的CSV文件中匹配特定值并连接行,python,python-3.x,csv,Python,Python 3.x,Csv,我正在尝试创建一个python v3.x程序,它可以查看2个CSV文件。 最初,我需要将汽车CSV列表分为两个列表,昂贵的汽车和便宜的汽车,并为每个列表分配相关的行 然后我计划从指示器CSV中获取一个指示器,查看它是否是一个特定于昂贵汽车、廉价汽车或两者的指示器,然后将该指示器CSV行中的值与汽车CSV中的匹配行连接起来 CAR CSV-列出多辆车的详细信息 “车辆代码”、“价格”、“别名”、“组” “Car1”、“100”、“蓝色汽车”、“廉价汽车” “Car2”、“900”、“黄金车”、“昂

我正在尝试创建一个python v3.x程序,它可以查看2个CSV文件。
最初,我需要将汽车CSV列表分为两个列表,昂贵的汽车和便宜的汽车,并为每个列表分配相关的行

然后我计划从指示器CSV中获取一个指示器,查看它是否是一个特定于昂贵汽车、廉价汽车或两者的指示器,然后将该指示器CSV行中的值与汽车CSV中的匹配行连接起来

  • CAR CSV-列出多辆车的详细信息

    “车辆代码”、“价格”、“别名”、“组”
    “Car1”、“100”、“蓝色汽车”、“廉价汽车”
    “Car2”、“900”、“黄金车”、“昂贵车”
    “Car3”、“150”、“红色汽车”、“廉价汽车”
    “Car4”、“999”、“白金车”、“名贵车”
    “Car5”、“122”、“棕色车”、“廉价车”
    “Car6”、“500”、“粉红车”、“廉价车”、“昂贵车”

  • 指示器CSV-列出车辆可能出现的故障/错误

    “指标字段”、“指标值”、“描述”、“组”
    “故障”、“生锈”、“车身生锈”、“便宜车;昂贵车”
    “错误”、“车窗升降器停止”、“手动车窗升降器损坏”、“廉价车”
    “故障”、“V12问题”、“V12发动机问题”、“昂贵的汽车”

  • 串联示例

    {“代码”:“Car1”,“错误”:“生锈”}
    {“代码”:“Car2”,“故障”:“V12问题”}
    {“代码”:“Car5”,“错误”:“车窗升降器停止”}
    {“代码”:“Car1”,“错误”:“窗口卷绕器已停止”}

  • 到目前为止,我所能做的就是从每个CSV中选择一条随机线,并将每个CSV中的特定值连接起来。问题是,现在所有的指示灯和车辆组都匹配起来了

    from sys import argv
    from time import sleep
    import random
    import csv
    import heapq
    
    script, indicator_file, car_file = argv
    ind = ''
    car = ''
    
    def indicatorDefinition(i):
        with open(i) as file:
            reader = csv.DictReader(file)
            random_line, = heapq.nlargest(1, reader, key=lambda L: random.random())
            global ind
            ind = random_line['Indicator_Field']+'":"'+random_line['Indicator_Value']+'"'
    
    def carDefinition(n):
        with open(n) as file:
            reader = csv.DictReader(file)
            random_line, = heapq.nlargest(1, reader, key=lambda L: random.random())
            global net
            net = '"code":"'+random_line['Car code']+'","'
    
    
    
    
    def counter():
        count = 0
        while count < 6:
            carDefinition(car_file)
            indicatorDefinition(indicator_file)
            print("{"+car+ind+"}")
            sleep(random.randint(1,10))
            count += 1
    
    
    counter()
    
    从系统导入argv
    从时间上导入睡眠
    随机输入
    导入csv
    进口heapq
    脚本,指示器文件,汽车文件=argv
    ind=“”
    汽车=“”
    def指示灯定义(i):
    打开(i)作为文件:
    reader=csv.DictReader(文件)
    random_line,=heapq.nlargest(1,读卡器,key=lambda L:random.random())
    全球工业
    ind=随机线['指标\字段']+':“'+随机线['指标\值']+'”
    def卡定义(n):
    打开(n)作为文件:
    reader=csv.DictReader(文件)
    random_line,=heapq.nlargest(1,读卡器,key=lambda L:random.random())
    全球网络
    net=““代码”:“+”随机_行['Car code']+'”,“'
    def计数器():
    计数=0
    当计数小于6时:
    卡片定义(car\U文件)
    指示符定义(指示符文件)
    打印(“{”+car+ind+“}”)
    睡眠(random.randint(1,10))
    计数+=1
    计数器()
    
    可能是这样的吧?我假设Groups字段应该被拆分,而不是按原样使用。根据评论更新以阅读所有组

    import csv
    import io
    import random
    import collections
    import itertools
    
    # Data included in source for clarity
    
    car_csv = """
    "Car code","Price","Alias","Groups"
    "Car1","100","Blue Car","Cheap Cars"
    "Car2","900","Gold Car","Expensive Cars"
    "Car3","150","Red Car","Cheap Cars"
    "Car4","999","Platinum Car","Expensive Cars"
    "Car5","122","Brown Car","Cheap Cars"
    "Car6","500","Pink Car","Cheap Cars","Expensive Cars"
    "TestCarWithThreeGroups","500","Pink Car","Cheap Cars","Expensive Cars","Broken Cars"
    """.lstrip()
    
    indicator_csv = """
    "Indicator_Field","Indicator_Value","Desc","Groups"
    "Fault","Rusting","Bodywork is rusting","Cheap Cars; Expensive Cars"
    "Error","Window Winder Stopped","Manual window winder broken","Cheap Cars"
    "Fault","V12 Issues","V12 Engine Problems","Expensive Cars"
    """.lstrip()
    
    # Read data (replace io.StringIO(...) with a file object to read from a file)
    indicator_data = list(csv.DictReader(io.StringIO(indicator_csv)))
    
    # Note `restkey` to capture all of the additional columns into `OtherGroups`.
    car_data = list(csv.DictReader(io.StringIO(car_csv), restkey='OtherGroups'))
    
    # Pre-massage the car data so `Groups` is always a set of groups, and `OtherGroups` is no longer there:
    for car in car_data:
        car['Groups'] = {car['Groups']} | set(car.pop('OtherGroups', ()))
    
    
    # Create a mapping of groups <-> indicators
    indicators_by_group = collections.defaultdict(list)
    for indicator in indicator_data:
        for group in indicator['Groups'].split('; '):
            indicators_by_group[group].append(indicator)
    
    
    def generate_car():
        car = random.choice(car_data)
        # Concatenate all indicators based on the groups of the car
        available_indicators = list(itertools.chain(
            *(indicators_by_group.get(group, []) for group in car['Groups'])
        ))
        # Choose a random indicator -- this will crash if there are no available indicators
        # for the car.
        indicator = random.choice(available_indicators)
        # Generate output datum
        return {
            'code': car['Car code'],
            indicator['Indicator_Field']: indicator['Indicator_Value'],
        }
    
    
    for x in range(10):
        print(generate_car())
    

    你愿意使用熊猫吗?@AntonvBR是的,如果这会有帮助的话那是完美的,但是在进一步调查Cars.csv之后,我给出的例子是不正确的:(似乎不是像在指示器中那样用a分隔组;Cars csv将其放入一个新值“group2”如果你在这里用实际的数据格式更新你的原始帖子,我可以对代码进行相关的更改。更改已经完成,Car6的汽车CSV注意到两个汽车组是如何分开的。谢谢你的帮助。顺便说一句。非常感谢你!!这工作非常好,我想我的主要问题是我对这方面太陌生了,我不知道所有不同的方式哟你还可以接近事物。
    {'code': 'Car1', 'Error': 'Window Winder Stopped'}
    {'code': 'Car6', 'Error': 'Window Winder Stopped'}
    {'code': 'Car2', 'Fault': 'Rusting'}
    {'code': 'Car6', 'Error': 'Window Winder Stopped'}
    {'code': 'TestCarWithThreeGroups', 'Error': 'Window Winder Stopped'}
    {'code': 'Car5', 'Fault': 'Rusting'}
    {'code': 'Car2', 'Fault': 'Rusting'}
    {'code': 'TestCarWithThreeGroups', 'Fault': 'V12 Issues'}
    {'code': 'Car6', 'Fault': 'Rusting'}
    {'code': 'TestCarWithThreeGroups', 'Error': 'Window Winder Stopped'}