Python 3.x 比较excel列中的值并将其替换为python中的字典列表

Python 3.x 比较excel列中的值并将其替换为python中的字典列表,python-3.x,Python 3.x,我有一个要求,我需要从excel文件中获取一个特定列,将精确的字符串与字典列表进行比较,如果找到匹配项,则替换它 例如 示例excel工作表如下所示(test.xlsx,其中一个工作表/选项卡为“tab1”):- 等等 我当前的字典如下所示:- test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','8

我有一个要求,我需要从excel文件中获取一个特定列,将精确的字符串与字典列表进行比较,如果找到匹配项,则替换它

例如 示例excel工作表如下所示(test.xlsx,其中一个工作表/选项卡为“tab1”):-

等等

我当前的字典如下所示:-

test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','8.8:8888']
等等

因此,我需要的是从excel中获取col2,将其与test_dict字典进行比较,如果找到匹配项,则将其替换为col3

我的预期词典输出需要是:-

test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1.1:1111','2.2.2.2:1111','3.3.3:1111','4.4.4:1111','0.0.0:0000'],{'col1':'B2','K2':['H4','H5','H5','H5','H5','2.2.2:2222','5.5:558']

我是python新手,需要一些关于如何实现它的建议。我曾尝试在另一本词典中转换excel,然后将其与现有词典进行比较并替换,但未能实现结果。请提供帮助和建议。

我在下面列出了一些示例代码,这些代码可能适用于您,具体取决于您输入的excel数据的类型和形状

test_list = [{'col1': 'A1',
          'K2': ['H1', 'H2'],
          'K3': ['1.1.1.1:1111','0.0.0.0:0000']},
          {'col1': 'B2',
           'K2': ['H4', 'H5'],
           'K3': ['2.2.2.2:2222','8.8.8.8:8888']}
           ]

example_data = [['A1', '1.1.1.1:1111', '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'],
                ['B2',  '2.2.2.2:2222', '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666']
                ]

def compare_K3(test_list, col2_data, col3_data):
    """This function assumes that relevant data is always in K3"""
    for i, row_dict in enumerate(test_list):  # i will be list index, row_dict will be col_dict
        data_list = row_dict.get('K3', [])  # empty list if 'K3' is not in the dict
        for ref_item in data_list:  # will be '1.1.1.1:1111' or similar
            if col2_data == ref_item:
                k3_data = test_list[i]['K3']
                col3_data_items = col3_data.split(',')
                for new_item in col3_data_items:
                    if new_item not in k3_data:
                        k3_data.append(new_item)
                # the sorting line below is optional, added to improve readability
                # if order is unimportant, test_list[i]['K3'] will be updated without this line since
                # k3_data and test_list[i]['K3'] are the same object
                test_list[i]['K3'] = sorted(k3_data)
                # change return to break if there are multiple instances of col2 data to change
                return

def compare_general(test_list, col2_data, col3_data):
    """This function does not make any assumptions about the relevant data"""
    for i, row_dict in enumerate(test_list):  # i will be list index, row_dict will be col_dict
        for col_key, data_list_or_str in row_dict.items(): 
            if isinstance(data_list_or_str, list):
                data_list = data_list_or_str
                for ref_item in data_list:  # will be '1.1.1.1:1111' or similar
                    if col2_data == ref_item:
                        test_list[i][col_key] = col3_data
                        # change return to break if there are multiple instances
                        # of col2 data to change
                        return
            else:  # not a list
                data_str = data_list_or_str
                if col2_data == data_str:
                    test_list[i][col_key] = col3_data
                    # change return to break if there are multiple instances of 
                    # col2 data to change
                    return

def sheet_processor(test_list, sheet_data):
    """Sheet data is a two dimensional list of rows and sublists of columns"""
    for row in sheet_data:
        col2_data = row[1]
        col3_data = row[2]
        compare_K3(test_list, col2_data, col3_data)
我已将您的test_dict重命名为test_list,因为它实际上是一个字典列表

从您的问题中不清楚如何将数据从Excel获取到Python,因此为了简单起见,我将其制作为一个二维行和列列表。如果函数表处理器是字典或其他东西,则可以修改它

我还假设相关比较数据在“K3”中。如果它可能在另一个条目中,我们可以循环遍历每个条目,尽管您可以使用compare_general,尽管它有点冗长

最后,如果需要从单个数据输入更改多个col3条目,请将“return”语句更改为“break”。我认为这应该继续循环

要运行该示例,请执行以下操作:

In [1]: so.test_list
Out[1]:
[{'col1': 'A1', 'K2': ['H1', 'H2'], 'K3': ['1.1.1.1:1111', '0.0.0.0:0000']},
 {'col1': 'B2', 'K2': ['H4', 'H5'], 'K3': ['2.2.2.2:2222', '8.8.8.8:8888']}]

In [2]: sheet_processor(so.test_list, so.example_data)

In [3]: test_list
Out[3]:
[{'col1': 'A1',
  'K2': ['H1', 'H2'],
  'K3': '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'},
 {'col1': 'B2',
  'K2': ['H4', 'H5'],
  'K3': '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666'}]

您目前如何将python与Excel文件连接?有几种不同的方法,例如通过COM端口或通过module.Hi Vector在Python中打开excel文件。我使用pandas模块来查询excel文件,如-->efile=pd.read\u excel('C:\\text.xlsx',sheet\u name='tab1')。values.tolist()。这篇文章给了我与示例相同的输出。\u非常感谢@VectorVictor的帮助和回复:)。它们都很好用。一件事是,它也将整个K3值替换为新值,而不仅仅是替换它,即0.0.0.0和8.8.8.8值也将被覆盖。预期的o/p将低于。你能建议:——>[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','2.2.2:1111','3.3.3:1111','4.4.4:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','5.5:558']我对代码进行了编辑,这样它就可以在col3Thanks中保留现有数据,非常感谢您的帮助
In [1]: so.test_list
Out[1]:
[{'col1': 'A1', 'K2': ['H1', 'H2'], 'K3': ['1.1.1.1:1111', '0.0.0.0:0000']},
 {'col1': 'B2', 'K2': ['H4', 'H5'], 'K3': ['2.2.2.2:2222', '8.8.8.8:8888']}]

In [2]: sheet_processor(so.test_list, so.example_data)

In [3]: test_list
Out[3]:
[{'col1': 'A1',
  'K2': ['H1', 'H2'],
  'K3': '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'},
 {'col1': 'B2',
  'K2': ['H4', 'H5'],
  'K3': '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666'}]