Python 3.x 比较excel列中的值并将其替换为python中的字典列表
我有一个要求,我需要从excel文件中获取一个特定列,将精确的字符串与字典列表进行比较,如果找到匹配项,则替换它 例如 示例excel工作表如下所示(test.xlsx,其中一个工作表/选项卡为“tab1”):- 等等 我当前的字典如下所示:-Python 3.x 比较excel列中的值并将其替换为python中的字典列表,python-3.x,Python 3.x,我有一个要求,我需要从excel文件中获取一个特定列,将精确的字符串与字典列表进行比较,如果找到匹配项,则替换它 例如 示例excel工作表如下所示(test.xlsx,其中一个工作表/选项卡为“tab1”):- 等等 我当前的字典如下所示:- test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','8
test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','8.8:8888']
等等
因此,我需要的是从excel中获取col2,将其与test_dict字典进行比较,如果找到匹配项,则将其替换为col3
我的预期词典输出需要是:-
test_dict=[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1.1:1111','2.2.2.2:1111','3.3.3:1111','4.4.4:1111','0.0.0:0000'],{'col1':'B2','K2':['H4','H5','H5','H5','H5','2.2.2:2222','5.5:558']
我是python新手,需要一些关于如何实现它的建议。我曾尝试在另一本词典中转换excel,然后将其与现有词典进行比较并替换,但未能实现结果。请提供帮助和建议。我在下面列出了一些示例代码,这些代码可能适用于您,具体取决于您输入的excel数据的类型和形状
test_list = [{'col1': 'A1',
'K2': ['H1', 'H2'],
'K3': ['1.1.1.1:1111','0.0.0.0:0000']},
{'col1': 'B2',
'K2': ['H4', 'H5'],
'K3': ['2.2.2.2:2222','8.8.8.8:8888']}
]
example_data = [['A1', '1.1.1.1:1111', '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'],
['B2', '2.2.2.2:2222', '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666']
]
def compare_K3(test_list, col2_data, col3_data):
"""This function assumes that relevant data is always in K3"""
for i, row_dict in enumerate(test_list): # i will be list index, row_dict will be col_dict
data_list = row_dict.get('K3', []) # empty list if 'K3' is not in the dict
for ref_item in data_list: # will be '1.1.1.1:1111' or similar
if col2_data == ref_item:
k3_data = test_list[i]['K3']
col3_data_items = col3_data.split(',')
for new_item in col3_data_items:
if new_item not in k3_data:
k3_data.append(new_item)
# the sorting line below is optional, added to improve readability
# if order is unimportant, test_list[i]['K3'] will be updated without this line since
# k3_data and test_list[i]['K3'] are the same object
test_list[i]['K3'] = sorted(k3_data)
# change return to break if there are multiple instances of col2 data to change
return
def compare_general(test_list, col2_data, col3_data):
"""This function does not make any assumptions about the relevant data"""
for i, row_dict in enumerate(test_list): # i will be list index, row_dict will be col_dict
for col_key, data_list_or_str in row_dict.items():
if isinstance(data_list_or_str, list):
data_list = data_list_or_str
for ref_item in data_list: # will be '1.1.1.1:1111' or similar
if col2_data == ref_item:
test_list[i][col_key] = col3_data
# change return to break if there are multiple instances
# of col2 data to change
return
else: # not a list
data_str = data_list_or_str
if col2_data == data_str:
test_list[i][col_key] = col3_data
# change return to break if there are multiple instances of
# col2 data to change
return
def sheet_processor(test_list, sheet_data):
"""Sheet data is a two dimensional list of rows and sublists of columns"""
for row in sheet_data:
col2_data = row[1]
col3_data = row[2]
compare_K3(test_list, col2_data, col3_data)
我已将您的test_dict重命名为test_list,因为它实际上是一个字典列表
从您的问题中不清楚如何将数据从Excel获取到Python,因此为了简单起见,我将其制作为一个二维行和列列表。如果函数表处理器是字典或其他东西,则可以修改它
我还假设相关比较数据在“K3”中。如果它可能在另一个条目中,我们可以循环遍历每个条目,尽管您可以使用compare_general,尽管它有点冗长
最后,如果需要从单个数据输入更改多个col3条目,请将“return”语句更改为“break”。我认为这应该继续循环
要运行该示例,请执行以下操作:
In [1]: so.test_list
Out[1]:
[{'col1': 'A1', 'K2': ['H1', 'H2'], 'K3': ['1.1.1.1:1111', '0.0.0.0:0000']},
{'col1': 'B2', 'K2': ['H4', 'H5'], 'K3': ['2.2.2.2:2222', '8.8.8.8:8888']}]
In [2]: sheet_processor(so.test_list, so.example_data)
In [3]: test_list
Out[3]:
[{'col1': 'A1',
'K2': ['H1', 'H2'],
'K3': '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'},
{'col1': 'B2',
'K2': ['H4', 'H5'],
'K3': '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666'}]
您目前如何将python与Excel文件连接?有几种不同的方法,例如通过COM端口或通过module.Hi Vector在Python中打开excel文件。我使用pandas模块来查询excel文件,如-->efile=pd.read\u excel('C:\\text.xlsx',sheet\u name='tab1')。values.tolist()。这篇文章给了我与示例相同的输出。\u非常感谢@VectorVictor的帮助和回复:)。它们都很好用。一件事是,它也将整个K3值替换为新值,而不仅仅是替换它,即0.0.0.0和8.8.8.8值也将被覆盖。预期的o/p将低于。你能建议:——>[{'col1':'A1','K2':['H1','H2'],'K3':['1.1.1:1111','2.2.2:1111','3.3.3:1111','4.4.4:1111','0.0.0.0:0000'],{'col1':'B2','K2':['H4','H5'],'K3':['2.2.2:2222','5.5:558']我对代码进行了编辑,这样它就可以在col3Thanks中保留现有数据,非常感谢您的帮助
In [1]: so.test_list
Out[1]:
[{'col1': 'A1', 'K2': ['H1', 'H2'], 'K3': ['1.1.1.1:1111', '0.0.0.0:0000']},
{'col1': 'B2', 'K2': ['H4', 'H5'], 'K3': ['2.2.2.2:2222', '8.8.8.8:8888']}]
In [2]: sheet_processor(so.test_list, so.example_data)
In [3]: test_list
Out[3]:
[{'col1': 'A1',
'K2': ['H1', 'H2'],
'K3': '1.1.1.1:1111,2.2.2.2:1111,3.3.3.3:1111,4.4.4.4:1111'},
{'col1': 'B2',
'K2': ['H4', 'H5'],
'K3': '2.2.2.2:2222,5.5.5.5:5555,6.6.6.6:6666'}]