Python 将YAML文件与列表元素中的替代值合并

Python 将YAML文件与列表元素中的替代值合并,python,bash,merge,yaml,Python,Bash,Merge,Yaml,我想合并两个包含列表元素的YAML文件。(A) 和(B)合并成一个新文件(C) 我想覆盖(A)中列表项的现有属性值,如果它们也在(B)中定义 如果在(A)中没有定义,但在(B)中有定义,我想向列表条目添加新属性 如果(A)中没有,我还想添加(B)的新列表条目 YAML文件A: 列表: -身份证号码:1 姓名:“来自A的姓名” -身份证号码:2 姓名:“来自A的姓名” YAML文件B: 列表: -身份证号码:1 姓名:“姓名来源B” -身份证号码:2 标题:“来自B的标题” -身份证号码:3 姓

我想合并两个包含列表元素的YAML文件。(A) 和(B)合并成一个新文件(C)

我想覆盖(A)中列表项的现有属性值,如果它们也在(B)中定义

如果在(A)中没有定义,但在(B)中有定义,我想向列表条目添加新属性

如果(A)中没有,我还想添加(B)的新列表条目

YAML文件A:

列表:
-身份证号码:1
姓名:“来自A的姓名”
-身份证号码:2
姓名:“来自A的姓名”
YAML文件B:

列表:
-身份证号码:1
姓名:“姓名来源B”
-身份证号码:2
标题:“来自B的标题”
-身份证号码:3
姓名:“姓名来源B”
标题:“来自B的标题”
合并的YAML文件(C),我想生成:

列表:
-身份证号码:1
姓名:“姓名来源B”
-身份证号码:2
姓名:“来自A的姓名”
标题:“来自B的标题”
-身份证号码:3
姓名:“姓名来源B”
标题:“来自B的标题”
我需要Bash脚本中的这个功能,但我可以在环境中使用Python

是否有任何独立的YAML处理器(如yq)可以做到这一点

如何在Python脚本中实现类似的内容?

您可以使用Python包来实现

如果已经安装了python,请在终端中运行以下命令:

pip install ruamel.yaml
python代码改编自(经过测试,效果良好)

import ruamel.yaml
yaml = ruamel.yaml.YAML()

#Load the yaml files
with open('/test1.yaml') as fp:
    data = yaml.load(fp)
with open('/test2.yaml') as fp:
    data1 = yaml.load(fp)
# dict to contain merged ids
merged = dict()

#Add the 'list' from test1.yaml to test2.yaml 'list'
for i in data1['list']:
    for j in data['list']:
        # if same 'id'
        if i['id'] == j['id']:
            i.update(j)
            merged[i['id']] = True

# add new ids if there is some
for j in data['list']:
    if not merged.get(j['id'], False):
        data1['list'].append(j)

#create a new file with merged yaml
with open('/merged.yaml', 'w') as yaml_file:
    yaml.dump(data1, yaml_file)

您可以合并在命令行上传递的yaml文件:

导入系统 进口yaml def合并指令(m_列表): 对于m\u列表中的m: 如果m['id']==s['id']: m、 更新(**s) 返回 m_列表。附加(s) 合并的_列表=[] 对于sys.argv[1:]中的f: 打开(f)作为s: 对于yaml.safe_负载['list']中的源: 合并目录(合并列表,源) 打印(yaml.dump({'list':merged_list}),end='') 结果:

list:
- id: 1
  name: name-from-B
- id: 2
  name: name-from-A
  title: title-from-B
- id: 3
  name: name-from-B
  title: title-from-B
基于这些答案(谢谢大家),我创建了一个解决方案,它以一种相当通用的方式处理ATM所需的所有合并功能(我需要在许多不同类型的Kubernetes描述符上使用它)

它是以鲁阿迈尔为基础的

它处理多级列表,不仅通过索引管理合并列表元素,还通过正确的项目标识管理合并列表元素

它比我希望的更复杂(它遍历YAML树)

脚本和核心方法:

import ruamel.yaml
from ruamel.yaml.comments import CommentedMap, CommentedSeq


#
# Merges a node from B with its pair in A
#
# If the node exists in both A and B, it will merge
# all children in sync
#
# If the node only exists in A, it will do nothing.
#
# If the node only exists in B, it will add it to A and stops
#
# attrPath DOES NOT include attrName
#
def mergeAttribute(parentNodeA, nodeA, nodeB, attrName, attrPath):

    # If both is None, there is nothing to merge
    if (nodeA is None) and (nodeB is None):
        return

    # If NodeA is None but NodeB has value, we simply set it in A
    if (nodeA is None) and (parentNodeA is not None):
        parentNodeA[attrName] = nodeB
        return

    if attrPath == '':
        attrPath = attrName
    else:
        attrPath = attrPath + '.' + attrName

    if isinstance(nodeB, CommentedSeq):

        # The attribute is a list, we need to merge specially
        mergeList(nodeA, nodeB, attrPath)

    elif isinstance(nodeB, CommentedMap):

        # A simple object to be merged
        mergeObject(nodeA, nodeB, attrPath)

    else:
        # Primitive type, simply overwrites
        parentNodeA[attrName] = nodeB


#
# Lists object attributes and merges the attribute values if possible
#
def mergeObject(nodeA, nodeB, attrPath):

    for attrName in nodeB:

        subNodeA = None
        if attrName in nodeA:
            subNodeA = nodeA[attrName]

        subNodeB = None
        if attrName in nodeB:
            subNodeB = nodeB[attrName]

        mergeAttribute(nodeA, subNodeA, subNodeB, attrName, attrPath)


#
# Merges two lists by properly identifying each item in both lists
# (using the merge-directives).
#
# If an item of listB is identified in listA, it will be merged onto the item
# of listA
#
def mergeList(listA, listB, attrPath):

    # Iterating the list from B
    for itemInB in listB:

        itemInA = findItemInList(listA, itemInB, attrPath)

        if itemInA is None:
            listA.append(itemInB)
            continue

        # Present in both, we need to merge them
        mergeObject(itemInA, itemInB, attrPath)


#
# Finds an item in the list by using the appropriate ID field defined for that
# attribute-path.
#
# If there is no id attribute defined for the list, it returns None
#
def findItemInList(listA, itemB, attrPath):

    if attrPath not in listsWithId:
        # No id field defined for the list, only "dumb" merging is possible
        return None

    # Finding out the name of the id attribute in the list items
    idAttrName = listsWithId[attrPath]

    idB = None
    if idAttrName is not None:
        idB = itemB[idAttrName]

    # Looking for the item by its ID
    for itemA in listA:

        idA = None
        if idAttrName is not None:
            idA = itemA[idAttrName]

        if idA == idB:
            return itemA

    return None

# ------------------------------------------------------------------------------


yaml = ruamel.yaml.YAML()

# Load the merge directives
with open('merge-directives.yaml') as fp:
    mergeDirectives = yaml.load(fp)

listsWithId = mergeDirectives['lists-with-id']

# Load the yaml files
with open('a.yaml') as fp:
    dataA = yaml.load(fp)

with open('b.yaml') as fp:
    dataB = yaml.load(fp)

mergeObject(dataA, dataB, '')

# create a new file with the merged yaml
yaml.dump(dataA, file('c.yaml', 'w'))
帮助器配置文件(merge directions.yaml),用于指示(甚至是多级)列表中元素的标识

对于原始问题中的数据结构,只需要'list:“id”'配置条目,但我包括了一些其他键来演示用法

#
# Lists that contain identifiable elements.
#
# Each sub-key is a property path denoting the list element in the YAML 
# data structure.
#
# The value is the name of the attribute in the list element that
# identifies the list element so that pairing can be made.
#
lists-with-id:
    list: "id"
    list.sub-list: "id"
    a.listAttrShared: "name"
尚未进行大量测试,但这里有两个测试文件,它们比原始问题中的测试更全面

a、 亚马尔:

a:
    attrShared: value-from-a
    listAttrShared:
        - name: a1
        - name: a2
    attrOfAOnly: value-from-a
list:
    - id: 1
      name: "name-from-A"
      sub-list:
          - id: s1
            name: "name-from-A"
            comments: "doesn't exist in B, so left untouched"
          - id: s2
            name: "name-from-A"
      sub-list-with-no-identification:
          - "comment 1"
          - "comment 2"
    - id: 2
      name: "name-from-A"

b、 亚马尔:

a:
    attrShared: value-from-b
    listAttrShared:
        - name: b1
        - name: b2
    attrOfBOnly: value-from-b
list:
    - id: 1
      name: "name-from-B"
      sub-list:
          - id: s2
            name: "name-from-B"
            title: "title-from-B"
            comments: "overwrites name in A with name in B + adds title from B"
          - id: s3
            name: "name-from-B"
            comments: "only exists in B so added to A's list"
      sub-list-with-no-identification:
          - "comment 3"
          - "comment 4"
    - id: 2
      title: "title-from-B"
    - id: 3
      name: "name-from-B"
      title: "title-from-B"

到目前为止你试过什么?给我们看一些代码!