Python 将配置数据文本与默认数据文本进行比较_Python_Text_Compare_Comparison_Text Files

Python 将配置数据文本与默认数据文本进行比较

python text

Python 将配置数据文本与默认数据文本进行比较,python,text,compare,comparison,text-files,Python,Text,Compare,Comparison,Text Files,我正在了解如何比较两个文本文件中的数据，并将不匹配的数据打印到新文档或输出中该计划的目标是：允许用户将包含多行数据的文件中的数据与具有正确数据值的默认文件进行比较将具有相同参数的多行不同数据与具有相同参数的默认数据列表进行比较例子：假设我有以下文本文档，其中包含这些参数和数据：我们将其称为Config.txt： <231931844151> Bird = 3 Cat = 4 Dog = 5 Bat = 10 Tiger = 11 Fish = 16 <9210

我正在了解如何比较两个文本文件中的数据，并将不匹配的数据打印到新文档或输出中

该计划的目标是：

允许用户将包含多行数据的文件中的数据与具有正确数据值的默认文件进行比较
将具有相同参数的多行不同数据与具有相同参数的默认数据列表进行比较

例子：假设我有以下文本文档，其中包含这些参数和数据：我们将其称为Config.txt：

<231931844151>
Bird = 3
Cat = 4
Dog = 5
Bat = 10
Tiger = 11
Fish = 16

<92103884812>
Bird = 4
Cat = 40
Dog = 10
Bat = Null
Tiger = 19
Fish = 24

etc. etc.

总结：我想比较两个包含数据/参数的文本文档，一个文本文档包含一系列具有相同参数的数据，而另一个文本文档仅包含一系列具有相同参数的数据。我需要比较这些参数并打印出与默认值不匹配的参数。如何在Python中执行此操作

编辑：好吧，多亏了@Maria的代码，我想我就快到了。现在我只需要找出如何将字典与列表进行比较，并打印出差异。下面是我正在尝试做的一个例子：

for i in range (len(setNames)):
    print setNames[i]
    for k in setData[i]:
        if k in dataDefault:
            print dataDefault

很明显，打印行只是为了看看它是否有效，但我不确定这是否是正确的方法

为什么不直接使用这些dict并循环它们进行比较呢

for keys in outdict:
    if defdict.get(keys):
        print outdict.get(keys)

将文件解析为单独字典的示例代码。这通过查找组分隔符（空行）来实现。setNames[i]是位于setData[i]的字典中参数集的名称。或者，您可以创建一个对象，该对象包含一个字符串

名称

成员和一个字典

数据

成员，并保留这些成员的列表。进行比较并以您想要的方式输出它取决于您，这只是以稍微不同的格式将输入文件返回到命令行

 # The function you wrote
 def make_dict(data):
    return dict((line.split(None, 1)[0], line) for line in data)

# open the file and read the lines into a list of strings
with open("Config.txt" , "rb") as f:
    dataConfig = f.read().splitlines()

# get rid of trailing '', as they cause problems and are unecessary
while (len(dataConfig) > 0) and (dataConfig[len(dataConfig) - 1] == ''):
    dataConfig.pop()

# find the indexes of all the ''. They amount to one index past the end of each set of parameters
setEnds = []
index = 0
while '' in dataConfig[index:]:
    setEnds.append(dataConfig[index:].index('') + index)
    index = setEnds[len(setEnds) - 1] + 1

# separate out your input into separate dictionaries, and keep track of the name of each dictionary
setNames = []
setData = []

i = 0;
j = 0;
while j < len(setEnds):
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:setEnds[j]]))
    i = setEnds[j] + 1
    j += 1

# handle the last index to the end of the list. Alternativel you could add len(dataConfig) to the end of setEnds and you wouldn't need this
if len(setEnds) > 0:
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:]))

# regurgitate the input to prove it worked the way you wanted.
for i in range(len(setNames)):
    print setNames[i]
    for k in setData[i]:
        print "\t" + k + ": " + setData[i][k];
    print ""

#您编写的函数
def生成指令（数据）：
返回dict（（行分割（无，1）[0]，行）用于数据中的行）
#打开文件并将行读入字符串列表
以open（“Config.txt”、“rb”）作为f：
dataConfig=f.read（）.splitlines（）
#请删除尾随“”，因为它们会导致问题并且是不必要的
而（len（dataConfig）>0）和（dataConfig[len（dataConfig）-1]=''：
dataConfig.pop（）
#查找所有“”的索引。它们相当于超过每组参数末尾的一个索引
setEnds=[]
索引=0
当dataConfig[索引：]中的“”时：
setEnds.append（dataConfig[index:].index（“”）+index）
索引=设置端[len（设置端）-1]+1
#将您的输入分离到不同的词典中，并跟踪每个词典的名称
setNames=[]
setData=[]
i=0；
j=0；
当j0：
追加（dataConfig[i]）
append（make_dict（dataConfig[i+1:]））
#反刍输入以证明它按照您想要的方式工作。
对于范围内的i（len（setNames））：
打印集合名[i]
对于setData[i]中的k：
打印“\t”+k+”：“+setData[i][k]；
打印“”

能否更新数据示例，使其显示问题并解释哪些条目应该/不应该match@scytale我编辑并将其添加到原始帖子中，以显示哪些参数不匹配以及原因。好的，那么，您的代码是如何表现得不正确的？@JeanP如果配置数据文件总是将这些条目用空行分隔，那么您应该尝试解析该文件，以便将它们分开。制作一个数据结构，其中存储了一组值的标签，并有一个值字典。当文件不是空的时候，读取所有内容直到下一个空行，并进行相应的分析。@JeanP不执行拆分行，只需执行read（）操作，然后对输入进行自定义处理（在调用read而不使用参数时，还要注意文件的大小）。请参阅并尝试传递\n\n或\r\n\r\n作为分隔符。Optionalyl逐个处理dataConfig中的每一行，直到您得到一个空白条目（仅\n或\r\n）。好的，我想我已经了解了这一点，我可以获取这些setData[i]并与默认文件进行比较吗？因为据我所知，setData[i]保存参数（cat、dog、bat、fish和tiger）的值。是的，setData[i]对setName[i]有一个命令，所以在您的示例中，如果setName[i]是，那么setData[i]将是{‘Bird’：3，‘cat’：4，‘dog’：5，‘bat’：10，‘tiger’：11，‘fish’：16}，您可以像使用普通字典一样使用它。好吧，我的下一步是：1。从默认文本中获取setData[i]。2.比较两个setData[i]字典，返回不同或不匹配的结果。3.将这些结果打印到输出文本中。这是一种很好的方式，还是您有其他想法？您不需要在这些列表中存储默认文本。默认情况下，您可以为数据创建一个单独的字典。此外，您还需要将每个setData与defaultData进行比较，然后您可以随意输出。不过，您似乎已经有了大致的想法。比较defaultData列表和setData[i]字典的一个简单方法是什么？

for i in range (len(setNames)):
    print setNames[i]
    for k in setData[i]:
        if k in dataDefault:
            print dataDefault

for keys in outdict:
    if defdict.get(keys):
        print outdict.get(keys)

 # The function you wrote
 def make_dict(data):
    return dict((line.split(None, 1)[0], line) for line in data)

# open the file and read the lines into a list of strings
with open("Config.txt" , "rb") as f:
    dataConfig = f.read().splitlines()

# get rid of trailing '', as they cause problems and are unecessary
while (len(dataConfig) > 0) and (dataConfig[len(dataConfig) - 1] == ''):
    dataConfig.pop()

# find the indexes of all the ''. They amount to one index past the end of each set of parameters
setEnds = []
index = 0
while '' in dataConfig[index:]:
    setEnds.append(dataConfig[index:].index('') + index)
    index = setEnds[len(setEnds) - 1] + 1

# separate out your input into separate dictionaries, and keep track of the name of each dictionary
setNames = []
setData = []

i = 0;
j = 0;
while j < len(setEnds):
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:setEnds[j]]))
    i = setEnds[j] + 1
    j += 1

# handle the last index to the end of the list. Alternativel you could add len(dataConfig) to the end of setEnds and you wouldn't need this
if len(setEnds) > 0:
    setNames.append(dataConfig[i])
    setData.append(make_dict(dataConfig[i+1:]))

# regurgitate the input to prove it worked the way you wanted.
for i in range(len(setNames)):
    print setNames[i]
    for k in setData[i]:
        print "\t" + k + ": " + setData[i][k];
    print ""