Python 查找文件中缺少的行

Python 查找文件中缺少的行,python,string,list,file,search,Python,String,List,File,Search,我有一个7000多行的.txt文件,包含描述和图像的有序路径。例如: abnormal /Users/alex/Documents/X-ray-classification/data/images/1.png abnormal /Users/alex/Documents/X-ray-classification/data/images/2.png normal /Users/alex/Documents/X-ray-classification/data/images/3.png normal

我有一个7000多行的.txt文件,包含描述和图像的有序路径。例如:

abnormal /Users/alex/Documents/X-ray-classification/data/images/1.png
abnormal /Users/alex/Documents/X-ray-classification/data/images/2.png
normal /Users/alex/Documents/X-ray-classification/data/images/3.png
normal /Users/alex/Documents/X-ray-classification/data/images/4.png
有几行不见了。我想以某种方式自动搜索丢失的行。直觉上我写道:

f = open("data.txt", 'r')
lines = f.readlines()
num = 1
for line in lines:
    if num in line:
        continue
    else:
        print (line)
    num+=1
但它当然不起作用,因为行是字符串。 有什么优雅的方法来解决这个问题吗?也许用正则表达式? 提前谢谢

您可以尝试以下方法:

lines = ["abnormal /Users/alex/Documents/X-ray-classification/data/images/1.png","normal /Users/alex/Documents/X-ray-classification/data/images/3.png","normal /Users/alex/Documents/X-ray-classification/data/images/4.png"]
maxvalue = 4 # or any other maximum value
missing = []
i = 0
for num in range(1, maxvalue+1):
    if str(num) not in lines[i]:
        missing.append(num)
    else:
      i += 1

print(missing)
或者,如果要检查以
XXX.png
结尾的行:

lines = ["abnormal /Users/alex/Documents/X-ray-classification/data/images/1.png","normal /Users/alex/Documents/X-ray-classification/data/images/3.png","normal /Users/alex/Documents/X-ray-classification/data/images/4.png"]
maxvalue = 4 # or any other maximum value
missing = []
i = 0
for num in range(1, maxvalue+1):
    if not lines[i].endswith(str(num) + ".png"):
        missing.append(num)
    else:
      i += 1

print(missing)

示例:

下面的方法很有希望奏效-它从文件名中提取数字,查看它是否比上一个数字高出1以上,如果是,则计算出所有“中间”数字并打印出来。需要打印数字(然后在以后重建文件名),因为在迭代过程中,
永远不会包含丢失文件的名称

# Set this to the first number in the series -1
num = lastnum = 0

with open("data.txt", 'r') as f:
    for line in f:
        # Pick the digit out of the filename
        num = int(''.join(x for x in line if x.isdigit()))
        if num - lastnum > 1:
          for i in range(lastnum+1, num):
            print("Missing: {}.png".format(str(i)))
        lastnum = num

这种方式的主要优点是,只要文件在列表中排序,它就可以处理从
1
以外的数字开始的操作,并且还可以报告序列中多个缺失的数字。

使用
str(num)
进行字符串比较?谢谢,但这两个版本中都有错误:TypeError:“bool”对象在同一行中不可iterable5@AlexNikitin感谢您的报告,修复了它,现在应该可以工作了