Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/svg/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 通过按前4个字节标识文件来进行文件哈希_Python - Fatal编程技术网

Python 通过按前4个字节标识文件来进行文件哈希

Python 通过按前4个字节标识文件来进行文件哈希,python,Python,我正试图编写一个python脚本来搜索我当前的目录,通过它们的头来识别jpg,然后散列这些文件。我到处都是。如有任何建议,将不胜感激 from os import listdir, getcwd from os.path import isfile, join, normpath, basename import hashlib jpgHead = b'\xff\xd8\xff\xe0' def get_files(): current_path = normpath(getcwd(

我正试图编写一个python脚本来搜索我当前的目录,通过它们的头来识别jpg,然后散列这些文件。我到处都是。如有任何建议,将不胜感激

from os import listdir, getcwd
from os.path import isfile, join, normpath, basename
import hashlib

jpgHead = b'\xff\xd8\xff\xe0'

def get_files():
    current_path = normpath(getcwd())
    return [join(current_path, f) for f in listdir(current_path) if 
isfile(join(current_path, f))] 

def checkJPG():
    checkJPG=checkJPG.read(4)
    if checkJPG==jpgHead
    get_hashes()

def get_hashes():
    files = checkJPG()
    list_of_hashes = []
    for each_file in files:
        hash_md5 = hashlib.md5()
        with open(each_file, "rb") as f: 
        list_of_hashes.append('Filename: {}\tHash: 
        {}\n'.format(basename(each_file), hash_md5.hexdigest()))
        return list_of_hashes

def write_jpgHashes():
    hashes=get_hashes()
    with open('list_of_hashes.txt', 'w') as f:
        for md5_hash in hashes:
        f.write(md5_hash)


if __name__ == '__main__':

write_jpgHashes()

我修改了一些函数,试试看

from os import listdir, getcwd
from os.path import isfile, join, normpath, basename
import hashlib

jpgHead = b'\xff\xd8\xff\xe0'

def get_files(path = getcwd()):
    current_path = normpath(path)
    return [ join(current_path, f) for f in listdir(current_path) if isfile(join(current_path, f)) ] 

def checkJPG(path):
    with open(path, 'rb') as f : 
        header = f.read(4)
    return header == jpgHead

def get_hashes():
    list_of_hashes = []
    for each_file in get_files() :
        if checkJPG(each_file) : 
            list_of_hashes.append('Filename: {}\tHash: {}\n'.format(each_file, md5hf(each_file)))
    return list_of_hashes

def md5hf(path): 
    #return hashlib.md5(open(path, "rb").read()).hexdigest()  ## you can use this line for small files ##  
    hash_md5 = hashlib.md5()
    with open(path, "rb") as f:
        for chunk in iter(lambda : f.read(4096), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

def write_jpgHashes():
    hashes=get_hashes()
    with open('list_of_hashes.txt', 'w') as f:
        for md5_hash in hashes:
            f.write(md5_hash)

if __name__ == '__main__':
    write_jpgHashes()
注:

修复了一些语法和缩进错误 将checkJPG改为boolean 将md5文件哈希添加到get_哈希中的_哈希列表中 添加了md5hf函数,以获取md5校验和
请查收。你应该提供一个最小的、完整的和可验证的例子,以便其他人能够重现你的问题,并就特定问题提出问题;没有给出完整的源代码,说明如何编写问题。这段代码有几个问题。有几个块需要缩进,一个不带冒号的if语句,定义了get_文件但从未调用,checkJPG返回None,但get_哈希希望它返回文件,这两个函数在无限递归循环中互相调用。同样,checkJPG=checkJPG.read4也很奇怪,而且你从来没有真正散列过任何文件数据。