Python 通过按前4个字节标识文件来进行文件哈希
我正试图编写一个python脚本来搜索我当前的目录,通过它们的头来识别jpg,然后散列这些文件。我到处都是。如有任何建议,将不胜感激Python 通过按前4个字节标识文件来进行文件哈希,python,Python,我正试图编写一个python脚本来搜索我当前的目录,通过它们的头来识别jpg,然后散列这些文件。我到处都是。如有任何建议,将不胜感激 from os import listdir, getcwd from os.path import isfile, join, normpath, basename import hashlib jpgHead = b'\xff\xd8\xff\xe0' def get_files(): current_path = normpath(getcwd(
from os import listdir, getcwd
from os.path import isfile, join, normpath, basename
import hashlib
jpgHead = b'\xff\xd8\xff\xe0'
def get_files():
current_path = normpath(getcwd())
return [join(current_path, f) for f in listdir(current_path) if
isfile(join(current_path, f))]
def checkJPG():
checkJPG=checkJPG.read(4)
if checkJPG==jpgHead
get_hashes()
def get_hashes():
files = checkJPG()
list_of_hashes = []
for each_file in files:
hash_md5 = hashlib.md5()
with open(each_file, "rb") as f:
list_of_hashes.append('Filename: {}\tHash:
{}\n'.format(basename(each_file), hash_md5.hexdigest()))
return list_of_hashes
def write_jpgHashes():
hashes=get_hashes()
with open('list_of_hashes.txt', 'w') as f:
for md5_hash in hashes:
f.write(md5_hash)
if __name__ == '__main__':
write_jpgHashes()
我修改了一些函数,试试看
from os import listdir, getcwd
from os.path import isfile, join, normpath, basename
import hashlib
jpgHead = b'\xff\xd8\xff\xe0'
def get_files(path = getcwd()):
current_path = normpath(path)
return [ join(current_path, f) for f in listdir(current_path) if isfile(join(current_path, f)) ]
def checkJPG(path):
with open(path, 'rb') as f :
header = f.read(4)
return header == jpgHead
def get_hashes():
list_of_hashes = []
for each_file in get_files() :
if checkJPG(each_file) :
list_of_hashes.append('Filename: {}\tHash: {}\n'.format(each_file, md5hf(each_file)))
return list_of_hashes
def md5hf(path):
#return hashlib.md5(open(path, "rb").read()).hexdigest() ## you can use this line for small files ##
hash_md5 = hashlib.md5()
with open(path, "rb") as f:
for chunk in iter(lambda : f.read(4096), b""):
hash_md5.update(chunk)
return hash_md5.hexdigest()
def write_jpgHashes():
hashes=get_hashes()
with open('list_of_hashes.txt', 'w') as f:
for md5_hash in hashes:
f.write(md5_hash)
if __name__ == '__main__':
write_jpgHashes()
注:
修复了一些语法和缩进错误
将checkJPG改为boolean
将md5文件哈希添加到get_哈希中的_哈希列表中
添加了md5hf函数,以获取md5校验和
请查收。你应该提供一个最小的、完整的和可验证的例子,以便其他人能够重现你的问题,并就特定问题提出问题;没有给出完整的源代码,说明如何编写问题。这段代码有几个问题。有几个块需要缩进,一个不带冒号的if语句,定义了get_文件但从未调用,checkJPG返回None,但get_哈希希望它返回文件,这两个函数在无限递归循环中互相调用。同样,checkJPG=checkJPG.read4也很奇怪,而且你从来没有真正散列过任何文件数据。