计算目录';使用Python的大小是多少?
在我重新发明这个特殊的轮子之前,有没有人有一个很好的使用Python计算目录大小的例程?如果例程能够以Mb/Gb等格式很好地格式化大小,那就太好了。这将遍历所有子目录;求和文件大小:计算目录';使用Python的大小是多少?,python,directory,Python,Directory,在我重新发明这个特殊的轮子之前,有没有人有一个很好的使用Python计算目录大小的例程?如果例程能够以Mb/Gb等格式很好地格式化大小,那就太好了。这将遍历所有子目录;求和文件大小: 导入操作系统 def get_大小(开始_路径='。): 总尺寸=0 对于os.walk(开始路径)中的dirpath、dirname和文件名: 对于文件名中的f: fp=os.path.join(dirpath,f) #如果是符号链接,则跳过 如果不是os.path.islink(fp): 总大小+=os.pat
导入操作系统
def get_大小(开始_路径='。):
总尺寸=0
对于os.walk(开始路径)中的dirpath、dirname和文件名:
对于文件名中的f:
fp=os.path.join(dirpath,f)
#如果是符号链接,则跳过
如果不是os.path.islink(fp):
总大小+=os.path.getsize(fp)
返回总大小
打印(获取大小(),“字节”)
和一个用于娱乐的oneliner(不包括子目录):
参考:
- -以字节为单位给出大小
导入操作系统
nbytes=sum(如果d.是_文件(),则在os.scandir('.')中d的d.stat().st_大小)
2018年更新
如果您使用Python 3.4或以前,那么您可以考虑使用第三方包提供的更高效的<代码> Wave方法。在Python3.5及更高版本中,此包已被合并到标准库中,
os.walk
的性能得到了相应的提高
2019年更新
最近我越来越多地使用pathlib
,这里有一个pathlib
解决方案:
从pathlib导入路径
根目录=路径('.')
求和(如果f.为文件(),则f.在根目录.glob('***')中的f.stat().st_大小)
monknut答案很好,但在断开符号链接时失败,因此您还必须检查此路径是否确实存在
if os.path.exists(fp):
total_size += os.stat(fp).st_size
要获取一个文件的大小,有os.path.getsize()
它以字节为单位进行报告。这里有一个递归函数(它递归地汇总所有子文件夹及其各自文件的大小),它返回的字节与在linux中运行“du-sb.”时完全相同(其中“.”表示“当前文件夹”):
公认的答案不考虑硬链接或软链接,并且会将这些文件计算两次。您希望跟踪看到的inode,而不是为这些文件添加大小
import os
def get_size(start_path='.'):
total_size = 0
seen = {}
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
try:
stat = os.stat(fp)
except OSError:
continue
try:
seen[stat.st_ino]
except KeyError:
seen[stat.st_ino] = True
else:
continue
total_size += stat.st_size
return total_size
print get_size()
您可以这样做:
import commands
size = commands.getoutput('du -sh /path/').split()[0]
在本例中,我在返回结果之前没有测试过结果,如果您愿意,可以使用commands.getstatusoutput进行检查。Chris的回答很好,但是可以通过使用一个集合来检查所看到的目录,这也避免了对控制流使用异常:
def directory_size(path):
total_size = 0
seen = set()
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
try:
stat = os.stat(fp)
except OSError:
continue
if stat.st_ino in seen:
continue
seen.add(stat.st_ino)
total_size += stat.st_size
return total_size # size in bytes
递归单行程序:
def getFolderSize(p):
from functools import partial
prepend = partial(os.path.join, p)
return sum([(os.path.getsize(f) if os.path.isfile(f) else getFolderSize(f)) for f in map(prepend, os.listdir(p))])
该脚本告诉您CWD中哪个文件最大,还告诉您该文件位于哪个文件夹中。 这个脚本在win8和Python3.3.3shell上适用
import os
folder=os.cwd()
number=0
string=""
for root, dirs, files in os.walk(folder):
for file in files:
pathname=os.path.join(root,file)
## print (pathname)
## print (os.path.getsize(pathname)/1024/1024)
if number < os.path.getsize(pathname):
number = os.path.getsize(pathname)
string=pathname
## print ()
print (string)
print ()
print (number)
print ("Number in bytes")
导入操作系统
folder=os.cwd()
数字=0
string=“”
对于os.walk(文件夹)中的根目录、目录和文件:
对于文件中的文件:
pathname=os.path.join(根目录,文件)
##打印(路径名)
##打印(os.path.getsize(路径名)/1024/1024)
如果编号
到目前为止建议的一些方法实现了递归,其他方法使用shell,或者不会产生格式整齐的结果。当您的代码是针对Linux平台的一次性代码时,您可以像往常一样获得格式,包括递归,作为一行程序。除了最后一行中的打印
,它将适用于当前版本的python2
和python3
:
du.py
-----
#!/usr/bin/python3
import subprocess
def du(path):
"""disk usage in human readable format (e.g. '2,1GB')"""
return subprocess.check_output(['du','-sh', path]).split()[0].decode('utf-8')
if __name__ == "__main__":
print(du('.'))
简单、高效,适用于文件和多级目录:
$ chmod 750 du.py
$ ./du.py
2,9M
至于问题的第二部分
def human(size):
B = "B"
KB = "KB"
MB = "MB"
GB = "GB"
TB = "TB"
UNITS = [B, KB, MB, GB, TB]
HUMANFMT = "%f %s"
HUMANRADIX = 1024.
for u in UNITS[:-1]:
if size < HUMANRADIX : return HUMANFMT % (size, u)
size /= HUMANRADIX
return HUMANFMT % (size, UNITS[-1])
def人(尺寸):
B=“B”
KB=“KB”
MB=“MB”
GB=“GB”
TB=“TB”
单位=[B,KB,MB,GB,TB]
HUMANFMT=“%f%s”
人基=1024。
对于u,单位为[:-1]:
如果大小
你说的一句话。。。
这是一条单行线:
sum([sum(map(lambda fname: os.path.getsize(os.path.join(directory, fname)), files)) for directory, folders, files in os.walk(path)])
虽然我可能会把它分开,它不会执行任何检查
要转换为kb,请参见并在中使用,以下脚本将打印指定目录的所有子目录的目录大小。它还试图从缓存递归函数的调用中获益(如果可能的话)。如果省略参数,脚本将在当前目录中工作。输出按目录大小从大到小排序。因此,您可以根据自己的需要进行调整 PS我使用了配方578019,以人性化的格式显示目录大小()
编辑:根据用户2233949的建议,使用
os.scandir将null_decorator移到上面Python 3.5递归文件夹大小
def folder_size(path='.'):
total = 0
for entry in os.scandir(path):
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += folder_size(entry.path)
return total
派对有点晚了,但只要你已经安装好了,就行了。请注意,在Python 3中,默认的iglob
具有递归模式。如何修改Python3的代码留给读者一个简单的练习
>>> import os
>>> from humanize import naturalsize
>>> from glob2 import iglob
>>> naturalsize(sum(os.path.getsize(x) for x in iglob('/var/**'))))
'546.2 MB'
诚然,这是一种黑客行为,只适用于Unix/Linux
它与du-sb.
匹配,因为实际上这是一个运行du-sb.
命令的Python bash包装器
import subprocess
def system_command(cmd):
""""Function executes cmd parameter as a bash command."""
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True)
stdout, stderr = p.communicate()
return stdout, stderr
size = int(system_command('du -sb . ')[0].split()[0])
我使用的是python 2.7.13,下面是我的一行递归函数,用于获取文件夹的总大小:
from scandir import scandir
def getTotFldrSize(path):
return sum([s.stat(follow_symlinks=False).st_size for s in scandir(path) if s.is_file(follow_symlinks=False)]) + \
+ sum([getTotFldrSize(s.path) for s in scandir(path) if s.is_dir(follow_symlinks=False)])
>>> print getTotFldrSize('.')
1203245680
使用库:模块du
执行以下操作:
pip install sh
import sh
print( sh.du("-s", ".") )
91154728 .
如果要通过asterix,请按说明使用glob
要转换人类可读文件中的值,请使用:
当计算子目录的大小时,它应该更新其父目录的文件夹大小,这将持续到它达到
def folder_size(path='.'):
total = 0
for entry in os.scandir(path):
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += folder_size(entry.path)
return total
>>> import os
>>> from humanize import naturalsize
>>> from glob2 import iglob
>>> naturalsize(sum(os.path.getsize(x) for x in iglob('/var/**'))))
'546.2 MB'
import subprocess
def system_command(cmd):
""""Function executes cmd parameter as a bash command."""
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True)
stdout, stderr = p.communicate()
return stdout, stderr
size = int(system_command('du -sb . ')[0].split()[0])
from scandir import scandir
def getTotFldrSize(path):
return sum([s.stat(follow_symlinks=False).st_size for s in scandir(path) if s.is_file(follow_symlinks=False)]) + \
+ sum([getTotFldrSize(s.path) for s in scandir(path) if s.is_dir(follow_symlinks=False)])
>>> print getTotFldrSize('.')
1203245680
pip install sh
import sh
print( sh.du("-s", ".") )
91154728 .
pip install humanize
import humanize
print( humanize.naturalsize( 91157384 ) )
91.2 MB
import os
def folder_size(path):
parent = {} # path to parent path mapper
folder_size = {} # storing the size of directories
folder = os.path.realpath(path)
for root, _, filenames in os.walk(folder):
if root == folder:
parent[root] = -1 # the root folder will not have any parent
folder_size[root] = 0.0 # intializing the size to 0
elif root not in parent:
immediate_parent_path = os.path.dirname(root) # extract the immediate parent of the subdirectory
parent[root] = immediate_parent_path # store the parent of the subdirectory
folder_size[root] = 0.0 # initialize the size to 0
total_size = 0
for filename in filenames:
filepath = os.path.join(root, filename)
total_size += os.stat(filepath).st_size # computing the size of the files under the directory
folder_size[root] = total_size # store the updated size
temp_path = root # for subdirectories, we need to update the size of the parent till the root parent
while parent[temp_path] != -1:
folder_size[parent[temp_path]] += total_size
temp_path = parent[temp_path]
return folder_size[folder]/1000000.0
tree -h --du /path/to/dir # files and dirs
tree -h -d --du /path/to/dir # dirs only
import subprocess
import os
#
# get folder size
#
def get_size(self, path):
if os.path.exists(path) and path != '/':
cmd = str(subprocess.check_output(['sudo', 'du', '-s', path])).\
replace('b\'', '').replace('\'', '').split('\\t')[0]
return float(cmd) / 1000000
elif os.path.exists(path) and path == '/':
cmd = str(subprocess.getoutput(['sudo du -s /'])). \
replace('b\'', '').replace('\'', '').split('\n')
val = cmd[len(cmd) - 1].replace('/', '').replace(' ', '')
return float(val) / 1000000
else: raise ValueError
import win32com.client as com
def get_folder_size(path):
try:
fso = com.Dispatch("Scripting.FileSystemObject")
folder = fso.GetFolder(path)
size = str(round(folder.Size / 1048576))
print("Size: " + size + " MB")
except Exception as e:
print("Error --> " + str(e))
import os
import stat
size = 0
path_ = ""
def calculate(path=os.environ["SYSTEMROOT"]):
global size, path_
size = 0
path_ = path
for x, y, z in os.walk(path):
for i in z:
size += os.path.getsize(x + os.sep + i)
def cevir(x):
global path_
print(path_, x, "Byte")
print(path_, x/1024, "Kilobyte")
print(path_, x/1048576, "Megabyte")
print(path_, x/1073741824, "Gigabyte")
calculate("C:\Users\Jundullah\Desktop")
cevir(size)
Output:
C:\Users\Jundullah\Desktop 87874712211 Byte
C:\Users\Jundullah\Desktop 85815148.64355469 Kilobyte
C:\Users\Jundullah\Desktop 83803.85609722137 Megabyte
C:\Users\Jundullah\Desktop 81.83970321994275 Gigabyte
import os
import glob
print(sum(os.path.getsize(f) for f in glob.glob('**', recursive=True) if os.path.isfile(f))/(1024*1024))
import os
def du(path):
if os.path.islink(path):
return (os.lstat(path).st_size, 0)
if os.path.isfile(path):
st = os.lstat(path)
return (st.st_size, st.st_blocks * 512)
apparent_total_bytes = 0
total_bytes = 0
have = []
for dirpath, dirnames, filenames in os.walk(path):
apparent_total_bytes += os.lstat(dirpath).st_size
total_bytes += os.lstat(dirpath).st_blocks * 512
for f in filenames:
fp = os.path.join(dirpath, f)
if os.path.islink(fp):
apparent_total_bytes += os.lstat(fp).st_size
continue
st = os.lstat(fp)
if st.st_ino in have:
continue # skip hardlinks which were already counted
have.append(st.st_ino)
apparent_total_bytes += st.st_size
total_bytes += st.st_blocks * 512
for d in dirnames:
dp = os.path.join(dirpath, d)
if os.path.islink(dp):
apparent_total_bytes += os.lstat(dp).st_size
return (apparent_total_bytes, total_bytes)
>>> du('/lib')
(236425839, 244363264)
$ du -sb /lib
236425839 /lib
$ du -sB1 /lib
244363264 /lib
def humanized_size(num, suffix='B', si=False):
if si:
units = ['','K','M','G','T','P','E','Z']
last_unit = 'Y'
div = 1000.0
else:
units = ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']
last_unit = 'Yi'
div = 1024.0
for unit in units:
if abs(num) < div:
return "%3.1f%s%s" % (num, unit, suffix)
num /= div
return "%.1f%s%s" % (num, last_unit, suffix)
>>> humanized_size(236425839)
'225.5MiB'
>>> humanized_size(236425839, si=True)
'236.4MB'
>>> humanized_size(236425839, si=True, suffix='')
'236.4M'
from pathlib import Path
sum([f.stat().st_size for f in Path("path").glob("**/*")])
from pathlib import Path
def get_size(path: str) -> int:
return sum(p.stat().st_size for p in Path(path).rglob('*'))
In [6]: get_size('/etc/not-exist-path')
Out[6]: 0
In [7]: get_size('.')
Out[7]: 12038689
In [8]: def filesize(size: int) -> str:
...: for unit in ("B", "K", "M", "G"):
...: if size < 1024:
...: break
...: size /= 1024
...: return f"{size:.1f}{unit}"
...:
In [9]: filesize(get_size('.'))
Out[9]: '11.5M'
import os
def get_size(path = os.getcwd()):
print("Calculating Size: ",path)
total_size = 0
#if path is directory--
if os.path.isdir(path):
print("Path type : Directory/Folder")
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
# skip if it is symbolic link
if not os.path.islink(fp):
total_size += os.path.getsize(fp)
#if path is a file---
elif os.path.isfile(path):
print("Path type : File")
total_size=os.path.getsize(path)
else:
print("Path Type : Special File (Socket, FIFO, Device File)" )
total_size=0
bytesize=total_size
print(bytesize, 'bytes')
print(bytesize/(1024), 'kilobytes')
print(bytesize/(1024*1024), 'megabytes')
print(bytesize/(1024*1024*1024), 'gegabytes')
return total_size
x=get_size("/content/examples")