Machine learning 从NN模型中提取的权重大小将高于模型

Machine learning 从NN模型中提取的权重大小将高于模型,machine-learning,neural-network,huffman-code,Machine Learning,Neural Network,Huffman Code,我尝试从.pb tensorflow模型中提取权重并将其存储在文本文件中。文本文件本身的大小高于模型。为什么会发生这种情况。。? 提前谢谢 用于提取权重的代码: import tensorflow as tf from tensorflow.python.platform import gfile from tensorflow.python.framework import tensor_util import operator from functools import reduce imp

我尝试从.pb tensorflow模型中提取权重并将其存储在文本文件中。文本文件本身的大小高于模型。为什么会发生这种情况。。? 提前谢谢

用于提取权重的代码:

import tensorflow as tf
from tensorflow.python.platform import gfile
from tensorflow.python.framework import tensor_util
import operator
from functools import reduce
import matplotlib.pyplot as plt
import zlib
import pickle

PB_PATH = 'quantized_graph_resnet.pb'
with tf.Session() as sess:
    with gfile.FastGFile(PB_PATH,'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        sess.graph.as_default()
        tf.import_graph_def(graph_def, name='')
        graph_nodes = [n for n in graph_def.node]
        wts = [n for n in graph_nodes if n.op=='Const']


def check(l):
    for item in l:
        if(type(item) is list):
            return True
    return False


weightsFreq = {}
f = open('weights.txt', 'w')
for n in wts:
    print("Name of the node - %s" % n.name)
    if(True):
            l = (tensor_util.MakeNdarray(n.attr['value'].tensor)).tolist()
            if(isinstance(l, int)):
                f.write('%d' % l)
                f.write(' ')
                if l in weightsFreq:
                    weightsFreq[l]+=1
                else:
                    weightsFreq[l]=1
                continue
            if(isinstance(l, float)):
                continue
            while(check(l)):
                l = reduce(operator.concat, l)
            for item in l :
                f.write('%d' % item)
                f.write(' ')
                # print(item)
                if item in weightsFreq:
                    weightsFreq[item]+=1
                else:
                    weightsFreq[item]=1
    # print("Value - ", tensor_util.MakeNdarray(n.attr['value'].tensor), type(tensor_util.MakeNdarray(n.attr['value'].tensor)), "\n")

文本文件是存储大量十进制数的一种非常低效的方法,它对每个数字的每个数字使用一个字节,而二进制文件将使用固定大小的表示(每个数字4个字节,带有一个单精度浮点数)


这就是文本文件比二进制文件大得多的原因。

即使我使用二进制格式,其大小也不会小于模型。我也试过了