Deep learning 尝试对变压器模型进行动态量化时的运行时错误_Deep Learning_Pytorch_Quantization

Deep learning 尝试对变压器模型进行动态量化时的运行时错误

deep-learning pytorch

Deep learning 尝试对变压器模型进行动态量化时的运行时错误,deep-learning,pytorch,quantization,Deep Learning,Pytorch,Quantization,我试图对huggingface库中的pytorch预训练模型进行动态量化（量化权重和激活）。我已经提到了这一点，并发现动态量化是最合适的。我将在CPU上使用量化模型链接到hugginface模型火炬版本：1.6.0（通过pip安装）预先训练的模型 tokenizer = AutoTokenizer.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext") model

我试图对huggingface库中的pytorch预训练模型进行动态量化（量化权重和激活）。我已经提到了这一点，并发现动态量化是最合适的。我将在CPU上使用量化模型

链接到hugginface模型

火炬版本：1.6.0（通过pip安装）

预先训练的模型

tokenizer = AutoTokenizer.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")
model = AutoModel.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")

动态量化

quantized_model = torch.quantization.quantize_dynamic(
    model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
)

print(quantized_model)

错误

---------------------------------------------------------------------------
运行时错误回溯（上次最近调用）
在里面
1量化_模型=火炬。量化。量化_动态(
---->2型号，qconfig_spec={torch.nn.Linear}，dtype=torch.qint8
3 )
4.
5打印（量化_模型）
量化动态中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模型、qconfig规范、数据类型、映射、就地）
283模型评估（）
284传播配置（型号、配置规格）
-->285转换（模型、映射、就地=真）
286 _移除_qconfig（型号）
287返回模式
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
363对于模块中的名称，mod.named_children（）：
364如果类型（mod）不在可交换的_模块中：
-->365转换（mod，mapping，inplace=True）
366重新分配[名称]=交换模块（模块，映射）
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
363对于模块中的名称，mod.named_children（）：
364如果类型（mod）不在可交换的_模块中：
-->365转换（mod，mapping，inplace=True）
366重新分配[名称]=交换模块（模块，映射）
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
363对于模块中的名称，mod.named_children（）：
364如果类型（mod）不在可交换的_模块中：
-->365转换（mod，mapping，inplace=True）
366重新分配[名称]=交换模块（模块，映射）
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
363对于模块中的名称，mod.named_children（）：
364如果类型（mod）不在可交换的_模块中：
-->365转换（mod，mapping，inplace=True）
366重新分配[名称]=交换模块（模块，映射）
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
363对于模块中的名称，mod.named_children（）：
364如果类型（mod）不在可交换的_模块中：
-->365转换（mod，mapping，inplace=True）
366重新分配[名称]=交换模块（模块，映射）
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（模块、映射、就地）
364如果类型（mod）不在可交换的_模块中：
365转换（mod，mapping，inplace=True）
-->366重新分配[名称]=交换模块（模块，映射）
367
368对于键，重新分配.items（）中的值：
swap_模块中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py（mod，映射）
393             )
394设备=下一个（iter（设备））如果len（设备）>0，则无
-->395新的\u mod=映射[类型（mod）]。来自\u float（mod）
396如果设备：
397新模块至（设备）
来自浮点数的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py（cls，mod）
101其他：
102 raise RUNTIMERROR（'为动态量化线性指定了不支持的数据类型！'）
-->103 qlinear=线性（mod.in_特征，mod.out_特征，dtype=dtype）
104 qlinear.set_weight_bias（qweight，mod.bias）
105返回qlinear
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in\uuuuuu init\uuuuuuuuuu（自、入、出特征、偏差、数据类型）
33
34 def uuu init uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu（自、入、出特征、偏差=True、dtype=torch.qint8）：
--->35超（线性，自）.\uuuu初始\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
36#我们这里没有缓冲区、属性或任何东西
37#保持模块简单*everything*只是一个Python属性。
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in\uuuuu init\uuuuuuuuu（自、入、出特征、偏差、数据类型）
150 raise RUNTIMERROR（'为量化线性指定了不支持的数据类型！'）
151
-->152自包装参数=线性包装参数（数据类型）
153自包装参数设置重量偏差（qweight，bias）
154自刻度=1.0
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in uuuuu init_uuu（self，dtype）
18 elif self.dtype==torch.float16：
19 wq=火炬零点（[1,1]，dtype=火炬浮动）
--->20自设置重量偏差（wq，无）
21
22@torch.jit.export
设置中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py权重偏差（自身、权重、偏差）
24#类型：（torch.Tensor，可选[torch.Tensor]）->无
25如果self.dtype==torch.qint8：
--->26自包装参数=火炬操作量化线性预包装（重量、偏差）
27 elif self.dtype==torch.float16：
28自包装参数=火炬操作量化线性预包装fp16（重量、偏差）
运行时错误：未找到操作量化的引擎：：linear_预包NoQEngine

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-7-df2355c17e0b> in <module>
      1 quantized_model = torch.quantization.quantize_dynamic(
----> 2     model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
      3 )
      4 
      5 print(quantized_model)

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in quantize_dynamic(model, qconfig_spec, dtype, mapping, inplace)
    283     model.eval()
    284     propagate_qconfig_(model, qconfig_spec)
--> 285     convert(model, mapping, inplace=True)
    286     _remove_qconfig(model)
    287     return model

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    363     for name, mod in module.named_children():
    364         if type(mod) not in SWAPPABLE_MODULES:
--> 365             convert(mod, mapping, inplace=True)
    366         reassign[name] = swap_module(mod, mapping)
    367 

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    363     for name, mod in module.named_children():
    364         if type(mod) not in SWAPPABLE_MODULES:
--> 365             convert(mod, mapping, inplace=True)
    366         reassign[name] = swap_module(mod, mapping)
    367 

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    363     for name, mod in module.named_children():
    364         if type(mod) not in SWAPPABLE_MODULES:
--> 365             convert(mod, mapping, inplace=True)
    366         reassign[name] = swap_module(mod, mapping)
    367 

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    363     for name, mod in module.named_children():
    364         if type(mod) not in SWAPPABLE_MODULES:
--> 365             convert(mod, mapping, inplace=True)
    366         reassign[name] = swap_module(mod, mapping)
    367 

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    363     for name, mod in module.named_children():
    364         if type(mod) not in SWAPPABLE_MODULES:
--> 365             convert(mod, mapping, inplace=True)
    366         reassign[name] = swap_module(mod, mapping)
    367 

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
    364         if type(mod) not in SWAPPABLE_MODULES:
    365             convert(mod, mapping, inplace=True)
--> 366         reassign[name] = swap_module(mod, mapping)
    367 
    368     for key, value in reassign.items():

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in swap_module(mod, mapping)
    393             )
    394             device = next(iter(devices)) if len(devices) > 0 else None
--> 395             new_mod = mapping[type(mod)].from_float(mod)
    396             if device:
    397                 new_mod.to(device)

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in from_float(cls, mod)
    101         else:
    102             raise RuntimeError('Unsupported dtype specified for dynamic quantized Linear!')
--> 103         qlinear = Linear(mod.in_features, mod.out_features, dtype=dtype)
    104         qlinear.set_weight_bias(qweight, mod.bias)
    105         return qlinear

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in __init__(self, in_features, out_features, bias_, dtype)
     33 
     34     def __init__(self, in_features, out_features, bias_=True, dtype=torch.qint8):
---> 35         super(Linear, self).__init__(in_features, out_features, bias_, dtype=dtype)
     36         # We don't muck around with buffers or attributes or anything here
     37         # to keep the module simple. *everything* is simply a Python attribute.

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in __init__(self, in_features, out_features, bias_, dtype)
    150             raise RuntimeError('Unsupported dtype specified for quantized Linear!')
    151 
--> 152         self._packed_params = LinearPackedParams(dtype)
    153         self._packed_params.set_weight_bias(qweight, bias)
    154         self.scale = 1.0

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in __init__(self, dtype)
     18         elif self.dtype == torch.float16:
     19             wq = torch.zeros([1, 1], dtype=torch.float)
---> 20         self.set_weight_bias(wq, None)
     21 
     22     @torch.jit.export

~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in set_weight_bias(self, weight, bias)
     24         # type: (torch.Tensor, Optional[torch.Tensor]) -> None
     25         if self.dtype == torch.qint8:
---> 26             self._packed_params = torch.ops.quantized.linear_prepack(weight, bias)
     27         elif self.dtype == torch.float16:
     28             self._packed_params = torch.ops.quantized.linear_prepack_fp16(weight, bias)

RuntimeError: Didn't find engine for operation quantized::linear_prepack NoQEngine