Deep learning 尝试对变压器模型进行动态量化时的运行时错误
我试图对huggingface库中的pytorch预训练模型进行动态量化(量化权重和激活)。我已经提到了这一点,并发现动态量化是最合适的。我将在CPU上使用量化模型 链接到hugginface模型 火炬版本:1.6.0(通过pip安装) 预先训练的模型Deep learning 尝试对变压器模型进行动态量化时的运行时错误,deep-learning,pytorch,quantization,Deep Learning,Pytorch,Quantization,我试图对huggingface库中的pytorch预训练模型进行动态量化(量化权重和激活)。我已经提到了这一点,并发现动态量化是最合适的。我将在CPU上使用量化模型 链接到hugginface模型 火炬版本:1.6.0(通过pip安装) 预先训练的模型 tokenizer = AutoTokenizer.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext") model
tokenizer = AutoTokenizer.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")
model = AutoModel.from_pretrained("microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext")
动态量化
quantized_model = torch.quantization.quantize_dynamic(
model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
)
print(quantized_model)
错误
---------------------------------------------------------------------------
运行时错误回溯(上次最近调用)
在里面
1量化_模型=火炬。量化。量化_动态(
---->2型号,qconfig_spec={torch.nn.Linear},dtype=torch.qint8
3 )
4.
5打印(量化_模型)
量化动态中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模型、qconfig规范、数据类型、映射、就地)
283模型评估()
284传播配置(型号、配置规格)
-->285转换(模型、映射、就地=真)
286 _移除_qconfig(型号)
287返回模式
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
363对于模块中的名称,mod.named_children():
364如果类型(mod)不在可交换的_模块中:
-->365转换(mod,mapping,inplace=True)
366重新分配[名称]=交换模块(模块,映射)
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
363对于模块中的名称,mod.named_children():
364如果类型(mod)不在可交换的_模块中:
-->365转换(mod,mapping,inplace=True)
366重新分配[名称]=交换模块(模块,映射)
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
363对于模块中的名称,mod.named_children():
364如果类型(mod)不在可交换的_模块中:
-->365转换(mod,mapping,inplace=True)
366重新分配[名称]=交换模块(模块,映射)
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
363对于模块中的名称,mod.named_children():
364如果类型(mod)不在可交换的_模块中:
-->365转换(mod,mapping,inplace=True)
366重新分配[名称]=交换模块(模块,映射)
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
363对于模块中的名称,mod.named_children():
364如果类型(mod)不在可交换的_模块中:
-->365转换(mod,mapping,inplace=True)
366重新分配[名称]=交换模块(模块,映射)
367
转换中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(模块、映射、就地)
364如果类型(mod)不在可交换的_模块中:
365转换(mod,mapping,inplace=True)
-->366重新分配[名称]=交换模块(模块,映射)
367
368对于键,重新分配.items()中的值:
swap_模块中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py(mod,映射)
393 )
394设备=下一个(iter(设备))如果len(设备)>0,则无
-->395新的\u mod=映射[类型(mod)]。来自\u float(mod)
396如果设备:
397新模块至(设备)
来自浮点数的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py(cls,mod)
101其他:
102 raise RUNTIMERROR('为动态量化线性指定了不支持的数据类型!')
-->103 qlinear=线性(mod.in_特征,mod.out_特征,dtype=dtype)
104 qlinear.set_weight_bias(qweight,mod.bias)
105返回qlinear
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in\uuuuuu init\uuuuuuuuuu(自、入、出特征、偏差、数据类型)
33
34 def uuu init uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu(自、入、出特征、偏差=True、dtype=torch.qint8):
--->35超(线性,自).\uuuu初始\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
36#我们这里没有缓冲区、属性或任何东西
37#保持模块简单*everything*只是一个Python属性。
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in\uuuuu init\uuuuuuuuu(自、入、出特征、偏差、数据类型)
150 raise RUNTIMERROR('为量化线性指定了不支持的数据类型!')
151
-->152自包装参数=线性包装参数(数据类型)
153自包装参数设置重量偏差(qweight,bias)
154自刻度=1.0
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in uuuuu init_uuu(self,dtype)
18 elif self.dtype==torch.float16:
19 wq=火炬零点([1,1],dtype=火炬浮动)
--->20自设置重量偏差(wq,无)
21
22@torch.jit.export
设置中的~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py权重偏差(自身、权重、偏差)
24#类型:(torch.Tensor,可选[torch.Tensor])->无
25如果self.dtype==torch.qint8:
--->26自包装参数=火炬操作量化线性预包装(重量、偏差)
27 elif self.dtype==torch.float16:
28自包装参数=火炬操作量化线性预包装fp16(重量、偏差)
运行时错误:未找到操作量化的引擎::linear_预包NoQEngine
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-df2355c17e0b> in <module>
1 quantized_model = torch.quantization.quantize_dynamic(
----> 2 model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
3 )
4
5 print(quantized_model)
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in quantize_dynamic(model, qconfig_spec, dtype, mapping, inplace)
283 model.eval()
284 propagate_qconfig_(model, qconfig_spec)
--> 285 convert(model, mapping, inplace=True)
286 _remove_qconfig(model)
287 return model
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
363 for name, mod in module.named_children():
364 if type(mod) not in SWAPPABLE_MODULES:
--> 365 convert(mod, mapping, inplace=True)
366 reassign[name] = swap_module(mod, mapping)
367
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
363 for name, mod in module.named_children():
364 if type(mod) not in SWAPPABLE_MODULES:
--> 365 convert(mod, mapping, inplace=True)
366 reassign[name] = swap_module(mod, mapping)
367
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
363 for name, mod in module.named_children():
364 if type(mod) not in SWAPPABLE_MODULES:
--> 365 convert(mod, mapping, inplace=True)
366 reassign[name] = swap_module(mod, mapping)
367
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
363 for name, mod in module.named_children():
364 if type(mod) not in SWAPPABLE_MODULES:
--> 365 convert(mod, mapping, inplace=True)
366 reassign[name] = swap_module(mod, mapping)
367
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
363 for name, mod in module.named_children():
364 if type(mod) not in SWAPPABLE_MODULES:
--> 365 convert(mod, mapping, inplace=True)
366 reassign[name] = swap_module(mod, mapping)
367
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in convert(module, mapping, inplace)
364 if type(mod) not in SWAPPABLE_MODULES:
365 convert(mod, mapping, inplace=True)
--> 366 reassign[name] = swap_module(mod, mapping)
367
368 for key, value in reassign.items():
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/quantization/quantize.py in swap_module(mod, mapping)
393 )
394 device = next(iter(devices)) if len(devices) > 0 else None
--> 395 new_mod = mapping[type(mod)].from_float(mod)
396 if device:
397 new_mod.to(device)
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in from_float(cls, mod)
101 else:
102 raise RuntimeError('Unsupported dtype specified for dynamic quantized Linear!')
--> 103 qlinear = Linear(mod.in_features, mod.out_features, dtype=dtype)
104 qlinear.set_weight_bias(qweight, mod.bias)
105 return qlinear
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/dynamic/modules/linear.py in __init__(self, in_features, out_features, bias_, dtype)
33
34 def __init__(self, in_features, out_features, bias_=True, dtype=torch.qint8):
---> 35 super(Linear, self).__init__(in_features, out_features, bias_, dtype=dtype)
36 # We don't muck around with buffers or attributes or anything here
37 # to keep the module simple. *everything* is simply a Python attribute.
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in __init__(self, in_features, out_features, bias_, dtype)
150 raise RuntimeError('Unsupported dtype specified for quantized Linear!')
151
--> 152 self._packed_params = LinearPackedParams(dtype)
153 self._packed_params.set_weight_bias(qweight, bias)
154 self.scale = 1.0
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in __init__(self, dtype)
18 elif self.dtype == torch.float16:
19 wq = torch.zeros([1, 1], dtype=torch.float)
---> 20 self.set_weight_bias(wq, None)
21
22 @torch.jit.export
~/.virtualenvs/python3/lib64/python3.6/site-packages/torch/nn/quantized/modules/linear.py in set_weight_bias(self, weight, bias)
24 # type: (torch.Tensor, Optional[torch.Tensor]) -> None
25 if self.dtype == torch.qint8:
---> 26 self._packed_params = torch.ops.quantized.linear_prepack(weight, bias)
27 elif self.dtype == torch.float16:
28 self._packed_params = torch.ops.quantized.linear_prepack_fp16(weight, bias)
RuntimeError: Didn't find engine for operation quantized::linear_prepack NoQEngine