Python 如何在Squad2.0上微调BERT
我对伯特真的很陌生,我想在谷歌Colab上微调伯特的基本模型。基本上,我使用GPU进行设置,下载数据并尝试调用python run_squad.pyPython 如何在Squad2.0上微调BERT,python,nlp,bert-language-model,squad,Python,Nlp,Bert Language Model,Squad,我对伯特真的很陌生,我想在谷歌Colab上微调伯特的基本模型。基本上,我使用GPU进行设置,下载数据并尝试调用python run_squad.py !git clone https://github.com/google-research/bert.git !wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip !unzip the file !unzip uncased_
!git clone https://github.com/google-research/bert.git
!wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip
!unzip the file
!unzip uncased_L-12_H-768_A-12.zip
import tensorflow as tf
# Get the GPU device name.
device_name = tf.test.gpu_device_name()
# The device name should look like the following:
if device_name == '/device:GPU:0':
print('Found GPU at: {}'.format(device_name))
else:
raise SystemError('GPU device not found')
import torch
# If there's a GPU available...
if torch.cuda.is_available():
# Tell PyTorch to use the GPU.
device = torch.device("cuda")
print('There are %d GPU(s) available.' % torch.cuda.device_count())
print('We will use the GPU:', torch.cuda.get_device_name(0))
# If not...
else:
print('No GPU available, using the CPU instead.')
device = torch.device("cpu")
!pip install transformers
!pip install wget
import wget
import os
print('Downloading dataset...')
# The URL for the dataset zip file.
url = 'https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json'
# Download the file (if we haven't already)
if not os.path.exists('./train-v2.0.json'):
wget.download(url, './train-v2.0.json')
# The URL for the dataset zip file.
url = 'https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json'
# Download the file (if we haven't already)
if not os.path.exists('./dev-v2.0.json'):
wget.download(url, './dev-v2.0.json')
print('Done')
# Unzip the dataset (if we haven't already)
if not os.path.exists('./bert-master/'):
!unzip bert-master.zip
!pip install tensorflow-gpu==1.15.0
上面的代码基本上确保我已经设置了GPU,获得了所需的依赖项,并下载了Squad2.0数据。下一步是给run_squad.py打电话,这就是我迷路的地方。这是我的文件位置
我运行了这个单元,得到了一个错误:python3。我认为我正确配置了路径,为什么它仍然缺少bert_config.json
WARNING:tensorflow:From /content/bert-master/optimization.py:87: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From bert-master/run_squad.py:1283: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.
WARNING:tensorflow:From bert-master/run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
W0519 08:02:09.023542 140027033761664 module_wrapper.py:139] From bert-master/run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From bert-master/run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
W0519 08:02:09.023784 140027033761664 module_wrapper.py:139] From bert-master/run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
WARNING:tensorflow:From /content/bert-master/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
W0519 08:02:09.024012 140027033761664 module_wrapper.py:139] From /content/bert-master/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
Traceback (most recent call last):
File "bert-master/run_squad.py", line 1283, in <module>
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "bert-master/run_squad.py", line 1129, in main
bert_config = modeling.BertConfig.from_json_file(FLAGS.bert_config_file)
File "/content/bert-master/modeling.py", line 94, in from_json_file
text = reader.read()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 122, in read
self._preread_check()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 84, in _preread_check
compat.as_bytes(self.__name), 1024 * 512)
tensorflow.python.framework.errors_impl.NotFoundError: /bert_config.json; No such file or directory
警告:tensorflow:From/content/bert master/optimization.py:87:名称tf.train.Optimizer已被弃用。请改用tf.compat.v1.train.Optimizer。
警告:tensorflow:来自bert master/run_squad.py:1283:名称tf.app.run已弃用。请改用tf.compat.v1.app.run。
警告:tensorflow:来自bert master/run_squad.py:1127:不推荐使用名称tf.logging.set_verbosity。请改用tf.compat.v1.logging.set\u详细信息。
W0519 08:02:09.023542 140027033761664模块_wrapper.py:139]来自bert master/run_squad.py:1127:不推荐使用名称tf.logging.set_verbosity。请改用tf.compat.v1.logging.set\u详细信息。
警告:tensorflow:来自bert master/run_squad.py:1127:名称tf.logging.INFO已弃用。请改用tf.compat.v1.logging.INFO。
W0519 08:02:09.023784 140027033761664模块_wrapper.py:139]来自bert master/run_squad.py:1127:名称tf.logging.INFO已弃用。请改用tf.compat.v1.logging.INFO。
警告:tensorflow:From/content/bert master/modeling.py:93:不推荐使用tf.gfile.gfile名称。请改用tf.io.gfile.gfile。
W0519 08:02:09.024012 140027033761664模块_wrapper.py:139]From/content/bert master/modeling.py:93:不推荐使用tf.gfile.gfile名称。请改用tf.io.gfile.gfile。
回溯(最近一次呼叫最后一次):
文件“bert master/run_squad.py”,第1283行,在
tf.app.run()
文件“/usr/local/lib/python3.6/dist packages/tensorflow_core/python/platform/app.py”,第40行,正在运行
_运行(main=main,argv=argv,flags\u parser=\u parse\u flags\u tolerate\u unde)
文件“/usr/local/lib/python3.6/dist-packages/absl/app.py”,第299行,运行中
_运行_main(main,args)
文件“/usr/local/lib/python3.6/dist-packages/absl/app.py”,第250行,在主
系统出口(主(argv))
文件“bert master/run_squad.py”,第1129行,主目录
bert_config=modeling.BertConfig.from_json_文件(FLAGS.bert_config_文件)
文件“/content/bert master/modeling.py”,第94行,在from_json_文件中
text=reader.read()
文件“/usr/local/lib/python3.6/dist packages/tensorflow_core/python/lib/io/File_io.py”,第122行,已读
self.\u预读\u检查()
文件“/usr/local/lib/python3.6/dist packages/tensorflow\u core/python/lib/io/File\u io.py”,第84行,在预读检查中
兼容字节(自身名称),1024*512)
tensorflow.python.framework.errors\u impl.NotFoundError:/bert\u config.json;没有这样的文件或目录
在导出BERT\u BASE\u DIR=.…
之后,你能发布ls$BERT\u BASE\u DIR/vocab.txt的响应吗
我认为它必须是导出BERT\u BASE\u DIR=./未加密的L-12\u H-768\u A-12
看看这个:它有用于二进制、多类和多标签文本分类的Google Colab笔记本,在GPU上运行的export-BERT\u BASE\u DIR=.…
之后,你能发布ls$BERT\u BASE\u DIR/vocab.txt的响应吗?
我认为它必须是export-BERT\u BASE\u DIR=./uncased\u L-12\u H-768\u A-12
看看这个:它有用于二进制、多类和多标签文本分类的Google Colab笔记本,在GPU上运行
WARNING:tensorflow:From /content/bert-master/optimization.py:87: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From bert-master/run_squad.py:1283: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.
WARNING:tensorflow:From bert-master/run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
W0519 08:02:09.023542 140027033761664 module_wrapper.py:139] From bert-master/run_squad.py:1127: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From bert-master/run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
W0519 08:02:09.023784 140027033761664 module_wrapper.py:139] From bert-master/run_squad.py:1127: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
WARNING:tensorflow:From /content/bert-master/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
W0519 08:02:09.024012 140027033761664 module_wrapper.py:139] From /content/bert-master/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
Traceback (most recent call last):
File "bert-master/run_squad.py", line 1283, in <module>
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "bert-master/run_squad.py", line 1129, in main
bert_config = modeling.BertConfig.from_json_file(FLAGS.bert_config_file)
File "/content/bert-master/modeling.py", line 94, in from_json_file
text = reader.read()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 122, in read
self._preread_check()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/lib/io/file_io.py", line 84, in _preread_check
compat.as_bytes(self.__name), 1024 * 512)
tensorflow.python.framework.errors_impl.NotFoundError: /bert_config.json; No such file or directory