Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 运行run_squad.py微调Google BERT模型(官方TensorFlow预训练模型)时加载(还原)TensorFlow检查点失败_Python_Tensorflow_Nlp - Fatal编程技术网

Python 运行run_squad.py微调Google BERT模型(官方TensorFlow预训练模型)时加载(还原)TensorFlow检查点失败

Python 运行run_squad.py微调Google BERT模型(官方TensorFlow预训练模型)时加载(还原)TensorFlow检查点失败,python,tensorflow,nlp,Python,Tensorflow,Nlp,我对深度学习和NLP是新手,现在正尝试开始使用预先训练好的GoogleBert模型。因为我打算和伯特一起建立一个QA系统,所以我决定从球队相关的微调开始 我按照中README.md的说明进行操作 我按如下方式键入代码: export BERT_BASE_DIR=/home/bert/Dev/venv/uncased_L-12_H-768_A-12/ export SQUAD_DIR=/home/bert/Dev/venv/squad python run_squad.py \ --vocab

我对深度学习和NLP是新手,现在正尝试开始使用预先训练好的GoogleBert模型。因为我打算和伯特一起建立一个QA系统,所以我决定从球队相关的微调开始

我按照中README.md的说明进行操作

我按如下方式键入代码:

export BERT_BASE_DIR=/home/bert/Dev/venv/uncased_L-12_H-768_A-12/
export SQUAD_DIR=/home/bert/Dev/venv/squad
python run_squad.py \
  --vocab_file=$BERT_BASE_DIR/vocab.txt \
  --bert_config_file=$BERT_BASE_DIR/bert_config.json \
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \
  --do_train=True \
  --train_file=$SQUAD_DIR/train-v1.1.json \
  --do_predict=True \
  --predict_file=$SQUAD_DIR/dev-v1.1.json \
  --train_batch_size=12 \
  --learning_rate=3e-5 \
  --num_train_epochs=2.0 \
  --max_seq_length=384 \
  --doc_stride=128 \
  --output_dir=/tmp/squad_base/
几分钟后(培训开始时),我得到了:

a lot of output omitted
INFO:tensorflow:start_position: 53
INFO:tensorflow:end_position: 54
INFO:tensorflow:answer: february 1848
INFO:tensorflow:***** Running training *****
INFO:tensorflow:  Num orig examples = 87599
INFO:tensorflow:  Num split examples = 88641
INFO:tensorflow:  Batch size = 12
INFO:tensorflow:  Num steps = 14599
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Running train on CPU
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = end_positions, shape = (12,)
INFO:tensorflow:  name = input_ids, shape = (12, 384)
INFO:tensorflow:  name = input_mask, shape = (12, 384)
INFO:tensorflow:  name = segment_ids, shape = (12, 384)
INFO:tensorflow:  name = start_positions, shape = (12,)
INFO:tensorflow:  name = unique_ids, shape = (12,)
INFO:tensorflow:Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /home/bert/Dev/venv/uncased_L-12_H-768_A-12//bert_model.ckpt
INFO:tensorflow:training_loop marked as finished
WARNING:tensorflow:Reraising captured error
Traceback (most recent call last):
  File "run_squad.py", line 1283, in <module>
    tf.app.run()
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "run_squad.py", line 1215, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2400, in train
    rendezvous.raise_errors()
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors
    six.reraise(typ, value, traceback)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
    saving_listeners=saving_listeners
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 356, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 1211, in _train_model_default
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2186, in _call_model_fn
    features, labels, mode, config)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/estimator/estimator.py", line 1169, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2470, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1250, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1524, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "run_squad.py", line 623, in model_fn
    ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint)
  File "/home/bert/Dev/venv/bert/modeling.py", line 330, in get_assignment_map_from_checkpoint
    init_vars = tf.train.list_variables(init_checkpoint)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/training/checkpoint_utils.py", line 95, in list_variables
    reader = load_checkpoint(ckpt_dir_or_file)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/training/checkpoint_utils.py", line 64, in load_checkpoint
    return pywrap_tensorflow.NewCheckpointReader(filename)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 314, in NewCheckpointReader
    return CheckpointReader(compat.as_bytes(filepattern), status)
  File "/home/bert/Dev/venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 526, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /home/bert/Dev/venv/uncased_L-12_H-768_A-12//bert_model.ckpt

我在Ubuntu 16.04 LTS上运行 ,配备NVIDIA GTX 1080 Ti(CUDA 9.0) ,使用Anaconda python 3.5发行版 ,在虚拟环境中使用tensorflow gpu 1.11.0


我希望代码能够顺利运行并开始培训(微调),因为它是官方代码,并且我已将文件作为说明放置。

我正在回答我自己的问题

我刚刚通过删除
$BERT\u BASE\u DIR
中的斜杠(
/
)解决了这个问题,因此变量从
'/home/BERT/Dev/venv/uncased\u L-12\u H-768\u A-12/'
更改为
'/home/BERT/Dev/venv/uncased\u L-12\u H-768\u A-12'

因此前缀
“/home/bert/Dev/venv/uncased\u L-12\u H-768\u A-12//bert\u model.ckpt”中不再有双斜杠


tensorflow中的检查点还原函数似乎认为单斜杠或双斜杠是不同的,因为我相信bash会将它们解释为相同的。

他们阅读了自述文件,但看不到对run train脚本的调用中存在的直接问题。这可能是一个环境问题吗?只是一个suggestion@NathanMcCoy嘿,我刚刚通过删除$BERT_BASE_DIR中的斜杠(“/”)解决了这个问题,因此变量从“/home/BERT/Dev/venv/uncased_L-12_H-768_A-12/”更改为“/home/BERT/Dev/venv/uncased_L-12_H-768_A-12”。因此前缀“/home/bert/Dev/venv/uncased_L-12_H-768_A-12//bert_model.ckpt”中不再有双斜杠。成功了!我只是不明白为什么这个斜杠能带来一点不同。因为一条路径中的单斜杠或双斜杠是由一个shell平等解释的。哦,很高兴知道!
(venv) bert@bert-System-Product-Name:~/Dev/venv/uncased_L-12_H-768_A-12$ pwd
/home/bert/Dev/venv/uncased_L-12_H-768_A-12
(venv) bert@bert-System-Product-Name:~/Dev/venv/uncased_L-12_H-768_A-12$ ls
bert_config.json  bert_model.ckpt.data-00000-of-00001  bert_model.ckpt.index  bert_model.ckpt.meta  vocab.txt