Deep learning 在colab笔记本中设置roBERTa模型时出错_Deep Learning_Nlp_Tensorflow2.0

Deep learning 在colab笔记本中设置roBERTa模型时出错

deep-learning nlp

Deep learning 在colab笔记本中设置roBERTa模型时出错,deep-learning,nlp,tensorflow2.0,Deep Learning,Nlp,Tensorflow2.0,我在为Tensorflow roBERTa设计的标记器合并词汇表和合并txt文件时出错。我附加了错误快照！！[在此处输入图像描述][1] 代码：错误： Exception Traceback (most recent call last) <ipython-input-9-5dab9f2389e4> in <module>() 1 MAX_LEN = 96 ----> 2 tokenize

我在为Tensorflow roBERTa设计的标记器合并词汇表和合并txt文件时出错。我附加了错误快照！！[在此处输入图像描述][1]

代码：

错误：

Exception                                 Traceback (most recent call last)
<ipython-input-9-5dab9f2389e4> in <module>()
      1 MAX_LEN = 96
----> 2 tokenizer = tokenizers.ByteLevelBPETokenizer(vocab_file='vocab_roberta_base.json',merges_file='merges_roberta_base.txt')
      3 sentiment_id = {'positive': 1313, 'negative': 2430, 'neutral': 7974}

/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in __init__(self, vocab_file, merges_file, add_prefix_space, lowercase, dropout, unicode_normalizer, continuing_subword_prefix, end_of_word_suffix)
     31                     dropout=dropout,
     32                     continuing_subword_prefix=continuing_subword_prefix or "",
---> 33                     end_of_word_suffix=end_of_word_suffix or "",
     34                 )
     35             )

Exception: expected ident at line 1 column 2

异常回溯（最近一次调用）
在（）
1最大长度=96
---->2 tokenizer=tokenizers.bytelevelbetokenezer（vocab_file='vocab_roberta_base.json'，merges_file='merges_roberta_base.txt'）
3情绪id={‘积极’：1313，‘消极’：2430，‘中立’：7974}
/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in_u___init__（self、vocab_文件、merges_文件、add_prefix_space、小写、dropout、unicode_normalizer、continued_subword_前缀、end_单词_后缀）
31辍学=辍学，
32 continue_subword_prefix=continue_subword_prefix或“”，
--->33单词后缀的结尾=单词后缀的结尾或“，
34                 )
35             )
例外：第1行第2列的预期标识

来自变压器进口*；tokenizer=RobertaTokenizer.from_pretrained（'roberta-base'）；标记器。从transformers import*保存_词汇表（'.'）；tokenizer=RobertaTokenizer.from_pretrained（'roberta-base'）；标记器。保存_词汇表（'.'））

Exception                                 Traceback (most recent call last)
<ipython-input-9-5dab9f2389e4> in <module>()
      1 MAX_LEN = 96
----> 2 tokenizer = tokenizers.ByteLevelBPETokenizer(vocab_file='vocab_roberta_base.json',merges_file='merges_roberta_base.txt')
      3 sentiment_id = {'positive': 1313, 'negative': 2430, 'neutral': 7974}

/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in __init__(self, vocab_file, merges_file, add_prefix_space, lowercase, dropout, unicode_normalizer, continuing_subword_prefix, end_of_word_suffix)
     31                     dropout=dropout,
     32                     continuing_subword_prefix=continuing_subword_prefix or "",
---> 33                     end_of_word_suffix=end_of_word_suffix or "",
     34                 )
     35             )

Exception: expected ident at line 1 column 2