Deep learning 在colab笔记本中设置roBERTa模型时出错

Deep learning 在colab笔记本中设置roBERTa模型时出错,deep-learning,nlp,tensorflow2.0,Deep Learning,Nlp,Tensorflow2.0,我在为Tensorflow roBERTa设计的标记器合并词汇表和合并txt文件时出错。我附加了错误快照!![在此处输入图像描述][1] 代码: 错误: Exception Traceback (most recent call last) <ipython-input-9-5dab9f2389e4> in <module>() 1 MAX_LEN = 96 ----> 2 tokenize

我在为Tensorflow roBERTa设计的标记器合并词汇表和合并txt文件时出错。我附加了错误快照!![在此处输入图像描述][1]

代码:

错误:

Exception                                 Traceback (most recent call last)
<ipython-input-9-5dab9f2389e4> in <module>()
      1 MAX_LEN = 96
----> 2 tokenizer = tokenizers.ByteLevelBPETokenizer(vocab_file='vocab_roberta_base.json',merges_file='merges_roberta_base.txt')
      3 sentiment_id = {'positive': 1313, 'negative': 2430, 'neutral': 7974}

/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in __init__(self, vocab_file, merges_file, add_prefix_space, lowercase, dropout, unicode_normalizer, continuing_subword_prefix, end_of_word_suffix)
     31                     dropout=dropout,
     32                     continuing_subword_prefix=continuing_subword_prefix or "",
---> 33                     end_of_word_suffix=end_of_word_suffix or "",
     34                 )
     35             )

Exception: expected ident at line 1 column 2
异常回溯(最近一次调用)
在()
1最大长度=96
---->2 tokenizer=tokenizers.bytelevelbetokenezer(vocab_file='vocab_roberta_base.json',merges_file='merges_roberta_base.txt')
3情绪id={‘积极’:1313,‘消极’:2430,‘中立’:7974}
/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in_u___init__(self、vocab_文件、merges_文件、add_prefix_space、小写、dropout、unicode_normalizer、continued_subword_前缀、end_单词_后缀)
31辍学=辍学,
32 continue_subword_prefix=continue_subword_prefix或“”,
--->33单词后缀的结尾=单词后缀的结尾或“,
34                 )
35             )
例外:第1行第2列的预期标识

来自变压器进口*;tokenizer=RobertaTokenizer.from_pretrained('roberta-base');标记器。从transformers import*保存_词汇表('.');tokenizer=RobertaTokenizer.from_pretrained('roberta-base');标记器。保存_词汇表('.'))
Exception                                 Traceback (most recent call last)
<ipython-input-9-5dab9f2389e4> in <module>()
      1 MAX_LEN = 96
----> 2 tokenizer = tokenizers.ByteLevelBPETokenizer(vocab_file='vocab_roberta_base.json',merges_file='merges_roberta_base.txt')
      3 sentiment_id = {'positive': 1313, 'negative': 2430, 'neutral': 7974}

/usr/local/lib/python3.6/dist-packages/tokenizers/implementations/byte_level_bpe.py in __init__(self, vocab_file, merges_file, add_prefix_space, lowercase, dropout, unicode_normalizer, continuing_subword_prefix, end_of_word_suffix)
     31                     dropout=dropout,
     32                     continuing_subword_prefix=continuing_subword_prefix or "",
---> 33                     end_of_word_suffix=end_of_word_suffix or "",
     34                 )
     35             )

Exception: expected ident at line 1 column 2