Python 在新文件中读写标记化和pos_标记的单词_Python_Nltk

Python 在新文件中读写标记化和pos_标记的单词

python

Python 在新文件中读写标记化和pos_标记的单词,python,nltk,Python,Nltk,我有一个txt文件，其中包含一篇新闻文章（我认为它存储为一个列表），我想标记这些单词，标记它们，并将它们保存到相应的文件中我使用nltk库运行以下命令由于某些原因，代码运行，但文件为空。如果我要跑就好了 with open(news_file) as f1, open(token_file, "w") as f2, open(tagged_file, "w") as f3: f2.writelines(('\n'.join(wt(words)) for words in f1.readli

我有一个txt文件，其中包含一篇新闻文章（我认为它存储为一个列表），我想标记这些单词，标记它们，并将它们保存到相应的文件中

我使用nltk库运行以下命令

由于某些原因，代码运行，但文件为空。如果我要跑就好了

with open(news_file) as f1, open(token_file, "w") as f2, open(tagged_file, "w") as f3:
 f2.writelines(('\n'.join(wt(words)) for words in f1.readlines()))

然后，新文件将在新行中列出新闻文章的每个单词

使用下面的代码，我在

tokenized=''.join（wt（taged））

遇到了一个问题，它给出了一个错误

TypeError:expected string或bytes like object

。我也尝试过

str.join

，但没有效果

with open(news_file) as f1, open(token_file, "w") as f2, open(tagged_file, "w") as f3:
    tagged = pos_tag(f1.readlines())
    tokenized = ' '.join(word_tokenize(tagged))
    for token_words in tokenized:
        print(' '.join(token_words), file=f2)
    for tag_words in tagged:
        print(' '.join(tag_words), file=f3)
#f2.writelines(('\n'.join(wt(words)) for words in f1.readlines()))

任何帮助都将不胜感激

谢谢：）

当文件可以pickle时，不要保存/加载文件。使用

pickle

是否意味着在我的机器上本地保存/加载文件？因为我在一个txt文档中有2500篇新闻文章，Pickle是你的朋友，尤其是小数据。