python中的日语txt文件和wordcloud输出有问题
我引用了阿穆勒的“云”一词和其他几个词。事情是这样的:python中的日语txt文件和wordcloud输出有问题,python,utf-8,atom-editor,word-cloud,Python,Utf 8,Atom Editor,Word Cloud,我引用了阿穆勒的“云”一词和其他几个词。事情是这样的: #!c:/Python27/python.exe # coding: UTF-8 from os import path from wordcloud import WordCloud import MeCab as mc d = path.dirname("C:\Users\BobLeponge\Desktop\jpn\JPNTEXT.txt") text = open(path.join(d, 'JPNTEXT.txt')).read
#!c:/Python27/python.exe
# coding: UTF-8
from os import path
from wordcloud import WordCloud
import MeCab as mc
d = path.dirname("C:\Users\BobLeponge\Desktop\jpn\JPNTEXT.txt")
text = open(path.join(d, 'JPNTEXT.txt')).read()
text = text.decode("utf-8")
def mecab_analysis(text):
t = mc.Tagger('-Ochasen -d/usr/local/Cellar/mecab/0.996/lib/mecab/dic/mecab-ipadic-neologd/')
enc_text = text.encode('utf-8')
node = t.parseToNode(enc_text)
output = []
while(node):
if node.surface != "":
word_type = node.feature.split(",")[0]
if word_type in ["形容詞", "動詞","名詞", "副詞"]:
output.append(node.surface)
node = node.next
if node is None:
break
return output
def create_wordcloud(text):
fpath = "C:\WINDOWS\Fonts\NotoSansMonoCJKjp-Regular.otf"
stop_words = [ u'てる', u'いる', u'なる', u'れる', u'する', u'ある', u'こと', u'これ', u'さん', u'して', \
u'くれる', u'やる', u'くださる', u'そう', u'せる', u'した', u'思う', \
u'それ', u'ここ', u'ちゃん', u'くん', u'', u'て',u'に',u'を',u'は',u'の', u'が', u'と', u'た', u'し', u'で', \
u'ない', u'も', u'な', u'い', u'か', u'ので', u'よう', u'']
wordcloud = WordCloud(background_color="white",font_path=fpath, width=900, height=500, \
stopwords=set(stop_words)).generate(text)
import matplotlib.pyplot as plt
wordcloud = WordCloud(background_color="white", width=900, height=500).generate(text)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
当我执行它时,会弹出这个图像
我仍然是python的初学者,所以我不确定到底出了什么问题。我检查了文本文件上的编码,它是utf-8,并且#编码:utf-8
位于代码的开头,因此我认为它可以工作。我做错了什么