用于python的斯坦福nlp_Python_Stanford Nlp_Sentiment Analysis

用于python的斯坦福nlp

python stanford-nlp

用于python的斯坦福nlp,python,stanford-nlp,sentiment-analysis,Python,Stanford Nlp,Sentiment Analysis,我想做的就是找到任何给定字符串的情绪（积极/消极/中性）。在研究中，我遇到了斯坦福NLP。但不幸的是，它是用Java编写的。关于如何让它为python工作，有什么想法吗 Textblob是一个用Python编写的用于情感分析的优秀软件包。你可以有这个。任何给定句子的情感分析都是通过检查单词及其相应的情感分数（情感）来进行的。你可以从 $ pip install -U textblob $ python -m textblob.download_corpora 第一个pip install命令将

我想做的就是找到任何给定字符串的情绪（积极/消极/中性）。在研究中，我遇到了斯坦福NLP。但不幸的是，它是用Java编写的。关于如何让它为python工作，有什么想法吗

Textblob

是一个用

Python

编写的用于情感分析的优秀软件包。你可以有这个。任何给定句子的情感分析都是通过检查单词及其相应的情感分数（情感）来进行的。你可以从

$ pip install -U textblob
$ python -m textblob.download_corpora

第一个pip install命令将为您提供（

virtualenv

）系统中安装的textblob的最新版本，因为您通过了

-您将升级pip包，使其达到最新可用版本。接下来将下载所需的所有数据，语料库
 我也面临类似的情况。我的大多数项目都使用Python，其中最重要的部分是Java。幸运的是，学习如何使用斯坦福CoreNLP jar非常容易
这是我的一个脚本，您可以下载JAR并运行它
import java.util.List;
import java.util.Properties;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.sentiment.SentimentCoreAnnotations.SentimentAnnotatedTree;
import edu.stanford.nlp.trees.Tree;
import edu.stanford.nlp.util.ArrayCoreMap;
import edu.stanford.nlp.util.CoreMap;

public class Simple_NLP {
static StanfordCoreNLP pipeline;

    public static void init() {
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, parse, sentiment");
        pipeline = new StanfordCoreNLP(props);
    }

    public static String findSentiment(String tweet) {
        String SentiReturn = "";
        String[] SentiClass ={"very negative", "negative", "neutral", "positive", "very positive"};

        //Sentiment is an integer, ranging from 0 to 4. 
        //0 is very negative, 1 negative, 2 neutral, 3 positive and 4 very positive.
        int sentiment = 2;

        if (tweet != null && tweet.length() > 0) {
            Annotation annotation = pipeline.process(tweet);

            List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
            if (sentences != null && sentences.size() > 0) {

                ArrayCoreMap sentence = (ArrayCoreMap) sentences.get(0);                
                Tree tree = sentence.get(SentimentAnnotatedTree.class);  
                sentiment = RNNCoreAnnotations.getPredictedClass(tree);             
                SentiReturn = SentiClass[sentiment];
            }
        }
        return SentiReturn;
    }

}

import java.util.List；
导入java.util.Properties；
导入edu.stanford.nlp.ling.core注释；
导入edu.stanford.nlp.neural.rnn.rnn注释；
导入edu.stanford.nlp.pipeline.Annotation；
导入edu.stanford.nlp.pipeline.StanfordCoreNLP；
导入edu.stanford.nlp.thousion.thousionCoreAnnotations.thousionAnnotatedTree；
导入edu.stanford.nlp.trees.Tree；
导入edu.stanford.nlp.util.ArrayCoreMap；
导入edu.stanford.nlp.util.CoreMap；
公共类简单{
StanfordCoreNLP静态管道；
公共静态void init（）{
Properties props=新属性（）；
props.setProperty（“注释器”、“标记化、ssplit、解析、情感”）；
管道=新StanfordCoreNLP（道具）；
}
公共静态字符串查找实体（字符串tweet）{
字符串返回=”；
字符串[]SentiClass={“非常负”、“负”、“中性”、“正”、“非常正”}；
//情绪是一个整数，范围从0到4。
//0为非常负，1为负，2为中性，3为正，4为非常正。
int=2；
if（tweet！=null&&tweet.length（）>0）{
注释=pipeline.process（tweet）；
列出句子=annotation.get（coreanotations.SentencesAnnotation.class）；
if（句子！=null&&句子.size（）>0）{
ArrayCoreMap语句=（ArrayCoreMap）语句。获取（0）；
Tree-Tree=句子.get（感伤注释树.class）；
情绪=RNNCorenceNotations.getPredictedClass（树）；
情感回报=情感等级[情感]；
}
}
回归；回归；
}
}
我面临着同样的问题：可能是一个使用@roopalgarg所指出的Py4j
的解决方案
斯坦福大学科伦普分校
这个repo提供了一个Python接口，用于调用斯坦福大学corenlpjava包（从v。3.5.1. 它使用py4j与JVM交互；因此，为了运行scripts/runGateway.py这样的脚本，必须首先编译并运行创建JVM网关的Java类
使用
下载
目前（2020-05-25）的最新版本为4.0.0：
wget https://nlp.stanford.edu/software/stanford-corenlp-4.0.0.zip https://nlp.stanford.edu/software/stanford-corenlp-4.0.0-models-english.jar

如果您没有，您可能有：
如果所有其他操作都失败，请使用浏览器；-）
安装软件包
启动
注:
timeout以毫秒为单位，我将其设置为10秒以上。
如果向服务器传递巨大的blob，则应该增加它
有，您可以使用--help
列出它们
-mx5g
应该分配足够的功率，但如果您的机箱功率不足，YMMV和您可能需要修改该选项
安装python包
标准包
pip install pycorenlp

不适用于Python3.9，因此需要执行以下操作
pip install git+https://github.com/sam-s/py-corenlp.git

（另见）
使用它
您将获得：
0: 'I love you .': 3 Positive
1: 'I hate him .': 1 Negative
2: 'You are nice .': 3 Positive
3: 'He is dumb': 1 Negative

笔记
你将全文传递给服务器，服务器将其拆分成句子。它还将句子拆分为标记
这种情绪归因于每一句话，而不是整个文本。句子间的情感值
可用于估计全文的情感
一个句子的平均情绪介于中性
（2）和消极
（1）之间，范围从非常消极
（0）到非常积极
（4），这似乎是相当罕见的
您可以在启动它的终端上键入Ctrl-C，也可以使用shell命令kill$（lsof-ti tcp:9000）
9000
是默认端口，您可以在启动服务器时使用-port
选项对其进行更改
如果出现超时错误，请在服务器或客户端中增加超时时间（以毫秒为单位）
情感
只是一个注释器，有，你可以请求几个，用逗号分隔：“注释器”：“情感，引理”
请注意，情绪模型有些特殊（例如）
PS。我不敢相信我添加了一个9th答案，但是，我想，我不得不这样做，因为现有的答案对我都没有帮助（之前的8个答案中有些已经被删除，有些已经转换为注释）。使用stanfordcore nlp python库
StanfordCoreNLP在stanfordcore nlp之上是一个非常好的包装器，可以在python中使用它
wgethttp://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip

用法
我建议使用TextBlob库。示例实现如下所示：
from textblob import TextBlob
def sentiment(message):
    # create TextBlob object of passed tweet text
    analysis = TextBlob(message)
    # set sentiment
    return (analysis.sentiment.polarity)

在这个问题上有一个非常新的进展：
现在，您可以在python中使用stanfordnlp
包：
从：
斯坦福大学NLP工具的本机Python实现
最近，斯坦福大学发布了一种新的基于神经网络（NN）的m
pip install git+https://github.com/sam-s/py-corenlp.git

from pycorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('http://localhost:9000')
res = nlp.annotate("I love you. I hate him. You are nice. He is dumb",
                   properties={
                       'annotators': 'sentiment',
                       'outputFormat': 'json',
                       'timeout': 1000,
                   })
for s in res["sentences"]:
    print("%d: '%s': %s %s" % (
        s["index"],
        " ".join([t["word"] for t in s["tokens"]]),
        s["sentimentValue"], s["sentiment"]))

0: 'I love you .': 3 Positive
1: 'I hate him .': 1 Negative
2: 'You are nice .': 3 Positive
3: 'He is dumb': 1 Negative

# Simple usage
from stanfordcorenlp import StanfordCoreNLP

nlp = StanfordCoreNLP('/Users/name/stanford-corenlp-full-2018-10-05')

sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print('Tokenize:', nlp.word_tokenize(sentence))
print('Part of Speech:', nlp.pos_tag(sentence))
print('Named Entities:', nlp.ner(sentence))
print('Constituency Parsing:', nlp.parse(sentence))
print('Dependency Parsing:', nlp.dependency_parse(sentence))

nlp.close() # Do not forget to close! The backend server will consume a lot memory.

from textblob import TextBlob
def sentiment(message):
    # create TextBlob object of passed tweet text
    analysis = TextBlob(message)
    # set sentiment
    return (analysis.sentiment.polarity)

>>> import stanfordnlp
>>> stanfordnlp.download('en')   # This downloads the English models for the neural pipeline
>>> nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English
>>> doc = nlp("Barack Obama was born in Hawaii.  He was elected president in 2008.")
>>> doc.sentences[0].print_dependencies()

pip install stanfordnlp

import stanfordnlp

stanfordnlp.download('en')   # This downloads the English models for the neural pipeline
nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English
doc = nlp("Barack Obama was born in Hawaii.  He was elected president in 2008.")
doc.sentences[0].print_dependencies()