带位置的N-grams的CoreNLP API_Nlp_Stanford Nlp_N Gram_Pos Tagger

带位置的N-grams的CoreNLP API

nlp stanford-nlp

带位置的N-grams的CoreNLP API,nlp,stanford-nlp,n-gram,pos-tagger,Nlp,Stanford Nlp,N Gram,Pos Tagger,CoreNLP是否有用于获取位置等ngrams的API 例如，我有一个字符串“我有最好的车”。如果我使用mingrams=1和maxgrams=2。我应该得到如下所示。我知道stringutil与ngram函数，但如何获得位置 (I,0) (I have,0) (have,1) (have the,1) (the,2) (the best,2) etc etc 根据我传递的字符串非常感谢您的帮助谢谢我没有看到任何有用的东西。以下是一些示例代码以帮助您： import java.io.*

CoreNLP是否有用于获取位置等ngrams的API

例如，我有一个字符串“我有最好的车”。如果我使用mingrams=1和maxgrams=2。我应该得到如下所示。我知道stringutil与ngram函数，但如何获得位置

(I,0)
(I have,0)
(have,1)
(have the,1)
(the,2)
(the best,2) etc etc

根据我传递的字符串

非常感谢您的帮助

谢谢

我没有看到任何有用的东西。以下是一些示例代码以帮助您：

import java.io.*;
import java.util.*;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.semgraph.*;
import edu.stanford.nlp.trees.TreeCoreAnnotations.*; 
import edu.stanford.nlp.util.*;


public class NGramPositionExample {


    public static List<List<String>> getNGramsPositions(List<String> items, int minSize, int maxSize) {
        List<List<String>> ngrams = new ArrayList<List<String>>();
    int listSize = items.size();
    for (int i = 0; i < listSize; ++i) {
        for (int ngramSize = minSize; ngramSize <= maxSize; ++ngramSize) {
        if (i + ngramSize <= listSize) {
            List<String> ngram = new ArrayList<String>();
            for (int j = i; j < i + ngramSize; ++j) {
            ngram.add(items.get(j));
            }
                    ngram.add(Integer.toString(i));
            ngrams.add(ngram);
        }
        }
    }
    return ngrams;
    }


        public static void main (String[] args) throws IOException {
            String testString = "I have the best car";
            List<String> tokens = Arrays.asList(testString.split(" "));
            List<List<String>> ngramsAndPositions = getNGramsPositions(tokens,1,2);
            for (List<String> np : ngramsAndPositions) {
                System.out.println(Arrays.toString(np.toArray()));
            }
        }
}

import java.io.*；
导入java.util.*；
导入edu.stanford.nlp.io.*；
导入edu.stanford.nlp.ling.*；
导入edu.stanford.nlp.pipeline.*；
导入edu.stanford.nlp.trees.*；
导入edu.stanford.nlp.semgraph.*；
导入edu.stanford.nlp.trees.treeCorenotations.*；
导入edu.stanford.nlp.util.*；
公共类位置示例{
公共静态列表getNGramsPositions（列表项、int-minSize、int-maxSize）{
List ngrams=new ArrayList（）；
int listSize=items.size（）；
对于（int i=0；i对于（int-ngramSize=minSize；ngramSize只需花费一些代码在scala中重写即可。上面的代码就是将其更改为scala。输出如下
NgramInfo(I,0)NgramInfo(I have,0)NgramInfo(have,1)NgramInfo(have the,1)NgramInfo(the,2)NgramInfo(the best,2)NgramInfo(best,3)NgramInfo(best car,3)NgramInfo(car,4) 

下面是case类的方法
   def getNgramPositions(items: List[String], minSize: Int, maxSize: Int): List[NgramInfo] = {
        var ngramList = new ListBuffer[NgramInfo]
        for (i <- 0 to items.size by 1) {
          for (ngramSize <- minSize until maxSize by 1) {
            if (i + ngramSize <= items.size) {
              var stringList = new ListBuffer[String]
              for (j <- i to i + ngramSize by 1) {
                if (j < items.size) {
                  stringList += items(j)
                  ngramList += new NgramInfo(stringList.mkString(" "), i)
                }
              }
            }
          }
        }
        ngramList.toList
      }

case class NgramInfo(term: String, termPosition: Int) extends Serializable

def getNgramPositions（项：List[String]，minSize:Int，maxSize:Int）：List[NgramInfo]={
var ngramList=新列表缓冲区[NgramInfo]
为了