Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/398.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何在Hadoop上用OpenNLP训练意大利语模型?_Java_Hadoop_Nlp_Opennlp_Linguistics - Fatal编程技术网

Java 如何在Hadoop上用OpenNLP训练意大利语模型?

Java 如何在Hadoop上用OpenNLP训练意大利语模型?,java,hadoop,nlp,opennlp,linguistics,Java,Hadoop,Nlp,Opennlp,Linguistics,我想在Hadoop上为意大利语实现一个自然语言处理算法 我有两个问题 如何为意大利语找到词干算法? 如何在hadoop中集成? 这是我的密码 String pathSent=...tagged sentences...; String pathChunk=....chunked train path....; File fileSent=new File(pathSent); File fileChunk=new File(pathChunk); InputStream inSent=null;

我想在Hadoop上为意大利语实现一个自然语言处理算法

我有两个问题

如何为意大利语找到词干算法? 如何在hadoop中集成? 这是我的密码

String pathSent=...tagged sentences...;
String pathChunk=....chunked train path....;
File fileSent=new File(pathSent);
File fileChunk=new File(pathChunk);
InputStream inSent=null;
InputStream inChunk=null;

inSent = new FileInputStream(fileSent);
inChunk = new FileInputStream(fileChunk);
POSModel posModel=POSTaggerME.train("it", new WordTagSampleStream((
new InputStreamReader(inSent))), ModelType.MAXENT, null, null, 3, 3);

ObjectStream stringStream =new PlainTextByLineStream(new InputStreamReader(inChunk));
ObjectStream chunkStream = new ChunkSampleStream(stringStream);
ChunkerModel chunkModel=ChunkerME.train("it",chunkStream ,1, 1);
this.tagger= new POSTaggerME(posModel);
this.chunker=new ChunkerME(chunkModel);


inSent.close();
inChunk.close();

您需要一个语法句子引擎:

"io voglio andare a casa"

io, sostantivo
volere, verbo
andare, verbo
a, preposizione semplice
casa, oggetto
当你标记了句子后,你可以教OpenNLP

在Hadoop上创建自定义映射

 public class Map extends Mapper<longwritable,
                            intwritable="" text,=""> {  

           private final static IntWritable one =
                           new IntWritable(1);  
          private Text word = new Text();    

          @Override  public void map(LongWritable key, Text value,
                      Context context)
      throws IOException, InterruptedException {

            //your code here
       } 
  }
public class Reduce extends Reducer<text,
              intwritable,="" intwritable="" text,=""> {
 @Override
 protected void reduce(
   Text key,
   java.lang.Iterable<intwritable> values,
   org.apache.hadoop.mapreduce.Reducer<text,
           intwritable,="" intwritable="" text,="">.Context context)
   throws IOException, InterruptedException {
       // your reduce here
 }
}
public static void main(String[] args)
                      throws Exception {
  Configuration conf = new Configuration();

  Job job = new Job(conf, "opennlp");
  job.setJarByClass(CustomOpenNLP.class);

  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(IntWritable.class);

  job.setMapperClass(Map.class);
  job.setReducerClass(Reduce.class);

  job.setInputFormatClass(TextInputFormat.class);
  job.setOutputFormatClass(TextOutputFormat.class);

  FileInputFormat.addInputPath(job, new Path(args[0]));
  FileOutputFormat.setOutputPath(job, new Path(args[1]));

  job.waitForCompletion(true);
}