如何从BerkelyAligner读取对齐类型?-JAVA
从如何从BerkelyAligner读取对齐类型?-JAVA,java,text,nlp,alignment,text-alignment,Java,Text,Nlp,Alignment,Text Alignment,从http://code.google.com/p/berkeleyaligner/,我将该项目添加到Eclipse上的构建路径中。然后使用下面的代码,我可以提取我从sourceFile和targetFile中读取的每个句子对的对齐方式。 对齐后,如何从BerkeleyAligner读取对齐类型 import edu.berkeley.nlp.wa.mt.Alignment; import edu.berkeley.nlp.wa.mt.SentencePair; import edu.berke
http://code.google.com/p/berkeleyaligner/
,我将该项目添加到Eclipse上的构建路径中。然后使用下面的代码,我可以提取我从sourceFile和targetFile中读取的每个句子对的对齐方式。
对齐后,如何从BerkeleyAligner读取对齐
类型
import edu.berkeley.nlp.wa.mt.Alignment;
import edu.berkeley.nlp.wa.mt.SentencePair;
import edu.berkeley.nlp.wordAlignment.combine.WordAlignerCombined;
public static void main(String[] args) {
BufferedReader brSrc = new BufferedReader(new FileReader ("sourceFile"));
BufferedReader brTrg = new BufferedReader(new FileReader ("targetFile"));
while ((currentSrcLine = brSrc.readLine()) !=null) {
String currentTrgLine = brTrg.readline();
// Reads into BerkeleyAligner SentencePair format.
SentencePair src2trg = new SentencePair(sentCounter, params.get("source"),
Arrays.asList(srcLine.split(" ")), Arrays.asList(trgLine.split(" ")));
// Generate Alignment type from SentencePair
WordAlignerCombined aligner;
Alignment alignedPair = aligner.alignSentencePair(src2trg);
// How do i print out the Alignment???
}
}
e、 g.源文件:
this is the first line in the textfile.
that is the second line.
foo bar likes to eat bar foo.
e、 g.目标文件:
Dies ist die erste Textzeile in der Datei.
das ist die zweite Zeile.
foo bar gerne bar foo essen.
打印GIZA。有一种方法:
public void writeGIZA(PrintWriter out, int idx)
吉萨是:
"# sentence pair (%d) source length %d target length %d alignment score : 0\n"
"NULL ({ %s })"
" %s ({ %s })" (englishSentence.get(i), StrUtils.join(alignments))
idx
只是句子对id
out
正是您想要打印它的地方。打印GIZA。有一种方法:
public void writeGIZA(PrintWriter out, int idx)
吉萨是:
"# sentence pair (%d) source length %d target length %d alignment score : 0\n"
"NULL ({ %s })"
" %s ({ %s })" (englishSentence.get(i), StrUtils.join(alignments))
idx
只是句子对id
out
正是您想要打印它的地方。在一些在线搜索之后。。。这里有一些提示http://code.google.com/p/tdx-nlp/source/browse/trunk/pa2/java/src/cs224n/assignments/WordAlignmentTester.java?r=67
。但在网上搜索之后,我仍在想如何给它打电话。。。这里有一些提示http://code.google.com/p/tdx-nlp/source/browse/trunk/pa2/java/src/cs224n/assignments/WordAlignmentTester.java?r=67
。但我还在想怎么称呼它