Java CMUSphinx从不识别音频文件中的任何单词
Sphinx似乎无法识别或处理音频文件—它接受音频流并吐出一个空数组(SpeechResult)。我觉得我正在使用的音频文件没有任何问题,因为我已经尝试了几个,但都不起作用。有人有他们知道有效的音频文件吗?有没有什么突出的东西可以导致流不产生转录Java CMUSphinx从不识别音频文件中的任何单词,java,speech-recognition,cmusphinx,Java,Speech Recognition,Cmusphinx,Sphinx似乎无法识别或处理音频文件—它接受音频流并吐出一个空数组(SpeechResult)。我觉得我正在使用的音频文件没有任何问题,因为我已经尝试了几个,但都不起作用。有人有他们知道有效的音频文件吗?有没有什么突出的东西可以导致流不产生转录 public static void main(String args[]) throws IOException { Configuration configuration = new Configuration(); configu
public static void main(String args[]) throws IOException {
Configuration configuration = new Configuration();
configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.dmp");
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
//recognizer.startRecognition(new FileInputStream("E:/1video/hello-5.mp3"));
File file = new File("E:/1video/bargain_not.wav");
FileInputStream fis = new FileInputStream(file);
InputStream is = new FileInputStream(file);
//is = AutomaticSpeechRecognition.class.getResourceAsStream("/edu/cmu/sphinx/demo/aligner/10001-90210-01803.wav");
recognizer.startRecognition(is);
SpeechResult result = null;
while((result = recognizer.getResult()) != null) {
System.out.println(result.getResult());
System.out.println(result.getHypothesis());
System.out.println(result.getWords());
}
//result = recognizer.getResult();
//System.out.println(result);
//System.out.println(result.toString());
//System.out.println(result.getWords());
/*for (WordResult wordResult : result.getWords())
{
System.out.println(wordResult);
}*/
recognizer.stopRecognition();
}
这是运行它的输出——它似乎没有任何故障
09:31:13.430 INFO unitManager CI Unit: *+NSN+
09:31:13.433 INFO unitManager CI Unit: *+SPN+
09:31:13.433 INFO unitManager CI Unit: AA
09:31:13.434 INFO unitManager CI Unit: AE
09:31:13.434 INFO unitManager CI Unit: AH
09:31:13.434 INFO unitManager CI Unit: AO
09:31:13.434 INFO unitManager CI Unit: AW
09:31:13.434 INFO unitManager CI Unit: AY
09:31:13.434 INFO unitManager CI Unit: B
09:31:13.434 INFO unitManager CI Unit: CH
09:31:13.434 INFO unitManager CI Unit: D
09:31:13.434 INFO unitManager CI Unit: DH
09:31:13.434 INFO unitManager CI Unit: EH
09:31:13.435 INFO unitManager CI Unit: ER
09:31:13.435 INFO unitManager CI Unit: EY
09:31:13.435 INFO unitManager CI Unit: F
09:31:13.435 INFO unitManager CI Unit: G
09:31:13.435 INFO unitManager CI Unit: HH
09:31:13.435 INFO unitManager CI Unit: IH
09:31:13.435 INFO unitManager CI Unit: IY
09:31:13.435 INFO unitManager CI Unit: JH
09:31:13.435 INFO unitManager CI Unit: K
09:31:13.435 INFO unitManager CI Unit: L
09:31:13.435 INFO unitManager CI Unit: M
09:31:13.436 INFO unitManager CI Unit: N
09:31:13.436 INFO unitManager CI Unit: NG
09:31:13.436 INFO unitManager CI Unit: OW
09:31:13.436 INFO unitManager CI Unit: OY
09:31:13.436 INFO unitManager CI Unit: P
09:31:13.436 INFO unitManager CI Unit: R
09:31:13.436 INFO unitManager CI Unit: S
09:31:13.436 INFO unitManager CI Unit: SH
09:31:13.436 INFO unitManager CI Unit: T
09:31:13.436 INFO unitManager CI Unit: TH
09:31:13.436 INFO unitManager CI Unit: UH
09:31:13.437 INFO unitManager CI Unit: UW
09:31:13.437 INFO unitManager CI Unit: V
09:31:13.437 INFO unitManager CI Unit: W
09:31:13.437 INFO unitManager CI Unit: Y
09:31:13.437 INFO unitManager CI Unit: Z
09:31:13.437 INFO unitManager CI Unit: ZH
09:31:14.014 INFO autoCepstrum Cepstrum component auto-configured as follows: autoCepstrum {MelFrequencyFilterBank, Denoise, DiscreteCosineTransform2, Lifter}
09:31:14.030 INFO dictionary Loading dictionary from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict
09:31:14.132 INFO dictionary Loading filler dictionary from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us/noisedict
09:31:14.132 INFO acousticModelLoader Loading tied-state acoustic model from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us
09:31:14.133 INFO acousticModelLoader Pool means Entries: 16128
09:31:14.133 INFO acousticModelLoader Pool variances Entries: 16128
09:31:14.133 INFO acousticModelLoader Pool transition_matrices Entries: 42
09:31:14.133 INFO acousticModelLoader Pool senones Entries: 5126
09:31:14.133 INFO acousticModelLoader Gaussian weights: mixture_weights. Entries: 15378
09:31:14.133 INFO acousticModelLoader Pool senones Entries: 5126
09:31:14.133 INFO acousticModelLoader Context Independent Unit Entries: 42
09:31:14.133 INFO acousticModelLoader HMM Manager: 137095 hmms
09:31:14.134 INFO acousticModel CompositeSenoneSequences: 0
09:31:14.134 INFO largeTrigramModel Loading n-gram language model from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us.lm.dmp
09:31:14.807 INFO largeTrigramModel 1-grams: 19794
09:31:14.807 INFO largeTrigramModel 2-grams: 1377200
09:31:14.807 INFO largeTrigramModel 3-grams: 3178194
09:31:15.582 INFO lexTreeLinguist Max CI Units 43
09:31:15.583 INFO lexTreeLinguist Unit table size 79507
09:31:15.585 INFO speedTracker # ----------------------------- Timers----------------------------------------
09:31:15.585 INFO speedTracker # Name Count CurTime MinTime MaxTime AvgTime TotTime
09:31:15.586 INFO speedTracker Load Dictionary 1 0.1020s 0.1020s 0.1020s 0.1020s 0.1020s
09:31:15.586 INFO speedTracker Load LM 1 0.6730s 0.6730s 0.6730s 0.6730s 0.6730s
09:31:15.586 INFO speedTracker Compile 1 0.7760s 0.7760s 0.7760s 0.7760s 0.7760s
09:31:15.586 INFO speedTracker Load AM 1 1.5450s 1.5450s 1.5450s 1.5450s 1.5450s
09:31:15.608 INFO speedTracker This Time Audio: 1.94s Proc: 0.01s Speed: 0.00 X real time
09:31:15.608 INFO speedTracker Total Time Audio: 1.94s Proc: 0.01s 0.00 X real time
09:31:15.609 INFO memoryTracker Mem Total: 454.75 Mb Free: 262.35 Mb
09:31:15.609 INFO memoryTracker Used: This: 192.40 Mb Avg: 192.40 Mb Max: 192.40 Mb
09:31:15.610 INFO largeTrigramModel LM Cache Size: 0 Hits: 0 Misses: 0
<s> </s>
09:31:13.430信息单元管理器CI单元:*+NSN+
09:31:13.433信息单元经理CI单元:*+SPN+
09:31:13.433信息单位经理CI单位:AA
09:31:13.434信息单位经理CI单位:AE
09:31:13.434信息单位经理CI单位:啊
09:31:13.434信息单位经理CI单位:AO
09:31:13.434信息单位经理CI单位:AW
09:31:13.434信息单位经理CI单位:是
09:31:13.434信息单元经理CI单元:B
09:31:13.434信息单位经理CI单位:CH
09:31:13.434信息单元经理CI单元:D
09:31:13.434信息单位经理CI单位:DH
09:31:13.434信息单位经理CI单位:EH
09:31:13.435信息单位经理CI单位:ER
09:31:13.435信息单位经理CI单位:EY
09:31:13.435信息单位经理CI单位:F
09:31:13.435信息单位经理CI单位:G
09:31:13.435信息单位经理CI单位:HH
09:31:13.435信息单位经理CI单位:IH
09:31:13.435信息单位经理CI单位:IY
09:31:13.435信息单位经理CI单位:JH
09:31:13.435信息单位经理CI单位:K
09:31:13.435信息单位经理CI单位:L
09:31:13.435信息单位经理CI单位:M
09:31:13.436信息单位经理CI单位:N
09:31:13.436信息单位经理CI单位:NG
09:31:13.436信息单位经理CI单位:OW
09:31:13.436信息单位经理CI单位:OY
09:31:13.436信息单位经理CI单位:P
09:31:13.436信息单位经理CI单位:R
09:31:13.436信息单位经理CI单位:S
09:31:13.436信息单位经理CI单位:上海
09:31:13.436信息单位经理CI单位:T
09:31:13.436信息单位经理CI单位:TH
09:31:13.436信息单位经理CI单位:嗯
09:31:13.437信息单位经理CI单位:UW
09:31:13.437信息单位经理CI单位:V
09:31:13.437信息单位经理CI单位:W
09:31:13.437信息单位经理CI单位:Y
09:31:13.437信息单位经理CI单位:Z
09:31:13.437信息单位经理CI单位:ZH
09:31:14.014信息自动倒谱倒谱组件自动配置如下:自动倒谱{MelFrequencyFilterBank,去噪,离散余弦传输M2,挺杆}
09:31:14.030信息字典加载字典来自:jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4 data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict
09:31:14.132信息字典加载填充字典来自:jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4 data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar/edu/cmu/sphinx/models/en-us/en-us/noisedict
09:31:14.132信息声学模型阅读器加载绑定状态声学模型来自:jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4 data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar/edu/cmu/sphinx/models/en-us/en-us
09:31:14.133信息声学模型加载程序池表示条目:16128
09:31:14.133信息声学模型加载程序池差异条目:16128
09:31:14.133信息声学模型加载程序池转换矩阵条目:42
09:31:14.133信息声学模型加载程序池senones条目:5126
09:31:14.133信息声学模型加载高斯权重:混合权重。参赛作品:15378
09:31:14.133信息声学模型加载程序池senones条目:5126
09:31:14.133信息声学模型阅读器上下文独立单元条目:42
09:31:14.133信息声学模型阅读器HMM管理器:137095 hmms
09:31:14.134信息声学模型合成非序列:0
09:31:14.134信息largeTrigramModel从以下位置加载n-gram语言模型:jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4 data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar/edu/cmu/sphinx/models/en-us/en-us.lm.dmp
09:31:14.807信息largeTrigramModel 1-grams:19794
09:31:14.807信息largeTrigramModel 2-grams:1377200
09:31:14.807信息largeTrigramModel 3-grams:3178194
09:31:15.582信息lextreelineguist最大CI单位43
09:31:15.583信息lexTreeLinguist单位表尺寸79507
09:31:15.585信息速度跟踪器-------------------------------------计时器----------------------------------------
09:31:15.585信息速度跟踪器#姓名计数CurTime MinTime MaxTime AvgTime TotTime
09:31:15.586信息speedTracker加载字典1 0.1020s 0.1020s 0.1020s 0.1020s 0.1020s 0.1020s
09:31:15.586信息速度跟踪器负载LM 1 0.6730s 0.6730s 0.6730s 0.6730s 0.6730s 0.6730s
09:31:15.586信息速度跟踪器编译1 0.7760s 0.7760s 0.7760s 0.7760s 0.7760s 0.7760s
09:31:15.586信息速度跟踪器加载AM 1 1.5450s 1.5450s 1.5450s 1.5450s 1.5450s 1.5450s
09:31:15.608信息速度跟踪器这次音频:1.94s进程:0.01s速度:0.00 X实时
09:31:15.608信息速度跟踪器总时间音频:1.94s进程:0.01s 0.00 X实时
09:31:15.609信息存储器Tracker Mem总计:454.75 Mb免费:262.35 Mb
09:31:15.609使用的信息内存跟踪程序:此:192.40 Mb平均:192.40 Mb最大:192.40 Mb
09:31:15.610信息largeTrigramModel LM缓存大小:0命中:0未命中:0
正如尼古拉·什米雷夫所说,文件必须是16khz 16位单声道MSWAV。这样的文件可以大胆地记录下来