Java Android:查找音频输入的基频
因此,我已经尝试了一段时间来寻找最佳的解决方案来计算实时使用音频记录捕获的样本的基频。 我在这里查阅了一些示例,因此: , 这些问题对我帮助最大,但我仍然不完全理解它们是如何找到基频的。所以我要找的是一个更详细的解释,我需要做什么才能找到有样本的基频 因此,我创建了一个录音:Java Android:查找音频输入的基频,java,audio,fft,frequency,analysis,Java,Audio,Fft,Frequency,Analysis,因此,我已经尝试了一段时间来寻找最佳的解决方案来计算实时使用音频记录捕获的样本的基频。 我在这里查阅了一些示例,因此: , 这些问题对我帮助最大,但我仍然不完全理解它们是如何找到基频的。所以我要找的是一个更详细的解释,我需要做什么才能找到有样本的基频 因此,我创建了一个录音: micData = new AudioRecord(audioSource, sampleRate, channel, encoding, bufferSize); data = new short[bufferSize]
micData = new AudioRecord(audioSource, sampleRate, channel, encoding, bufferSize);
data = new short[bufferSize];
然后开始听:
micData.startRecording();
sample = micData.read(data,0,bufferSize);
我知道如何创建一个复数数组,但我不知道到底有哪种方法可以使用的值来创建这些复数,哪种方法可以返回峰值频率。阅读您的问题,我发现您还不确定是否要使用FFT。这很好,因为我不建议只使用FFT。停留在时域中,使用自相关或AMDF,如果您想要更精确的结果,而不是使用FFT作为附加组件。
public double getPitchInSampleRange(AudioSamples as, int start, int end) throws Exception {
//If your sound is musical note/voice you need to limit the results because it wouldn't be above 4500Hz or bellow 20Hz
int nLowPeriodInSamples = (int) as.getSamplingRate() / 4500;
int nHiPeriodInSamples = (int) as.getSamplingRate() / 20;
//I get my sample values from my AudioSamples class. You can get them from wherever you want
double[] samples = Arrays.copyOfRange((as.getSamplesChannelSegregated()[0]), start, end);
if(samples.length < nHiPeriodInSamples) throw new Exception("Not enough samples");
//Since we're looking the periodicity in samples, in our case it won't be more than the difference in sample numbers
double[] results = new double[nHiPeriodInSamples - nLowPeriodInSamples];
//Now you iterate the time lag
for(int period = nLowPeriodInSamples; period < nHiPeriodInSamples; period++) {
double sum = 0;
//Autocorrelation is multiplication of the original and time lagged signal values
for(int i = 0; i < samples.length - period; i++) {
sum += samples[i]*samples[i + period];
}
//find the average value of the sum
double mean = sum / (double)samples.length;
//and put it into results as a value for some time lag.
//You subtract the nLowPeriodInSamples for the index to start from 0.
results[period - nLowPeriodInSamples] = mean;
}
//Now, it is obvious that the mean will be highest for time lag equal to the periodicity of the signal because in that case
//most of the positive values will be multiplied with other positive and most of the negative values will be multiplied with other
//negative resulting again as positive numbers and the sum will be high positive number. For example, in the other case, for let's say half period
//autocorrelation will multiply negative with positive values resulting as negatives and you will get low value for the sum.
double fBestValue = Double.MIN_VALUE;
int nBestIndex = -1; //the index is the time lag
//So
//The autocorrelation is highest at the periodicity of the signal
//The periodicity of the signal can be transformed to frequency
for(int i = 0; i < results.length; i++) {
if(results[i] > fBestValue) {
nBestIndex = i;
fBestValue = results[i];
}
}
//Convert the period in samples to frequency and you got yourself a fundamental frequency of a sound
double res = as.getSamplingRate() / (nBestIndex + nLowPeriodInSamples)
return res;
}
这是我计算基频的Java代码。我写评论是因为你说你仍然不理解这个过程。public double getPitchInSampleRange(AudioSamples as, int start, int end) throws Exception {
//If your sound is musical note/voice you need to limit the results because it wouldn't be above 4500Hz or bellow 20Hz
int nLowPeriodInSamples = (int) as.getSamplingRate() / 4500;
int nHiPeriodInSamples = (int) as.getSamplingRate() / 20;
//I get my sample values from my AudioSamples class. You can get them from wherever you want
double[] samples = Arrays.copyOfRange((as.getSamplesChannelSegregated()[0]), start, end);
if(samples.length < nHiPeriodInSamples) throw new Exception("Not enough samples");
//Since we're looking the periodicity in samples, in our case it won't be more than the difference in sample numbers
double[] results = new double[nHiPeriodInSamples - nLowPeriodInSamples];
//Now you iterate the time lag
for(int period = nLowPeriodInSamples; period < nHiPeriodInSamples; period++) {
double sum = 0;
//Autocorrelation is multiplication of the original and time lagged signal values
for(int i = 0; i < samples.length - period; i++) {
sum += samples[i]*samples[i + period];
}
//find the average value of the sum
double mean = sum / (double)samples.length;
//and put it into results as a value for some time lag.
//You subtract the nLowPeriodInSamples for the index to start from 0.
results[period - nLowPeriodInSamples] = mean;
}
//Now, it is obvious that the mean will be highest for time lag equal to the periodicity of the signal because in that case
//most of the positive values will be multiplied with other positive and most of the negative values will be multiplied with other
//negative resulting again as positive numbers and the sum will be high positive number. For example, in the other case, for let's say half period
//autocorrelation will multiply negative with positive values resulting as negatives and you will get low value for the sum.
double fBestValue = Double.MIN_VALUE;
int nBestIndex = -1; //the index is the time lag
//So
//The autocorrelation is highest at the periodicity of the signal
//The periodicity of the signal can be transformed to frequency
for(int i = 0; i < results.length; i++) {
if(results[i] > fBestValue) {
nBestIndex = i;
fBestValue = results[i];
}
}
//Convert the period in samples to frequency and you got yourself a fundamental frequency of a sound
double res = as.getSamplingRate() / (nBestIndex + nLowPeriodInSamples)
return res;
}
public-double-getPitchInSampleRange(音频样本为,int-start,int-end)引发异常{
//如果您的声音是音符/声音,您需要限制结果,因为它不会高于4500Hz或低于20Hz
int nLowPeriodInSamples=(int)as.getSamplingRate()/4500;
int-nHiPeriodInSamples=(int)as.getSamplingRate()/20;
//我从AudioSamples类中获取样本值。您可以从任何地方获取它们
double[]samples=Arrays.copyOfRange((as.getSamplesChannelSeparated()[0]),start,end);
如果(samples.lengthfBestValue){
nBestIndex=i;
fBestValue=结果[i];
}
}
//将采样中的周期转换成频率,你就得到了一个声音的基本频率
double res=as.getSamplingRate()/(nBestIndex+nLowPeriodInSamples)
返回res;
}
你还需要知道的是,在自相关方法中存在常见的倍频程错误,特别是当信号中有噪声时。根据我的经验,钢琴声或吉他声不是问题。这些错误很少见。但是人类的声音可能是…你到底不明白什么?基音周期估计是一个很大的研究课题。你在记录什么?你的准确度要求是什么?抱歉,我不太清楚的是,你是如何从FFT数组中得到基频的(峰值频率,我想它将是主要的频率,以便我能够确定播放的是哪个音符)。谷歌“基音检测或基音估计”. 有很多关于各种技术的研究论文(见:),你应该在你的问题中加入Java标记,因为它与Android相关。否则代码块将不会得到语法高亮显示…如果您坚持使用FFT,您需要做的是:1。用样本值2填充复杂数组的实部。调用你的FFT计算方法,然后你还有虚部。3.将每个点的大小计算为Re^2+Im^2(如果您希望以DB为单位进行测量,则需要另外计算10*log())4。现在,您的结果数组是频谱,SamplingFreq/numOfSamples提供频率点之间的距离。对于结果数组中的索引i,频率为(i+1)*SamplingFreq/numOfSamples。非常感谢,这非常有用。在您的代码中,nLowPeriodInSamples和nHiPeriodInSamples不能直接设置为4500和20?在双阵列(样本)中,我可以使用录音,对吗?还有,什么是最佳指数[i]?不。。。。20和4500是以Hz为单位的频率。由于自相关是时域中的一种方法,所以您需要将频率限制转换为样本中周期中的时间限制。对不起,这是显而易见的。另外,在双数组(示例)上,我可以按原样使用AudioRecord对象,还是需要将其转换为数组?如您所愿。。。您只需要访问样本值,将它们相乘,然后