Java:如何获取音频输入的当前频率?

Java:如何获取音频输入的当前频率?,java,audio,fft,frequency,javasound,Java,Audio,Fft,Frequency,Javasound,我想分析麦克风输入的当前频率,以使我的LED与音乐播放同步。我知道如何从麦克风捕捉声音,但我不知道FFT,这是我在寻找获得频率的解决方案时经常看到的 我想测试特定频率的当前音量是否大于设定值。代码应该如下所示: if(frequency > value) { LEDs on else { LEDs off } import javax.sound.sampled.*; public class AudioLED { private static final

我想分析麦克风输入的当前频率,以使我的LED与音乐播放同步。我知道如何从麦克风捕捉声音,但我不知道FFT,这是我在寻找获得频率的解决方案时经常看到的

我想测试特定频率的当前音量是否大于设定值。代码应该如下所示:

 if(frequency > value) { 
   LEDs on
 else {
   LEDs off
 }
import javax.sound.sampled.*;

public class AudioLED {

    private static final float NORMALIZATION_FACTOR_2_BYTES = Short.MAX_VALUE + 1.0f;

    public static void main(final String[] args) throws Exception {
        // use only 1 channel, to make this easier
        final AudioFormat format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 1, 2, 44100, false);
        final DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
        final TargetDataLine targetLine = (TargetDataLine) AudioSystem.getLine(info);
        targetLine.open();
        targetLine.start();
        final AudioInputStream audioStream = new AudioInputStream(targetLine);

        final byte[] buf = new byte[256]; // <--- increase this for higher frequency resolution
        final int numberOfSamples = buf.length / format.getFrameSize();
        final JavaFFT fft = new JavaFFT(numberOfSamples);
        while (true) {
            // in real impl, don't just ignore how many bytes you read
            audioStream.read(buf);
            // the stream represents each sample as two bytes -> decode
            final float[] samples = decode(buf, format);
            final float[][] transformed = fft.transform(samples);
            final float[] realPart = transformed[0];
            final float[] imaginaryPart = transformed[1];
            final double[] magnitudes = toMagnitudes(realPart, imaginaryPart);

            // do something with magnitudes...
        }
    }

    private static float[] decode(final byte[] buf, final AudioFormat format) {
        final float[] fbuf = new float[buf.length / format.getFrameSize()];
        for (int pos = 0; pos < buf.length; pos += format.getFrameSize()) {
            final int sample = format.isBigEndian()
                    ? byteToIntBigEndian(buf, pos, format.getFrameSize())
                    : byteToIntLittleEndian(buf, pos, format.getFrameSize());
            // normalize to [0,1] (not strictly necessary, but makes things easier)
            fbuf[pos / format.getFrameSize()] = sample / NORMALIZATION_FACTOR_2_BYTES;
        }
        return fbuf;
    }

    private static double[] toMagnitudes(final float[] realPart, final float[] imaginaryPart) {
        final double[] powers = new double[realPart.length / 2];
        for (int i = 0; i < powers.length; i++) {
            powers[i] = Math.sqrt(realPart[i] * realPart[i] + imaginaryPart[i] * imaginaryPart[i]);
        }
        return powers;
    }

    private static int byteToIntLittleEndian(final byte[] buf, final int offset, final int bytesPerSample) {
        int sample = 0;
        for (int byteIndex = 0; byteIndex < bytesPerSample; byteIndex++) {
            final int aByte = buf[offset + byteIndex] & 0xff;
            sample += aByte << 8 * (byteIndex);
        }
        return sample;
    }

    private static int byteToIntBigEndian(final byte[] buf, final int offset, final int bytesPerSample) {
        int sample = 0;
        for (int byteIndex = 0; byteIndex < bytesPerSample; byteIndex++) {
            final int aByte = buf[offset + byteIndex] & 0xff;
            sample += aByte << (8 * (bytesPerSample - byteIndex - 1));
        }
        return sample;
    }

}
我的问题是如何在Java中实现FFT。为了更好地理解,这是一个链接到YouTube视频,它显示了我正在努力实现的非常好的目标

整个代码:

public class Music {

    static AudioFormat format;
    static DataLine.Info info;

    public static void input() {
        format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100, false);

        try {
            info = new DataLine.Info(TargetDataLine.class, format);
            final TargetDataLine targetLine = (TargetDataLine) AudioSystem.getLine(info);
            targetLine.open();

            AudioInputStream audioStream = new AudioInputStream(targetLine);

            byte[] buf = new byte[256]

            Thread targetThread = new Thread() {
                public void run() {
                    targetLine.start();
                    try {
                        audioStream.read(buf);
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
            };

            targetThread.start();
    } catch (LineUnavailableException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

}

编辑:我尝试使用MediaPlayer的JavaFX AudioSpectrumListener,只要我使用
.mp3
文件,它的效果就非常好。问题是,我必须使用一个字节数组来存储麦克风输入。关于这个问题,我问了另一个问题。

使用中的
JavaFFT
类,您可以执行以下操作:

 if(frequency > value) { 
   LEDs on
 else {
   LEDs off
 }
import javax.sound.sampled.*;

public class AudioLED {

    private static final float NORMALIZATION_FACTOR_2_BYTES = Short.MAX_VALUE + 1.0f;

    public static void main(final String[] args) throws Exception {
        // use only 1 channel, to make this easier
        final AudioFormat format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 1, 2, 44100, false);
        final DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
        final TargetDataLine targetLine = (TargetDataLine) AudioSystem.getLine(info);
        targetLine.open();
        targetLine.start();
        final AudioInputStream audioStream = new AudioInputStream(targetLine);

        final byte[] buf = new byte[256]; // <--- increase this for higher frequency resolution
        final int numberOfSamples = buf.length / format.getFrameSize();
        final JavaFFT fft = new JavaFFT(numberOfSamples);
        while (true) {
            // in real impl, don't just ignore how many bytes you read
            audioStream.read(buf);
            // the stream represents each sample as two bytes -> decode
            final float[] samples = decode(buf, format);
            final float[][] transformed = fft.transform(samples);
            final float[] realPart = transformed[0];
            final float[] imaginaryPart = transformed[1];
            final double[] magnitudes = toMagnitudes(realPart, imaginaryPart);

            // do something with magnitudes...
        }
    }

    private static float[] decode(final byte[] buf, final AudioFormat format) {
        final float[] fbuf = new float[buf.length / format.getFrameSize()];
        for (int pos = 0; pos < buf.length; pos += format.getFrameSize()) {
            final int sample = format.isBigEndian()
                    ? byteToIntBigEndian(buf, pos, format.getFrameSize())
                    : byteToIntLittleEndian(buf, pos, format.getFrameSize());
            // normalize to [0,1] (not strictly necessary, but makes things easier)
            fbuf[pos / format.getFrameSize()] = sample / NORMALIZATION_FACTOR_2_BYTES;
        }
        return fbuf;
    }

    private static double[] toMagnitudes(final float[] realPart, final float[] imaginaryPart) {
        final double[] powers = new double[realPart.length / 2];
        for (int i = 0; i < powers.length; i++) {
            powers[i] = Math.sqrt(realPart[i] * realPart[i] + imaginaryPart[i] * imaginaryPart[i]);
        }
        return powers;
    }

    private static int byteToIntLittleEndian(final byte[] buf, final int offset, final int bytesPerSample) {
        int sample = 0;
        for (int byteIndex = 0; byteIndex < bytesPerSample; byteIndex++) {
            final int aByte = buf[offset + byteIndex] & 0xff;
            sample += aByte << 8 * (byteIndex);
        }
        return sample;
    }

    private static int byteToIntBigEndian(final byte[] buf, final int offset, final int bytesPerSample) {
        int sample = 0;
        for (int byteIndex = 0; byteIndex < bytesPerSample; byteIndex++) {
            final int aByte = buf[offset + byteIndex] & 0xff;
            sample += aByte << (8 * (bytesPerSample - byteIndex - 1));
        }
        return sample;
    }

}
import javax.sound.sampled.*;
公共级音频{
私有静态最终浮点归一化系数\u 2\u字节=Short.MAX\u值+1.0f;
公共静态void main(最终字符串[]args)引发异常{
//仅使用1个通道,以简化此操作
最终音频格式=新的音频格式(AudioFormat.Encoding.PCM_SIGNED,44100,16,1,2,44100,false);
final DataLine.Info=new DataLine.Info(TargetDataLine.class,格式);
最终TargetDataLine targetLine=(TargetDataLine)AudioSystem.getLine(info);
targetLine.open();
targetLine.start();
最终AudioInputStream audioStream=新的AudioInputStream(targetLine);

final byte[]buf=new byte[256];//我认为亨德里克有一个基本的计划,但我听说你对理解到达那里的过程感到痛苦

我假设您是通过
TargetDataLine
获取字节数组的,它返回字节。将字节转换为浮点数需要一些操作,这取决于
AudioFormat
。典型的格式每秒有44100帧,16位编码(两个字节形成一个数据点)这意味着4个字节组成一个由左值和右值组成的单帧

java音频教程中提供了显示如何读取和处理传入的单个字节流的示例代码。向下滚动到“读取声音文件”部分中的第一个“代码段”。将传入数据转换为浮点的关键点出现在标记如下的位置:

// Here, do something useful with the audio data that's 
// now in the audioBytes array...
此时,您可以获取两个字节(假设采用16位编码)并将它们附加到单个短字节中,然后将值缩放为标准化浮点值(范围从-1到1)。有几个StackOverflow问题显示了执行此转换的算法

您可能还需要经历一个过程编辑,其中示例代码从
音频输入流
(如示例所示)与
目标数据线
读取,但我认为如果这造成了问题,也有一些堆栈溢出问题可以帮助解决

对于hendrik推荐的方法,我怀疑使用带有一个float[]作为输入的transform方法就足够了。但是我还没有深入了解细节,也没有尝试自己运行这个方法。(这看起来很有希望。我怀疑搜索可能还会发现其他具有更完整文档的FFT库。我记得一些可能来自麻省理工学院的信息。从技术上讲,我可能只比你们领先几步。)

在任何情况下,在上面发生转换的地方,您都可以将transform()添加到输入数组中,直到它已满,然后在该迭代中调用transform()方法

解释方法的输出最好在单独的线程上完成。我在想,传递FFT调用的结果,或者通过某种松耦合传递transform()调用本身。(您熟悉这个术语和多线程编码吗?)

关于Java如何编码声音和声音格式的重要见解,可以在上面链接的教程之前的教程中找到


另一个伟大的资源,如果你想更好地理解如何解释FFT结果,可以免费下载:“

虽然其他答案提供了大量有用的信息并很好地解释了相关概念,但如果您想快速获得Java的工作解决方案,那么jAudio提供了一个可以为您做一切的解决方案。可以找到此类的所有相关函数

在这种情况下,可以忽略假想输入(因为音频信号只是实值),因此所需的输入只是一个样本数组(类型为
double
)。例如,如果您的样本是16位整数,您可以使用以下方法轻松地从
short
样本转换为
double

short shortSample = ...
double sample = (double) shortSample / Short.MAX_VALUE;
对于完全工作的代码段,请查看改编自以下代码段的代码,或查看以下代码段:

double[] samples = getSamples(NUMBER_OF_SAMPLES); // implement this function to get samples from your source

FFT fft = new FFT(samples, null, false, false); // optionally set last parameter to true if you want Hamming window

double[] magnitudes = fft.getMagnitudeSpectrum();
double[] bins = leftFft.getBinLabels(sampleRate); // the sample rate used is required for frequency bins

// get the loudest occurring frequency within typical human hearing range
int maxIndex = 0;
double max = Double.NEGATIVE_INFINITY;
for (int i = 1; i < magnitudes.length; i++) {
  // ignore frequencies outside human hearing range
  if (bins[i] < 20 || bins[i] > 20000) {
    continue;
  }
  if (magnitudes[i] > max) {
    maxIndex = i;
    max = magnitudes[i];
  }
}

// loudest frequency of all previous samples now easy to obtain
double frequency = bins[maxIndex];
double[]samples=getSamples(样本数);//实现此函数从源代码获取样本
FFT FFT=新的FFT(采样、空、假、假);//如果需要汉明窗口,可以选择将最后一个参数设置为真
double[]震级=fft.getMagnitudeSpectrum();
double[]bins=leftFft.getBinLabels(sampleRate);//频率bins需要使用的采样率
//获得典型人类听力范围内最大的发生频率
int maxIndex=0;
双最大值=双负无穷大;
对于(int i=1;i20000){
继续;
}
if(震级[i]>最大值){
maxIndex=i;
最大值=震级[i];
}
}
//以前所有样本中最响亮的频率现在很容易获得
双频=箱[maxIndex];
您有什么“音频输入”功能,例如,您有以给定频率采样的电压列表吗