Java 将音频作为原始流播放时发出刺耳的白色声音
I.背景Java 将音频作为原始流播放时发出刺耳的白色声音,java,audio,ffmpeg,javasound,Java,Audio,Ffmpeg,Javasound,I.背景 我正在尝试制作一个应用程序,帮助在波形级别、单词级别甚至字符级别非常准确地将字幕与音频波形匹配 音频应该是梵文圣歌(瑜伽、仪式等),这是一个非常长的复合词[示例-a]ṅganyā-sokta-mātaro-bījam传统上是一个被打断的单词,只是为了帮助阅读] 输入的文本/副标题可能在句子/韵文级别上大致同步,但在单词级别上肯定不会同步 应用程序应该能够找出音频波形中的静音点,以便猜测每个单词(甚至单词中的字母/辅音/元音)的起点和终点,从而使单词级(甚至字母/辅音/元音级)的音频吟诵
- 也许我需要关闭周期之间的界限?听起来很简单,我可以试试
- 但我也在想,这种总体方法本身是否正确?任何提示、指南、建议和链接都会非常有用李>
- 此外,我刚刚硬编码的采样率等(44100Hz等),这些是好的设置为默认预设或它应该取决于输入格式李>
import com.github.kokorin.jaffree.StreamType;
import com.github.kokorin.jaffree.ffmpeg.FFmpeg;
import com.github.kokorin.jaffree.ffmpeg.FFmpegProgress;
import com.github.kokorin.jaffree.ffmpeg.FFmpegResult;
import com.github.kokorin.jaffree.ffmpeg.NullOutput;
import com.github.kokorin.jaffree.ffmpeg.PipeOutput;
import com.github.kokorin.jaffree.ffmpeg.ProgressListener;
import com.github.kokorin.jaffree.ffprobe.Stream;
import com.github.kokorin.jaffree.ffmpeg.UrlInput;
import com.github.kokorin.jaffree.ffprobe.FFprobe;
import com.github.kokorin.jaffree.ffprobe.FFprobeResult;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.SourceDataLine;
public class FFMpegToRaw {
Path BIN = Paths.get("f:\\utilities\\ffmpeg-20190413-0ad0533-win64-static\\bin");
String VIDEO_MP4 = "f:\\org\\TEMPLE\\DeviMahatmyamRecitationAudio\\03_01_Devi Kavacham.mp3";
FFprobe ffprobe;
FFmpeg ffmpeg;
public void basicCheck() throws Exception {
if (BIN != null) {
ffprobe = FFprobe.atPath(BIN);
} else {
ffprobe = FFprobe.atPath();
}
FFprobeResult result = ffprobe
.setShowStreams(true)
.setInput(VIDEO_MP4)
.execute();
for (Stream stream : result.getStreams()) {
System.out.println("Stream " + stream.getIndex()
+ " type " + stream.getCodecType()
+ " duration " + stream.getDuration(TimeUnit.SECONDS));
}
if (BIN != null) {
ffmpeg = FFmpeg.atPath(BIN);
} else {
ffmpeg = FFmpeg.atPath();
}
//Sometimes ffprobe can't show exact duration, use ffmpeg trancoding to NULL output to get it
final AtomicLong durationMillis = new AtomicLong();
FFmpegResult fFmpegResult = ffmpeg
.addInput(
UrlInput.fromUrl(VIDEO_MP4)
)
.addOutput(new NullOutput())
.setProgressListener(new ProgressListener() {
@Override
public void onProgress(FFmpegProgress progress) {
durationMillis.set(progress.getTimeMillis());
}
})
.execute();
System.out.println("audio size - "+fFmpegResult.getAudioSize());
System.out.println("Exact duration: " + durationMillis.get() + " milliseconds");
}
public void toRawAndPlay() throws Exception {
ProgressListener listener = new ProgressListener() {
@Override
public void onProgress(FFmpegProgress progress) {
System.out.println(progress.getFrame());
}
};
// code derived from : https://stackoverflow.com/questions/32873596/play-raw-pcm-audio-received-in-udp-packets
int sampleRate = 44100;//24000;//Hz
int sampleSize = 16;//Bits
int channels = 1;
boolean signed = true;
boolean bigEnd = false;
String format = "s16be"; //"f32le"
//https://trac.ffmpeg.org/wiki/audio types
final AudioFormat af = new AudioFormat(sampleRate, sampleSize, channels, signed, bigEnd);
final DataLine.Info info = new DataLine.Info(SourceDataLine.class, af);
final SourceDataLine line = (SourceDataLine) AudioSystem.getLine(info);
line.open(af, 4096); // format , buffer size
line.start();
OutputStream destination = new OutputStream() {
@Override public void write(int b) throws IOException {
throw new UnsupportedOperationException("Nobody uses thi.");
}
@Override public void write(byte[] b, int off, int len) throws IOException {
String o = new String(b);
boolean showString = false;
System.out.println("New output ("+ len
+ ", off="+off + ") -> "+(showString?o:""));
// output wave form repeatedly
if(len%2!=0) {
len -= 1;
System.out.println("");
}
line.write(b, off, len);
System.out.println("done round");
}
};
// src : http://blog.wudilabs.org/entry/c3d357ed/?lang=en-US
FFmpegResult result = FFmpeg.atPath(BIN).
addInput(UrlInput.fromPath(Paths.get(VIDEO_MP4))).
addOutput(PipeOutput.pumpTo(destination).
disableStream(StreamType.VIDEO). //.addArgument("-vn")
setFrameRate(sampleRate). //.addArguments("-ar", sampleRate)
addArguments("-ac", "1").
setFormat(format) //.addArguments("-f", format)
).
setProgressListener(listener).
execute();
// shut down audio
line.drain();
line.stop();
line.close();
System.out.println("result = "+result.toString());
}
public static void main(String[] args) throws Exception {
FFMpegToRaw raw = new FFMpegToRaw();
raw.basicCheck();
raw.toRawAndPlay();
}
}
谢谢你我怀疑你的尖叫声来自交给音响系统的半满缓冲区 正如上面的评论所指出的,我会使用类似(如果在mac或Windows上)的代码,然后使用下面的代码,这更像java风格 只要确保FFSampledSP完整的jar在您的路径中,您就可以开始了
import javax.sound.sampled.*;
import java.io.File;
import java.io.IOException;
public class PlayerDemo {
/**
* Derive a PCM format.
*/
private static AudioFormat toSignedPCM(final AudioFormat format) {
final int sampleSizeInBits = format.getSampleSizeInBits() <= 0 ? 16 : format.getSampleSizeInBits();
final int channels = format.getChannels() <= 0 ? 2 : format.getChannels();
final float sampleRate = format.getSampleRate() <= 0 ? 44100f : format.getSampleRate();
return new AudioFormat(AudioFormat.Encoding.PCM_SIGNED,
sampleRate,
sampleSizeInBits,
channels,
(sampleSizeInBits > 0 && channels > 0) ? (sampleSizeInBits/8)*channels : AudioSystem.NOT_SPECIFIED,
sampleRate,
format.isBigEndian()
);
}
public static void main(final String[] args) throws IOException, UnsupportedAudioFileException, LineUnavailableException {
final File audioFile = new File(args[0]);
// open mp3 or whatever
final Long durationInMicroseconds = (Long)AudioSystem.getAudioFileFormat(audioFile).getProperty("duration");
// how long is the file, use AudioFileFormat properties
System.out.println("Duration in microseconds (not millis!): " + durationInMicroseconds);
// open the mp3 stream (not yet decoded)
final AudioInputStream mp3In = AudioSystem.getAudioInputStream(audioFile);
// derive a suitable PCM format that can be played by the AudioSystem
final AudioFormat desiredFormat = toSignedPCM(mp3In.getFormat());
// ask the AudioSystem for a source line for playback
// that corresponds to the derived PCM format
final SourceDataLine line = AudioSystem.getSourceDataLine(desiredFormat);
// now play, typically in separate thread
new Thread(() -> {
final byte[] buf = new byte[4096];
int justRead;
// convert to raw PCM samples with the AudioSystem
try (final AudioInputStream rawIn = AudioSystem.getAudioInputStream(desiredFormat, mp3In)) {
line.open();
line.start();
while ((justRead = rawIn.read(buf)) >= 0) {
// only write bytes we really read, not more!
line.write(buf, 0, justRead);
final long microsecondPosition = line.getMicrosecondPosition();
System.out.println("Current position in microseconds: " + microsecondPosition);
}
} catch (IOException | LineUnavailableException e) {
e.printStackTrace();
} finally {
line.drain();
line.stop();
}
}).start();
}
}
import javax.sound.sampled.*;
导入java.io.File;
导入java.io.IOException;
公共类PlayerDemo{
/**
*导出PCM格式。
*/
专用静态音频格式至已签名PCM(最终音频格式){
final int-sampleSizeInBits=format.getSampleSizeInBits(){
最终字节[]buf=新字节[4096];
int justRead;
//使用音响系统转换为原始PCM样本
try(final AudioInputStream rawIn=AudioSystem.getAudioInputStream(desiredFormat,mp3In)){
line.open();
line.start();
而((justRead=rawIn.read(buf))>=0){
//只写我们真正读取的字节,而不是更多!
行写入(buf,0,justRead);
final long microsecondPosition=line.getMicrosecondPosition();
System.out.println(“以微秒为单位的当前位置:“+微秒位置”);
}
}捕获(IOException | LineUnavailableException e){
e、 printStackTrace();
}最后{
line.drain();
line.stop();
}
}).start();
}
}
常规JavaAPI不允许跳转到任意位置。但是,FFSampledSP包含一个扩展,即方法。要使用它,只需从上面的例子将<代码> RaWin 转换为<代码> FudioOnIdPoStuts并调用<代码> SeCK()/<代码> <代码>时间>代码>和<代码>时间单元< /代码>。< /P>如果您在MACOS或Windows上,您可能需要考虑使用以使其更加优雅。@亨德里克-与任何示例代码有任何链接吗?那会有帮助的。感谢您的评论。通过读取具有已知音频(例如100赫兹)的文件来简化,并通过以PCM格式打印原始音频曲线(仅音频曲线上的点)来确认您的代码工作,以便您可以看到音频曲线数据点根据sin曲线向上/向下变化。。。这将让你确认你的代码是正确的solid@ScottStensland-谢谢你的评论。我可以听到音频点亮,它播放ok,然后是尖叫声,然后是下一个循环,播放ok,然后是尖叫声。仍然在解决问题。谢谢,让我试试这个。我会回来的。它播放得很好,现在让我试着理解代码。我如何从任意时间点查找和播放音频?正如我所解释的,我需要非常准确地同步音频和字幕,所以用户应该能够准确地做到这一点。我把精确的寻找排除在当前问题的范围之外,但是有人怎么做甚至是基本的寻找呢?这很重要。非常感谢。旁白:现在我可能需要图形化的音频数据来查看静默点,或者可以通过计算来完成。那将是不同的话题,不是这个问题。不,你完全正确。您无法使用MediaPlayer获取波形。您需要将字节转换为采样值,即整数或浮点值,然后绘制它们。有关转换的示例代码,请查看这是否有帮助,请发布新问题。