Audio mp4 atom-如何区分音频编解码器?是AAC还是MP3?

Audio mp4 atom-如何区分音频编解码器?是AAC还是MP3?,audio,mp4,codec,Audio,Mp4,Codec,我正在开发一个mp4容器解析器,但我正在疯狂地尝试识别流的音频编解码器。 我使用了QtAtomViewer和AtomicParsley,但当我找到原子时: trak->mdia->minf->stbl->stsd 我总是得到“mp4a”,即使mp4文件有mp3流 我应该找一个“.mp3”fourcc吗 我附加了两种不同的mp4结构: 带有AAC音频流的mp4容器 Atom trak @ 716882 of size: 2960, ends @ 719842 Atom tkhd

我正在开发一个mp4容器解析器,但我正在疯狂地尝试识别流的音频编解码器。 我使用了QtAtomViewer和AtomicParsley,但当我找到原子时:

trak->mdia->minf->stbl->stsd

我总是得到“mp4a”,即使mp4文件有mp3流

我应该找一个“.mp3”fourcc吗

我附加了两种不同的mp4结构: 带有AAC音频流的mp4容器

     Atom trak @ 716882 of size: 2960, ends @ 719842
     Atom tkhd @ 716890 of size: 92, ends @ 716982
     Atom mdia @ 716982 of size: 2860, ends @ 719842
         Atom mdhd @ 716990 of size: 32, ends @ 717022
         Atom hdlr @ 717022 of size: 33, ends @ 717055
         Atom minf @ 717055 of size: 2787, ends @ 719842
             Atom dinf @ 717063 of size: 36, ends @ 717099
                 Atom dref @ 717071 of size: 28, ends @ 717099
             Atom stbl @ 717099 of size: 2727, ends @ 719826
                 Atom stts @ 717107 of size: 24, ends @ 717131
                 Atom stsz @ 717131 of size: 1268, ends @ 718399
                 Atom stsc @ 718399 of size: 40, ends @ 718439
                 Atom stco @ 718439 of size: 32, ends @ 718471
                 Atom stss @ 718471 of size: 1264, ends @ 719735
                 Atom stsd @ 719735 of size: 91, ends @ 719826
                     Atom mp4a @ 719751 of size: 75, ends @ 719826
                         Atom esds @ 719787 of size: 39, ends @ 719826
             Atom smhd @ 719826 of size: 16, ends @ 719842
Atom trak @ 1663835 of size: 4844, ends @ 1668679
     Atom tkhd @ 1663843 of size: 92, ends @ 1663935
     Atom mdia @ 1663935 of size: 4744, ends @ 1668679
         Atom mdhd @ 1663943 of size: 32, ends @ 1663975
         Atom hdlr @ 1663975 of size: 45, ends @ 1664020
         Atom minf @ 1664020 of size: 4659, ends @ 1668679
             Atom smhd @ 1664028 of size: 16, ends @ 1664044
             Atom dinf @ 1664044 of size: 36, ends @ 1664080
                 Atom dref @ 1664052 of size: 28, ends @ 1664080
             Atom stbl @ 1664080 of size: 4599, ends @ 1668679
                 Atom stsd @ 1664088 of size: 87, ends @ 1664175
                     Atom mp4a @ 1664104 of size: 71, ends @ 1664175
                         Atom esds @ 1664140 of size: 35, ends @ 1664175
                 Atom stts @ 1664175 of size: 24, ends @ 1664199
                 Atom stsc @ 1664199 of size: 28, ends @ 1664227
                 Atom stsz @ 1664227 of size: 2228, ends @ 1666455
                 Atom stco @ 1666455 of size: 2224, ends @ 1668679
带有mp3音频流的mp4容器

     Atom trak @ 716882 of size: 2960, ends @ 719842
     Atom tkhd @ 716890 of size: 92, ends @ 716982
     Atom mdia @ 716982 of size: 2860, ends @ 719842
         Atom mdhd @ 716990 of size: 32, ends @ 717022
         Atom hdlr @ 717022 of size: 33, ends @ 717055
         Atom minf @ 717055 of size: 2787, ends @ 719842
             Atom dinf @ 717063 of size: 36, ends @ 717099
                 Atom dref @ 717071 of size: 28, ends @ 717099
             Atom stbl @ 717099 of size: 2727, ends @ 719826
                 Atom stts @ 717107 of size: 24, ends @ 717131
                 Atom stsz @ 717131 of size: 1268, ends @ 718399
                 Atom stsc @ 718399 of size: 40, ends @ 718439
                 Atom stco @ 718439 of size: 32, ends @ 718471
                 Atom stss @ 718471 of size: 1264, ends @ 719735
                 Atom stsd @ 719735 of size: 91, ends @ 719826
                     Atom mp4a @ 719751 of size: 75, ends @ 719826
                         Atom esds @ 719787 of size: 39, ends @ 719826
             Atom smhd @ 719826 of size: 16, ends @ 719842
Atom trak @ 1663835 of size: 4844, ends @ 1668679
     Atom tkhd @ 1663843 of size: 92, ends @ 1663935
     Atom mdia @ 1663935 of size: 4744, ends @ 1668679
         Atom mdhd @ 1663943 of size: 32, ends @ 1663975
         Atom hdlr @ 1663975 of size: 45, ends @ 1664020
         Atom minf @ 1664020 of size: 4659, ends @ 1668679
             Atom smhd @ 1664028 of size: 16, ends @ 1664044
             Atom dinf @ 1664044 of size: 36, ends @ 1664080
                 Atom dref @ 1664052 of size: 28, ends @ 1664080
             Atom stbl @ 1664080 of size: 4599, ends @ 1668679
                 Atom stsd @ 1664088 of size: 87, ends @ 1664175
                     Atom mp4a @ 1664104 of size: 71, ends @ 1664175
                         Atom esds @ 1664140 of size: 35, ends @ 1664175
                 Atom stts @ 1664175 of size: 24, ends @ 1664199
                 Atom stsc @ 1664199 of size: 28, ends @ 1664227
                 Atom stsz @ 1664227 of size: 2228, ends @ 1666455
                 Atom stco @ 1666455 of size: 2224, ends @ 1668679
谢谢 铁

更新:

我找到了解决问题的方法: 通过观察欧芹的代码,我发现有可能得到 关于流原子(mp4a)的编解码器信息,读取第11个字节 进入esds(基本流描述)原子

现在我以这种方式工作:

如果第11个字节的值是0x40,我假设流是AAC,否则如果我读取0x69,我假设流是MP3

我不喜欢这些“经验主义”的解决方案,所以我正在寻找一个更正确的解决方案 路,但我只发现那是不完整的


有人知道我在哪里可以获得MP4容器的更详细规范吗?

在“esds”atom中,有几个字段与确定编解码器有关。esds atom内容的第一个字节是
objectTypeIndication
(这是您的解决方案中的第11个字节)。此字段应指示所使用的编解码器,但有几个条目由多个编解码器使用。MP4RA有一个新的功能。在这种情况下,以下几点是相关的:

  • 0x40-MPEG-4音频
  • 0x6B-MPEG-1音频(MPEG-1第1、2和3层)
  • 0x69-MPEG-2向后兼容音频(MPEG-2第1、2和3层)
  • 0x67-MPEG-2 AAC LC
0x6B
0x69
分别表示MPEG-1和2层1、2和3
0x67
表示MPEG-2 AAC LC,但通常不使用,而使用
0x040
0x66
0x68
也是MPEG-2 AAC配置文件,使用频率更低)<代码>0x40表示MPEG-4音频。MPEG-4音频通常被认为是AAC,但有一个完整的音频编解码器框架,可用于MPEG-4音频,包括AAC、BSAC、ALS、CELP和MP3On4。MP3On4是一种MP3变体,带有一些新的多声道报头信息

通过查看
AudioSpecificConfig
,我们可以了解MPEG-4音频中的实际音频格式。这是解码器的全局报头,存在于“esds”原子内容的字节13处。在
AudioSpecificConfig
的开头有一个5位
AudioObjectType
。可以在多媒体维基上找到完整的列表(该列表在您的文章“MPEG-4音频”一文中链接:但以下是有用的值:

  • 00-空
  • 01-AAC Main(来自MPEG-2的弃用AAC配置文件)
  • 02-AAC LC或向后兼容的HE-AAC(大多数现实世界中的AAC属于此类情况之一)
  • 03-AAC可扩展采样率(很少使用)
  • 03-AAC LTP(AAC干管的替代品,很少使用)
  • 05-HE-AAC明确发出信号(不向后兼容)
  • 22-ER BSAC(韩国广播编解码器)
  • 23-低延迟AAC
  • 29-HE-AACv2明确表示(在一份草案中,这是MP3On4)
  • 31-转义(再读取6位,添加32位)
  • 32-MP3on4第1层
  • 33-MP3on4第2层
  • 34-MP3on4第3层
如果您不担心“MP3On4”mp3变体或其他奇怪的MPEG-4音频编解码器,那么只需使用
objectTypeIndication


在MPEG规范中,这些细节分布在14496-1、-12、-14和-3中。其中只有14496-12是免费提供的:

esds atom[1]的格式定义为:

Size 32-bit
Type 32-bit 'esds'
Version: 8-bit, zero.
Flags: 24-bit field, zero.
Elementary Stream Descriptor
基本流描述符在相关MPEG4文档[2]中定义

查看MP4A文件中的典型ESD:

00000033 65736473 00000000 03808080  
22000100 04808080 14401500 00000001
FC170001 FC170580 80800212 08068080
800102
智力

00000033 65736473 = ISO Atom "esds" of length 0x33
00000000 = Version/Flags field (0), meaning tagged Elementary Stream Descriptor follows
03808080 = TAG(3) = Object Descriptor ([2])
22       = length of this OD (which includes the next 2 tags)
  0001   = ES_ID = 1
      00 = flags etc = 0
04808080 = TAG(4) = ES Descriptor ([2]) embedded in above OD
14       = length of this ESD
  40     = MPEG4 Audio (see table for valid types here)
    15   = stream type(6bits)=5 audio, flags(2bits)=1
000000   = 24bit buffer size
0001FC17 = max bitrate (130,071 bps)
0001FC17 = avg bitrate
05808080 = TAG(5) = ASC ([2],[3]) embedded in above OD
02       = length
1208     = ASC (AOT=2 AAC-LC, freq=4 => 44100 Hz, chan=1 => single channel, flen0 => 1024 samples)
06808080 = TAG(6)
01       = length
02       = data
参考文献:

  • [1]
  • [2] 我相信MPEG4-part1系统中定义的标签
  • [3] ASC是AudioSpecificConfig,请参阅

Andy Henson的答案中的字段拆分是错误的,“0x80”字节不是标记的一部分,而是长度的一部分,并形成一个Varint32长度

00000033 65736473 = ISO Atom "esds" of length 0x33
00000000 = Version/Flags field (0), meaning tagged Elementary Stream Descriptor follows
03       = TAG(3) = Object Descriptor ([2])
80808022 = length = 34 [4] of this OD (which includes the next 3 tags)
  0001   = ES_ID = 1
      00 = flags etc = 0
04       = TAG(4) = ES Descriptor ([2]) embedded in above OD
80808014 = length of this ESD
  40     = MPEG4 Audio (see table for valid types here)
    15   = stream type(6bits)=5 audio, flags(2bits)=1
000000   = 24bit buffer size
0001FC17 = max bitrate (130,071 bps)
0001FC17 = avg bitrate
05       = TAG(5) = ASC ([2],[3]) embedded in above OD
80808002 = length
1208     = ASC (AOT=2 AAC-LC, freq=4 => 44100 Hz, chan=1 => single channel, flen0 => 1024 samples)
06       = TAG(6)
80808001 = length
02       = data

长度编码记录在MPEG-4系统(ISO 14496-1)附录E.1中

为清楚起见,您应将HE-AAC称为实名SBR,并将(也称为HE-AAC)放在括号中。#5 SBR ISO/IEC 14496-3子部分4在哪里可以找到mp4a原子后面的ESDS原子的字节格式结构?