Python tensorflow音频识别错误'；标头不匹配：预期为RIFF，但找到'；_Python_Python 3.x_Tensorflow_Audio

Python tensorflow音频识别错误'；标头不匹配：预期为RIFF，但找到'；

python python-3.x tensorflow audio

Python tensorflow音频识别错误'；标头不匹配：预期为RIFF，但找到'；,python,python-3.x,tensorflow,audio,Python,Python 3.x,Tensorflow,Audio,我是tf的新手，我正在尝试阅读一些音频以进行类型识别。我遵循tf网站上关于这个主题的指南，但我一直遇到两个错误。第一个是关于这一行： train_ds = train_ds.cache().prefetch(AUTOTUNE) val_ds = val_ds.cache().prefetch(AUTOTUNE) norm_layer = preprocessing.Normalization() norm_layer.adapt(spectrogram_ds.map(lambda x, _:

我是tf的新手，我正在尝试阅读一些音频以进行类型识别。我遵循tf网站上关于这个主题的指南，但我一直遇到两个错误。第一个是关于这一行：

train_ds = train_ds.cache().prefetch(AUTOTUNE)
val_ds = val_ds.cache().prefetch(AUTOTUNE)

norm_layer = preprocessing.Normalization()
norm_layer.adapt(spectrogram_ds.map(lambda x, _: x))

它会产生以下错误：

The calling iterator did not fully read the dataset being cached.
In order to avoid unexpected truncation of the dataset, the partially cached contents of
the dataset will be discarded.
This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

另一个是这一行：

train_ds = train_ds.cache().prefetch(AUTOTUNE)
val_ds = val_ds.cache().prefetch(AUTOTUNE)

norm_layer = preprocessing.Normalization()
norm_layer.adapt(spectrogram_ds.map(lambda x, _: x))

它会产生以下错误：

Header mismatch: Expected RIFF but found

这是我目前拥有的。我评论了错误1和错误2的位置，以及我认为问题的根源所在：

import os
import pathlib

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import tensorflow as tf

from tensorflow.keras.layers.experimental import preprocessing
from tensorflow.keras import layers
from tensorflow.keras import models
from IPython import display

seed = 42
tf.random.set_seed(seed)
np.random.seed(seed)

#Get audio
def decode_audio(audio_binary):
  audio, _ = tf.audio.decode_wav(audio_binary)
  return tf.squeeze(audio, axis=-1)

#Get label
def get_label(file_path):
  parts = tf.strings.split(file_path, os.path.sep)
  return parts[-2]

#Get both
def get_waveform_and_label(file_path):
  label = get_label(file_path)
  audio_binary = tf.io.read_file(file_path)
  waveform = decode_audio(audio_binary)
  return waveform, label

def get_spectrogram(waveform):
  # Here is where I suspect the source of my error to be
  zero_padding = tf.zeros([1600000] - tf.shape(waveform), dtype=tf.float32)

  waveform = tf.cast(waveform, tf.float32)
  equal_length = tf.concat([waveform, zero_padding], 0)
  spectrogram = tf.signal.stft(
      equal_length, frame_length=2047, frame_step=2048)

  spectrogram = tf.abs(spectrogram)

  return spectrogram

def get_spectrogram_and_label_id(audio, label):
  spectrogram = get_spectrogram(audio)
  spectrogram = tf.expand_dims(spectrogram, -1)
  label_id = tf.argmax(label == genres)
  return spectrogram, label_id

def preprocess_dataset(files):
  files_ds = tf.data.Dataset.from_tensor_slices(files)
  output_ds = files_ds.map(get_waveform_and_label, num_parallel_calls=AUTOTUNE)
  output_ds = output_ds.map(
      get_spectrogram_and_label_id,  num_parallel_calls=AUTOTUNE)
  return output_ds

#Work with data
def preprocess_dataset(files):
  files_ds = tf.data.Dataset.from_tensor_slices(files)
  output_ds = files_ds.map(get_waveform_and_label, num_parallel_calls=AUTOTUNE)
  output_ds = output_ds.map(
      get_spectrogram_and_label_id,  num_parallel_calls=AUTOTUNE)
  return output_ds

#Get directory of songs. This directory holds folders of songs in mono channel .wav files separated by genre
#Each song is 30 seconds in length and has a playback rate of 16 bits
data_dir = pathlib.Path('path/to/Songs')

#Get list of genres
genres = np.array(tf.io.gfile.listdir(str(data_dir)))
genres = genres[genres != '.DS_Store']

#Get list and randomized songs
filenames = tf.io.gfile.glob(str(data_dir) + '/*/*')
filenames = tf.random.shuffle(filenames)
print(len(filenames)) #prints 3936


#Separating songs
train_files = filenames[:2748]
val_files = filenames[2748: 2748 + 393]
test_files = filenames[-393:]

AUTOTUNE = tf.data.AUTOTUNE
files_ds = tf.data.Dataset.from_tensor_slices(train_files)
waveform_ds = files_ds.map(get_waveform_and_label, num_parallel_calls=AUTOTUNE)

spectrogram_ds = waveform_ds.map(
    get_spectrogram_and_label_id, num_parallel_calls=AUTOTUNE)

for waveform, label in spectrogram_ds.take(1):
  print(waveform.shape) #prints (781, 1025, 1)

train_ds = spectrogram_ds
val_ds = preprocess_dataset(val_files)
test_ds = preprocess_dataset(test_files)

batch_size = 64
train_ds = train_ds.batch(batch_size)
val_ds = val_ds.batch(batch_size)

#Error 1
train_ds = train_ds.cache().prefetch(AUTOTUNE)
val_ds = val_ds.cache().prefetch(AUTOTUNE)

num_labels = len(genres)


for spectrogram, _ in train_ds.take(1):
    input_shape = spectrogram.shape
    print('Input shape:', input_shape)



#Error 2
norm_layer = preprocessing.Normalization()
norm_layer.adapt(spectrogram_ds.map(lambda x, _: x))

我真的很感谢你的任何意见。如果还需要什么，请告诉我