Node.js WAV标头表示不支持的格式。在谷歌云语音到文本API中

Node.js WAV标头表示不支持的格式。在谷歌云语音到文本API中,node.js,audio,speech-to-text,Node.js,Audio,Speech To Text,我正在尝试先将我的WAV文件上载到已成功上载的bucket,然后在使用google cloud speech to text API时使用该URI进行转录,但它给出了一个错误,即我提供的配置对象可能是错误的: (node:15728) UnhandledPromiseRejectionWarning: Error: 3 INVALID_ARGUMENT: WAV header indicates an unsupported format. at Object.callErrorFrom

我正在尝试先将我的WAV文件上载到已成功上载的bucket,然后在使用google cloud speech to text API时使用该URI进行转录,但它给出了一个错误,即我提供的配置对象可能是错误的:

(node:15728) UnhandledPromiseRejectionWarning: Error: 3 INVALID_ARGUMENT: WAV header indicates an unsupported format.
    at Object.callErrorFromStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\call.js:31:26)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client.js:176:52)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client-interceptors.js:342:141)
    at Object.onReceiveStatus (C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\client-interceptors.js:305:181)
    at C:\Users\Talha\Desktop\transcription backend\node_modules\@grpc\grpc-js\build\src\call-stream.js:124:78
    at processTicksAndRejections (internal/process/task_queues.js:75:11)
(node:15728) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:15728) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate
the Node.js process with a non-zero exit code.
我写的代码是:

const filePath = "i_think_arthur.wav"; // WAV file

// Google Cloud storage
const bucketName = "<bucket name>"; // Must exist in your Cloud Storage
const keyFilename = "<path to service account key>";

const uploadToGcs = async () => {
  const storage = new Storage({
    projectId: "<my project id>",
    keyFilename,
  });

  const bucket = storage.bucket(bucketName);
  const fileName = path.basename(filePath);

  await bucket.upload(filePath);

  return `gs://${bucketName}/${fileName}`;
};

// Upload to Cloud Storage first, then detects speech in the audio file
uploadToGcs()
  .then(async (gcsUri) => {
    const audio = {
      uri: gcsUri,
    };

    const config = {
      encoding: "OGG_OPUS",
      sampleRateHertz: 48000,
      // encoding: "LINEAR16",
      languageCode: "en-US",
      audioChannelCount: 2,
      enableSeparateRecognitionPerChannel: true,
    };

    const request = {
      audio,
      config,
    };

    speechClient
      .longRunningRecognize(request)
      .then((data) => {
        const operation = data[0];

        // The following Promise represents the final result of the job
        return operation.promise();
      })
      .then((data) => {
        const results = _.get(data[0], "results", []);
        const transcription = results
          .map((result) => result.alternatives[0].transcript)
          .join("\n");
        console.log(`Transcription: ${transcription}`);
      });
  })
  .catch((err) => {
    console.error("ERROR:", err);
  });

const filePath=“i_think_arthur.wav”//WAV文件
//谷歌云存储
常量bucketName=“”;//必须存在于云存储中
const keyFilename=“”;
const uploadToGcs=async()=>{
常量存储=新存储({
投射:“,
密钥文件名,
});
const bucket=storage.bucket(bucketName);
const fileName=path.basename(filePath);
等待bucket.upload(文件路径);
返回`gs://${bucketName}/${fileName}`;
};
//先上传到云存储,然后检测音频文件中的语音
上传togcs()
.然后(异步(gcsUri)=>{
常量音频={
uri:gcsUri,
};
常量配置={
编码:“OGG_OPUS”,
赫兹:48000,
//编码:“LINEAR16”,
语言代码:“en US”,
音频通道数:2,
enableSeparateRecognitionPerChannel:true,
};
常量请求={
音频
配置,
};
演讲客户
.longRunningRecognize(请求)
。然后((数据)=>{
常量运算=数据[0];
//以下承诺代表工作的最终结果
返回操作promise();
})
。然后((数据)=>{
const results=uz.get(数据[0],“results”,[]);
常量转录=结果
.map((结果)=>result.alternations[0]。记录本)
.加入(“\n”);
log(`Transcription:${Transcription}`);
});
})
.catch((错误)=>{
console.error(“error:,err”);
});

我将感谢您在这个问题上的任何帮助,谢谢

我也经历了几乎相同的事情

我试着混合样本,试着识别一个特定的声音。我所做的是使用OpenShot视频编辑器混合音频样本,并将.mp4文件转换为wav

具体而言,converter网站中的以下设置与Google Cloud脚本上的默认设置配合使用:

在高级设置中:

  • 采样率:16000 KHz
  • 频道:1
现在你有了一个音频文件,你可以在谷歌云的
语音到文本
上使用它了