Speech recognition 如何获得'；暂停'；使用google Speech to Text为给定音频文件标记？_Speech Recognition_Speech To Text_Google Speech Api_Google Cloud Speech

Speech recognition 如何获得'；暂停'；使用google Speech to Text为给定音频文件标记？

speech-recognition

Speech recognition 如何获得'；暂停'；使用google Speech to Text为给定音频文件标记？,speech-recognition,speech-to-text,google-speech-api,google-cloud-speech,Speech Recognition,Speech To Text,Google Speech Api,Google Cloud Speech,下面给出了从谷歌语音到音频文件文本的输出 results { alternatives { transcript: "extremely grateful for the " confidence: 0.911402702331543 words { start_time { } end_time { nanos: 600000000 } word: "extre

下面给出了从谷歌语音到音频文件文本的输出

results {
  alternatives {
    transcript: "extremely grateful for the "
    confidence: 0.911402702331543
    words {
      start_time {
      }
      end_time {
        nanos: 600000000
      }
      word: "extremely"
    }
    words {
      start_time {
        nanos: 600000000
      }
      end_time {
        nanos: 900000000
      }
      word: "grateful"
    }
    words {
      start_time {
        nanos: 900000000
      }
      end_time {
        seconds: 1
        nanos: 100000000
      }
      word: "for"
    }
    words {
      start_time {
        seconds: 1
        nanos: 100000000
      }
      end_time {
        seconds: 1
        nanos: 300000000
      }
      word: "the"
    }
    words {
      start_time {
        seconds: 1
        nanos: 300000000
      }
}

我想知道单词之间的停顿。下面的“极端”一词从0开始，以nano结尾：600000000。下一个单词“感激”的开始时间是Nano:600000000。但当我们说话时，字里行间会有空隙。我想要单词之间间隔持续时间的时间戳

有没有办法通过谷歌语音文本转换来获取这些信息？
如果没有，请提出一些替代方案，以实现同样的目标