Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 将对象数组中的字符串格式化为单个字符串_Javascript_Arrays_Javascript Objects - Fatal编程技术网

Javascript 将对象数组中的字符串格式化为单个字符串

Javascript 将对象数组中的字符串格式化为单个字符串,javascript,arrays,javascript-objects,Javascript,Arrays,Javascript Objects,我正在使用谷歌的语音到文本API将音频文件转换为文本。它可以识别说话人,这真的很酷,但它的格式信息的方式,我有一些麻烦。以下是他们关于分离扬声器的建议 我的目标是用一根线将说话者的行分隔开,如下所示: Speaker1: Hello Tom Speaker2: Howdy Speaker1: How was your weekend wordsObjects = [ { startTime: { seconds: '1'}, endTime: { seconds: '1'}

我正在使用谷歌的语音到文本API将音频文件转换为文本。它可以识别说话人,这真的很酷,但它的格式信息的方式,我有一些麻烦。以下是他们关于分离扬声器的建议

我的目标是用一根线将说话者的行分隔开,如下所示:

Speaker1: Hello Tom
Speaker2: Howdy
Speaker1: How was your weekend
wordsObjects =
[
  {
    startTime: { seconds: '1'},
    endTime: { seconds: '1'},
    word: 'Hello',
    speakerTag: 1
  },
  {
    startTime: { seconds: '2'},
    endTime: { seconds: '2'},
    word: 'Tom',
    speakerTag: 1
  },
]
如果我发送一个音频文件进行转录,我会得到如下结果:

Speaker1: Hello Tom
Speaker2: Howdy
Speaker1: How was your weekend
wordsObjects =
[
  {
    startTime: { seconds: '1'},
    endTime: { seconds: '1'},
    word: 'Hello',
    speakerTag: 1
  },
  {
    startTime: { seconds: '2'},
    endTime: { seconds: '2'},
    word: 'Tom',
    speakerTag: 1
  },
]
当然每个单词都有一个对象,我只是想节省空间。Tom在本例中说的任何话都应该用
speakerTag:2

这是我迄今为止最接近的一次:

  const unformattedTranscript = wordsObjects.map((currentWord, idx, arr) => {
    if (arr[idx + 1]) {
      if (currentWord.speakerTag === arr[idx + 1].speakerTag) {
        return [currentWord.word, arr[idx + 1].word];
      } else {
        return ["SPEAKER CHANGE"];
      }
    }
  });

  const formattedTranscript = unformattedTranscript.reduce(
    (acc, wordArr, idx, arr) => {
      if (arr[idx + 1]) {
        if (wordArr[wordArr.length - 1] === arr[idx + 1][0]) {
          wordArr.pop();
          acc.push(wordArr.concat(arr[idx + 1]));
        } else {
          acc.push(["\n"]);
        }
      }
      return acc;
    },
    []
  );
如果演讲者连续说出两个以上的单词,则此解决方案不起作用。在这一点上,我已经完全弄糊涂了,所以我很想被推到正确的方向上


提前谢谢你的建议。

我认为你把事情复杂化了。您可以简单地迭代单词数组并跟踪当前的说话人标记。只要当前单词的speaker标记发生更改,您就可以添加一行新词(如果没有更改,则将当前单词附加到当前行)。下面是一个例子:

const stringifyDialog = (words) => {
    let currSpeakerTag // number | undefined
    let lines = [] // Array<[number, string]>, where number is speaker tag and string is the line

    for (let {speakerTag, word} of words) {
        if (speakerTag !== currSpeakerTag) {
            currSpeakerTag = speakerTag
            lines.push([speakerTag, word])
        } else {
            lines[lines.length - 1][1] += ` ${word}`
        }
    }

    return lines.map(([speakerTag, line]) => `Speaker${speakerTag}: ${line}`).join('\n')
}
这将产生

"Speaker1: Hello Tom
Speaker2: Howdy
Speaker1: How was your weekend"

这就是我使用减速器的方式:

  const formattedTranscript = wordsObjects.reduce((accumulator, currentValue) => {

    // check if same speaker (continue on the same line)
    if(accumulator.length > 0)
    {
        const lastItem = accumulator[accumulator.length -1];
        if(lastItem.speakerTag === currentValue.speakerTag) {
          lastItem.text += " " + currentValue.word;
          return accumulator;
        }
    }

    // new line (new speaker)
    accumulator.push({
        speakerTag: currentValue.speakerTag, 
        text: currentValue.word 
    });

    return accumulator;
}, []);
你可以加一个。只要speaker标记相同,就将项目分块,然后将每个分块转换为一行

function*chunkWhile(iterable,fn){
常量迭代器=iterable[Symbol.iterator]();
让{done,value:valueA}=iterator.next();
如果(完成)返回;
设chunk=Array.of(valueA);
for(迭代器的常量值b){
if(fn(价值A、价值B)){
chunk.push(valueB);
}否则{
产量块;
chunk=Array.of(valueB);
}
valueA=valueB;
}
产量块;
}
const words对象=[
{单词:'你好',发言人:1},
{单词:'Tom',发言人:1},
{单词:'你好',说话者代号:2},
{单词:'如何',说话者代号:1},
{单词:'was',发言人代号:1},
{单词:'your',speakerTag:1},
{单词:'weekend',speakerTag:1},
];
const chunkGenerator=chunkWhile(
文字主题,
(a,b)=>a.speakerTag==b.speakerTag,
);
让字符串=”;
for(chunkGenerator的const words对象){
const speakerTag=wordsObjects[0]。speakerTag;
const words=wordsObjects.map(({word})=>word.join(“”);
string+=`Speaker${speakerTag}:${words}\n`;
}

console.log(字符串)谢谢你帮我澄清,甚至写下解决方案的代码。这比我当时的想法要直截了当得多。