如何在iOS中将语音转换为文本_Ios_Objective C_Swift_Speech Recognition_Speech To Text

如何在iOS中将语音转换为文本

ios objective-c swift speech-recognition

如何在iOS中将语音转换为文本,ios,objective-c,swift,speech-recognition,speech-to-text,Ios,Objective C,Swift,Speech Recognition,Speech To Text,据我所知，apple native framework没有用于将语音转换为文本的API，我们必须使用第三方框架来实现这一点，它有很多缺点，比如用户必须使用麦克风才能将语音转换为文本但是我可以找到很多关于将文本转换为语音的信息，但不能找到其他方法找不到任何关于这方面的明确信息，而且大部分都有很多不确定的事情如果有人能透露一些信息，那就太好了以下是相同的完整代码： import UIKit import Speech public class ViewController: UIViewC

据我所知，apple native framework没有用于将语音转换为文本的API，我们必须使用第三方框架来实现这一点，它有很多缺点，比如用户必须使用麦克风才能将语音转换为文本

但是我可以找到很多关于将文本转换为语音的信息，但不能找到其他方法

找不到任何关于这方面的明确信息，而且大部分都有很多不确定的事情

如果有人能透露一些信息，那就太好了

以下是相同的完整代码：

import UIKit
import Speech

public class ViewController: UIViewController, SFSpeechRecognizerDelegate {
    // MARK: Properties

    private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!

    private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?

    private var recognitionTask: SFSpeechRecognitionTask?

    private let audioEngine = AVAudioEngine()

    @IBOutlet var textView : UITextView!

    @IBOutlet var recordButton : UIButton!

    // MARK: UIViewController

    public override func viewDidLoad() {
        super.viewDidLoad()

        // Disable the record buttons until authorization has been granted.
        recordButton.isEnabled = false
    }

    override public func viewDidAppear(_ animated: Bool) {
        speechRecognizer.delegate = self

        SFSpeechRecognizer.requestAuthorization { authStatus in
            /*
                The callback may not be called on the main thread. Add an
                operation to the main queue to update the record button's state.
            */
            OperationQueue.main.addOperation {
                switch authStatus {
                    case .authorized:
                        self.recordButton.isEnabled = true

                    case .denied:
                        self.recordButton.isEnabled = false
                        self.recordButton.setTitle("User denied access to speech recognition", for: .disabled)

                    case .restricted:
                        self.recordButton.isEnabled = false
                        self.recordButton.setTitle("Speech recognition restricted on this device", for: .disabled)

                    case .notDetermined:
                        self.recordButton.isEnabled = false
                        self.recordButton.setTitle("Speech recognition not yet authorized", for: .disabled)
                }
            }
        }
    }

    private func startRecording() throws {

        // Cancel the previous task if it's running.
        if let recognitionTask = recognitionTask {
            recognitionTask.cancel()
            self.recognitionTask = nil
        }

        let audioSession = AVAudioSession.sharedInstance()
        try audioSession.setCategory(AVAudioSessionCategoryRecord)
        try audioSession.setMode(AVAudioSessionModeMeasurement)
        try audioSession.setActive(true, with: .notifyOthersOnDeactivation)

        recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

        guard let inputNode = audioEngine.inputNode else { fatalError("Audio engine has no input node") }
        guard let recognitionRequest = recognitionRequest else { fatalError("Unable to created a SFSpeechAudioBufferRecognitionRequest object") }

        // Configure request so that results are returned before audio recording is finished
        recognitionRequest.shouldReportPartialResults = true

        // A recognition task represents a speech recognition session.
        // We keep a reference to the task so that it can be cancelled.
        recognitionTask = speechRecognizer.recognitionTask(with: recognitionRequest) { result, error in
            var isFinal = false

            if let result = result {
                self.textView.text = result.bestTranscription.formattedString
                isFinal = result.isFinal
            }

            if error != nil || isFinal {
                self.audioEngine.stop()
                inputNode.removeTap(onBus: 0)

                self.recognitionRequest = nil
                self.recognitionTask = nil

                self.recordButton.isEnabled = true
                self.recordButton.setTitle("Start Recording", for: [])
            }
        }

        let recordingFormat = inputNode.outputFormat(forBus: 0)
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
            self.recognitionRequest?.append(buffer)
        }

        audioEngine.prepare()

        try audioEngine.start()

        textView.text = "(Go ahead, I'm listening)"
    }

    // MARK: SFSpeechRecognizerDelegate

    public func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool) {
        if available {
            recordButton.isEnabled = true
            recordButton.setTitle("Start Recording", for: [])
        } else {
            recordButton.isEnabled = false
            recordButton.setTitle("Recognition not available", for: .disabled)
        }
    }

    // MARK: Interface Builder actions

    @IBAction func recordButtonTapped() {
        if audioEngine.isRunning {
            audioEngine.stop()
            recognitionRequest?.endAudio()
            recordButton.isEnabled = false
            recordButton.setTitle("Stopping", for: .disabled)
        } else {
            try! startRecording()
            recordButton.setTitle("Stop recording", for: [])
        }
    }
}

对于Objective C，我编写了一个语音转换器类a，用于将语音转换为文本

步骤1：创建语音转换器类

创建一个新的Cocoa类，并从NSObject中将其子类化

命名它，比如说语音识别器

在ATSpeechRecognizer.h中：

#import <Foundation/Foundation.h>
#import <Speech/Speech.h>
#import <AVFoundation/AVFoundation.h>

typedef NS_ENUM(NSInteger, ATSpeechRecognizerState) {
    ATSpeechRecognizerStateRunning,
    ATSpeechRecognizerStateStopped
};

@protocol ATSpeechDelegate<NSObject>
@required
/*This method relays parsed text from Speech to the delegate responder class*/
-(void)convertedSpeechToText:(NSString *) parsedText;
/*This method relays change in Speech recognition ability to delegate responder class*/
-(void) speechRecAvailabilityChanged:(BOOL) status;
/*This method relays error messages to delegate responder class*/
-(void) sendErrorInfoToViewController:(NSString *) errorMessage;
@optional
/*This method relays info regarding whether speech rec is running or stopped to delegate responder class. State with be either ATSpeechRecognizerStateRunning or ATSpeechRecognizerStateStopped. You may or may not implement this method*/
-(void) changeStateIndicator:(ATSpeechRecognizerState) state;
@end

@interface ATSpeechRecognizer : NSObject <SFSpeechRecognizerDelegate>

+ (ATSpeechRecognizer *)sharedObject;

/*Delegate to communicate with requesting VCs*/
@property (weak, nonatomic) id<ATSpeechDelegate> delegate;

/*Class Methods*/
-(void) toggleRecording;
-(void) activateSpeechRecognizerWithLocaleIdentifier:(NSString *) localeIdentifier andBlock:(void (^)(BOOL isAuthorized))successBlock;
@end

就这样现在，您可以在任何要将语音转换为文本的项目中的任何位置使用该类。如果您对其工作原理感到困惑，请务必阅读指导意见

步骤2：在VC中设置ATSpeechRecognizer类在视图控制器中导入ATSpeechRecognizer并按如下方式设置代理：

#import "ATSpeechRecognizer.h"
@interface ViewController : UIViewController <ATSpeechDelegate>{
    BOOL isRecAllowed;
}

现在设置委托方法：

#pragma mark - Speech Recog Delegates

-(void) convertedSpeechToText:(NSString *)parsedText{
    if(parsedText!=nil){
        _txtView.text = parsedText; //You got Text from voice. Use it as you want
    }
    
}

-(void) speechRecAvailabilityChanged:(BOOL)status{
    isRecAllowed = status; //Status of Conversion ability has changed. Use Status flag to allow/stop operations
}

-(void) changeStateIndicator:(ATSpeechRecognizerState) state{
    if(state==ATSpeechRecognizerStateStopped){
        //Speech Recognizer is Stopped
        _lblState.text = @"Stopped";
        
    }
    else{
        //Speech Recognizer is running
        _lblState.text = @"Running";
    }
    _txtView.text = @"";
}

-(void) sendErrorInfoToViewController:(NSString *)errorMessage{
    [self showPopUpForErrorMessage:errorMessage]; /*Some error occured. Show it to user*/
}

要开始将语音转换为文本，请执行以下操作：

- (IBAction)btnRecordTapped:(id)sender {
    if(!isRecAllowed){
        [self showPopUpForErrorMessage:@"Speech recognition is either not authorized or available for this device. Please authorize the operation or upgrade to latest iOS. If you have done all this, check your internet connectivity"];
    }
    else{
        [[ATSpeechRecognizer sharedObject] toggleRecording]; /*If speech Recognizer is running, it will turn it off. if it is off, it will set it on*/
        
        /*
         If you want to do it mannually, use startAudioEngine method and stopAudioEngine method to explicitly perform those operations instead of toggleRecording
         
         */
    }
    
}

就这样。您需要的所有进一步解释都在代码注释中。如果您需要进一步解释，请与我联系。

显示您的CDoE您可以使用语音框架：。也可以查看类似教程的内容。以下链接可能会对您有所帮助：。它在GitHub上也有工作代码。atSpeechRecognitizer。。iOS 8支持吗？@Shiva ATSpeechRecognizer正是我命名的类。Aegon Targaryen是我在互联网上的别名。至少如此。主要是SFSpeechRecognitor，后面的引擎。它在iOS 10及以后版本中提供。谢谢，伙计，这很有道理。。。你能告诉我如何将它嵌入到应用程序中，这样它就可以监听用户的语音命令。。。我计划定义大约20种不同的语音命令。。。当用户发言时，我应该能够据此采取行动。。。。我可以保持这是应用程序委托，但如果继续运行，会导致内存问题或其他问题吗？你能给我介绍一下这方面的大致情况吗this@Shiva您将在

convertedSpeechToText:（NSString*）parsedText

方法中获得文本。使用解析的文本确定它是否是20个预定义命令之一，并执行相应的操作。@Shiva，不要让它持续运行。这将耗尽您的数据带宽和内存。而是使用一个按钮从用户那里获取输入，当用户想要发出语音命令时。

#pragma mark - Speech Recog Delegates

-(void) convertedSpeechToText:(NSString *)parsedText{
    if(parsedText!=nil){
        _txtView.text = parsedText; //You got Text from voice. Use it as you want
    }
    
}

-(void) speechRecAvailabilityChanged:(BOOL)status{
    isRecAllowed = status; //Status of Conversion ability has changed. Use Status flag to allow/stop operations
}

-(void) changeStateIndicator:(ATSpeechRecognizerState) state{
    if(state==ATSpeechRecognizerStateStopped){
        //Speech Recognizer is Stopped
        _lblState.text = @"Stopped";
        
    }
    else{
        //Speech Recognizer is running
        _lblState.text = @"Running";
    }
    _txtView.text = @"";
}

-(void) sendErrorInfoToViewController:(NSString *)errorMessage{
    [self showPopUpForErrorMessage:errorMessage]; /*Some error occured. Show it to user*/
}

- (IBAction)btnRecordTapped:(id)sender {
    if(!isRecAllowed){
        [self showPopUpForErrorMessage:@"Speech recognition is either not authorized or available for this device. Please authorize the operation or upgrade to latest iOS. If you have done all this, check your internet connectivity"];
    }
    else{
        [[ATSpeechRecognizer sharedObject] toggleRecording]; /*If speech Recognizer is running, it will turn it off. if it is off, it will set it on*/
        
        /*
         If you want to do it mannually, use startAudioEngine method and stopAudioEngine method to explicitly perform those operations instead of toggleRecording
         
         */
    }
    
}