Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/375.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何阻止java拼写检查程序纠正重复的单词_Java_Algorithm_If Statement_Spell Checking - Fatal编程技术网

如何阻止java拼写检查程序纠正重复的单词

如何阻止java拼写检查程序纠正重复的单词,java,algorithm,if-statement,spell-checking,Java,Algorithm,If Statement,Spell Checking,我实施了一项计划,该计划可实现以下功能: 将网页中的所有单词扫描成字符串(使用jsoup) 过滤掉所有HTML标记和代码 将这些单词放入拼写检查程序并提供建议 拼写检查程序将dictionary.txt文件加载到数组中,并将字符串输入与字典中的单词进行比较 我目前的问题是,当输入多次包含同一个单词时,例如“程序是最差的”,代码将被打印出来 You entered 'teh', did you mean 'the'? You entered 'teh', did you mean 'the'?

我实施了一项计划,该计划可实现以下功能:

  • 将网页中的所有单词扫描成字符串(使用jsoup)
  • 过滤掉所有HTML标记和代码
  • 将这些单词放入拼写检查程序并提供建议
  • 拼写检查程序将dictionary.txt文件加载到数组中,并将字符串输入与字典中的单词进行比较

    我目前的问题是,当输入多次包含同一个单词时,例如“程序是最差的”,代码将被打印出来

    You entered 'teh', did you mean 'the'?
    You entered 'teh', did you mean 'the'?
    
    有时一个网站会一遍又一遍地出现多个单词,这可能会变得混乱

    如果可能的话,将单词连同拼写错误的次数一起打印是完美的,但是限制每个单词打印一次就足够了

    我的程序有几个方法和两个类,但拼写检查方法如下:

    注意:原始代码包含一些删除标点符号的“if”语句,但为了清晰起见,我删除了它们

    static boolean suggestWord;
    
    public static String checkWord(String wordToCheck) {
            String wordCheck;
            String word = wordToCheck.toLowerCase();
    
        if ((wordCheck = (String) dictionary.get(word)) != null) {
            suggestWord = false; // no need to ask for suggestion for a correct
                                    // word.
            return wordCheck;
        }
    
        // If after all of these checks a word could not be corrected, return as
        // a misspelled word.
        return word;
    }
    
    临时编辑:根据要求,完整代码:

    第1类:

    public class ParseCleanCheck {
    
            static Hashtable<String, String> dictionary;// To store all the  words of the
            // dictionary
            static boolean suggestWord;// To indicate whether the word is spelled
                                        // correctly or not.
    
            static Scanner urlInput = new Scanner(System.in);
            public static String cleanString;
            public static String url = "";
            public static boolean correct = true;
    
    
            /**
             * PARSER METHOD
             */
            public static void PageScanner() throws IOException {
                System.out.println("Pick an english website to scan.");
    
                // This do-while loop allows the user to try again after a mistake
                do {
                    try {
                        System.out.println("Enter a URL, starting with http://");
                        url = urlInput.nextLine();
                        // This creates a document out of the HTML on the web page
                        Document doc = Jsoup.connect(url).get();
                        // This converts the document into a string to be cleaned
                        String htmlToClean = doc.toString();
                        cleanString = Jsoup.clean(htmlToClean, Whitelist.none());
    
    
                        correct = false;
                    } catch (Exception e) {
                        System.out.println("Incorrect format for a URL. Please try again.");
                    }
                } while (correct);
            }
    
            /**
             * SPELL CHECKER METHOD
             */
            public static void SpellChecker() throws IOException {
                dictionary = new Hashtable<String, String>();
                System.out.println("Searching for spelling errors ... ");
    
                try {
                    // Read and store the words of the dictionary
                    BufferedReader dictReader = new BufferedReader(new FileReader("dictionary.txt"));
    
                    while (dictReader.ready()) {
                        String dictInput = dictReader.readLine();
                        String[] dict = dictInput.split("\\s"); // create an array of
                                                                // dictionary words
    
                        for (int i = 0; i < dict.length; i++) {
                            // key and value are identical
                            dictionary.put(dict[i], dict[i]);
                        }
                    }
                    dictReader.close();
                    String user_text = "";
    
                    // Initializing a spelling suggestion object based on probability
                    SuggestSpelling suggest = new SuggestSpelling("wordprobabilityDatabase.txt");
    
                    // get user input for correction
                    {
    
                        user_text = cleanString;
                        String[] words = user_text.split(" ");
    
                        int error = 0;
    
                        for (String word : words) {
                            if(!dictionary.contains(word)) {
                                checkWord(word);
    
    
                                dictionary.put(word, word);
                            }
                            suggestWord = true;
                            String outputWord = checkWord(word);
    
                            if (suggestWord) {
                                System.out.println("Suggestions for " + word + " are:  " + suggest.correct(outputWord) + "\n");
                                error++;
                            }
                        }
    
                        if (error == 0) {
                            System.out.println("No mistakes found");
                        }
                    }
    
                } catch (IOException e) {
                    e.printStackTrace();
                    System.exit(-1);
                }
            }
    
            /**
             * METHOD TO SPELL CHECK THE WORDS IN A STRING. IS USED IN SPELL CHECKER
             * METHOD THROUGH THE "WORD" STRING
             */
    
            public static String checkWord(String wordToCheck) {
                String wordCheck;
                String word = wordToCheck.toLowerCase();
    
            if ((wordCheck = (String) dictionary.get(word)) != null) {
                suggestWord = false; // no need to ask for suggestion for a correct
                                        // word.
                return wordCheck;
            }
    
            // If after all of these checks a word could not be corrected, return as
            // a misspelled word.
            return word;
        }
        }
    
    公共类语法检查{
    静态哈希表字典;//用于存储
    //字典
    静态布尔suggestWord;//指示单词是否拼写
    //正确与否。
    静态扫描仪urlInput=新扫描仪(System.in);
    公共静态字符串;
    公共静态字符串url=“”;
    公共静态布尔值correct=true;
    /**
    *解析器方法
    */
    公共静态void PageScanner()引发IOException{
    System.out.println(“选择要扫描的英语网站”);
    //此do while循环允许用户在出错后重试
    做{
    试一试{
    System.out.println(“输入URL,以http://”开头);
    url=urlInput.nextLine();
    //这将从网页上的HTML创建一个文档
    Document doc=Jsoup.connect(url.get();
    //这会将文档转换为要清理的字符串
    字符串htmlToClean=doc.toString();
    cleanString=Jsoup.clean(htmlToClean,Whitelist.none());
    正确=错误;
    }捕获(例外e){
    System.out.println(“URL格式不正确,请重试”);
    }
    }而(正确),;
    }
    /**
    *拼写检查法
    */
    公共静态无效拼写检查器()引发IOException{
    dictionary=新哈希表();
    System.out.println(“搜索拼写错误…”);
    试一试{
    //阅读并储存字典中的单词
    BufferedReader dictReader=新的BufferedReader(新文件阅读器(“dictionary.txt”);
    while(dictReader.ready()){
    字符串dictInput=dictReader.readLine();
    String[]dict=dictInput.split(\\s”);//创建
    //词典词汇
    for(int i=0;i

    还有第二个类(SuggestSpelling.java),它包含一个概率计算器,但现在不相关,除非您计划自己运行代码。

    使用
    哈希集来检测重复项-

    Set<String> wordSet = new HashSet<>();
    
    编辑
    /。。。。
    {
    user_text=cleanString;
    String[]words=user\u text.split(“”);
    Set wordSet=newhashset();
    整数误差=0;
    for(字符串字:字){
    //单词集是另一种数据结构
    
    String[] words = // split input sentence into words
    for(String word: words) {
        if(!wordSet.contains(word)) {
            checkWord(word);
            // do stuff
            wordSet.add(word);
        }
    }
    
    // ....
    {
    
        user_text = cleanString;
        String[] words = user_text.split(" ");
        Set<String> wordSet = new HashSet<>();
    
        int error = 0;
    
        for (String word : words) {
            // wordSet is another data-structure. Its only for duplicates checking, don't mix it with dictionary
            if(!wordSet.contains(word)) {
    
                // put all your logic here
    
                wordSet.add(word);
            }
        }
    
        if (error == 0) {
            System.out.println("No mistakes found");
        }
    }
    // ....