Java 为什么TF-IDF只能得到一个结果？ //计算术语频率 System.out.println（“请输入所需单词：”）；扫描仪扫描=新扫描仪（System.in）； String word=scan.nextLine（）； String[]数组=word.split（“”）； int filename=11；字符串[]文件名=新字符串[文件名]； int a=0； int totalCount=0； int字数=0；对于（a=0；aIDF”+inverseTF）；双TFIDF=（（双）字数/总计数）*反向ETF）； System.out.println（array2[b]+“-->TFIDF”+TFIDF）； } }_Java_Tf Idf

Java 为什么TF-IDF只能得到一个结果？ //计算术语频率 System.out.println（“请输入所需单词：”）；扫描仪扫描=新扫描仪（System.in）； String word=scan.nextLine（）； String[]数组=word.split（“”）； int filename=11；字符串[]文件名=新字符串[文件名]； int a=0； int totalCount=0； int字数=0；对于（a=0；aIDF”+inverseTF）；双TFIDF=（（双）字数/总计数）*反向ETF）； System.out.println（array2[b]+“-->TFIDF”+TFIDF）； } }

java

Java 为什么TF-IDF只能得到一个结果？ //计算术语频率 System.out.println（“请输入所需单词：”）；扫描仪扫描=新扫描仪（System.in）； String word=scan.nextLine（）； String[]数组=word.split（“”）； int filename=11；字符串[]文件名=新字符串[文件名]； int a=0； int totalCount=0； int字数=0；对于（a=0；aIDF”+inverseTF）；双TFIDF=（（双）字数/总计数）*反向ETF）； System.out.println（array2[b]+“-->TFIDF”+TFIDF）； } },java,tf-idf,Java,Tf Idf,嗨，这是我计算术语频率和TF-IDF的代码。第一个代码计算给定字符串的每个文件的术语频率。第二个代码应该使用上面的值计算每个文件的TF-IDF。但我只得到一个值。它应该为每个文档提供TF-IDF值术语频率的输出示例：输入的单词是“is” |File=abc0.txt | is--->字数=| 2 |总数=| 150 |术语频率=| 0.0133| 输入的单词是“是” |File=abc1.txt | 是--->字数=| 0 |总数=| 9 |术语频率=| 0.0000 | TF-IDF

嗨，这是我计算术语频率和TF-IDF的代码。第一个代码计算给定字符串的每个文件的术语频率。第二个代码应该使用上面的值计算每个文件的TF-IDF。但我只得到一个值。它应该为每个文档提供TF-IDF值

术语频率的输出示例：

输入的单词是“is”

|File=abc0.txt |
is--->字数=| 2 |总数=| 150 |术语频率=| 0.0133|

输入的单词是“是”

|File=abc1.txt |
是--->字数=| 0 |总数=| 9 |术语频率=| 0.0000 |

TF-IDF

是-->包含术语7的文件数

is-->IDF 0.1962946357308887

是否-->TFIDF 0.0028607962606519654假设每个文件重复的println语句是

// Calculating term frequency
    System.out.println("Please enter the required word  :");
    Scanner scan = new Scanner(System.in);
    String word = scan.nextLine();

    String[] array = word.split(" ");
    int filename = 11;
    String[] fileName = new String[filename];
    int a = 0;
    int totalCount = 0;
    int wordCount = 0;


    for (a = 0; a < filename; a++) {

        try {
            System.out.println("The word inputted is " + word);
            File file = new File(
                    "C:\\Users\\user\\fypworkspace\\TextRenderer\\abc" + a
                            + ".txt");
            System.out.println(" _________________");

            System.out.print("| File = abc" + a + ".txt | \t\t \n");

            for (int i = 0; i < array.length; i++) {

                totalCount = 0;
                wordCount = 0;

                Scanner s = new Scanner(file);
                {
                    while (s.hasNext()) {
                        totalCount++;
                        if (s.next().equals(array[i]))
                            wordCount++;

                    }

                    System.out.print(array[i] + " ---> Word count =  "
                            + "\t\t " + "|" + wordCount + "|");
                    System.out.print("  Total count = " + "\t\t " + "|"
                            + totalCount + "|");
                    System.out.printf("  Term Frequency =  | %8.4f |",
                            (double) wordCount / totalCount);

                    System.out.println("\t ");

                }
            }
        } catch (FileNotFoundException e) {
            System.out.println("File is not found");

        }

    }

System.out.println("Please enter the required word  :");
    Scanner scan2 = new Scanner(System.in);
    String word2 = scan2.nextLine();
    String[] array2 = word2.split(" ");
    int numofDoc;

    for (int b = 0; b < array2.length; b++) {

        numofDoc = 0;

        for (int i = 0; i < filename; i++) {

            try {

                BufferedReader in = new BufferedReader(new FileReader(
                        "C:\\Users\\user\\fypworkspace\\TextRenderer\\abc"
                                + i + ".txt"));

                int matchedWord = 0;

                Scanner s2 = new Scanner(in);

                {

                    while (s2.hasNext()) {
                        if (s2.next().equals(array2[b]))
                            matchedWord++;
                    }

                }
                if (matchedWord > 0)
                    numofDoc++;

            } catch (IOException e) {
                System.out.println("File not found.");
            }

        }
        System.out.println(array2[b]
                + " --> This number of files that contain the term  "
                + numofDoc);
        double inverseTF = Math.log10((float) numDoc / numofDoc);
        System.out.println(array2[b] + " --> IDF " +  inverseTF );
        double TFIDF = (((double) wordCount / totalCount) * inverseTF );
        System.out.println(array2[b] + " --> TFIDF " + TFIDF);
    }
}

但它包含在单个循环中

double TFIDF = (((double) wordCount / totalCount) * inverseTF );
System.out.println(array2[b] + " --> TFIDF " + TFIDF);

for（int b=0；b


只是。如果要按文件打印这一行，则必须在所有文件上用另一个循环环绕该语句
由于这是家庭作业，我将不包括最后的代码，但给您另一个提示：在TFIDF的计算中还包括了变量wordCount和totalCount。但每个文件名/单词对都是唯一的。因此，您不仅需要保存一次，还需要按照文件名/单词保存一次，或者在最后一个循环中再次将其重新包含。
打印TDIDF的部分需要移动到循环所有文件的for循环中
即：
}
}
除了实际答案（霍华德给出的答案）之外，你应该更加注意命名。使用名为“fileName”和“fileName”的变量（其中一个是int）非常令人困惑。
for (int b = 0; b < array2.length; b++)

    System.out.println(array2[b]
            + " --> This number of files that contain the term  "
            + numofDoc);
    double inverseTF = Math.log10((float) numDoc / numofDoc);
    System.out.println(array2[b] + " --> IDF " +  inverseTF );
    double TFIDF = (((double) wordCount / totalCount) * inverseTF );
    System.out.println(array2[b] + " --> TFIDF " + TFIDF);
}