无法在java.util.Scanner中设置字符编码

无法在java.util.Scanner中设置字符编码,java,java.util.scanner,apache-tika,Java,Java.util.scanner,Apache Tika,我使用apachetika获取文件的编码 FileInputStream fis = new FileInputStream(my_file); final AutoDetectReader detector = new AutoDetectReader(fis); fis.close(); System.out.println("Encoding:" + detector.getCharset().

我使用
apachetika
获取文件的编码

            FileInputStream fis = new FileInputStream(my_file);
            final AutoDetectReader detector = new AutoDetectReader(fis);
            fis.close();
            System.out.println("Encoding:" + detector.getCharset().toString());
我使用
扫描仪
从文件中读取值

                Scanner scanner = new Scanner(my_file, detector.getCharset().toString());
                Map<String, String> values = new HashMap<>();
                String line, key = null, value = null;
                while (scanner.hasNextLine()) {
                    line = scanner.nextLine();
                    if (line.contains(":")) {
                        if (key != null) {
                            values.put(key, value.trim());
                            key = null;
                            value = null;
                        }
                        int indexOfColon = line.indexOf(":");
                        key = line.substring(0, indexOfColon);
                        value = line.substring(indexOfColon + 1);
                    } else {
                        value += " " + line;
                    }
                }

我将尝试使用以下方法阅读字符,而不是阅读行:

ByteArrayOutputStream line = new ByteArrayOutputStream();
Scanner scanner = new Scanner(my_file);

while (scanner.hasNextInt()) {
    int c = 0;
    // read every line
    while (c != newline) { // TODO: Check for a newline char
        c = scanner.nextInt();
        line.write((byte) c);
    }
    byte[] array = line.toByteArray();
    String output = new String(array, "Windows-1252"); // This should do the trick

    // We have a string here, do your logic

    line.reset();
}

这种方法很难看,但是使用了
新字符串
,它能够指定特定的编码。我根本没有测试或运行这段代码,但至少它会显示您是否正确阅读了任何内容。

它具有相同的效果,字符串为空。我也尝试过:Scanner Scanner=new Scanner(新文件inputstream(my_文件),detector.getCharset().toString());啊,真可悲!它有
hasNextLine()
吗?没有,但我用hasNextLine()方法替换了它。我明白了,但我建议您检查文件是否有任何内容,并且
。nextInt()
可以工作。文件有内容,我可以用文本编辑器打开它。
ByteArrayOutputStream line = new ByteArrayOutputStream();
Scanner scanner = new Scanner(my_file);

while (scanner.hasNextInt()) {
    int c = 0;
    // read every line
    while (c != newline) { // TODO: Check for a newline char
        c = scanner.nextInt();
        line.write((byte) c);
    }
    byte[] array = line.toByteArray();
    String output = new String(array, "Windows-1252"); // This should do the trick

    // We have a string here, do your logic

    line.reset();
}