Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java获取具有正确编码的url_Java_File_Url_Encoding_Utf 8 - Fatal编程技术网

Java获取具有正确编码的url

Java获取具有正确编码的url,java,file,url,encoding,utf-8,Java,File,Url,Encoding,Utf 8,我想下载许多网页的源代码,然后写入文件并在NetBeans控制台中打印出来。我在编码方面有问题。首先检查我的代码: public static final void foo(URL url, Charset endoding, String file) { BufferedReader in; String readLine; try { in = new BufferedReader(new InputStreamReader(url.openS

我想下载许多网页的源代码,然后写入文件并在NetBeans控制台中打印出来。我在编码方面有问题。首先检查我的代码:

public static final void foo(URL url, Charset endoding, String file) {
    BufferedReader in;
    String readLine;
    try
    {
        in = new BufferedReader(new InputStreamReader(url.openStream(), encoding));
        BufferedWriter out = new BufferedWriter(new OutputStreamWriter( new FileOutputStream(file) , encoding));
        while ((readLine = in.readLine()) != null) {
            System.out.println(readLine+"\n");
            out.write(readLine+"\n");
        }
        out.flush();
        out.close();
    }
}
我正在两个外国网站上测试(捷克和泰国除外)

我尝试了Charset.forName(“UTF-8”),它似乎对泰国网页有效,但实际上对捷克网页无效。控制台和文件包含问号,例如�.

我还尝试了ISO-8859-2,它正确地保存了文件,但控制台显示的是小矩形,而不是字母ž、š


对于多语言网站(如捷克语、日本语、泰语等),是否存在任何通用的解决方案,我可以将其正确保存到文件,就像打印到控制台或保存到变量一样?

问题是没有终极编码。当时最先进的编码可能是UTF-8,即使每一方都可以自己决定使用哪种编码。 这是一篇相当不错的文章,值得一读,它描述了字符编码的基本问题,并将其作为一种全球解决方案

因此,最好的解决方案是通过以下方式获得html页面编码:


这应该可以正常工作。

嗯,好吧,那我真的不知道该怎么办。你能告诉我你的网站在哪里失败了吗?
public static final void foo(URL url, String file){
  BufferedReader in;
  String readLine;
  try{
    InputStreamReader isr = new InputStreamReader(url.openStream());
    String encoding = isr.getEncoding(); //if you actually need it, which I don't suppose
    in = new BufferedReader(isr);
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter( new FileOutputStream(file) , encoding));
    while ((readLine = in.readLine()) != null) {
      System.out.println(readLine+"\n");
      out.write(readLine+"\n");
    }
    out.flush();
    out.close();
  }
}