Java 用于Zip/Gzip文件的Guava Resources.readLines（）_Java_Url_Guava_Readline

Java 用于Zip/Gzip文件的Guava Resources.readLines（）

java url

Java 用于Zip/Gzip文件的Guava Resources.readLines（）,java,url,guava,readline,Java,Url,Guava,Readline,我发现Resources.readLines（）和Files.readLines（）对简化代码很有帮助。问题是我经常从URL（HTTP和FTP）读取gzip压缩的txt文件或zip归档文件中的txt文件。有没有一种方法可以使用番石榴的方法来读取这些URL呢？或者，只有使用Java的GZIPInputStream/ZipInputStream才能做到这一点吗？您可以创建自己的ByteSources：对于GZip： public class GzippedByteSource extends

我发现Resources.readLines（）和Files.readLines（）对简化代码很有帮助。
问题是我经常从URL（HTTP和FTP）读取gzip压缩的txt文件或zip归档文件中的txt文件。

有没有一种方法可以使用番石榴的方法来读取这些URL呢？或者，只有使用Java的GZIPInputStream/ZipInputStream才能做到这一点吗？

您可以创建自己的

ByteSource

s：

对于GZip：

public class GzippedByteSource extends ByteSource {
  private final ByteSource source;
  public GzippedByteSource(ByteSource gzippedSource) { source = gzippedSource; }
  @Override public InputStream openStream() throws IOException {
    return new GzipInputStream(source.openStream());
  }
}

然后使用它：

Charset charset = ... ;
new GzippedByteSource(Resources.asByteSource(url)).toCharSource(charset).readLines();

下面是Zip的实现。这假定您只读取一个条目

public static class ZipEntryByteSource extends ByteSource {
  private final ByteSource source;
  private final String entryName;
  public ZipEntryByteSource(ByteSource zipSource, String entryName) {
    this.source = zipSource;
    this.entryName = entryName;
  }
  @Override public InputStream openStream() throws IOException {
    final ZipInputStream in = new ZipInputStream(source.openStream());
    while (true) {
      final ZipEntry entry = in.getNextEntry();
      if (entry == null) {
        in.close();
        throw new IOException("No entry named " + entry);
      } else if (entry.getName().equals(this.entryName)) {
        return new InputStream() {
          @Override
          public int read() throws IOException {
            return in.read();
          }

          @Override
          public void close() throws IOException {
            in.closeEntry();
            in.close();
          }
        };
      } else {
        in.closeEntry();
      }
    }
  }
}

您可以这样使用它：

Charset charset = ... ;
String entryName = ... ; // Name of the entry inside the zip file.
new ZipEntryByteSource(Resources.asByteSource(url), entryName).toCharSource(charset).readLines();

正如Olivier Grégoire所说，为了使用Guava的

readLines

功能，您可以为任何需要的压缩方案创建必要的

字节源
不过对于zip档案，虽然有可能做到，但我认为这不值得。创建自己的readLines
方法将更容易，该方法迭代zip条目并自己读取每个条目的行。下面是一个演示如何读取和输出指向zip存档的URL行的类：
public class ReadLinesOfZippedUrl {
    public static List<String> readLines(String urlStr, Charset charset) {
        List<String> retVal = new LinkedList<>();
        try (ZipInputStream zipInputStream = new ZipInputStream(new URL(urlStr).openStream())) {
            for (ZipEntry zipEntry = zipInputStream.getNextEntry(); zipEntry != null; zipEntry = zipInputStream.getNextEntry()) {
                // don't close this reader or you'll close the underlying zip stream
                BufferedReader reader = new BufferedReader(new InputStreamReader(zipInputStream, charset));
                retVal.addAll(reader.lines().collect(Collectors.toList())); // slurp all the lines from one entry
            }
        } catch (IOException e) {
            throw new UncheckedIOException(e);
        }
        return retVal;
    }

    public static void main(String[] args) {
        String urlStr = "http://central.maven.org/maven2/com/google/guava/guava/18.0/guava-18.0-sources.jar";
        Charset charset = StandardCharsets.UTF_8;
        List<String> lines = readLines(urlStr, charset);
        lines.forEach(System.out::println);
    }
}

public类ReadLinesOfZippedUrl{
公共静态列表读取行（字符串urlStr、字符集字符集）{
List retVal=new LinkedList（）；
try（ZipInputStream-ZipInputStream=new-ZipInputStream（新URL（urlStr.openStream（）））{
for（ZipEntry-ZipEntry=zipInputStream.getnextery（）；ZipEntry！=null；ZipEntry=zipInputStream.getnextery（））{
//不要关闭此读卡器，否则将关闭底层zip流
BufferedReader=新的BufferedReader（新的InputStreamReader（zipInputStream，字符集））；
retVal.addAll（reader.lines（）.collect（Collectors.toList（））；//从一个条目中读取所有行
}
}捕获（IOE异常）{
抛出新的未选中异常（e）；
}
返回返回；
}
公共静态void main（字符串[]args）{
字符串urlStr=”http://central.maven.org/maven2/com/google/guava/guava/18.0/guava-18.0-sources.jar";
Charset Charset=StandardCharsets.UTF_8；
列表行=读取行（urlStr，字符集）；
lines.forEach（System.out:：println）；
}
}
如果您使用的是Java 8，那么您可以使用BufferedReader#lines（）
.Ping！我在回答中为Zip添加了一个ByteSource
。gzip输入流
应该是gzip输入流