Java 使用Commons Compress将目录压缩到tar.gz

Java 使用Commons Compress将目录压缩到tar.gz,java,compression,tar,apache-commons,apache-commons-compress,Java,Compression,Tar,Apache Commons,Apache Commons Compress,我在使用commons压缩库创建目录的tar.gz时遇到问题。我有一个目录结构,如下所示 parent/ child/ file1.raw fileN.raw 我使用下面的代码进行压缩。它毫无例外地运行良好。然而,当我试图解压缩tar.gz时,我得到一个名为“childDirToCompress”的文件。它的大小是正确的,因此文件在涂焦油过程中已清楚地相互追加。所需的输出将是一个目录。我不知道我做错了什么。任何一个明智的公地压缩机能让我走上正确的道路吗

我在使用commons压缩库创建目录的tar.gz时遇到问题。我有一个目录结构,如下所示

parent/
    child/
        file1.raw
        fileN.raw
我使用下面的代码进行压缩。它毫无例外地运行良好。然而,当我试图解压缩tar.gz时,我得到一个名为“childDirToCompress”的文件。它的大小是正确的,因此文件在涂焦油过程中已清楚地相互追加。所需的输出将是一个目录。我不知道我做错了什么。任何一个明智的公地压缩机能让我走上正确的道路吗

CreateTarGZ() throws CompressorException, FileNotFoundException, ArchiveException, IOException {
            File f = new File("parent");
            File f2 = new File("parent/childDirToCompress");

            File outFile = new File(f2.getAbsolutePath() + ".tar.gz");
            if(!outFile.exists()){
                outFile.createNewFile();
            }
            FileOutputStream fos = new FileOutputStream(outFile);

            TarArchiveOutputStream taos = new TarArchiveOutputStream(new GZIPOutputStream(new BufferedOutputStream(fos)));
            taos.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR); 
            taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
            addFilesToCompression(taos, f2, ".");
            taos.close();

        }

        private static void addFilesToCompression(TarArchiveOutputStream taos, File file, String dir) throws IOException{
            taos.putArchiveEntry(new TarArchiveEntry(file, dir));

            if (file.isFile()) {
                BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
                IOUtils.copy(bis, taos);
                taos.closeArchiveEntry();
                bis.close();
            }

            else if(file.isDirectory()) {
                taos.closeArchiveEntry();
                for (File childFile : file.listFiles()) {
                    addFilesToCompression(taos, childFile, file.getName());

                }
            }
        }

我还没有弄清楚到底出了什么问题,但通过对谷歌缓存的搜索,我发现了一个有效的例子。对不起,风滚草

public void CreateTarGZ()
    throws FileNotFoundException, IOException
{
    try {
        System.out.println(new File(".").getAbsolutePath());
        dirPath = "parent/childDirToCompress/";
        tarGzPath = "archive.tar.gz";
        fOut = new FileOutputStream(new File(tarGzPath));
        bOut = new BufferedOutputStream(fOut);
        gzOut = new GzipCompressorOutputStream(bOut);
        tOut = new TarArchiveOutputStream(gzOut);
        addFileToTarGz(tOut, dirPath, "");
    } finally {
        tOut.finish();
        tOut.close();
        gzOut.close();
        bOut.close();
        fOut.close();
    }
}

private void addFileToTarGz(TarArchiveOutputStream tOut, String path, String base)
    throws IOException
{
    File f = new File(path);
    System.out.println(f.exists());
    String entryName = base + f.getName();
    TarArchiveEntry tarEntry = new TarArchiveEntry(f, entryName);
    tOut.putArchiveEntry(tarEntry);

    if (f.isFile()) {
        IOUtils.copy(new FileInputStream(f), tOut);
        tOut.closeArchiveEntry();
    } else {
        tOut.closeArchiveEntry();
        File[] children = f.listFiles();
        if (children != null) {
            for (File child : children) {
                System.out.println(child.getName());
                addFileToTarGz(tOut, child.getAbsolutePath(), entryName + "/");
            }
        }
    }
}

我遵循这个解决方案,直到我处理了一组更大的文件,它在处理15000-16000个文件后随机崩溃。以下行正在泄漏文件处理程序:

IOUtils.copy(new FileInputStream(f), tOut);
代码在操作系统级别由于“打开的文件太多”错误而崩溃 以下微小更改修复了该问题:

FileInputStream in = new FileInputStream(f);
IOUtils.copy(in, tOut);
in.close();

我最后做了以下几件事:

public URL createTarGzip() throws IOException {
    Path inputDirectoryPath = ...
    File outputFile = new File("/path/to/filename.tar.gz");

    try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
            BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
            GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream);
            TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipOutputStream)) {

        tarArchiveOutputStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_POSIX);
        tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

        List<File> files = new ArrayList<>(FileUtils.listFiles(
                inputDirectoryPath,
                new RegexFileFilter("^(.*?)"),
                DirectoryFileFilter.DIRECTORY
        ));

        for (int i = 0; i < files.size(); i++) {
            File currentFile = files.get(i);

            String relativeFilePath = new File(inputDirectoryPath.toUri()).toURI().relativize(
                    new File(currentFile.getAbsolutePath()).toURI()).getPath();

            TarArchiveEntry tarEntry = new TarArchiveEntry(currentFile, relativeFilePath);
            tarEntry.setSize(currentFile.length());

            tarArchiveOutputStream.putArchiveEntry(tarEntry);
            tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(currentFile)));
            tarArchiveOutputStream.closeArchiveEntry();
        }
        tarArchiveOutputStream.close();
        return outputFile.toURI().toURL();
    }
}
public URL createTarGzip()引发IOException{
路径inputDirectoryPath=。。。
File outputFile=新文件(“/path/to/filename.tar.gz”);
try(FileOutputStream FileOutputStream=newfileoutputstream(outputFile);
BufferedOutputStream BufferedOutputStream=新的BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream=新的GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream TarArchiveOutputStream=新的TarArchiveOutputStream(gzip输出流)){
tarArchiveOutputStream.SetBignumerMode(tarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(tarArchiveOutputStream.LONGFILE_GNU);
列表文件=新的ArrayList(FileUtils.listFiles(
inputDirectoryPath,
新的RegexFileFilter(“^(.*)”,
DirectoryFileFilter.DIRECTORY
));
对于(int i=0;i

这就解决了其他解决方案中出现的一些边缘情况。

我必须对@merrick solution进行一些调整,以使其与路径相关。也许是最新的maven依赖项。目前接受的解决方案对我不起作用

import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.filefilter.DirectoryFileFilter;
import org.apache.commons.io.filefilter.RegexFileFilter;

public class TAR {

    public static void CreateTarGZ(String inputDirectoryPath, String outputPath) throws IOException {

        File inputFile = new File(inputDirectoryPath);
        File outputFile = new File(outputPath);

        try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
                BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
                GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream);
                TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipOutputStream)) {

            tarArchiveOutputStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_POSIX);
            tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);

            List<File> files = new ArrayList<>(FileUtils.listFiles(
                    inputFile,
                    new RegexFileFilter("^(.*?)"),
                    DirectoryFileFilter.DIRECTORY
            ));

            for (int i = 0; i < files.size(); i++) {
                File currentFile = files.get(i);

                String relativeFilePath = inputFile.toURI().relativize(
                        new File(currentFile.getAbsolutePath()).toURI()).getPath();

                TarArchiveEntry tarEntry = new TarArchiveEntry(currentFile, relativeFilePath);
                tarEntry.setSize(currentFile.length());

                tarArchiveOutputStream.putArchiveEntry(tarEntry);
                tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(currentFile)));
                tarArchiveOutputStream.closeArchiveEntry();
            }
            tarArchiveOutputStream.close();
        }
    }
}
import java.io.BufferedOutputStream;
导入java.io.File;
导入java.io.FileInputStream;
导入java.io.FileNotFoundException;
导入java.io.FileOutputStream;
导入java.io.IOException;
导入java.util.ArrayList;
导入java.util.List;
导入org.apache.commons.compress.archivers.tar.TarArchiveEntry;
导入org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
导入org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
导入org.apache.commons.io.FileUtils;
导入org.apache.commons.io.IOUtils;
导入org.apache.commons.io.filefilter.DirectoryFileFilter;
导入org.apache.commons.io.filefilter.RegexFileFilter;
公共级焦油{
公共静态void CreateTarGZ(字符串inputDirectoryPath,字符串outputPath)引发IOException{
文件inputFile=新文件(inputDirectoryPath);
File outputFile=新文件(outputPath);
try(FileOutputStream FileOutputStream=newfileoutputstream(outputFile);
BufferedOutputStream BufferedOutputStream=新的BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream=新的GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream TarArchiveOutputStream=新的TarArchiveOutputStream(gzip输出流)){
tarArchiveOutputStream.SetBignumerMode(tarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(tarArchiveOutputStream.LONGFILE_GNU);
列表文件=新的ArrayList(FileUtils.listFiles(
输入文件,
新的RegexFileFilter(“^(.*)”,
DirectoryFileFilter.DIRECTORY
));
对于(int i=0;i
马文


公地io
公地io
2.6
org.apache.commons
公用压缩
1.18
我使用的东西(通过
文件.walk
API),你可以链接
gzip(tar(youFile))


查看下面的Apache
commons compress
和File walker示例

此示例
tar.gz
a目录

public static void createTarGzipFolder(Path source) throws IOException {

        if (!Files.isDirectory(source)) {
            throw new IOException("Please provide a directory.");
        }

        // get folder name as zip file name
        String tarFileName = source.getFileName().toString() + ".tar.gz";

        try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
             BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
             GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
             TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {

            Files.walkFileTree(source, new SimpleFileVisitor<>() {

                @Override
                public FileVisitResult visitFile(Path file,
                                            BasicFileAttributes attributes) {

                    // only copy files, no symbolic links
                    if (attributes.isSymbolicLink()) {
                        return FileVisitResult.CONTINUE;
                    }

                    // get filename
                    Path targetFile = source.relativize(file);

                    try {
                        TarArchiveEntry tarEntry = new TarArchiveEntry(
                                file.toFile(), targetFile.toString());

                        tOut.putArchiveEntry(tarEntry);

                        Files.copy(file, tOut);

                        tOut.closeArchiveEntry();

                        System.out.printf("file : %s%n", file);

                    } catch (IOException e) {
                        System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
                    }

                    return FileVisitResult.CONTINUE;
                }

                @Override
                public FileVisitResult visitFileFailed(Path file, IOException exc) {
                    System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
                    return FileVisitResult.CONTINUE;
                }

            });

            tOut.finish();
        }

    }
参考资料

  • 为了将来的参考,y
    public static File gzip(File fileToCompress) throws IOException {
    
        final File gzipFile = new File(fileToCompress.toPath().getParent().toFile(),
                fileToCompress.getName() + ".gz");
    
        final byte[] buffer = new byte[1024];
    
        try (FileInputStream in = new FileInputStream(fileToCompress);
                GZIPOutputStream out = new GZIPOutputStream(
                        new FileOutputStream(gzipFile))) {
    
            int len;
            while ((len = in.read(buffer)) > 0) {
                out.write(buffer, 0, len);
            }
        }
    
        return gzipFile;
    }
    
    public static File tar(File folderToCompress) throws IOException, ArchiveException {
    
        final File tarFile = Files.createTempFile(null, ".tar").toFile();
    
        try (TarArchiveOutputStream out = (TarArchiveOutputStream) new ArchiveStreamFactory()
                .createArchiveOutputStream(ArchiveStreamFactory.TAR,
                        new FileOutputStream(tarFile))) {
    
            out.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
    
            Files.walk(folderToCompress.toPath()) //
                    .forEach(source -> {
    
                        if (source.toFile().isFile()) {
                            final String relatifSourcePath = StringUtils.substringAfter(
                                    source.toString(), folderToCompress.getPath());
    
                            final TarArchiveEntry entry = new TarArchiveEntry(
                                    source.toFile(), relatifSourcePath);
    
                            try (InputStream in = new FileInputStream(source.toFile())){
                                out.putArchiveEntry(entry);
    
                                IOUtils.copy(in, out);
    
                                out.closeArchiveEntry();
                            }
                            catch (IOException e) {
                                // Handle this better than bellow...
                                throw new RuntimeException(e);
                            }
                        }
                    });
    
        }
    
        return tarFile;
    }
    
    public static void createTarGzipFolder(Path source) throws IOException {
    
            if (!Files.isDirectory(source)) {
                throw new IOException("Please provide a directory.");
            }
    
            // get folder name as zip file name
            String tarFileName = source.getFileName().toString() + ".tar.gz";
    
            try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
                 BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
                 GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
                 TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
    
                Files.walkFileTree(source, new SimpleFileVisitor<>() {
    
                    @Override
                    public FileVisitResult visitFile(Path file,
                                                BasicFileAttributes attributes) {
    
                        // only copy files, no symbolic links
                        if (attributes.isSymbolicLink()) {
                            return FileVisitResult.CONTINUE;
                        }
    
                        // get filename
                        Path targetFile = source.relativize(file);
    
                        try {
                            TarArchiveEntry tarEntry = new TarArchiveEntry(
                                    file.toFile(), targetFile.toString());
    
                            tOut.putArchiveEntry(tarEntry);
    
                            Files.copy(file, tOut);
    
                            tOut.closeArchiveEntry();
    
                            System.out.printf("file : %s%n", file);
    
                        } catch (IOException e) {
                            System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
                        }
    
                        return FileVisitResult.CONTINUE;
                    }
    
                    @Override
                    public FileVisitResult visitFileFailed(Path file, IOException exc) {
                        System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
                        return FileVisitResult.CONTINUE;
                    }
    
                });
    
                tOut.finish();
            }
    
        }
    
    public static void decompressTarGzipFile(Path source, Path target)
            throws IOException {
    
            if (Files.notExists(source)) {
                throw new IOException("File doesn't exists!");
            }
    
            try (InputStream fi = Files.newInputStream(source);
                 BufferedInputStream bi = new BufferedInputStream(fi);
                 GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
                 TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
    
                ArchiveEntry entry;
                while ((entry = ti.getNextEntry()) != null) {
    
                    Path newPath = zipSlipProtect(entry, target);
    
                    if (entry.isDirectory()) {
                        Files.createDirectories(newPath);
                    } else {
    
                        // check parent folder again
                        Path parent = newPath.getParent();
                        if (parent != null) {
                            if (Files.notExists(parent)) {
                                Files.createDirectories(parent);
                            }
                        }
    
                        // copy TarArchiveInputStream to Path newPath
                        Files.copy(ti, newPath, StandardCopyOption.REPLACE_EXISTING);
    
                    }
                }
            }
        }
    
        private static Path zipSlipProtect(ArchiveEntry entry, Path targetDir)
            throws IOException {
    
            Path targetDirResolved = targetDir.resolve(entry.getName());
    
            Path normalizePath = targetDirResolved.normalize();
    
            if (!normalizePath.startsWith(targetDir)) {
                throw new IOException("Bad entry: " + entry.getName());
            }
    
            return normalizePath;
        }