Java 使用Commons Compress将目录压缩到tar.gz
我在使用commons压缩库创建目录的tar.gz时遇到问题。我有一个目录结构,如下所示Java 使用Commons Compress将目录压缩到tar.gz,java,compression,tar,apache-commons,apache-commons-compress,Java,Compression,Tar,Apache Commons,Apache Commons Compress,我在使用commons压缩库创建目录的tar.gz时遇到问题。我有一个目录结构,如下所示 parent/ child/ file1.raw fileN.raw 我使用下面的代码进行压缩。它毫无例外地运行良好。然而,当我试图解压缩tar.gz时,我得到一个名为“childDirToCompress”的文件。它的大小是正确的,因此文件在涂焦油过程中已清楚地相互追加。所需的输出将是一个目录。我不知道我做错了什么。任何一个明智的公地压缩机能让我走上正确的道路吗
parent/
child/
file1.raw
fileN.raw
我使用下面的代码进行压缩。它毫无例外地运行良好。然而,当我试图解压缩tar.gz时,我得到一个名为“childDirToCompress”的文件。它的大小是正确的,因此文件在涂焦油过程中已清楚地相互追加。所需的输出将是一个目录。我不知道我做错了什么。任何一个明智的公地压缩机能让我走上正确的道路吗
CreateTarGZ() throws CompressorException, FileNotFoundException, ArchiveException, IOException {
File f = new File("parent");
File f2 = new File("parent/childDirToCompress");
File outFile = new File(f2.getAbsolutePath() + ".tar.gz");
if(!outFile.exists()){
outFile.createNewFile();
}
FileOutputStream fos = new FileOutputStream(outFile);
TarArchiveOutputStream taos = new TarArchiveOutputStream(new GZIPOutputStream(new BufferedOutputStream(fos)));
taos.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR);
taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
addFilesToCompression(taos, f2, ".");
taos.close();
}
private static void addFilesToCompression(TarArchiveOutputStream taos, File file, String dir) throws IOException{
taos.putArchiveEntry(new TarArchiveEntry(file, dir));
if (file.isFile()) {
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
IOUtils.copy(bis, taos);
taos.closeArchiveEntry();
bis.close();
}
else if(file.isDirectory()) {
taos.closeArchiveEntry();
for (File childFile : file.listFiles()) {
addFilesToCompression(taos, childFile, file.getName());
}
}
}
我还没有弄清楚到底出了什么问题,但通过对谷歌缓存的搜索,我发现了一个有效的例子。对不起,风滚草
public void CreateTarGZ()
throws FileNotFoundException, IOException
{
try {
System.out.println(new File(".").getAbsolutePath());
dirPath = "parent/childDirToCompress/";
tarGzPath = "archive.tar.gz";
fOut = new FileOutputStream(new File(tarGzPath));
bOut = new BufferedOutputStream(fOut);
gzOut = new GzipCompressorOutputStream(bOut);
tOut = new TarArchiveOutputStream(gzOut);
addFileToTarGz(tOut, dirPath, "");
} finally {
tOut.finish();
tOut.close();
gzOut.close();
bOut.close();
fOut.close();
}
}
private void addFileToTarGz(TarArchiveOutputStream tOut, String path, String base)
throws IOException
{
File f = new File(path);
System.out.println(f.exists());
String entryName = base + f.getName();
TarArchiveEntry tarEntry = new TarArchiveEntry(f, entryName);
tOut.putArchiveEntry(tarEntry);
if (f.isFile()) {
IOUtils.copy(new FileInputStream(f), tOut);
tOut.closeArchiveEntry();
} else {
tOut.closeArchiveEntry();
File[] children = f.listFiles();
if (children != null) {
for (File child : children) {
System.out.println(child.getName());
addFileToTarGz(tOut, child.getAbsolutePath(), entryName + "/");
}
}
}
}
我遵循这个解决方案,直到我处理了一组更大的文件,它在处理15000-16000个文件后随机崩溃。以下行正在泄漏文件处理程序:
IOUtils.copy(new FileInputStream(f), tOut);
代码在操作系统级别由于“打开的文件太多”错误而崩溃
以下微小更改修复了该问题:
FileInputStream in = new FileInputStream(f);
IOUtils.copy(in, tOut);
in.close();
我最后做了以下几件事:
public URL createTarGzip() throws IOException {
Path inputDirectoryPath = ...
File outputFile = new File("/path/to/filename.tar.gz");
try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipOutputStream)) {
tarArchiveOutputStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
List<File> files = new ArrayList<>(FileUtils.listFiles(
inputDirectoryPath,
new RegexFileFilter("^(.*?)"),
DirectoryFileFilter.DIRECTORY
));
for (int i = 0; i < files.size(); i++) {
File currentFile = files.get(i);
String relativeFilePath = new File(inputDirectoryPath.toUri()).toURI().relativize(
new File(currentFile.getAbsolutePath()).toURI()).getPath();
TarArchiveEntry tarEntry = new TarArchiveEntry(currentFile, relativeFilePath);
tarEntry.setSize(currentFile.length());
tarArchiveOutputStream.putArchiveEntry(tarEntry);
tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(currentFile)));
tarArchiveOutputStream.closeArchiveEntry();
}
tarArchiveOutputStream.close();
return outputFile.toURI().toURL();
}
}
public URL createTarGzip()引发IOException{
路径inputDirectoryPath=。。。
File outputFile=新文件(“/path/to/filename.tar.gz”);
try(FileOutputStream FileOutputStream=newfileoutputstream(outputFile);
BufferedOutputStream BufferedOutputStream=新的BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream=新的GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream TarArchiveOutputStream=新的TarArchiveOutputStream(gzip输出流)){
tarArchiveOutputStream.SetBignumerMode(tarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(tarArchiveOutputStream.LONGFILE_GNU);
列表文件=新的ArrayList(FileUtils.listFiles(
inputDirectoryPath,
新的RegexFileFilter(“^(.*)”,
DirectoryFileFilter.DIRECTORY
));
对于(int i=0;i
这就解决了其他解决方案中出现的一些边缘情况。我必须对@merrick solution进行一些调整,以使其与路径相关。也许是最新的maven依赖项。目前接受的解决方案对我不起作用
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import org.apache.commons.io.filefilter.DirectoryFileFilter;
import org.apache.commons.io.filefilter.RegexFileFilter;
public class TAR {
public static void CreateTarGZ(String inputDirectoryPath, String outputPath) throws IOException {
File inputFile = new File(inputDirectoryPath);
File outputFile = new File(outputPath);
try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream(gzipOutputStream)) {
tarArchiveOutputStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
List<File> files = new ArrayList<>(FileUtils.listFiles(
inputFile,
new RegexFileFilter("^(.*?)"),
DirectoryFileFilter.DIRECTORY
));
for (int i = 0; i < files.size(); i++) {
File currentFile = files.get(i);
String relativeFilePath = inputFile.toURI().relativize(
new File(currentFile.getAbsolutePath()).toURI()).getPath();
TarArchiveEntry tarEntry = new TarArchiveEntry(currentFile, relativeFilePath);
tarEntry.setSize(currentFile.length());
tarArchiveOutputStream.putArchiveEntry(tarEntry);
tarArchiveOutputStream.write(IOUtils.toByteArray(new FileInputStream(currentFile)));
tarArchiveOutputStream.closeArchiveEntry();
}
tarArchiveOutputStream.close();
}
}
}
import java.io.BufferedOutputStream;
导入java.io.File;
导入java.io.FileInputStream;
导入java.io.FileNotFoundException;
导入java.io.FileOutputStream;
导入java.io.IOException;
导入java.util.ArrayList;
导入java.util.List;
导入org.apache.commons.compress.archivers.tar.TarArchiveEntry;
导入org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
导入org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
导入org.apache.commons.io.FileUtils;
导入org.apache.commons.io.IOUtils;
导入org.apache.commons.io.filefilter.DirectoryFileFilter;
导入org.apache.commons.io.filefilter.RegexFileFilter;
公共级焦油{
公共静态void CreateTarGZ(字符串inputDirectoryPath,字符串outputPath)引发IOException{
文件inputFile=新文件(inputDirectoryPath);
File outputFile=新文件(outputPath);
try(FileOutputStream FileOutputStream=newfileoutputstream(outputFile);
BufferedOutputStream BufferedOutputStream=新的BufferedOutputStream(fileOutputStream);
GzipCompressorOutputStream gzipOutputStream=新的GzipCompressorOutputStream(bufferedOutputStream);
TarArchiveOutputStream TarArchiveOutputStream=新的TarArchiveOutputStream(gzip输出流)){
tarArchiveOutputStream.SetBignumerMode(tarArchiveOutputStream.BIGNUMBER_POSIX);
tarArchiveOutputStream.setLongFileMode(tarArchiveOutputStream.LONGFILE_GNU);
列表文件=新的ArrayList(FileUtils.listFiles(
输入文件,
新的RegexFileFilter(“^(.*)”,
DirectoryFileFilter.DIRECTORY
));
对于(int i=0;i
马文
公地io
公地io
2.6
org.apache.commons
公用压缩
1.18
我使用的东西(通过文件.walk
API),你可以链接gzip(tar(youFile))代码>
查看下面的Apachecommons compress
和File walker示例
此示例tar.gz
a目录
public static void createTarGzipFolder(Path source) throws IOException {
if (!Files.isDirectory(source)) {
throw new IOException("Please provide a directory.");
}
// get folder name as zip file name
String tarFileName = source.getFileName().toString() + ".tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
Files.walkFileTree(source, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attributes) {
// only copy files, no symbolic links
if (attributes.isSymbolicLink()) {
return FileVisitResult.CONTINUE;
}
// get filename
Path targetFile = source.relativize(file);
try {
TarArchiveEntry tarEntry = new TarArchiveEntry(
file.toFile(), targetFile.toString());
tOut.putArchiveEntry(tarEntry);
Files.copy(file, tOut);
tOut.closeArchiveEntry();
System.out.printf("file : %s%n", file);
} catch (IOException e) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
return FileVisitResult.CONTINUE;
}
});
tOut.finish();
}
}
参考资料
为了将来的参考,y
public static File gzip(File fileToCompress) throws IOException {
final File gzipFile = new File(fileToCompress.toPath().getParent().toFile(),
fileToCompress.getName() + ".gz");
final byte[] buffer = new byte[1024];
try (FileInputStream in = new FileInputStream(fileToCompress);
GZIPOutputStream out = new GZIPOutputStream(
new FileOutputStream(gzipFile))) {
int len;
while ((len = in.read(buffer)) > 0) {
out.write(buffer, 0, len);
}
}
return gzipFile;
}
public static File tar(File folderToCompress) throws IOException, ArchiveException {
final File tarFile = Files.createTempFile(null, ".tar").toFile();
try (TarArchiveOutputStream out = (TarArchiveOutputStream) new ArchiveStreamFactory()
.createArchiveOutputStream(ArchiveStreamFactory.TAR,
new FileOutputStream(tarFile))) {
out.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
Files.walk(folderToCompress.toPath()) //
.forEach(source -> {
if (source.toFile().isFile()) {
final String relatifSourcePath = StringUtils.substringAfter(
source.toString(), folderToCompress.getPath());
final TarArchiveEntry entry = new TarArchiveEntry(
source.toFile(), relatifSourcePath);
try (InputStream in = new FileInputStream(source.toFile())){
out.putArchiveEntry(entry);
IOUtils.copy(in, out);
out.closeArchiveEntry();
}
catch (IOException e) {
// Handle this better than bellow...
throw new RuntimeException(e);
}
}
});
}
return tarFile;
}
public static void createTarGzipFolder(Path source) throws IOException {
if (!Files.isDirectory(source)) {
throw new IOException("Please provide a directory.");
}
// get folder name as zip file name
String tarFileName = source.getFileName().toString() + ".tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
Files.walkFileTree(source, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attributes) {
// only copy files, no symbolic links
if (attributes.isSymbolicLink()) {
return FileVisitResult.CONTINUE;
}
// get filename
Path targetFile = source.relativize(file);
try {
TarArchiveEntry tarEntry = new TarArchiveEntry(
file.toFile(), targetFile.toString());
tOut.putArchiveEntry(tarEntry);
Files.copy(file, tOut);
tOut.closeArchiveEntry();
System.out.printf("file : %s%n", file);
} catch (IOException e) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
return FileVisitResult.CONTINUE;
}
});
tOut.finish();
}
}
public static void decompressTarGzipFile(Path source, Path target)
throws IOException {
if (Files.notExists(source)) {
throw new IOException("File doesn't exists!");
}
try (InputStream fi = Files.newInputStream(source);
BufferedInputStream bi = new BufferedInputStream(fi);
GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
ArchiveEntry entry;
while ((entry = ti.getNextEntry()) != null) {
Path newPath = zipSlipProtect(entry, target);
if (entry.isDirectory()) {
Files.createDirectories(newPath);
} else {
// check parent folder again
Path parent = newPath.getParent();
if (parent != null) {
if (Files.notExists(parent)) {
Files.createDirectories(parent);
}
}
// copy TarArchiveInputStream to Path newPath
Files.copy(ti, newPath, StandardCopyOption.REPLACE_EXISTING);
}
}
}
}
private static Path zipSlipProtect(ArchiveEntry entry, Path targetDir)
throws IOException {
Path targetDirResolved = targetDir.resolve(entry.getName());
Path normalizePath = targetDirResolved.normalize();
if (!normalizePath.startsWith(targetDir)) {
throw new IOException("Bad entry: " + entry.getName());
}
return normalizePath;
}