Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/294.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在PHP中将所有嵌套的tar.gz和zip解压缩到一个目录中?_Php_Json_Zip_Tar_Data Extraction - Fatal编程技术网

如何在PHP中将所有嵌套的tar.gz和zip解压缩到一个目录中?

如何在PHP中将所有嵌套的tar.gz和zip解压缩到一个目录中?,php,json,zip,tar,data-extraction,Php,Json,Zip,Tar,Data Extraction,我需要在PHP中提取一个tar.gz文件。该文件包含许多JSON文件、tar.gz、zip文件和子目录。我只需要将JSON文件移动到/Dataset/processing目录中,然后继续提取嵌套的tar.gz和zip,以从中获取所有JSON文件。这些文件也可以有嵌套的文件夹/目录 结构如下所示: origin.tar.gz ├───sub1.tar.gz │ ├───sub2.tar.gz │ ├───├───a.json │ ├───├───├───├───├───├───

我需要在PHP中提取一个tar.gz文件。该文件包含许多JSON文件、tar.gzzip文件和子目录。我只需要将JSON文件移动到/Dataset/processing目录中,然后继续提取嵌套的tar.gz和zip,以从中获取所有JSON文件。这些文件也可以有嵌套的文件夹/目录

结构如下所示:

origin.tar.gz
 ├───sub1.tar.gz
 │   ├───sub2.tar.gz
 │   ├───├───a.json
 │   ├───├───├───├───├───├───...(unknown depth)
 │   ├───b.json
 │   ├───c.json
 ├───sub3.zip
 │   ├───sub4.tar.gz
 │   ├───├───d.json
 │   ├───├───├───├───├───├───...(unknown depth)
 │   ├───e.json
 │   ├───f.json
 ├───subdirectory
 │   ├───g.json
 ├───h.json
 ├───i.json
 |   ..........
 |   ..........
 |   ..........
 |   many of them
一旦它被提取出来,/Dataset将如下所示

Dataset/processing
 ├───a.json
 ├───b.json
 ├───c.json
 ├───d.json
 ├───e.json
 ├───f.json
 ├───g.json
 ├───h.json
 ├───i.json
 |   ..........
 |   ..........
 |   ..........
 |   many of them
我知道如何在PHP中使用PharData提取tar.gz,但它只能在单级深度上工作。我在想,如果某种递归可以使这项工作达到多层次的深度

$phar = new PharData('origin.tar.gz');
$phar->extractTo('/full/path'); // extract all files in the tar.gz
我对我的代码进行了一些改进,并尝试了这一点,它适用于多深度,但在存在同样包含JSON的目录(文件夹或嵌套文件夹)时失败。有人能帮我把它们也提取出来吗

<?php

$path = './';

// Extraction of compressed file
function fun($path) {    
    $array = scandir($path); 
    for ($i = 0; $i < count($array); $i++) {
        if($i == 0 OR $i == 1){continue;}
        else {
            $item = $array[$i];
            $fileExt = explode('.', $item);

            // Getting the extension of the file
            $fileActualExt = strtolower(end($fileExt));
            if(($fileActualExt == 'gz') or ($fileActualExt == 'zip')){
                $pathnew = $path.$item; // Dataset ./data1.tar.gz
                $phar = new PharData($pathnew);
                // Moving the files
                $phar->extractTo($path);
                // Del the files
                unlink($pathnew);
                $i=0;
            }
        }
        $array = scandir($path);


    }
}
fun($path);

// Move only the json to ./dataset(I will add it later)
?>


提前感谢。

在第一步,提取tar.gz文件,如您所述:

$phar = new PharData('origin.tar.gz');
$phar->extractTo('/full/path'); // extract all files in the tar.gz
然后递归读取目录,将所有json类型的文件移动到目标目录中,下面是我的带注释的代码:

$dirPath='./';       // the root path of your very first extraction of your tar.gz

recursion_readdir($dirPath,1);


function recursion_readdir($dirPath,$Deep=0){
    $resDir=opendir($dirPath);
    while($basename=readdir($resDir)){
        //current file path
        $path=$dirPath.'/'.$basename;
        if(is_dir($path) AND $basename!='.' AND $basename!='..'){
            //it is directory, then go deeper
            $Deep++;//depth+1
            recursion_readdir($path,$Deep);
        }else if(basename($path)!='.' AND basename($path)!='..'){
            //it is not directory,
            //when the file is json file
                if(strstr($basename,'json')) {
                        //copy the file to your destination path
                    copy($path, './dest/' . $basename);

            } else if(strstr($basename,'tar')){
                //when the file is tar.gz file, extract this tar.gz file
                $phar = new PharData($basename);
                $phar->extractTo($dirPath, null, true);
            }
        }

    }
    closedir($resDir);
}
function forChar($char='-',$times=0){
  $result='';
  for($i=0;$i<$times;$i++){
     $result.=$char;
  }
  return $result;
}
$dirPath='./';//第一次提取tar.gz的根路径
递归_readdir($dirPath,1);
函数递归\u readdir($dirPath,$Deep=0){
$resDir=opendir($dirPath);
而($basename=readdir($resDir)){
//当前文件路径
$path=$dirPath.'/'.$basename;
如果(是_dir($path)和$basename!='.'和$basename!='.'){
//它是目录,然后再深入
$Deep++;//深度+1
递归\u readdir($path,$Deep);
}else if(basename($path)!='.'和basename($path)!='.'){
//它不是目录,
//当文件为json文件时
if(strstrstr($basename,'json')){
//将文件复制到目标路径
副本($path'./dest/'.$basename);
}else if(strstrstr($basename,'tar')){
//当文件为tar.gz文件时,提取此tar.gz文件
$phar=新的PharData($basename);
$phar->extractTo($dirPath,null,true);
}
}
}
closedir($resDir);
}
函数forChar($char='-',$times=0){
$result='';

对于($i=0;$i我在做了一点研究后解决了它。这就解决了问题

有3个功能:

  • recursiveScanProtected():它提取所有压缩文件
  • scanJSON():它将扫描JSON文件并将其移动到处理文件夹
  • delete_files():此函数删除除处理文件夹(其中包含JSON文件)和根目录中的index.php之外的所有内容


欢迎使用Stack Overflow..请将您自己的代码添加到您的问题中。您需要至少展示您自己为解决此问题所做的研究量。我是Stack Overflow新手。非常感谢您的激励。我添加了我所做的研究量以及为获得所需解决方案所做的努力。我我仍然受到我上面提到的问题的困扰。任何帮助都将不胜感激。感谢您的努力和时间,但如果有一些目录(也包含JSON文件的文件夹)则不起作用。我也展示了我的方法。如果您能帮助我,我将不胜感激。
<?php

// Root directory
$path = './';

// Directory where I want to extract the JSON files
$path_json = $path.'processing/';


// Function to extract all the compressed files
function recursiveScanProtected($dir, $conn) {
    if($dir != '') {
        $tree = glob(rtrim($dir, '/') . '/*');
        if (is_array($tree)) {
            for ($i = 0; $i < count($tree); $i++) {
                $file = $tree[$i];
                if (is_dir($file)) {
                    recursiveScanProtected($file, $conn); // Recursive call if directory
                } elseif (is_file($file)) {

                    $item = $file;
                    $fileExt = explode('.', $item); 
                    // Getting the extension of the file
                    $fileActualExt = strtolower(end($fileExt));
                    // Check if the file is a zip or a tar.gz
                    if(($fileActualExt == 'gz') or ($fileActualExt == 'zip')){

                        // Moving the file - Overwriting true
                        $phar->extractTo($dir.$i."/", null, true);

                        // Del the compressed file
                        unlink($item);

                        recursiveScanProtected($dir.$i, $conn); // Recursive call
                    }

                }
            }
        }
    }
}
recursiveScanProtected($path, $conn);


// Move the JSON files to processing
function scanJSON($dir, $path_json) {
    if($dir != '') {
        $tree = glob(rtrim($dir, '/') . '/*');
        if (is_array($tree)) {
            foreach($tree as $file) {
                if (is_dir($file)) {
                    // Do not scan processing recursively, but all other directories should be scanned
                    if($file != './processing'){
                        scanJSON($file, $path_json);
                    }
                } elseif (is_file($file)) {

                    $ext = pathinfo($file);

                    if(strtolower($ext['extension']) == 'json'){
                        // Move the JSON files to processing
                        rename($file, $path_json.$ext['basename']);
                    }
                }
            }
        }
    }
}

scanJSON($path, $path_json);

/* 
 * php delete function that deals with directories recursively
 * It deletes everything except ./dataset/processing and index.php
 */
function delete_files($target) {

    if(is_dir($target)){
        $files = glob( $target . '*', GLOB_MARK ); //GLOB_MARK adds a slash to directories returned
        foreach( $files as $file ){
            if($file == './processing/' || $file == './index.php'){
                continue;
            } else{
                delete_files( $file );
            }
        }
        if($target != './'){
            rmdir( $target );
        }
    } elseif(is_file($target)) {
        unlink( $target );  
    }
}

delete_files($path);
?>