Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/230.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用PHP删除.txt文件中的重复行_Php_Unix_For Loop_Fopen - Fatal编程技术网

使用PHP删除.txt文件中的重复行

使用PHP删除.txt文件中的重复行,php,unix,for-loop,fopen,Php,Unix,For Loop,Fopen,我有多个带有目录的txt文件。文本文件都包含相同的标题。我正在读取所有的txt文件,并将其全部输出到一个文件中 因为每个单独的文件都包含相同的头,所以它会将它们全部插入到新的合并文件中。如何删除新合并文件中的所有标题并将其中一个保留在顶部 我一直在研究unix中的sort命令 sort filename | uniq 此命令起作用,但删除所有其他重复数据。是否只删除特定字符串“thisaheader”而将其保留在顶部 当前代码 $header = array( "XX-XXXXXXXXX-XX

我有多个带有目录的txt文件。文本文件都包含相同的标题。我正在读取所有的txt文件,并将其全部输出到一个文件中

因为每个单独的文件都包含相同的头,所以它会将它们全部插入到新的合并文件中。如何删除新合并文件中的所有标题并将其中一个保留在顶部

我一直在研究unix中的sort命令

sort filename | uniq
此命令起作用,但删除所有其他重复数据。是否只删除特定字符串“thisaheader”而将其保留在顶部

当前代码

$header = array( "XX-XXXXXXXXX-XXXXXXX-X        XXXXXXXXXXXX" );


$files = glob( "/path/to/folder/*.txt" );

$output_file = "newfile_".date( "YmdHis" ).".txt";

$out = fopen( $output_file, "w" );

foreach( $header as $inputHeader ) {

    fwrite( $out, $inputHeader );
}

    foreach( $files as $file ) {

        $in = fopen( $file, "r" );

            while ( $line = fgets( $in ) ) {

                if( $header !== $line ) {

                    fwrite( $out, $line );

                }

            }

        fclose( $in );

     }

fclose( $out );
//the headers that were in the file with duplicates
$header1 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA111;
$header2 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA222";
$header3 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA333";
$header4 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA444";

//get all the files to be merged
$files = glob( "/PATH/TO/FILES/*.txt" );

//set the output filename
$output_file = "NewFile".date( "YmdHis" ).".txt";

//open the output file
$out = fopen( $output_file, "w" );

    //loop through the files to be merged
    foreach( $files as $file ) {

        //open each file
        $in = fopen( $file, "r" );

            //while each line in each file
            while ( $line = fgets( $in ) ) {

                //if the current line is not equal to header1, header2, header3 or header4
                if( preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header1 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header2 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header3 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header4 ) ) {

                       //write that line to the output file
                       fwrite( $out, $line );

                       //echo $line."\n";

                }else{
                       //write blank line to the file
                       fwrite( $out, "\n" );

                  }

            }
        //close the file
        fclose( $in );

     }
//close the output file
fclose( $out );

//get the contents of the output file
$header1 .= file_get_contents( $output_file );

//add the header to the top of the output file
file_put_contents( $output_file, $header1 );
重复多次的行

试着在开始书写时输入标题,然后在阅读行时检查标题

//cache our header lines
$header = "Header line";

$files = glob( "/path/to/files*.txt" );

//print_r($files);

$output_file = "newfile".date( "YmdHis" ).".txt";

$out = fopen( $output_file, "w" );

//input the header line at the top of our new file

fwrite( $out, $header);



foreach( $files as $file ) {

    $in = fopen( $file, "r" );

        while ( $line = fgets( $in ) ) {
            //header check, dont output header lines to new file
            if($header !== preg_replace('/\s+/', '', $line)){
                 fwrite( $out, $line );
            }
        }

    fclose( $in );
}

fclose( $out );

创建新文件后,添加此行将删除重复的行

$lines = array_unique(file("your_file.txt"));

如果文件只有1个头

$header_exist = false;

foreach($files as $file) {

  $in = fopen($file, "r");

  while($line = fgets($in)) {
    if(strpos($line, "This is a header") === false) {
      fwrite($out, $line);
    }
    else {
      if($header_exist === false) {
        $header_exist = true;
        fwrite($out, $line);
      }
    }
  }
  fclose($in);
}

所以我在@WillParky93的帮助下解决了这个问题。我在文件中有4个不同的头,它们都是重复的。在使用逻辑运算符之后

最终代码

$header = array( "XX-XXXXXXXXX-XXXXXXX-X        XXXXXXXXXXXX" );


$files = glob( "/path/to/folder/*.txt" );

$output_file = "newfile_".date( "YmdHis" ).".txt";

$out = fopen( $output_file, "w" );

foreach( $header as $inputHeader ) {

    fwrite( $out, $inputHeader );
}

    foreach( $files as $file ) {

        $in = fopen( $file, "r" );

            while ( $line = fgets( $in ) ) {

                if( $header !== $line ) {

                    fwrite( $out, $line );

                }

            }

        fclose( $in );

     }

fclose( $out );
//the headers that were in the file with duplicates
$header1 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA111;
$header2 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA222";
$header3 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA333";
$header4 = "DD-LLDRHD045-UHSTAYL-MR        LOCKFMDLA444";

//get all the files to be merged
$files = glob( "/PATH/TO/FILES/*.txt" );

//set the output filename
$output_file = "NewFile".date( "YmdHis" ).".txt";

//open the output file
$out = fopen( $output_file, "w" );

    //loop through the files to be merged
    foreach( $files as $file ) {

        //open each file
        $in = fopen( $file, "r" );

            //while each line in each file
            while ( $line = fgets( $in ) ) {

                //if the current line is not equal to header1, header2, header3 or header4
                if( preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header1 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header2 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header3 )&&
                    preg_replace('/\s+/', '', $line ) !=
                    preg_replace('/\s+/', '', $header4 ) ) {

                       //write that line to the output file
                       fwrite( $out, $line );

                       //echo $line."\n";

                }else{
                       //write blank line to the file
                       fwrite( $out, "\n" );

                  }

            }
        //close the file
        fclose( $in );

     }
//close the output file
fclose( $out );

//get the contents of the output file
$header1 .= file_get_contents( $output_file );

//add the header to the top of the output file
file_put_contents( $output_file, $header1 );

“这个命令可以工作,但会删除所有其他重复的数据”-确切地说,这就是该命令应该做的。“是否只删除特定字符串“This a header”但将其保留在顶部?”当然,您回答了自己的问题,
uniq
也会这样做:尝试此操作后,它会停止输出文件。我做错什么了吗?运行
error\u reporting(E\u ALL)的输出是什么;ini设置(“显示错误”,1)在页面顶部?它不会给出任何错误。我将$header标记从您输入的更改为我要查找的字符串。对吗?$header仍然是数组吗?你能贴上你新的$header标签吗?(如果必须,请忽略个人数据)是的,标题标记仍然是一个数组。但是,我注意到,当我运行脚本时,它在某个地方陷入了一个循环中。在fclose之前和foreach语句之后很高兴看到您已经解决了问题!现在我们应该重新考虑使用数组,因为我们知道我们的页眉没有对齐,这将有助于提高可读性,并为您添加/删除标题中的灵活性提供更多的帮助。future@WillParky93我会那样做的。谢谢你的帮助。