Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/261.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 如何将此文本转换为所需的数组格式并以csv格式导出?_Php_Data Extraction_Pdftotext - Fatal编程技术网

Php 如何将此文本转换为所需的数组格式并以csv格式导出?

Php 如何将此文本转换为所需的数组格式并以csv格式导出?,php,data-extraction,pdftotext,Php,Data Extraction,Pdftotext,我使用pdftotext工具从pdf中提取了此文本 请查看下面的文本结构 stage title1 title2 title3 title4 I value1 value2 value3 II value5 value6 stage Other1 Other2 Other3 Other4 I otherval1 otherval2 otherval3

我使用
pdftotext
工具从pdf中提取了此文本

请查看下面的文本结构

stage    title1    title2  title3  title4
I        value1    value2  value3  
II                         value5  value6

stage    Other1      Other2     Other3     Other4
I        otherval1   otherval2  otherval3  otherval4
现在,我想以CSV格式导出此文本,使用适当的列和标题,或者以这种方式构建一个数组

[
  "category" => "title1",
  "score"    => "value1",
],
[
  "category" => "title2",
  "score"    => "value2",
],
[
  "category" => "title3",
  "score"    => "value3"
],
// unable to to do this
[
  "category" => "title3",
  "score"    => "value5"
],
[
  "category" => "title4",
  "score"    => "value6",
],

.
.
// so on
现在问题是

  • I阶段和II阶段中的列值是可选的,但其中任何一个 行中的每列至少包含一个值
  • 第二阶段行是可选的,可以存在也可以不存在
  • 如果第二阶段行存在,则该行中至少存在一个列值 划船
我面临的问题是如何绘制地图

$rows = explode("\n", $pdfExtractedText);
$rows = array_values(array_filter($rows));

$categories = array_values(array_filter(explode(" ", $rows[7])));
$stage1Scores = array_values(array_filter(explode(" ", $rows[8])));
$stage2Scores = array_values(array_filter(explode(" ", $rows[9])));
var_dump($categories);
var_dump($stage1Scores);
var_dump($stage2Scores);
  • 价值5至标题3
  • 价值6至第4项
这是我的解析器代码(PHP)

$rows = explode("\n", $pdfExtractedText);
$rows = array_values(array_filter($rows));

$categories = array_values(array_filter(explode(" ", $rows[7])));
$stage1Scores = array_values(array_filter(explode(" ", $rows[8])));
$stage2Scores = array_values(array_filter(explode(" ", $rows[9])));
var_dump($categories);
var_dump($stage1Scores);
var_dump($stage2Scores);
输出:

// categories
array:13 [
  0 => "stage"
  1 => "title1"
  2 => "title2"
  3 => "title3"
  4 => "title4"
]

//values - Index preserved so that I can map with categories
array:14 [
  0 => "I"
  1 => "value1"
  2 => "value2"
  3 => "value3"
  4 => "value4"
]

// index not preserved :(
array:2 [
  0 => "II"
  1 => "value5",
  2 => "value6"
]
那就试试这个,

$csv = "";

$csv .= implode("," , $categories) . PHP_EOL; 
$csv .= implode("," , $stage1scores) . PHP_EOL;
$csv .= implode("," , $stage2scores) . PHP_EOL;

然后将其写入一个文件。

您想要一种将输出解析为数组的方法吗?或者您只是想将其推送到csv格式吗?@Hudson csv也可以,如果它保留了所需的标题。我已经回答了下面的问题,如果您需要任何更改,只需在答案下面加上注释,我就可以更正任何内容。您不认为这会在标题1下打印
value5
?当我需要
value5
title3
下时,是否已将键映射到值?如果不是自动填充,请将空列替换为空值。我正在寻找解决该问题的方法,我无法正确地将键映射到值。请检查我的php解析器代码和提取的文本结构