如何使用php从pdf中提取特定文本_Php_Html_Pdf

如何使用php从pdf中提取特定文本

php html pdf

如何使用php从pdf中提取特定文本,php,html,pdf,Php,Html,Pdf,我需要在mysql表中存储候选人的姓名和他的id，我已经使用pdfparser提取了文本 <?php // Include Composer autoloader if not already done. include 'vendor\autoload.php'; // Parse pdf file and build necessary objects. $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->

我需要在mysql表中存储候选人的姓名和他的id，我已经使用pdfparser提取了文本

<?php

// Include Composer autoloader if not already done.
include 'vendor\autoload.php';

// Parse pdf file and build necessary objects.
$parser = new  \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('C:\Desktop\Data\ApplicationForm.pdf');

$text = $pdf->getText();
echo $text;

?>

现在它只是显示提取的文本，现在我需要从页面（运行上述程序时出现的页面）中提取名称和id，该页面充满了提取的文本，单击查看页面源，我找到了我需要的id

出现在：-

tr 1115*15 td.线号31*15和td.线内容：1084*15，线号值=12

名称存在于：-

tr 1115*15 td.线号31*15和td.线内容：1084*15，线号值=13

我在这一点上迷路了，因为我不知道如何获得这些信息。请帮助我

我有多个pdf文件，我需要的所有信息都在同一个地方（在同一个地方，我指的是行号值=13，tr 1115*15 td.line-number 31*15和td.line-content:1084*15，）我只是想找到一种解决这个问题的方法，请帮助我

如果您有任何疑问，我会澄清，如果问题看起来不清楚，我会改进。

我需要从pdf中提取候选人的姓名和id，因此在使用pdfparser后，我提取了文本，并使用php下载了html页面

<?php
$filename = 'filename.txt';
header('Content-disposition: attachment; filename=' . $filename);
header('Content-type: text');
// ... the rest of your file
?>
<?php

// Include Composer autoloader if not already done.
include 'C:\Users\Downloads\pdfparser-master (1)\pdfparser-master\vendor\autoload.php';

// Parse pdf file and build necessary objects.
$parser = new  \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('C:\Users\Desktop\Data\ApplicationForm (3).pdf');

$text = $pdf->getText();
echo $text;


?>

我这样做是因为我需要的信息在查看源页面的第12行和第13行，这是我需要的所有pdf，所以在以文本文件的形式下载html页面后，我使用下面的代码从下载的文件中提取我需要的文本，并将其存储在数据库中

<?php

$source = file("filename.txt");

$number =$source[12];
$name = $source[13];
$gslink = "https://www.google.co.in/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=google+scholar+".$name;        
$dblplink = "https://www.google.co.in/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=dblp+".$name ;
$servername = "127.0.0.1";
$username = "root";
$password = "";
$dbname = "mydb";
// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 
$sql = "INSERT INTO faculty (candidate_no,candidate_name,gs_link,dblp_link)VALUES('$number','$name','$gslink','$dblplink')";
if ($conn->query($sql) === TRUE) {
    echo "New record created successfully";
} else {
    echo "Error: " . $sql . "<br>" . $conn->error;
}

$conn->close();
?>