Php 如何获取内容新闻，如Safari'；s&x201C；读卡器模式”；_Php

Php 如何获取内容新闻，如Safari'；s&x201C；读卡器模式”；

php

Php 如何获取内容新闻，如Safari'；s&x201C；读卡器模式”；,php,Php,Safari有一个“阅读器模式”，它可以删除网站上除了文本以外的所有文章。现在我需要从站点获取HTML源代码，然后通过PHP获取真正的内容新闻，比如Safari的“阅读器模式”！你能帮我吗？？：有人指出，仅仅发布一个指向另一篇文章的链接并没有多大帮助，所以我正在更新。从那以后，我开始使用一个具有Arc90可读性的PHP端口，它工作得非常好下面是指向Readability.js的PHP端口的链接：下面是一个简单的实施示例： $url = 'http://'; $html = file_g

Safari有一个“阅读器模式”，它可以删除网站上除了文本以外的所有文章。

现在我需要从站点获取HTML源代码，然后通过PHP获取真正的内容新闻，比如Safari的“阅读器模式”！

你能帮我吗？？：有人指出，仅仅发布一个指向另一篇文章的链接并没有多大帮助，所以我正在更新。从那以后，我开始使用一个具有Arc90可读性的PHP端口，它工作得非常好

下面是指向Readability.js的PHP端口的链接：

下面是一个简单的实施示例：

$url = 'http://';
$html = file_get_contents($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

// give it to Readability
$readability = new Readability($html, $url);
// echo $readability->html;
// echo htmlspecialchars($tidy($readability->html, true));

// print debug output?
// useful to compare against Arc90's original JS version -
// simply click the bookmarklet with FireBug's console window open
$readability->debug = false;
// convert links to footnotes?
$readability->convertLinksToFootnotes = false;

$readability->lightClean = false;
// $readability->revertForcedParagraphElements = false;

// process it
$result = $readability->init();
// store reference to dom content processed by Readability
$content = $readability->getContent();

echo '<h1>'.$readability->getTitle()->textContent.'</h1>';
echo $content->innerHTML;

$url = 'http://';
//$html = file_get_contents($url);
$html = getData($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

$readability = new Readability($html, $url);

//...

实施：

$url = 'http://';
$html = file_get_contents($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

// give it to Readability
$readability = new Readability($html, $url);
// echo $readability->html;
// echo htmlspecialchars($tidy($readability->html, true));

// print debug output?
// useful to compare against Arc90's original JS version -
// simply click the bookmarklet with FireBug's console window open
$readability->debug = false;
// convert links to footnotes?
$readability->convertLinksToFootnotes = false;

$readability->lightClean = false;
// $readability->revertForcedParagraphElements = false;

// process it
$result = $readability->init();
// store reference to dom content processed by Readability
$content = $readability->getContent();

echo '<h1>'.$readability->getTitle()->textContent.'</h1>';
echo $content->innerHTML;

$url = 'http://';
//$html = file_get_contents($url);
$html = getData($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

$readability = new Readability($html, $url);

//...

有人指出，仅仅发布一个链接到另一个帖子是没有多大帮助的，所以我正在更新。从那以后，我开始使用一个具有Arc90可读性的PHP端口，它工作得非常好

下面是指向Readability.js的PHP端口的链接：

下面是一个简单的实施示例：

$url = 'http://';
$html = file_get_contents($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

// give it to Readability
$readability = new Readability($html, $url);
// echo $readability->html;
// echo htmlspecialchars($tidy($readability->html, true));

// print debug output?
// useful to compare against Arc90's original JS version -
// simply click the bookmarklet with FireBug's console window open
$readability->debug = false;
// convert links to footnotes?
$readability->convertLinksToFootnotes = false;

$readability->lightClean = false;
// $readability->revertForcedParagraphElements = false;

// process it
$result = $readability->init();
// store reference to dom content processed by Readability
$content = $readability->getContent();

echo '<h1>'.$readability->getTitle()->textContent.'</h1>';
echo $content->innerHTML;

$url = 'http://';
//$html = file_get_contents($url);
$html = getData($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

$readability = new Readability($html, $url);

//...

实施：

$url = 'http://';
$html = file_get_contents($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

// give it to Readability
$readability = new Readability($html, $url);
// echo $readability->html;
// echo htmlspecialchars($tidy($readability->html, true));

// print debug output?
// useful to compare against Arc90's original JS version -
// simply click the bookmarklet with FireBug's console window open
$readability->debug = false;
// convert links to footnotes?
$readability->convertLinksToFootnotes = false;

$readability->lightClean = false;
// $readability->revertForcedParagraphElements = false;

// process it
$result = $readability->init();
// store reference to dom content processed by Readability
$content = $readability->getContent();

echo '<h1>'.$readability->getTitle()->textContent.'</h1>';
echo $content->innerHTML;

$url = 'http://';
//$html = file_get_contents($url);
$html = getData($url);

if (function_exists('tidy_parse_string')) {
    $tidy = tidy_parse_string($html, array(), 'UTF8');
    $tidy->cleanRepair();
    $html = $tidy->value;
}

$readability = new Readability($html, $url);

//...

它基于一种试图识别网站主要内容部分的算法。这方面没有明确的标准，你必须自己尝试并实现它。它基于一种算法，试图识别网站的主要内容部分。没有明确的标准，您必须自己尝试并实施。请添加解决方案的重要部分。如果链接停止，您的答案将丢失。请添加解决方案的重要部分。如果链接停止，您的答案将丢失。