Javascript 如何在我的网站上实现Mozilla readability.js?

Javascript 如何在我的网站上实现Mozilla readability.js?,javascript,html,mozilla,readability,Javascript,Html,Mozilla,Readability,(readability.js用于创建网页的阅读视图) 如何在此测试网页上实现readability.js 问题是,readability.js删除了这个网站中我想保留的元素,并保留了那些应该删除的元素。我希望有人能帮助我。非常感谢。有关于如何使用readability.js的文档吗 <html><head> <title>Reader View shows only the browser in reader view</title> &

(readability.js用于创建网页的阅读视图)

如何在此测试网页上实现readability.js 问题是,readability.js删除了这个网站中我想保留的元素,并保留了那些应该删除的元素。我希望有人能帮助我。非常感谢。有关于如何使用readability.js的文档吗

<html><head>
<title>Reader View shows only the browser in reader view</title>
    <script src="https://raw.githack.com/mozilla/readability/master/Readability.js"></script>
</head>
<body>
Everything outside the main div tag vanishes in Reader View<br>
<img class="no-print" src="http://dummyimage.com/1024x100/000/ffffff&text=This+banner+should+vanish+in+print+view">
<div>
   <h1>H1 tags outside ot a p tag are hidden in reader view</h1>
   <img class="no-print" src="http://dummyimage.com/1024x100/000/ffffff&text=This+banner+is resized+in+print+view">
   <p>
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789 123456
</p>
</div>
</body>
    <script>
    var article = new Readability(document).parse();
    </script>
</html>

Reader视图仅在Reader视图中显示浏览器
主div标记之外的所有内容都将在Reader视图中消失
ot a p标记外部的H1标记隐藏在读卡器视图中

var article=新的可读性(document.parse();
测试页面的来源:

好的

    document.getElementById("body").innerHTML = "<font face='Calibri' size='4'> 
    <h1>"+article.title+"</h1>"+article.content;
document.getElementById(“body”).innerHTML=”
“+article.title+”+article.content;
你试过这个吗

从他们的github页面:

“Readability的parse()通过修改DOM来工作。这会删除网页中的某些元素。您可以通过在创建Readability对象时传递文档对象的克隆来避免这种情况。”


您可以创建dom对象的副本,这样您就不会真正修改真正的dom了

您可以像他们在文档中提到的那样,同时使用DOMPurify和Readability-

import { Readability } from '@mozilla/readability'
import DOMPurify from 'dompurify';

function readable(doc) {
  const reader = new Readability(doc)
  const article = reader.parse()
  return article
}

let cloneDoc = document.cloneNode(true)
let parsed = readable(cloneDoc)
const markup = DOMPurify.sanitize(parsed.content)
标记
将是可读内容的html字符串。 尝试
console.log(已解析)
查看可用属性

import { Readability } from '@mozilla/readability'
import DOMPurify from 'dompurify';

function readable(doc) {
  const reader = new Readability(doc)
  const article = reader.parse()
  return article
}

let cloneDoc = document.cloneNode(true)
let parsed = readable(cloneDoc)
const markup = DOMPurify.sanitize(parsed.content)