Ruby 处理Jekyll内容，将任何文章标题的第一个出现替换为具有该标题的文章的超链接我想做什么_Ruby_Plugins_Jekyll_Hook_Nokogiri

Ruby 处理Jekyll内容，将任何文章标题的第一个出现替换为具有该标题的文章的超链接我想做什么

ruby plugins jekyll

Ruby 处理Jekyll内容，将任何文章标题的第一个出现替换为具有该标题的文章的超链接我想做什么,ruby,plugins,jekyll,hook,nokogiri,Ruby,Plugins,Jekyll,Hook,Nokogiri,我正在构建一个JekyllRuby插件，它将用一个链接到同名帖子URL的超链接替换帖子副本文本内容中第一次出现的任何单词我遇到的问题我已经实现了这一点，但我无法解决过程中的两个问题\u words方法：如何仅在文章的主要内容副本文本中搜索文章标题，而不在文章或目录（也在主要文章副本文本之前生成）之前搜索元标记？我无法让Nokigiri使用这个工具，尽管它似乎是这里的首选工具如果帖子的URL不在post.data['URL']，它在哪里还有，有没有更高效、更清洁的方法当前代码可以工作，

我正在构建一个JekyllRuby插件，它将用一个链接到同名帖子URL的超链接替换帖子副本文本内容中第一次出现的任何单词

我遇到的问题我已经实现了这一点，但我无法解决

过程中的两个问题\u words

方法：

如何仅在文章的主要内容副本文本中搜索文章标题，而不在文章或目录（也在主要文章副本文本之前生成）之前搜索元标记？我无法让Nokigiri使用这个工具，尽管它似乎是这里的首选工具

如果帖子的URL不在

post.data['URL']

，它在哪里

还有，有没有更高效、更清洁的方法

当前代码可以工作，但将替换第一次出现的代码，即使它是HTML属性的值，如锚或元标记

示例结果我们有一个包含3篇文章的博客：

爱好
食物
自行车

在“嗜好”帖子正文中，我们有一个句子，每个单词第一次出现在帖子中，如下所示：

I love mountain biking and bicycles in general.

插件将处理该句子并将其输出为：

I love mountain biking and <a href="https://example.com/link/to/bicycles/">bicycles</a> in general.

总体而言，我喜欢山地自行车。

我的当前代码（更新1）

##_插件/hyperlink_first_word_occurance.rb
需要“杰基尔”
需要“uri”
杰基尔模块
#用文章的标题超链接替换内容中每个文章标题的首次出现
模块HyperlinkFirstWordOccurance
POST\u CONTENT\u CLASS=“页面内容”
BODY_START_TAG=“这段代码看起来很熟悉：）我建议您查看Rspecs测试文件以针对您的问题进行测试：
我会尽力回答你的问题，很抱歉，在写这篇文章的时候，我无法亲自测试这些问题
如何只在文章的主要内容副本文本中搜索文章标题，而不是在文章或目录（也在主要文章副本文本之前生成）之前搜索元标记？我无法使用Nokigiri实现这一点，尽管这似乎是这里的选择工具。
您的要求如下：
1） 忽略
标记之外的内容
这似乎已经在process\u html（）
方法中实现。此方法说明了唯一的进程body\u content
，它应该可以按原样工作。您有测试吗？您如何调试它？相同的字符串拆分在我的插件中工作。也就是说，只处理body内的内容
2） 忽略目录（TOC）中的内容。
我建议您通过进一步拆分body\u content
变量来扩展process\u html（）
方法。在TOC的开始和结束标记之间搜索内容（按id、css类等），并将其排除，然后将其添加回process\u words
字符串之前或之后的位置
3） 是否使用Nokigiri插件？
这个插件非常适合解析html。我认为你正在解析字符串，然后创建html。所以香草Ruby和URI插件应该足够了。如果你愿意，你仍然可以使用它，但它不会比在Ruby中拆分字符串更快
如果一篇文章的URL不在post.data['URL']，它在哪里？
我认为你应该有一个方法来获取所有帖子标题，然后将“单词”与数组匹配。你可以从文档本身doc.site.posts
获取所有帖子集合，然后foreach post返回标题。过程\u words（）
方法可以检查每个作品是否与数组中的项目匹配。但如果标题由多个单词组成，该怎么办
还有，有没有更高效、更清洁的方法
到目前为止还不错。我将首先解决问题，然后重构速度和编码标准
我再次建议您使用测试来帮助您实现这一点
如果我能提供更多帮助，请告诉我：）lol你是怎么找到它的！？你是遇到过它还是刚刚遇到了Jekyll/ruby on Stack？PS：谢谢你提供了一个良好的起点/代码库！非常欢迎你，Andre。我们是一个社区！我几天前开始在Twitter上关注@jekyllbot，这是在订阅源上出现的。：）
# _plugins/hyperlink_first_word_occurance.rb
require "jekyll"
require 'uri'


module Jekyll

    # Replace the first occurance of each post title in the content with the post's title hyperlink
    module HyperlinkFirstWordOccurance
        POST_CONTENT_CLASS = "page__content"
        BODY_START_TAG = "<body"
        ASIDE_START_TAG = "<aside"
        OPENING_BODY_TAG_REGEX = %r!<body(.*)>\s*!
        CLOSING_ASIDE_TAG_REGEX = %r!</aside(.*)>\s*!

        class << self
            # Public: Processes the content and updates the 
            # first occurance of each word that also has a post
            # of the same title, into a hyperlink.
            #
            # content - the document or page to be processes.
            def process(content)
                @title = content.data['title']
                @posts = content.site.posts

                content.output = if content.output.include? BODY_START_TAG
                                    process_html(content)
                                else
                                    process_words(content.output)
                                end
            end


            # Public: Determines if the content should be processed.
            #
            # doc - the document being processes.
            def processable?(doc)
                (doc.is_a?(Jekyll::Page) || doc.write?) &&
                    doc.output_ext == ".html" || (doc.permalink&.end_with?("/"))
            end


            private

            # Private: Processes html content which has a body opening tag.
            #
            # content - html to be processes.
            def process_html(content)
            content.output = if content.output.include? ASIDE_START_TAG
                    head, opener, tail = content.output.partition(CLOSING_ASIDE_TAG_REGEX)
                            else
                    head, opener, tail = content.output.partition(POST_CONTENT_CLASS)
                            end
                body_content, *rest = tail.partition("</body>")

                processed_markup = process_words(body_content)

                content.output = String.new(head) << opener << processed_markup << rest.join
            end

            # Private: Processes each word of the content and makes
            # the first occurance of each word that also has a post
            # of the same title, into a hyperlink.
            #
            # html = the html which includes all the content.
            def process_words(html)
                page_content = html
                @posts.docs.each do |post|
                    post_title = post.data['title'] || post.name
                    post_title_lowercase = post_title.downcase
                    if post_title != @title
                        if page_content.include?(" " + post_title_lowercase + " ") ||
                            page_content.include?(post_title_lowercase + " ") ||
                            page_content.include?(post_title_lowercase + ",") ||
                            page_content.include?(post_title_lowercase + ".")
                            page_content = page_content.sub(post_title_lowercase, "<a href=\"#{ post.url }\">#{ post_title.downcase }</a>")
                        elsif page_content.include?(" " + post_title + " ") ||
                            page_content.include?(post_title + " ") ||
                            page_content.include?(post_title + ",") ||
                            page_content.include?(post_title + ".")
                            page_content = page_content.sub(post_title, "<a href=\"#{ post.data['url'] }\">#{ post_title }</a>")
                        end
                    end
                end
                page_content
            end
        end
    end
end


Jekyll::Hooks.register %i[posts pages], :post_render do |doc|
  # code to call after Jekyll renders a post
  Jekyll::HyperlinkFirstWordOccurance.process(doc) if Jekyll::HyperlinkFirstWordOccurance.processable?(doc)
end