Access markup-free post and page content from Jekyll plugins

Question

I’m working on a plugin to parse all posts and gather them into a JSON file to be consumed by a search mechanism. How can I access just the text of the post, with no markup? I’m currently accessing site.posts, then e.g. page.content in loops. This returns the content of the post, but includes newline markers (\n) and Markdown syntax.

I saw another question in which someone wanted to get Markdown processed content in a Jekyll tag plugin, but my case is different: I don't want any markup at all, just the plain text of the post, with no formatting applied.

Below is the key def from my current implementation.

def generate(site)
  target = File.open('js/searchcontent.js', 'w')
  target.truncate(target.size)
  target.puts('var tipuesearch = {"pages": [')

  all_but_last, last = site.posts[0..-2], site.posts.last

  # Process all posts but the last one
  all_but_last.each do |page|
    tp_page = TipuePage.new(
      page.data['title'],
      "#{page.data['tags']} #{page.data['categories']}",
      page.url,
      page.content
    )
    target.puts(tp_page.to_json + ',')
  end

  # Do the last post
  tp_page = TipuePage.new(
    last.data['title'],
    "#{last.data['tags']} #{last.data['categories']}",
    last.url,
    last.content
  )
  target.puts(tp_page.to_json)

  target.puts(']};')
  target.close
end

David Jacquel · Accepted Answer · 2015-02-23T15:33:36.223

1

Maybe you can try this :

{{ page.content | strip_html | strip_newlines }}

Edit obviously I misunderstood you question.

But you can use Liquid filters with include Liquid::StandardFilters

You can then use strip_html and strip_newlines in your plugin.

edited Feb 23 '15 at 15:33

answered Feb 23 '15 at 06:25

David Jacquel

46,880
4
106
132

Those are Liquid filters, which are not accessible from a plugin Ruby script. I need to do this entirely within this plugin file. – Tohuw Feb 23 '15 at 14:32

Access markup-free post and page content from Jekyll plugins

1 Answers1