I’m working on a plugin to parse all posts and gather them into a JSON file to be consumed by a search mechanism. How can I access just the text of the post, with no markup? I’m currently accessing site.posts
, then e.g. page.content
in loops. This returns the content of the post, but includes newline markers (\n
) and Markdown syntax.
I saw another question in which someone wanted to get Markdown processed content in a Jekyll tag plugin, but my case is different: I don't want any markup at all, just the plain text of the post, with no formatting applied.
Below is the key def
from my current implementation.
def generate(site)
target = File.open('js/searchcontent.js', 'w')
target.truncate(target.size)
target.puts('var tipuesearch = {"pages": [')
all_but_last, last = site.posts[0..-2], site.posts.last
# Process all posts but the last one
all_but_last.each do |page|
tp_page = TipuePage.new(
page.data['title'],
"#{page.data['tags']} #{page.data['categories']}",
page.url,
page.content
)
target.puts(tp_page.to_json + ',')
end
# Do the last post
tp_page = TipuePage.new(
last.data['title'],
"#{last.data['tags']} #{last.data['categories']}",
last.url,
last.content
)
target.puts(tp_page.to_json)
target.puts(']};')
target.close
end