-1

We have used pdf searcher (nuget package) within one of our Umbraco applications. When I see the pdf search results it does not look 100% correct.

The top 2 pdfs in the search result contain the search term, but the 3rd, 4th and remaining other pdfs in the search result do not have search term. Not sure why pdfs not having the search term are being added in the search result.

Can anyone provide some info on how the umbraco pdf searcher works? and ranks the result items?

Is there any way to remove the pdfs from the search result which do not contain the search term at all.

Shraddha G
  • 31
  • 2

1 Answers1

0

Go and download LUKE (https://code.google.com/archive/p/luke/). This is a tool that allows you to look inside indexes and see what they have indexed etc.

Using LUKE you should be able to see the indexes and see what has been indexed.

You can get Umbraco Examine to output the raw Lucene string it's using to search by calling .ToString on the criteria object. You can paste that into LUKE to run a search and you'll be able to see all sorts of useful details, like the matched terms, and the ranking etc.

:)

Tim
  • 4,167
  • 1
  • 13
  • 21