2

I did some studies on Lucene search queries and searched the internet for answers on how to do this... But couldn't find a method that works and my attempts failed, not returning what I want.

Basically, I've a field on my database, that are IDs concatenated by a comma, these fields are Umbraco document properties.

For instance, let's say I've these entries with these fields:

Entry 1: relatedContents: 500,700

Entry 2: relatedContents: 500

My search query is for fields that have the value 500, as of now, it only returns Entry 2, but when I use a wildcard term by using the value 500*, it returns both of them. That would be fine, but the problem is when searching something that is not begging of a value.

When I search for 700, it doesn't return the Entry 1 and WildCard searches on Lucene doesn't allow the * to be at the begging of the search term.

It looks like my query is searching for values that has to be exactly like the search term. If there was a way to make a query, in an analogy, like one would use a .Contains() to search a substring in a string it would solve this problem, I think.

halfer
  • 18,701
  • 13
  • 79
  • 158

2 Answers2

3

The leading wildcard is NOT supported in Lucene by design (Reference)

If your website is NOT too complicated and you can be sure performance is NOT an issue, you can enable leading wildcard enableLeadingWildcards="true" by creating your own custom searcher instead of using the default one in Umbraco Examine:

Define custom searcher in settings:

<add name="CustomSearchSearcher" 
       type="MyNamespace.MyUmbracoExamineSearcher, MyNamespace"
       analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"
       enableLeadingWildcards="true"/>

Use RawQuery when you want to search:

var searchProvider = ExamineManager.Instance.SearchProviderCollection["CustomSearchSearcher"];
var searchCriteria = searchProvider.CreateSearchCriteria();
searchProvider.Search(searchCriteria.RawQuery("relatedContents:*700*));
Thang Pham
  • 601
  • 5
  • 12
  • I'm currently using a custom searcher provider to override Lucene with search terms that have accents. I tried to test that search other day, field: * term * and it wasn't working, turns out I think it was because I had enableLeadingWildcard on off before doing that, because I was testing different possibilities to try working it out. I did what you said and it worked! Thanks for the help and sorry for the inconvenience, all the stuff I found googling wasn't very clear about this. – Victor Santos Jul 28 '17 at 06:44
  • OK Victor. Just want to add in one thing: When you configure a new custom searcher provider, please remember to trigger it to rebuild the index by republishing at least one node in Umbraco or by execute this: `ExamineManager.Instance.IndexProviderCollection[indexToRebuild].RebuildIndex();` – Thang Pham Jul 28 '17 at 07:21
0

I don't think the marked answer is going to solve your issue. You should investigate the analyser you are using.

If you use KeywordAnalyzer the string is indexed as it is, with the comma, and you will have to use the *, but if you use the Standard Analyser the string is splitted in different terms for your field, so 500 or 700 should find your node.

If you require a KeywordAnalyser for your index, what you can do is specify a different analyser for that field specifically. for this you will have to work directly with Lucene, not Examine, and use the PerFieldAnalyserWrapper

Mario Lopez
  • 1,298
  • 10
  • 22