Questions tagged [lucene]

The term Lucene refers to the open source Java fulltext search engine library, but also to the entire eco-system that grew around it, including lucene.net, solr, elasticsearch and zend-search-lucene.

The term "Lucene" refers to the open source Java fulltext search engine library, and also to the entire eco-system that grew around it, including , , and . "Lucene" may also be used to refer to top-level projects like Nutch and Tika which were once sub-projects of Lucene.

Use the "Lucene" tag if either:

  • The question is about the Java library
  • The question is about a port of the library, but would make sense to people who know the Java library (many Lucene.NET questions match this criteria).
  • The question is so general it doesn't apply to a specific implementation (example).

References:

Basic Demo:

A basic "getting started" demo showing how to build and query an index is provided as part of the official documentation:

Basic Demo documentation - (this link is for Lucene v8.7.0. Newer versions may be available)

Links to the demo's source files are provided in the above documentation.

The source code can also be found here on GitHub.

Luke - a Lucene GUI Client:

Luke is a GUI client application which can be used to explore your Lucene indexes. Recent versions of Luke are now provided as part of each binary release, which can be downloaded from here.

After downloading the binary release, unzip it, and go to the luke directory. Launch the client using the provided luke.bat or luke.sh scripts.

11633 questions
3
votes
1 answer

Using Lucene just as an inverted index

Lucene has a great capability of incremental indexing. Which is normally a pain when developing a IR system from scratch. I would like to know if I can use low-level Lucene APIs to use it only as an Inverted Index, i.e., storage for inverted lists,…
Felipe Hummel
  • 4,216
  • 5
  • 27
  • 33
3
votes
2 answers

How do I select the top term buckets based on a rescore function in Elasticsearch

Consider the following query for Elasticsearch 5.6: { "size": 0, "query": { "match_all": {} }, "rescore": [ { "window_size": 10000, "query": { "rescore_query": { "function_score": { …
LaserJesus
  • 7,070
  • 6
  • 39
  • 59
3
votes
1 answer

Solr partial document index update

I'm using Solr and Solr:Cell plugin to index and search rich text documents and metadata. DEFINITION: solr_document = tuple(rich_text_document, metadata1, metadata2) I want to reindex some solr_documents when metadata changes, but only the parts…
clyfe
  • 23,035
  • 7
  • 78
  • 106
3
votes
1 answer

ElasticSearch: preserve_position_increments not working

According to the docs https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html preserve_position_increments=false is supposed to make consecutive keywords in a string searchable. But for me it's not working.…
Phil
  • 5,442
  • 6
  • 36
  • 70
3
votes
3 answers

What are the best practices for combining analyzers in Lucene?

I have a situation where I'm using a StandardAnalyzer in Lucene to index text strings as follows: public void indexText(String suffix, boolean includeStopWords) { StandardAnalyzer analyzer = null; if (includeStopWords) { …
mr morgan
  • 451
  • 1
  • 6
  • 5
3
votes
2 answers

Full Text Search with multiple index and complex requirements

We are building an application which will require us to index data for each of our users so that we can provide full text search on their data. Here are some notable things about the application: A) The data for every user is totally unrelated to…
Shrinath
  • 6,892
  • 12
  • 45
  • 82
3
votes
1 answer

Asynchronously build Hibernate Search index to ensure no downtime.

We are using Hibernate Search (Lucene Engine) to enable fuzzy search for text some data we are storing in SQL Server database and consumed by a search service written in Java 8. The data source for the search is a table with moderate edit/update…
3
votes
3 answers

Solr requests time out during index update. Perhaps replication a possible solution?

We are running a Solr installation (everything standard jetty environment, just added some fields to schema). The index is about 80k Documents that are of average size (probably 20 fields with about 100 characters each). The problem is that from…
The Surrican
  • 26,829
  • 23
  • 111
  • 159
3
votes
1 answer

is Lucene + classic query search syntax same as using AND

A Lucene query of the form field1:+"term1" field2:+"term2" seems to be equivalent to field1:"term1" OR field2:"term2" I expected it to be equivalent to field1:"term1" AND field2:"term2" (i.e for my particular query on my database query 1 and…
Paul Taylor
  • 12,050
  • 34
  • 149
  • 295
3
votes
3 answers

Why did they create the concept of "schema.xml" in Solr?

Lucene does searching and indexing, all by taking "coding"... Why doesn't Solr do the same ? Why do we need a schema.xml ? Whats its importance ? Is there a way to avoid placing all the fields we want into a schema.xml ? ( I guess dynamic fields are…
Shrinath
  • 6,892
  • 12
  • 45
  • 82
3
votes
2 answers

Querying against a comma separated list of IDs with Examine and Lucene.Net?

I am using Examine for Umbraco (which is built on top of Lucene.net) to do my search. I am quite sure my problem is Lucene related. One of my fields contains a list of comma separated IDs. How do I query this field in the right way? Eg. I have a…
ThomasD
  • 2,302
  • 5
  • 32
  • 50
3
votes
2 answers

Lucene 6.2.1 How to get all field names or search across all fields without knowing their names

I'm new in Lucene and I would like to know if there is a way to search through all possible fields in multiple documents without knowing their names or... another approach: to get all field names (version 6.2.1) For instance: How to get all names…
Rays
  • 31
  • 2
3
votes
1 answer

Limit couchdb-lucene results by key / specific field? map?

I have a pretty straight-forward question. I am using couchdb-lucene to search the full text of my documents. My documents each have the following fields: _id _rev docID (the unique ID of the document from our system) title (title of the…
3
votes
2 answers

How do I store the lucene index in a database?

This is my sample code: MysqlDataSource dataSource = new MysqlDataSource(); dataSource.setUser("root"); dataSource.setPassword("ncl"); dataSource.setDatabaseName("userdb"); dataSource.setEmulateLocators(true); //This is important because we are…
devesh
3
votes
4 answers

Solr %100 Write Availability During Optimize

So here's my dilemma... I'm running a realtime search index with Solr, indexing about 6M documents per day. The documents expire after about 7 days. So every day, I add 6M documents, and delete 6M documents. Unfortunately, I need to run "optimize"…
devinfoley
  • 1,836
  • 16
  • 20
1 2 3
99
100