76

I'm new to Apache Solr. Even after reading the documentation part, I'm finding it difficult to clearly understand the functionality and use of the multiValued field type property.

What internally Solr does/treats/handles a field that is marked as multiValued?

What is the difference in indexing in Solr between a field that is multiValued and those that are not?

Can somebody explain with some good example?

Doc says:

multiValued=true|false

True if this field may contain multiple values per document, i.e. if it can appear multiple times in a document

Mikael Engver
  • 4,078
  • 4
  • 40
  • 51
Gnanam
  • 9,553
  • 18
  • 48
  • 71

3 Answers3

77

A multivalued field is useful when there are more than one value present for the field. An easy example would be tags, there can be multiple tags that need to be indexed. so if we have tags field as multivalued then solr response will return a list instead of a string value. One point to note is that you need to submit multiple lines for each value of the tags like:

<field name="tags">tag1</tags>
<field name="tags">tag2</tags>
...
<field name="tags">tagn</tags>

Once you have all the values index you can search or filter results by any value, e,g. you can find all documents with tag1 using query like

q=tags:tag1

or use the tags to filter out results like

q=query&fq=tags:tag1
Steve Chambers
  • 31,993
  • 15
  • 129
  • 173
Umar
  • 2,779
  • 18
  • 17
  • 5
    What is the difference/advantage between doing `search or filter results by any value`? In this case, what difference it makes in searching with *tags:tag1* in 'q' or `fq`? – Gnanam Apr 27 '11 at 08:07
  • 1
    each value can be a string and you can exact match against a set of strings. In case of single valued field you can either have tokenized words or entire string. Another use is to store values that are lists as i mentioned in the case of tags, can be numbers like a list of numeric values. – Umar Apr 27 '11 at 18:48
  • 6
    @Gnanam: Filtered queries are cached and do not affect the score. Their main purpose is to create a fixed "superset" of documents, which then can be searched. Example: The user enters a query and the application applies additional constraints, for example to only search the documents the user owns. In this case the application would send the constraint "only given user" as `fq` and actual search query as `q`. – Daniel Rikowski Oct 17 '11 at 11:19
  • What if you do not know the values of tags. For example, when indexing a collection of papers, you want to set "keyword" as a multiValue field, but you do not know all the values! – fanchyna Nov 25 '14 at 15:27
16

multiValued defined in the schema whether the field is allowed to have more than one value.

For instance:
if I have a fieldType called ID which is multiValued=false indexing a document such as this:

doc {
  id : [ 1, 2]
  ...
}

would cause an exception to be thrown in the indexing thread and the document will not be indexed (schema validation will fail).

On the other hand if I do have multiple values for a field I would want to set multiValued=true in order to guarantee that indexing is done correctly, for example:

doc {
  id : 1
  keywords: [ hello, world ]
  ...
}

In this case you would define "keywords" as a multiValued field.

Asaf
  • 5,981
  • 1
  • 19
  • 43
  • 1
    Let me know whether I've understood this correctly. For example, if I try to index data directly from database using `DataImportHandler` and if one of my database field *tag* type is `VARCHAR[]` (varchar array), then it would make sense to map this *tag* field in Solar schema field as multiValued. Am I correct in my understanding? – Gnanam Apr 27 '11 at 07:58
  • 1
    late to the game here, but I would generally say yes... but never say never and never say always – markgiaconia Sep 10 '14 at 21:50
12

I use multiple value fields only with copyfields, so think this way, say all fields will be single valued unless it's a copyfield, for example I have following fields:

<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="subject" type="string" indexed="true" stored="true"/>
<field name="location" type="string" indexed="true" stored="true"/>

I want to query one field only and possibly to search all 4 fields above, then we need to use copyfield. first to create a new field call 'all', then copy everything into 'all'

<field name="all" type="text" indexed="true" stored="true" multiValued="true"/>
<copyField source="*" dest="all"/>

Now field 'all' need to be multi-valued.

waynet
  • 147
  • 1
  • 3