lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (Confluence)" <>
Subject [CONF] Apache Solr Reference Guide > Solr Glossary
Date Fri, 27 Sep 2013 01:15:00 GMT
Space: Apache Solr Reference Guide (
Page: Solr Glossary (

Change Comment:
fix link anchors

Edited by Hoss Man:
Where possible, terms are linked to relevant parts of the Solr Reference Guide for more information.

*Jump to a letter:*

[A|#A] [B|#B] [C|#C] [D|#D] [E|#E] [F|#F] G H [I|#I] J K [L|#L] [M|#M] [N|#N] [O|#O] P [Q|#Q]
[R|#R] [S|#S] [T|#T] U V [W|#W] X Y [Z|#Z]

h2. A

h6. [Atomic updates|Updating Parts of Documents#Atomic Updates]
An approach to updating only one or more fields of a document, instead of reindexing the entire

h2. B

h6. Boolean operators
These control the inclusion or exclusion of keywords in a query by using operators such as
AND, OR, and NOT.

h2. C

h6. Cluster
In Solr, a cluster is a set of Solr nodes managed as a unit. They may contain many cores,
collections, shards, and/or replicas. See also [#SolrCloud].

h6. Collection
In Solr, one or more documents grouped together in a single logical index. A collection must
have a single schema, but can be spread across multiple cores.

In [#ZooKeeper], a group of cores managed together as part of a SolrCloud installation. 

h6. Commit
To make document changes permanent in the index. In the case of added documents, they would
be searchable after a _commit_.

h6. Core
An individual Solr instance (represents a logical index). Multiple cores can run on a single
node. See also [#SolrCloud].

h6. Core reload
To re-initialize Solr after changes to {{schema.xml}}, {{solrconfig.xml}} or other configuration

h2. D

h6. Distributed search
Distributed search is one where queries are processed across more than one [shard|#Shard].

h6. Document
One or more Fields and their values that are considered related for indexing. See also [Field|#Field].

h2. E

h6. Ensemble
A [#ZooKeeper] term to indicate multiple ZooKeeper instances running simultaneously.

h2. F

h6. Facet
The arrangement of search results into categories based on indexed terms.

h6. Field
The content to be indexed/searched along with metadata defining how the content should be
processed by Solr.

h2. I

h6. Inverse document frequency (IDF)
A measure of the general importance of a term. It is calculated as the number of total Documents
divided by the number of Documents that a particular word occurs in the collection. See []
and [] for more
info on TF-IDF based scoring and Lucene scoring in particular. See also [#Term frequency].

h6. Inverted index
A way of creating a searchable index that lists every word and the documents that contain
those words, similar to an index in the back of a book which lists words and the pages on
which they can be found.  When performing keyword searches, this method is considered more
efficient than the alternative, which would be to create a list of documents paired with every
word used in each document. Since users search using terms they expect to be in documents,
finding the term before the document saves processing resources and time.

h2. L

h6. Leader
The main shard for each node that routes document adds, updates, or deletes to other shards
on the same node. See also [#SolrCloud].

h2. M

h6. Metadata
Literally, _data about data_.  Metadata is information about a document, such as it's title,
author, or location.

h2. N

h6. Natural language query
A search that is entered as a user would normally speak or write, as in, "What is aspirin?"

h6. Node
A JVM instance running Solr. Also known as a Solr server.

h2. O

h6. [Optimistic concurrency|Updating Parts of Documents#Optimistic Concurrency]
Also known as "optimistic locking", this is an approach that allows for updates to documents
currently in the index while retaining locking or version control.

h6. Overseer
The name of the SolrCloud process that coordinates the clusters. It keeps track of existing
nodes and shards, and assigns shards to nodes. See also [#SolrCloud].

h2. Q

h6. Query parser
A query parser processes the terms entered by a user.

h2. R

h6. Recall
The ability of a search engine to retrieve _all_ of the possible matches to a user's query.

h6. Relevance
The appropriateness of a document to the search conducted by the user.

h6. Replica
A copy of a shard or single logical index, for use in failover or load balancing. 

h6. [Replication|solr:Index Replication]
A method of copying a master index from one server to one or more "slave" or "child" servers.

h6. [RequestHandler|solr:RequestHandlers and SearchComponents in SolrConfig]
Logic and configuration parameters that tell Solr how to handle incoming "requests", whether
the requests are to return search results, to index documents, or to handle other custom situations.

h2. S

h6. [SearchComponent|solr:RequestHandlers and SearchComponents in SolrConfig]
Logic and configuration parameters used by request handlers to process query requests. Examples
of search components include faceting, highlighting, and "more like this" functionality.

h6. Shard
In SolrCloud, a logical section of a single collection. This may be spread across multiple
nodes. See also [#SolrCloud].

h6. [SolrCloud]
Umbrella term for a suite of functionality in Solr which allows managing a cluster of Solr
servers for scalability, fault tolerance, and high availability.

h6. Solr Schema (schema.xml)
The Apache Solr index schema. The schema defines the fields to be indexed and the type for
the field (text, integers, etc.) The schema is stored in schema.xml and is located in the
Solr home conf directory.

h6. SolrConfig (solrconfig.xml)
The Apache Solr configuration file. Defines indexing options, RequestHandlers, highlighting,
spellchecking and various other configurations. The file, solrconfig.xml is located in the
Solr home conf directory.

h6. Spell Check
The ability to suggest alternative spellings of search terms to a user, as a check against
spelling errors causing few or zero results. 

h6. Stopwords
Generally, words that have little meaning to a user's search but which may have been entered
as part of a [natural language|#Naturallanguagequery] query. Stopwords are generally very
small pronouns, conjunctions and prepositions (such as, "the", "with", or "and")

h6. [solr:Suggester]
Functionality in Solr that provides the ability to suggest possible query terms to users as
they type.

h6. Synonyms
Synonyms generally are terms which are near to each other in meaning and may substitute for
one another. In a search engine implementation, synonyms may be abbreviations as well as words,
or terms that are not consistently hyphenated. Examples of synonyms in this context would
be "Inc." and "Incorporated" or "iPod" and "i-pod".

h2. T

h6. Term frequency
The number of times a word occurs in a given document. See []
and [] for more info on TF-IDF based scoring
and Lucene scoring in particular.
See also [#Inverse document frequency (IDF)].

h6. Transaction log
An append-only log of write operations maintained by each node. This log is only required
with SolrCloud implementations and is created and managed automatically by Solr.

h2. W

h6. Wildcard
A wildcard allows a substitution of one or more letters of a word to account for possible
variations in spelling or tenses.

h2. Z

h6. ZooKeeper
Also known as [Apache ZooKeeper|]. The system used by SolrCloud
to keep track of configuration files and node names for a cluster. A ZooKeeper cluster is
used as the central configuration store for the cluster, a coordinator for operations requiring
distributed synchronization, and the system of record for cluster topology. See also [#SolrCloud].


Stop watching space:
Change email notification preferences:


View raw message