lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "SchemaXml" by MeljeanLegaspi
Date Sun, 12 Dec 2010 02:57:08 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SchemaXml" page has been changed by MeljeanLegaspi.
The comment on this change is: Replaced <fieldtypes> with <types> in the Miscellaneous
Settings section of the document.
http://wiki.apache.org/solr/SchemaXml?action=diff&rev1=40&rev2=41

--------------------------------------------------

  <<TableOfContents>>
  
  == Data Types ==
- 
  The `<types>` section allows you to define a list of `<fieldtype>` declarations
you wish to use in your schema, along with the underlying Solr class that should be used for
that type, as well as the default options you want for fields that use that type.
  
  Any subclass of [[http://lucene.apache.org/solr/docs/api/org/apache/solr/schema/FieldType.html|FieldType]]
may be used as a field type class, using either its full package name, or the "solr" alias
if it is in the default Solr package.  For common numeric types (integer, float, etc...) there
are multiple implementations provided depending on your needs, please see SolrPlugins for
information on how to ensure that your own custom Field Types can be loaded into Solr.
  
-   Common options that field types can have are...
+  . Common options that field types can have are...
- 
-    * `sortMissingLast=true|false`
+   * `sortMissingLast=true|false`
-    * `sortMissingFirst=true|false`
+   * `sortMissingFirst=true|false`
-    * `indexed=true|false`
+   * `indexed=true|false`
-    * `stored=true|false`
+   * `stored=true|false`
-    * `multiValued=true|false`
+   * `multiValued=true|false`
-    * `omitNorms=true|false` 
+   * `omitNorms=true|false`
-    * `omitTermFreqAndPositions=true|false` <!> [[Solr1.4]]
+   * `omitTermFreqAndPositions=true|false` <!> [[Solr1.4]]
-    * `positionIncrementGap=N`
+   * `positionIncrementGap=N`
  
  `TextField`s can also support Analyzers with highly configurable [[AnalyzersTokenizersTokenFilters|Tokenizers
and Token Filters]].
  
@@ -31, +29 @@

  
  Field types that store text (`TextField`, `StrField`) support compression of stored contents:
  
-    * `compressed=true|false`
+  * `compressed=true|false`
-    * `compressThreshold=<integer>`
+  * `compressThreshold=<integer>`
  
  `compressThreshold` is the minimum length required for text compression to be invoked. 
This applies only if `compressed=true`; a common pattern is to set `compressThreshold` on
the field type definition, and turn compression on and off in the individual field definitions.
  
  === Poly Field Types ===
- /!\ Solr1.5 /!\
- See https://issues.apache.org/jira/browse/SOLR-1131.  This discusses uncommitted code.
+ /!\ Solr1.5 /!\ See https://issues.apache.org/jira/browse/SOLR-1131.  This discusses uncommitted
code.
  
  Some !FieldTypes can be "poly" field types.  A Poly !FieldType is one that can potentially
create multiple Fields per "declared" field.  The primary example in Solr is the PointType.
 Depending on the dimension specified, one or more Fields will be created.  For example:
+ 
  {{{
  <fieldType name="location" class="solr.PointType" dimension="2" subFieldTypes="double"/>
  }}}
- 
  Declares a !FieldType that can be used to represent a point in 2 dimensions (i.e. a lat/lon).
 The subFieldTypes value tells Solr what the underlying representation will be for the values
in the field, in this case a !FieldType called "double".
  
  Thus, a Field declaration like:
+ 
  {{{
  <field name="store" type="location" indexed="true" stored="true"/>
  }}}
+ can be indexed like:
  
- can be indexed like:
  {{{
  <add>
  <doc>
@@ -60, +58 @@

  </doc>
  </add>
  }}}
- 
  Underneath the hood, Solr will create two fields (using dynamic fields) to store the information.
  
  == Fields ==
- 
- The `<fields>` section is where you list the individual `<field>` declarations
you wish to use in your documents.  Each `<field>` has a `name` that you will use to
reference it when adding documents or executing searches, and an associated `type` which identifies
the name of the fieldtype you wish to use for this field. There are various field options
that apply to a field. These can be set in the field type declarations, and can also be overridden
at an individual field's declaration. 
+ The `<fields>` section is where you list the individual `<field>` declarations
you wish to use in your documents.  Each `<field>` has a `name` that you will use to
reference it when adding documents or executing searches, and an associated `type` which identifies
the name of the fieldtype you wish to use for this field. There are various field options
that apply to a field. These can be set in the field type declarations, and can also be overridden
at an individual field's declaration.
  
  === Common field options ===
+ Common options that fields can have are...
  
- Common options that fields can have are...
   * `default`
    * The default value for this field if none is provided while adding documents
   * `indexed=true|false`
@@ -87, +83 @@

   * `omitTermFreqAndPositions=true|false` <!> [[Solr1.4]]
    * If set, omit term freq, positions and payloads from postings for this field. This can
be a performance boost for fields that don't require that information and reduces storage
space required for the index. Queries that rely on position that are issued on a field with
this option will silently fail to find documents.
  
- 
  See also FieldOptionsByUseCase, which discusses how these options should be set in various
circumstances. See SolrPerformanceFactors for how different options can affect Solr performance.
  
  === Dynamic fields ===
- 
- One of the powerful features of Lucene is that you don't have to pre-define every field
when you first create your index.  Even though Solr provides strong datatyping for fields,
it still preserves that flexibility using "Dynamic Fields".  Using `<dynamicField>`
declarations, you can create field rules that Solr will use to understand what datatype should
be used whenever it is given a field name that is not explicitly defined, but matches a prefix
or suffix used in a dynamicField.  
+ One of the powerful features of Lucene is that you don't have to pre-define every field
when you first create your index.  Even though Solr provides strong datatyping for fields,
it still preserves that flexibility using "Dynamic Fields".  Using `<dynamicField>`
declarations, you can create field rules that Solr will use to understand what datatype should
be used whenever it is given a field name that is not explicitly defined, but matches a prefix
or suffix used in a dynamicField.
  
  For example the following dynamic field declaration tells Solr that whenever it sees a field
name ending in "_i" which is not an explicitly defined field, then it should dynamically create
an integer field with that name...
  
  {{{
      <dynamicField name="*_i"  type="integer"  indexed="true"  stored="true"/>
  }}}
- 
  === Indexing same data in multiple fields ===
- 
  Note that, with textual data, it will often make sense to take what's logically speaking
a single field (e.g. product name) and index it into several different Solr fields, each with
different field options and/or analyzers.
  
  As an example, if I had a field with a list of authors, such as:
  
-   ''Schildt, Herbert; Wolpert, Lewis; Davies, P.''
+  . ''Schildt, Herbert; Wolpert, Lewis; Davies, P.''
-   
+ 
- I might want to index the same data differently in three different fields (perhaps using
the Solr [[SchemaXml#Copy Fields|copyField]] directive):
+ I might want to index the same data differently in three different fields (perhaps using
the Solr [[#Copy_Fields|copyField]] directive):
+ 
-   * For searching: Tokenized, case-folded, punctuation-stripped:
+  * For searching: Tokenized, case-folded, punctuation-stripped:
-       schildt / herbert / wolpert / lewis / davies / p
+   . schildt / herbert / wolpert / lewis / davies / p
-   * For sorting: Untokenized, case-folded, punctuation-stripped:
+  * For sorting: Untokenized, case-folded, punctuation-stripped:
-       schildt herbert wolpert lewis davies p
+   . schildt herbert wolpert lewis davies p
-   * For faceting: Primary author only, using a `solr.StringField`:
+  * For faceting: Primary author only, using a `solr.StringField`:
-       Schildt, Herbert
+   . Schildt, Herbert
  
  (See also SolrFacetingOverview.)
  
  === Expert field options ===
+ The storage of Lucene term vectors can be triggered using the following field options:
  
- The storage of Lucene term vectors can be triggered using the following field options:
-    * `termVectors=true|false`
+  * `termVectors=true|false`
-    * `termPositions=true|false`
+  * `termPositions=true|false`
-    * `termOffsets=true|false`
+  * `termOffsets=true|false`
  
  These options can be used to accelerate highlighting and other ancillary functionality,
but impose a substantial cost in terms of index size.  They are ''not'' necessary for typical
uses of Solr (phrase queries, etc., do not require these settings to be present).
  
  == Miscellaneous Settings ==
- 
- In addition to the `<fieldtypes>` and `<fields>` sections of the schema, there
are several other declarations that can appear in your schema....
+ In addition to the `<types>` and `<fields>` sections of the schema, there are
several other declarations that can appear in your schema.
  
  === The Unique Key Field ===
- 
  The `<uniqueKey>` declaration can be used to inform Solr that there is a field in
your index which should be unique for all documents.  If a document is added that contains
the same value for this field as an existing document, the old document will be deleted.
  
  It is not mandatory for a schema to have a uniqueKey field.
  
  === The Default Search Field ===
- 
  The `<defaultSearchField>` is used by Solr when parsing queries to identify which
field name should be searched in queries where an explicit field name has not been used.
  
  /!\ :TODO: /!\ check whether this option is also used by the DisMaxRequestHandler and not
only by the StandardRequestHandler
  
  === Default query parser operator ===
- 
  The default operator used by Solr's query parser ([[http://lucene.apache.org/solr/docs/api/org/apache/solr/search/SolrQueryParser.html|SolrQueryParser]])
can be configured with <solrQueryParser defaultOperator="AND|OR"/>.  The default operator
is "OR" if unspecified.
  
+ <<Anchor(copyField)>>
  
- <<Anchor(copyField)>>
  === Copy Fields ===
- 
  Any number of `<copyField>` declarations can be included in your schema, to instruct
Solr that you want it to duplicate any data it sees in the "source" field of documents that
are added to the index, in the "dest" field of that document.  You are responsible for ensuring
that the datatypes of the fields are compatible. The original text is sent from the "source"
field to the "dest" field, before any configured analyzers for the originating or destination
field are invoked.
  
+ This is provided as a convenient way to ensure that data is put into several fields, without
needing to include the data in the update command multiple times. The maxChars property may
be used in a copyField declaration.   This simply limits the number of characters copied.
 For example:
- This is provided as a convenient way to ensure that data is put into several fields, without
needing to include the data in the update command multiple times.
- The maxChars property may be used in a copyField declaration.   This simply limits the number
of characters copied.  For example:
  
  {{{
   <copyField source="body" dest="teaser" maxChars="300"/>
  }}}
+ A common requirement is to copy or merge all input fields into a single solr field. This
can be done as follows:-
  
- A common requirement is to copy or merge all input fields into a single solr field. This
can be done as follows:-
  {{{
   <copyField source="*" dest="text"/>
  }}}
- 
- 
- 
  === Similarity ===
  A `<similarity>` declaration can be used to specify the subclass of Similarity that
you want Solr to use when dealing with your index.  If no Similarity class is specified, the
Lucene !DefaultSimilarity is used.  Please see SolrPlugins for information on how to ensure
that your own custom Similarity can be loaded into Solr.
  

Mime
View raw message