lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "TermVectorComponent" by GrantIngersoll
Date Sat, 03 Jul 2010 00:52:40 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "TermVectorComponent" page has been changed by GrantIngersoll.


  = Introduction =
  <!> Solr 1.4 <!>
  The Term Vector Component (TVC) is a !SearchComponent designed to return information about
documents that is stored when setting the termVector attribute on a field:
  <field name="features" type="text" indexed="true" stored="true" multiValued="true" termVectors="true"
termPositions="true" termOffsets="true"/>
  For each document, the TVC can return, the term vector, the term frequency, inverse document
frequency, position and offset information.  As with most components, there are a number of
options that are outlined in the samples below.
  = Sample Usage =
  All examples are based on using the Solr example.
  == Enabling the TVC ==
  === Changes required in solrconfig.xml ===
  You need to enable the TermVectorComponent in your solr configuration:
  <searchComponent name="tvComponent" class="org.apache.solr.handler.component.TermVectorComponent"/>
  A RequestHandler configuration using this component could look like this:
  <requestHandler name="tvrh" class="org.apache.solr.handler.component.SearchHandler">
- 	<lst name="defaults">
+         <lst name="defaults">
- 		<bool name="tv">true</bool>
+                 <bool name="tv">true</bool>
- 	</lst>
+         </lst>
- 	<arr name="last-components">
+         <arr name="last-components">
- 		<str>tvComponent</str>
- 	</arr>
+                 <str>tvComponent</str>
+         </arr>
  === HTTP Requests ===
  In the example, the component is associated with a request handler named tvrh, but you can
associate it with any !RequestHandler.  To turn on the component for a request, add the {{{tv=true}}}
parameter (or add it to your !RequestHandler defaults configuration).
- Example output:
- See TermVectorComponentExampleEnabled.
+ Example output: See TermVectorComponentExampleEnabled.
  == Options ==
   * - Return document term frequency info per term in the document.
@@ -58, +49 @@

   * tv.tf_idf - Calculates tf*idf for each term.  Requires the parameters and tv.df
to be "true". This can be expensive. (not shown in example output)
  Alternatively, a shortcut for all options on is:
   * tv.all=true
  Example output: See TermVectorComponentExampleOptions.
  Schema requirements see: FieldOptionsByUseCase.
+ === Per Field Options ===
+ With, it is now possible to specify per
field options, similar to the way per field options work in faceting, as in
+  * - Turns on Term Frequency for the fieldName specified.
+  * Similar for all the other options above
+ If you do not specify per field options but still specify a field, it will assume the general
  == Other Options ==
   * tv.fl - List of fields to get TV information from.  Optional.  If not specified, the
fl parameter is used.
+   * As of, If the field does not exist,
an exception is thrown
   * tv.docIds - List of Lucene document ids (not the Solr Unique Key) to get term vectors
+ == Warnings ==
+ If a request field does not support the options specified, warnings will be returned indicating
that the field does not support that option.  There are three types of warnings:
+  1. noTermVector - The field does not store term vectors
+  1. noPositions - The field does not store positions
+  1. noOffsets - The field does not store offsets
+ Each of these items is a List of Strings containing the field name that does not support
the option specified.
  == SolrJ ==
  Neither the SolrQuery class nor the QueryResponse class offer specific method calls to set
TermVectorComponent parameters or get the "termVectors" output. However, there is a patch
for it: [[|SOLR-949]].
  == History ==

View raw message