lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "TermsComponent" by GrantIngersoll
Date Fri, 28 Nov 2008 13:27:24 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by GrantIngersoll:
http://wiki.apache.org/solr/TermsComponent

New page:
= Introduction =

The !TermsComponent !SearchComponent is a simple plugin that provides access to Lucene's term
dictionary (the TermEnum.)  This could be useful for doing auto-suggest or other things that
operate at the term level instead of the search or document level.  Currently, the !TermsComponent
only provides !TermEnum access and not !TermDocs or position information.  This kind of lookup
should be very fast.

See http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/TermEnum.html

See http://lucene.apache.org/java/2_4_0/fileformats.html for what Lucene's file formats look
like.


= How it Works =

To use the !TermsComponent, users can pass in a variety of options in order to get access
to the TermEnum.  The supported parameters are available in the org.apache.solr.common.params.TermsParams
class.  These params are:

 * terms={true|false} - Turn on the !TermsComponent
 * terms.fl={FIELD NAME} - Required. The name of the field to get the terms from.
 * terms.lower={The lower bound term} - Optional.  The term to start at.  If not specified,
the empty string is used, meaning start at the beginning of the field.
 * terms.upper={The upper bound term} - Either upper, terms.rows, rows must be set.  The term
to stop at.
 * terms.upr.incl={true|false} - Optional.  Include the upper bound term in the result set.
 Default is false.
 * terms.lwr.incl={true|false} - Optional.  Include the lower bound term in the result set.
 Default is true.
 * terms.rows={integer} - Either upper, terms.rows, rows must be set.  The number of results
to return.  If not specified, looks for rows (CommonParams.ROWS).  If that is not specified,
default is 10  

The output is a list of the terms and their document frequency values.  Again, see http://lucene.apache.org/java/2_4_0/api/core/org/apache/lucene/index/TermEnum.html

= Examples =

The following examples use the Solr tutorial example located in the <Solr>/example directory.

== Simple ==
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name
}}}

Get back the first ten terms in the name field. 

Results:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="0">5</str>
 <str name="1">15</str>
 <str name="11">5</str>
 <str name="120">5</str>
 <str name="133">5</str>
 <str name="184">15</str>
 <str name="19">5</str>
 <str name="1900">5</str>
 <str name="2">15</str>
 <str name="20">5</str>
</lst>
</response>
}}}

== Lower ==

URL: 
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

 <str name="all">5</str>
 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>

 <str name="ati">5</str>
 <str name="b">5</str>
</lst>
</response>
}}}

== Lower, Upper ==

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

 <str name="all">5</str>
 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>

 <str name="ati">5</str>
</lst>
</response>
}}}

Notice the "b" got dropped

== Exclusive of Lower Bound ==

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&terms.lwr.incl=false&indent=true
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">2</int>
</lst>
<lst name="terms">
 <str name="adata">5</str>
 <str name="all">5</str>

 <str name="allinon">5</str>
 <str name="amber">1</str>
 <str name="appl">5</str>
 <str name="asus">5</str>
 <str name="ata">5</str>
 <str name="ati">5</str>

</lst>
</response>
}}}


== Rows == 

URL:
{{{
http://localhost:8983/solr/autoSuggest?terms=true&terms.fl=name&terms.lower=a&terms.upper=b&indent=true&terms.rows=2
}}}

Result:
{{{
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
</lst>
<lst name="terms">
 <str name="a">8</str>
 <str name="adata">5</str>

</lst>
</response>

}}}

Mime
View raw message