incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rory Franklin <r...@chillibean.tv>
Subject Re: couchdb-lucene indexing issues
Date Sat, 03 Sep 2011 16:30:06 GMT
Excellent, that was the simple mistake I was making! I thought standard broke it up into tokens.

Rory

Sent from my iPhone

On 3 Sep 2011, at 17:12, Robert Newson <rnewson@apache.org> wrote:

> " For instance, searching for the term "wonderland" should return back
> a document where there is a field with the value
> "some_wonderland_example" but it doesn't."
> 
> It shouldn't and doesn't. :)
> 
> 'some_wonderland_example' is a single token when tokenized by the
> default StandardAnalyzer. If instead you specify "analyzer":"simple",
> you will find that it is 3 tokens, and your search should work.
> 
> B.
> 
> On 3 September 2011 16:06, Rory Franklin <rory@chillibean.tv> wrote:
>>  I'm using couchdb-lucene to index a list of fields (user defined) in a document
using the following design document:
>> 
>> {
>>  "_id": "_design/foo",
>>  "_rev": "16-dcd0d39369c35b3d74ceef13a388826f",
>>  "fulltext": {
>> "by_metadata": {
>>  "index": "function(doc) {
>> var ret=new Document();
>> if (doc['type'] == 'CSAsset' && doc['deleted'] != true) {
>> for (var i in doc.metadata) {
>> if(doc.metadata[i]['key'] == 'Title') {
>> ret.add(doc.metadata[i]['value'].toLowerCase(), {'field':'sort_title', 'store':'yes',
'index' : 'not_analyzed'});
>> }
>> ret.add(doc.metadata[i]['value'],{'field':doc.metadata[i]['key'].toLowerCase() });
>> ret.add(doc.metadata[i]['value']);
>> }
>> for (var i in doc.partitions) {
>> ret.add(doc.partitions[i].partition_id,{'field':'partition'}); ret.add(doc.partitions[i].partition_id);
>> }
>> ret.add(doc['created_at'], {'field':'sort_created_at', 'store':'yes', 'index' : 'not_analyzed'});
>> return ret;
>> } else {
>> return null;
>> }
>> }"
>>  }
>>  }
>> }
>> 
>> 
>> 
>> (I've formatted the definition so that it's not all on one line for readability here)
>> 
>> However, when using the by_metadata view it doesn't appear to be breaking the values
up when there are underscores. For instance, searching for the term "wonderland" should return
back a document where there is a field with the value "some_wonderland_example" but it doesn't.
It returns the document if I search for the full term.
>> 
>> I'm just wondering whether I'm defining the index incorrectly? (of course, feel free
to point out if I'm doing anything else glaringly obviously wrong too!)
>> 
>> 
>> 
>> Rory
>> 
>> 
>> 
>> 

Mime
View raw message