couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Integrated Full Text Indexing and Reporting Re: CouchDB 0.9 and 1.0
Date Sat, 12 Jul 2008 21:24:38 GMT

On Jul 11, 2008, at 22:29 , Damien Katz wrote:

> CouchDB needs integrate full-text indexing support. We should be  
> able to support multiple full text engines, but our reference  
> implementation will be Apache Lucene.
>
> Initially (I'm hoping for 0.9.0)  we should be able to index all  
> documents and their attachments (for types that lucene can index  
> anyway) and return queries against that index via. Jan has begun  
> this work and I think someone has this mostly working now somewhere,  
> but its not in trunk?

we have a patch that improves the API here: https://issues.apache.org/jira/browse/COUCHDB-74
and there is the http://svn.apache.org/repos/asf/incubator/couchdb/branches/lucene-search/
branch that this patch should be applied to. Further work should be  
continued there. At this
point the only difference between trunk and the branch is the addition  
of the /db/_search
API call. The branch also might need to be brought up to trunk. It has  
no current maintainer,
although Paul Davis voiced interest in pushing this forward. Also,  
there were attempts at adding
other search engines but they never surfaced. If I remember correctly,  
the problem that views
can not be searched without expanding the view server, stopped most  
work.


> By 1.0, we should also do a view intersections with full text  
> results. At query time, CouchDB gets back a list of matching  
> documents and then finds the emited view rows from those documents,   
> and returns them sorted by relevance score. This will require some  
> enhancements to the internal view API, but the data and required  
> index (views keys by doc id) already exist to make this efficient.

I opened a bug report for this.


--

Since I started the work on Lucene I am by open source work definition  
somewhat responsible for the life of this. But I'd rather not, at  
least for the Java side of things. If somebody (heya Paul, still in?)  
wants to take this over, that'd be mighty cool.


Cheers
Jan
--

> Perhaps not initially, but eventually the integration of the  
> fulltext engine will be as proper couchdb HTTP and daemon plug-ins  
> (once those apis are established).
>
> On Jul 2, 2008, at 3:08 AM, Jan Lehnardt wrote:
>
>> Hello everybody,
>> this thread is meant to collect missing work items (features and
>> bugs) for for our 1.0 release and a discussion about how to split
>> them up between 0.9 and 1.0.
>>
>> Take it away: Damien.
>>
>> Cheers
>> Jan
>> --
>
>


Mime
View raw message