couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: Full text search - is it coming? If yes, approx when.
Date Mon, 28 Mar 2011 14:30:22 GMT
I am a CouchDB committer and author of couchdb-lucene. :)

B.

On 28 March 2011 10:44, Andrew Stuart (SuperCoders)
<andrew.stuart@supercoders.com.au> wrote:
> Hi Robert
>
> "there are no publicly known plans to build a native full-text indexing
> feature for CouchDB."
>
> I don't know who is who around here as yet - are you commenting from inside
> knowledge or as an end user/developer?
>
> Thanks
>
>
> On 28/03/2011, at 8:24 PM, Robert Newson wrote:
>
> I have to dispute "There does not seem to be much understanding that
> this could be a killer feature."
>
> Obviously full-text search is a killer feature, but it's trivially
> available now via couchdb-lucene or elasticsearch.
>
> What people are asking for is native full-text search which, to me, is
> essentially asking for an Erlang port of Lucene. We'd love this, but
> it's a huge amount of work. Continually asking others to do
> significant amounts of work is also wearying.
>
> To replace a Lucene-based solution and match its quality and breadth
> is a huge chunk of work and is only necessary to satisfy people who,
> for various reasons, don't want to use Java.
>
> To answer the original post, there are no publicly known plans to
> build a native full-text indexing feature for CouchDB.
>
> B.
>
> On 28 March 2011 10:15, Olafur Arason <olafura@olafura.com> wrote:
>>
>> There does not seem to be much understanding that this could be a killer
>> feature. People are now relying on Lucene which monitors the _changes
>> feed.
>>
>> Cloudant has done it's own implementation which I gather through the
>> information they have published makes a view out of all your word,
>> they recommend java view because you can then reuse the lexer from
>> Lucene. Then I think they are reusing the reader of the view to make
>> their query. They have a similar syntax as Lucene for the query interface.
>> They are still working on this and I think they don't have that much
>> incentive to opensource it right away. But they have in past both
>> opensourced there technology like BigCouch so I think it's more a
>> matter of when rather then if.
>>
>> I think this is a good solution for a fulltext search. But I don't think
>> that
>> the java view does not have direct access to the data so it could be
>> slow. But cloudant does clustering on view generation so that helps.
>>
>> But there is also general problem with the current view system where
>> search technology could be used.
>>
>> The view are really good at sorting but people are using them to
>> do key matches which they are not designed for. They beginkey and
>> endkey are for sorting ranges and are not good for matching which
>> most resources online are pointing to.
>>
>> For example when you do:
>> beginkey = ["key11", "key21"]
>> endkey = ["key19", "key21"]
>>
>> You get ["key11","key22"], ["key11", "key23"] ... ["key12","key21"],
>> ["key12","key22"]...
>> which makes sense when looking up sorting ranges but not using it to
>> match keys. But you can have a range match lookup but only on the
>> last key and never on two keys. So this would work:
>>
>> beginkey = ["key21", "key11"]
>> endkey = ["key21", "key19"]
>>
>> The current view interface could be augmented to accept queries
>> and could make them much more powerful then they currently are
>> and just using the keys for sorting and selecting which values you
>> want shown which they are designed to do and do really well.
>>
>> This would be a killer feature and could use the new infrastructure
>> from Cloudant search.
>>
>> And don't tell me the Elastic or Lucene interface could do anything
>> close to this :)
>>
>> Regards,
>> Olafur Arason
>>
>> On Mon, Mar 28, 2011 at 04:31, Andrew Stuart (SuperCoders)
>> <andrew.stuart@supercoders.com.au> wrote:
>>>
>>> It would be good to know if full text search is coming as a core feature
>>> and
>>> if yes, approximately when - does anyone know?
>>>
>>> Even an approximate timeframe would be good.
>>>
>>> thanks
>>>
>>
> --
> Message  protected by MailGuard: e-mail anti-virus, anti-spam and content
> filtering.http://www.mailguard.com.au/mg
> Click here to report this message as spam:
> https://login.mailguard.com.au/report/1BZveI1wri/4izG2DWUCf9OUvbAh9DkfT/0
>

Mime
View raw message