couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Chesneau <>
Subject Re: Full text search - is it coming? If yes, approx when.
Date Mon, 28 Mar 2011 15:59:33 GMT
Would be cool to have a nifs integration of Apache Lucy. It may solve
the problem.

- benoit

On Mon, Mar 28, 2011 at 5:17 PM, Olafur Arason <> wrote:
> I love the power of Lucene but it's not needed for many usecases
> and can even be gutted like Cloudant is doing with their search
> using the lexer from Lucene.
> But most of the time people need quick and dirty search and
> even search integration with views. Then you would maybe have
> a really simple lexer. And have it built in. If people need more
> power they would use Lucene.
> It's like using a Ferrari to go to the store, it's cool but an overkill.
> Hope you keep up the good work, couchdb-lucene is really easy
> to use.
> Regards,
> Olafur Arason
> Ps I was talking to an NLP expert and I realize that there is so
> much to searching. Especially doing it right that I think nobody
> will be able to re-implement Lucene anytime soon.
> On Mon, Mar 28, 2011 at 14:30, Robert Newson <> wrote:
>> I am a CouchDB committer and author of couchdb-lucene. :)
>> B.
>> On 28 March 2011 10:44, Andrew Stuart (SuperCoders)
>> <> wrote:
>>> Hi Robert
>>> "there are no publicly known plans to build a native full-text indexing
>>> feature for CouchDB."
>>> I don't know who is who around here as yet - are you commenting from inside
>>> knowledge or as an end user/developer?
>>> Thanks
>>> On 28/03/2011, at 8:24 PM, Robert Newson wrote:
>>> I have to dispute "There does not seem to be much understanding that
>>> this could be a killer feature."
>>> Obviously full-text search is a killer feature, but it's trivially
>>> available now via couchdb-lucene or elasticsearch.
>>> What people are asking for is native full-text search which, to me, is
>>> essentially asking for an Erlang port of Lucene. We'd love this, but
>>> it's a huge amount of work. Continually asking others to do
>>> significant amounts of work is also wearying.
>>> To replace a Lucene-based solution and match its quality and breadth
>>> is a huge chunk of work and is only necessary to satisfy people who,
>>> for various reasons, don't want to use Java.
>>> To answer the original post, there are no publicly known plans to
>>> build a native full-text indexing feature for CouchDB.
>>> B.
>>> On 28 March 2011 10:15, Olafur Arason <> wrote:
>>>> There does not seem to be much understanding that this could be a killer
>>>> feature. People are now relying on Lucene which monitors the _changes
>>>> feed.
>>>> Cloudant has done it's own implementation which I gather through the
>>>> information they have published makes a view out of all your word,
>>>> they recommend java view because you can then reuse the lexer from
>>>> Lucene. Then I think they are reusing the reader of the view to make
>>>> their query. They have a similar syntax as Lucene for the query interface.
>>>> They are still working on this and I think they don't have that much
>>>> incentive to opensource it right away. But they have in past both
>>>> opensourced there technology like BigCouch so I think it's more a
>>>> matter of when rather then if.
>>>> I think this is a good solution for a fulltext search. But I don't think
>>>> that
>>>> the java view does not have direct access to the data so it could be
>>>> slow. But cloudant does clustering on view generation so that helps.
>>>> But there is also general problem with the current view system where
>>>> search technology could be used.
>>>> The view are really good at sorting but people are using them to
>>>> do key matches which they are not designed for. They beginkey and
>>>> endkey are for sorting ranges and are not good for matching which
>>>> most resources online are pointing to.
>>>> For example when you do:
>>>> beginkey = ["key11", "key21"]
>>>> endkey = ["key19", "key21"]
>>>> You get ["key11","key22"], ["key11", "key23"] ... ["key12","key21"],
>>>> ["key12","key22"]...
>>>> which makes sense when looking up sorting ranges but not using it to
>>>> match keys. But you can have a range match lookup but only on the
>>>> last key and never on two keys. So this would work:
>>>> beginkey = ["key21", "key11"]
>>>> endkey = ["key21", "key19"]
>>>> The current view interface could be augmented to accept queries
>>>> and could make them much more powerful then they currently are
>>>> and just using the keys for sorting and selecting which values you
>>>> want shown which they are designed to do and do really well.
>>>> This would be a killer feature and could use the new infrastructure
>>>> from Cloudant search.
>>>> And don't tell me the Elastic or Lucene interface could do anything
>>>> close to this :)
>>>> Regards,
>>>> Olafur Arason
>>>> On Mon, Mar 28, 2011 at 04:31, Andrew Stuart (SuperCoders)
>>>> <> wrote:
>>>>> It would be good to know if full text search is coming as a core feature
>>>>> and
>>>>> if yes, approximately when - does anyone know?
>>>>> Even an approximate timeframe would be good.
>>>>> thanks
>>> --
>>> Message  protected by MailGuard: e-mail anti-virus, anti-spam and content
>>> filtering.
>>> Click here to report this message as spam:

View raw message