couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Lazy Fulltext Search
Date Sun, 13 Apr 2008 21:00:00 GMT

On Apr 12, 2008, at 12:06, Søren Hilmer wrote:
> Hi
>
> Have you read Chris' response about letting the view engine call the  
> indexer,
> as it has the information needed for the indexer? As I understand  
> the idea,
> it will essentially keep the fulltext indexer and the views in sync.
>
> I like this idea and I believe the code for the indexer would be  
> much simpler
> and efficient.
>
> Also as the shift goes towards indexing views and not documents, it  
> makes
> sense that it is the View engine that triggers the indexer, right?

The only problem here is that views are changed, when they are being  
queried and not when documents are added. So you could end up with a  
lot of not-indexed data because your view hasn't been queried. That  
can be worked around, but I don't think it makes things any easier :)

The design of the update notification is intentionally simple. We  
expect the clients (the Indexer in this case) to be smart. We believe  
that this makes the server code is more robust in that way.


> I have to study the View engine, if I am to provide any code for  
> this, though
> (provided consensus blows in this direction).
>
> Have fun
>   Søren
> On Friday 11 April 2008 13:26, Jan Lehnardt wrote:
>> On Apr 11, 2008, at 08:55, Søren Hilmer wrote:
>>> Hi Jan
>>>
>>> It certainly would simplify configuration, allthough the
>>> DbUpdateNotificationProcess setting ought to be retained as it is
>>> potentially usefull for other stuff than indexing (can you have more
>>> than
>>> one of these, setup?)
>>
>> No, the update searcher will stay! :-)
>>
>>> I am also worried about responsetimes for searching, potentially the
>>> indexing can take considerable time. With the current approach
>>> indexing
>>> can be done off peak hours and only searching is done at prime time.
>>
>> Right, if you want to be conservative with resources, you might want
>> togo
>> with my approach at the expense of possibly higher response times the
>> first time things are searched for (as it is with views). I just
>> wanted to make
>> available my idea that fulltext indexing could be modelled after how
>> views
>> work, in case this is useful for a specific scenario.
>>
>> Cheers
>> Jan
>> --
>>
>>> Have fun
>>> Søren
>>> --
>>> Søren Hilmer, M.Sc., M.Crypt.
>>> wideTrail            Phone: +45 25481225
>>> Pilevænget 41        Email: sh@widetrail.dk
>>> DK-8961  Allingåbro  Web: www.widetrail.dk
>>>
>>> On Thu, April 10, 2008 23:32, Jan Lehnardt wrote:
>>>> Heya,
>>>> while thinking more about the fulltext implementation, I began to
>>>> wonder why we don't model it after the view engine.
>>>>
>>>> At the moment, we have an Indexer waiting for update notifications
>>>> and
>>>> polling CouchDB for changes and a separate mechanism to register a
>>>> fulltext query Searcher, that looks up things in the index.
>>>>
>>>> My proposed architectural change would be to trigger the Indexer  
>>>> from
>>>> the Searcher module when a request comes in, just like views work.
>>>> This would delay the creation of fulltext indexes until they are
>>>> actually needed.
>>>>
>>>> The possible drawback though is, that when building the fulltext
>>>> index
>>>> is rather slow, old-style pre-calculation might be more feasible.
>>>> View
>>>> deal with that by requiring frequent requests (possibly cron-ed).
>>>>
>>>> This is not a proposal or anything, just a thought I wanted to  
>>>> share
>>>> with those who work on fulltext integration.
>>>>
>>>> If you have any input on this, please let us know ;)
>>>>
>>>> Cheers
>>>> Jan
>>>> --
>
> -- 
> Søren Hilmer, M.Sc., M.Crypt.
> wideTrail			Phone:	+45 25481225
> Pilevænget 41		Email:	sh@widetrail.dk
> DK-8961  Allingåbro	Web:	www.widetrail.dk
>


Mime
View raw message