lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir" <rcm...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-1473) Implement Externalizable in main top level searcher classes
Date Thu, 04 Dec 2008 07:03:21 GMT
On Thu, Dec 4, 2008 at 1:24 AM, John Wang <john.wang@gmail.com> wrote:

> Nice!
> Some questions:
>
> 1) one index?
>
no, but two individual ones today were around 100M docs

> 2) how big is your document? e.g. how many terms etc.
>
last one built has over 4M terms

> 3) are you serving(searching) the docs in realtime?
>
i dont understand this question, but searching is slower if i am indexing on
a disk thats also being searched.

>
> 4) search speed?
>
usually subsecond (or close) after some warmup. while this might seem slow
its fast compared to the competition, trust me.

>
> I'd love to learn more about your architecture.
>
i hate to say you would be disappointed, but theres nothign fancy. probably
why it works...

>
> -John
>
>
> On Wed, Dec 3, 2008 at 10:13 PM, Robert Muir <rcmuir@gmail.com> wrote:
>
>> sorry gotta speak up on this. i indexed 300m docs today. I'm using an out
>> of box jar.
>>
>> yeah i have some special subclasses but if i thought any of this stuff was
>> general enough to be useful to others i'd submit it. I'm just happy to have
>> something scalable that i can customize to my peculiarities.
>>
>> so i think i fit in your 10% and im not stressing on either scalability or
>> api.
>>
>> thanks,
>> robert
>>
>>
>> On Thu, Dec 4, 2008 at 12:36 AM, John Wang <john.wang@gmail.com> wrote:
>>
>>> Grant:
>>>         I am sorry that I disagree with some points:
>>>
>>> 1) "I think it's a sign that Lucene is pretty stable." - While lucene is
>>> a great project, especially with 2.x releases, great improvements are made,
>>> but do we really have a clear picture on how lucene is being used and
>>> deployed. While lucene works great running as a vanilla search library, when
>>> pushed to limits, one needs to "hack" into lucene to make certain things
>>> work. If 90% of the user base use it to build small indexes and using the
>>> vanilla api, and the other 10% is really stressing both on the scalability
>>> and api side and are running into issues, would you still say: "running well
>>> for 90% of the users, therefore it is stable or extensible"? I think it is
>>> unfair to the project itself to be measured by the vanilla use-case. I have
>>> done couple of large deployments, e.g. >30 million documents indexed and
>>> searched in realtime., and I really had to do some tweaking.
>>>
>>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>


-- 
Robert Muir
rcmuir@gmail.com

Mime
View raw message