lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Masters <>
Subject Re: Realtime & distributed
Date Sat, 10 Oct 2009 04:09:02 GMT
Hi Jake,

Zoie looks like a a really cool project. I'd like to learn more about
the distributed part of the setup. Any way you could describe that
here or on the wiki?


On Thu, Oct 8, 2009 at 9:24 PM, Jake Mannix <> wrote:
> On Thu, Oct 8, 2009 at 7:00 PM, Angel, Eric <> wrote:
>> Does anyone have any recommendations?  I've looked at Katta, but it doesn't
>> seem to support realtime searching.  It also uses hdfs, which I've heard can
>> be slow.  I'm looking to serve 40gb of indexes and support about 1 million
>> updates per day.
> Hi Eric,
>  As I mentioned in my response to Jason, we at LinkedIn serve our roughly
> 50million document profile index on a real-time distributed setup (we're
> serving facets in real-time also), serving tens of millions of queries a day
> in the 1-10ms latency per node, based on the open source zoie project (built
> here at LinkedIn) :
>  Zoie doesn't handle the distributed part of the setup, it's just the
> real-time side.  Distribution is done pretty straitgtforwardly in our case
> though: N shards each getting a different contiguous slice of the user base,
> each replicated K times, and all N*K nodes get indexing events distributed
> by a message queue independently.
>  If you have any questions about zoie, let me know.  The documentation
> could get filled in a little further, and it doesn't touch on distributed
> side of things, so feel free to ping me.
>  -jake

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message