lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Hadoop RPC for distributed Lucene
Date Fri, 11 Jul 2008 13:13:39 GMT
I believe there is a subproject over at Hadoop for doing distributed  
stuff w/ Lucene, but I am not sure if they are doing search side, only  
indexing.  I was always under the impression that it was too slow for  
search side, as I don't think Nutch even uses it for the search side  
of the equation, but I don't know if that is still the case.

On Jul 10, 2008, at 10:16 PM, Jason Rutherglen wrote:

> Has anyone taken a look at using Hadoop RPC for enabling distributed  
> Lucene?  I am thinking it would implement the Searchable interface  
> and use serialization to be compatible with the current RMI  
> version.  Somewhat defeats the purpose of using Hadoop RPC and  
> serialization however Hadoop RPC scales far beyond what RMI can at  
> the networking level.  RMI uses a thread per socket and has  
> reportedly has latency issues.  Hadoop RPC uses NIO and is proven to  
> scale to thousands of servers.  Serialization unfortunately must be  
> used with Lucene due to the Weight, Query and Filter classes.  There  
> could be an extended version of Searchable that allows passing  
> Weight, Query, and Filter classes that implement Hadoop's Writeable  
> interface if a user wants to bypass using serialization.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message