lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <>
Subject [jira] Commented: (SOLR-1044) Use Hadoop RPC for inter Solr communication
Date Wed, 04 Mar 2009 05:51:56 GMT


Noble Paul commented on SOLR-1044:

bq.We're using persistent HTTP connections, so socket creation overhead should not be much
of an issue.

An HTTP connection can be re-used only after the request-response is complete. meanwhile,
If there is another request to be fired to the same server from the same client  , a new connection
will have to be created. So the no:of connections we create will be quite high if we have
a large no:of nodes in distributed search . 

I haven't yet seen a HTTP server serving more than around 1200 req/sec (apache HTTPD). A call
based server can serve 4k-5k messages easily. (I am yet to test hadoop RPC) . The proliferation
of a large no: of frameworks around that is a testimony to the superiority of that approach.

> Use Hadoop RPC for inter Solr communication
> -------------------------------------------
>                 Key: SOLR-1044
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Noble Paul
> Solr uses http for distributed search . We can make it a whole lot faster if we use an
RPC mechanism which is more lightweight/efficient. 
> Hadoop RPC looks like a good candidate for this.  
> The implementation should just have one protocol. It should follow the Solr's idiom of
making remote calls . A uri + params +[optional stream(s)] . The response can be a stream
of bytes.
> To make this work we must make the SolrServer implementation pluggable in distributed
search. Users should be able to choose between the current CommonshttpSolrServer, or a HadoopRpcSolrServer

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message