lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "tom liu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1395) Integrate Katta
Date Mon, 11 Oct 2010 10:22:32 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919784#action_12919784
] 

tom liu commented on SOLR-1395:
-------------------------------

No, that's four queries:
# on solr01, url is /select?fl=id,score&...
#* Shard=solrhome02#solrhome02 
#* Shard=solrhome01#solrhome01
# on solr01, url is /select?ids=SOLR1000&fl=id,score,id&...
#* Shard=solrhome02#solrhome02
# on solr02, url is /select?ids=SOLR1000&fl=id,score&...
#* Shard=solrhome02#solrhome02 
#* Shard=solrhome01#solrhome01
# on solr02, url is /select?ids=SOLR1000&ids=SOLR1000&...
#* Shard=solrhome01#solrhome01

If the orient query includes shards=*, then master solr would send * to kattaclient.
And then, kattaclient or katta.Client would select node such as solr01, and send shards=solrhome01#solrhome01,solrhome02#solrhome02
in middle-shard, searchHandler and queryComponent would invoke distributed process, such as
createMainQuery and createRetrieveDocs.
So, in any node, the query would be distributed into two queries:
# first is selecting id and score
# second is selecting docs

i have changed the queryComponent class. that is:
{code:title=distributedProcess|borderStyle=solid}
	// Added by tom liu
	// do or not need distributed process
	boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
	// if in sub shards, do not need distributed process
	if (isShard) {
		if (rb.stage < ResponseBuilder.STAGE_PARSE_QUERY)
			return ResponseBuilder.STAGE_PARSE_QUERY;
		if (rb.stage == ResponseBuilder.STAGE_PARSE_QUERY) {
			createDistributedIdf(rb);
			return ResponseBuilder.STAGE_EXECUTE_QUERY;
		}
		if (rb.stage < ResponseBuilder.STAGE_EXECUTE_QUERY)
			return ResponseBuilder.STAGE_EXECUTE_QUERY;
		if (rb.stage == ResponseBuilder.STAGE_EXECUTE_QUERY) {
			createMainQuery(rb);
			return ResponseBuilder.STAGE_GET_FIELDS;
		}
		if (rb.stage < ResponseBuilder.STAGE_GET_FIELDS)
			return ResponseBuilder.STAGE_GET_FIELDS;
		if (rb.stage == ResponseBuilder.STAGE_GET_FIELDS) {
			return ResponseBuilder.STAGE_DONE;
		}
		return ResponseBuilder.STAGE_DONE;
	}
	// add end
        ...
{code} 

{code:title=handleResponses|borderStyle=solid}
  if ((sreq.purpose & ShardRequest.PURPOSE_GET_TOP_IDS) != 0) {
      mergeIds(rb, sreq);
  	  // Added by tom liu
  	  // do or not need distributed process
  	  boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
      if(isShard){
      	sreq.purpose = ShardRequest.PURPOSE_GET_FIELDS;
      }
   	  // add end
    }

    if ((sreq.purpose & ShardRequest.PURPOSE_GET_FIELDS) != 0) {
      returnFields(rb, sreq);
      return;
    }
{code} 

{code:title=createMainQuery|borderStyle=solid}
    sreq.params = new ModifiableSolrParams(rb.req.getParams());
    // TODO: base on current params or original params?

	// Added by tom liu
	// do or not need distributed process
	boolean isShard = rb.req.getParams().getBool(ShardParams.IS_SHARD, false);
    if(isShard){
        // isShard=true, then do not change params
    }else{
    	// add end
	    // don't pass through any shards param
	    sreq.params.remove(ShardParams.SHARDS);
    ...
{code} 

{code:title=returnFields|borderStyle=solid}
      boolean returnScores = (rb.getFieldFlags() & SolrIndexSearcher.GET_SCORES) != 0;

      // changed by tom liu
      // add for loop
      //assert(sreq.responses.size() == 1);
      //ShardResponse srsp = sreq.responses.get(0);
      for(ShardResponse srsp : sreq.responses){
	      SolrDocumentList docs = (SolrDocumentList)srsp.getSolrResponse().getResponse().get("response");

	      String keyFieldName = rb.req.getSchema().getUniqueKeyField().getName();
      ...
{code} 


> Integrate Katta
> ---------------
>
>                 Key: SOLR-1395
>                 URL: https://issues.apache.org/jira/browse/SOLR-1395
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: Next
>
>         Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar,
katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch,
solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, SOLR-1395.patch,
SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We'll integrate Katta into Solr so that:
> * Distributed search uses Hadoop RPC
> * Shard/SolrCore distribution and management
> * Zookeeper based failover
> * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message