lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mathias Walter (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-1395) Integrate Katta
Date Wed, 18 Aug 2010 09:01:25 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899788#action_12899788
] 

Mathias Walter edited comment on SOLR-1395 at 8/18/10 5:00 AM:
---------------------------------------------------------------

I ported the patch to Solr 3.1 and Katta 0.6.2, except the Katta test. I also fixed some bugs.
The updated patch will be added soon.

In the meanwhile I discovered a big issue. Often, a SolrKattaNode (back-end server) hosts
many shards. If now a Solr front-end server starts a new query, it sent as many queries in
parallel to the back-end servers as shards the have. In contrast, a Katta/Lucene search sends
just one query to each back-end server which queries all shards a back-end server hosts.
The problem is now that the Solr front-end server often did not receive all KattaResponse's
from the back-end servers and hence timeout some queries and raises an exception. Sometimes
a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.mergeIds}}
(usually at startup of the front-end server) and sometimes a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.returnFields}}:

{noformat}
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Done waiting,
results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutting down
work queue, results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - close()
called.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - Notifying
closed listener.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shut down
via ClientRequest.close()
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutdown()
called (id=6)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Returning
results = ClientResult: 0 results, 0 errors, 0/1 shards (closed), took 9989 ms (id=6:0)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] net.sf.katta.client.Client - broadcast(request([null,
org.apache.solr.katta.KattaRequest@180a1d7b]),
 {ibis46.gsf.de:20001=[sen-00000#sen-00000]}) took 10001 msec for ClientResult: 0 results,
0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler -
KattaCommComponent shard: sen-00000 results.size: 0
 WARN 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler -
Received 0 responses for query [], not 1
ERROR 2010-08-18 10:32:25,731 [pool-1-thread-1] org.apache.solr.core.SolrCore - java.lang.NullPointerException
	at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:411)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:308)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
	at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
	at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1144)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)


DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] net.sf.katta.client.Client - broadcast(request([null,
org.apache.solr.katta.KattaRequest@71ce5e7a]),
 {ibis46.gsf.de:20001=[sen-00003#sen-00003]}) took 10001 msec for ClientResult: 0 results,
0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler -
KattaCommComponent shard: sen-00003 results.size: 0
 WARN 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler -
Received 0 responses for query [], not 1
ERROR 2010-08-18 10:37:55,296 [918077175@qtp-87740549-8] org.apache.solr.core.SolrCore - java.lang.NullPointerException
	at org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:574)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:312)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
	at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:326)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
	at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{noformat}

Interestingly, the back-end servers processing the queries immediately and send the results
to the front-end server:

{noformat}
 INFO 2010-08-18 10:37:45,325 [pool-13-thread-9] org.apache.solr.core.SolrCore - [sen-00003#sen-00003]
webapp=null path=/select params={start=0&
ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&
ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10}
status=0 QTime=7 
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.solr.katta.SolrKattaServer
- SolrServer.request: ibis46.gsf.de:20001 shards: [sen-00003#sen-00003]
 request params: start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10&shards=sen-00003%23sen-00003
 rsp: {response={numFound=4,start=0,docs=[SolrDocument[{id=pubmed:1567687:1:0, type=sentence,
lang=en, pubdate=Fri Dec 15 11:39:20 CET 2006}],
 SolrDocument[{id=pubmed:17140099:8:0, type=sentence, lang=en, pubdate=Thu Mar 01 11:40:18
CET 2007}],
 SolrDocument[{id=pubmed:12807258:6:0, type=sentence, lang=en, pubdate=Thu Jun 11 11:37:14
CEST 2009}],
 SolrDocument[{id=pubmed:11701068:3:0, type=sentence, lang=en, pubdate=Fri Apr 28 11:36:26
CEST 2006}]]},
 QueriedShards=[Ljava.lang.String;@791ef9f6}
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- Served: request queueTime= 8 procesingTime= 17
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- IPC Server Responder: responding to #30 from 146.107.217.46:58679
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- IPC Server Responder: responding to #30 from 146.107.217.46:58679 Wrote 386 bytes.
{noformat}

But if the front-end server cancels the query in case of a timout, always the last sent KattaResponse
was not recognized by the front-end server. I've attached a full communication log of one
failed query for both the front-end ([^front-end.log]) and the back-end ([^backend-end.log])
server.

Did anyone run into the same issue? I hope because the error occurs quit often. I assume this
bug is related to Hadoop RPC, but I could not find a Hadoop JIRA. I also tried the latest
release candidate 0.21.0 of Hadoop.

My idea is now to combine the parallel queries to one back-end server into one single query,
similar to the Lucene queries implemented in Katta.

      was (Author: tolot27):
    I ported the patch to Solr 3.1 and Katta 0.6.2, except the Katta test. I also fixed some
bugs. The updated patch will be added soon.

In the meanwhile I discovered a big issue. Often, a SolrKattaNode (back-end server) hosts
many shards. If now a Solr front-end server starts a new query, it sent as many queries in
parallel to the back-end servers as shards the have. In contrast, a Katta/Lucene search sends
just one query to each back-end server which queries all shards a back-end server hosts.
The problem is now that the Solr front-end server often did not receive all KattaResponse's
from the back-end servers and hence timeout some queries and raises an exception. Sometimes
a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.mergeIds}}
(usually at startup of the front-end server) and sometimes a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.returnFields}}:

{noformat}
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Done waiting,
results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutting down
work queue, results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - close()
called.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - Notifying
closed listener.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shut down
via ClientRequest.close()
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutdown()
called (id=6)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Returning
results = ClientResult: 0 results, 0 errors, 0/1 shards (closed), took 9989 ms (id=6:0)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] net.sf.katta.client.Client - broadcast(request([null,
org.apache.solr.katta.KattaRequest@180a1d7b]), {ibis46.gsf.de:20001=[sen-00000#sen-00000]})
took 10001 msec for ClientResult: 0 results, 0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler -
KattaCommComponent shard: sen-00000 results.size: 0
 WARN 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler -
Received 0 responses for query [], not 1
ERROR 2010-08-18 10:32:25,731 [pool-1-thread-1] org.apache.solr.core.SolrCore - java.lang.NullPointerException
	at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:411)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:308)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
	at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
	at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1144)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)


DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] net.sf.katta.client.Client - broadcast(request([null,
org.apache.solr.katta.KattaRequest@71ce5e7a]), {ibis46.gsf.de:20001=[sen-00003#sen-00003]})
took 10001 msec for ClientResult: 0 results, 0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler -
KattaCommComponent shard: sen-00003 results.size: 0
 WARN 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler -
Received 0 responses for query [], not 1
ERROR 2010-08-18 10:37:55,296 [918077175@qtp-87740549-8] org.apache.solr.core.SolrCore - java.lang.NullPointerException
	at org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:574)
	at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:312)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
	at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
	at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
	at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
	at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:326)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
	at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{noformat}

Interestingly, the back-end servers processing the queries immediately and send the results
to the front-end server:

{noformat}
 INFO 2010-08-18 10:37:45,325 [pool-13-thread-9] org.apache.solr.core.SolrCore - [sen-00003#sen-00003]
webapp=null path=/select params={start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&
ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10}
status=0 QTime=7 
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.solr.katta.SolrKattaServer
- SolrServer.request: ibis46.gsf.de:20001 shards: [sen-00003#sen-00003]
 request params: start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10&shards=sen-00003%23sen-00003
 rsp: {response={numFound=4,start=0,docs=[SolrDocument[{id=pubmed:1567687:1:0, type=sentence,
lang=en, pubdate=Fri Dec 15 11:39:20 CET 2006}],
 SolrDocument[{id=pubmed:17140099:8:0, type=sentence, lang=en, pubdate=Thu Mar 01 11:40:18
CET 2007}],
 SolrDocument[{id=pubmed:12807258:6:0, type=sentence, lang=en, pubdate=Thu Jun 11 11:37:14
CEST 2009}],
 SolrDocument[{id=pubmed:11701068:3:0, type=sentence, lang=en, pubdate=Fri Apr 28 11:36:26
CEST 2006}]]},
 QueriedShards=[Ljava.lang.String;@791ef9f6}
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- Served: request queueTime= 8 procesingTime= 17
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- IPC Server Responder: responding to #30 from 146.107.217.46:58679
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server
- IPC Server Responder: responding to #30 from 146.107.217.46:58679 Wrote 386 bytes.
{noformat}

But if the front-end server cancels the query in case of a timout, always the last sent KattaResponse
was not recognized by the front-end server. I've attached a full communication log of one
failed query for both the front-end ([^front-end.log]) and the back-end ([^backend-end.log])
server.

Did anyone run into the same issue? I hope because the error occurs quit often. I assume this
bug is related to Hadoop RPC, but I could not find a Hadoop JIRA. I also tried the latest
release candidate 0.21.0 of Hadoop.

My idea is now to combine the parallel queries to one back-end server into one single query,
similar to the Lucene queries implemented in Katta.
  
> Integrate Katta
> ---------------
>
>                 Key: SOLR-1395
>                 URL: https://issues.apache.org/jira/browse/SOLR-1395
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: Next
>
>         Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar,
katta.node.properties, katta.zk.properties, log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch,
solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, solr-1395-1431.patch, SOLR-1395.patch,
SOLR-1395.patch, SOLR-1395.patch, test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We'll integrate Katta into Solr so that:
> * Distributed search uses Hadoop RPC
> * Shard/SolrCore distribution and management
> * Zookeeper based failover
> * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message