spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: Spark Cannot Connect to HBaseClusterSingleton
Date Wed, 26 Aug 2015 15:08:22 GMT
Btw, here is the source code of GoraInputFormat.java :

https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/mapreduce/GoraInputFormat.java
26 Ağu 2015 18:05 tarihinde "Furkan KAMACI" <furkankamaci@gmail.com> yazdı:

> I'll send an e-mail to Gora dev list too and also attach my patch into my
> GSoC Jira issue you mentioned and then we can continue at there.
>
> Before I do that stuff, I wanted to get Spark dev community's ideas to
> solve my problem due to you may have faced such kind of problems before.
> 26 Ağu 2015 17:13 tarihinde "Ted Yu" <yuzhihong@gmail.com> yazdı:
>
>> I found GORA-386 Gora Spark Backend Support
>>
>> Should the discussion be continued there ?
>>
>> Cheers
>>
>> On Wed, Aug 26, 2015 at 7:02 AM, Ted Malaska <ted.malaska@cloudera.com>
>> wrote:
>>
>>> Where is the input format class.  When every I use the search on your
>>> github it says "We couldn’t find any issues matching 'GoraInputFormat'"
>>>
>>>
>>>
>>> On Wed, Aug 26, 2015 at 9:48 AM, Furkan KAMACI <furkankamaci@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Here is the MapReduceTestUtils.testSparkWordCount()
>>>>
>>>>
>>>> https://github.com/kamaci/gora/blob/master/gora-core/src/test/java/org/apache/gora/mapreduce/MapReduceTestUtils.java#L108
>>>>
>>>> Here is SparkWordCount
>>>>
>>>>
>>>> https://github.com/kamaci/gora/blob/8f1acc6d4ef6c192e8fc06287558b7bc7c39b040/gora-core/src/examples/java/org/apache/gora/examples/spark/SparkWordCount.java
>>>>
>>>> Lastly, here is GoraSparkEngine:
>>>>
>>>>
>>>> https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/spark/GoraSparkEngine.java
>>>>
>>>> Kind Regards,
>>>> Furkan KAMACI
>>>>
>>>> On Wed, Aug 26, 2015 at 4:40 PM, Ted Malaska <ted.malaska@cloudera.com>
>>>> wrote:
>>>>
>>>>> Where can I find the code for MapReduceTestUtils.testSparkWordCount?
>>>>>
>>>>> On Wed, Aug 26, 2015 at 9:29 AM, Furkan KAMACI <furkankamaci@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Here is the test method I've ignored due to Connection Refused
>>>>>> problem failure:
>>>>>>
>>>>>>
>>>>>> https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65
>>>>>>
>>>>>> I've implemented a Spark backend for Apache Gora as GSoC project
and
>>>>>> this is the latest obstacle that I should solve. If you can help
me, you
>>>>>> are welcome.
>>>>>>
>>>>>> Kind Regards,
>>>>>> Furkan KAMACI
>>>>>>
>>>>>> On Wed, Aug 26, 2015 at 3:45 PM, Ted Malaska <
>>>>>> ted.malaska@cloudera.com> wrote:
>>>>>>
>>>>>>> I've always used HBaseTestingUtility and never really had much
>>>>>>> trouble. I use that for all my unit testing between Spark and
HBase.
>>>>>>>
>>>>>>> Here are some code examples if your interested
>>>>>>>
>>>>>>> --Main HBase-Spark Module
>>>>>>> https://github.com/apache/hbase/tree/master/hbase-spark
>>>>>>>
>>>>>>> --Unit test that cover all basic connections
>>>>>>>
>>>>>>> https://github.com/apache/hbase/blob/master/hbase-spark/src/test/scala/org/apache/hadoop/hbase/spark/HBaseContextSuite.scala
>>>>>>>
>>>>>>> --If you want to look at the old stuff before it went into HBase
>>>>>>> https://github.com/cloudera-labs/SparkOnHBase
>>>>>>>
>>>>>>> Let me know if that helps
>>>>>>>
>>>>>>> On Wed, Aug 26, 2015 at 5:40 AM, Ted Yu <yuzhihong@gmail.com>
wrote:
>>>>>>>
>>>>>>>> Can you log the contents of the Configuration you pass from
Spark ?
>>>>>>>> The output would give you some clue.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Aug 26, 2015, at 2:30 AM, Furkan KAMACI <furkankamaci@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Ted,
>>>>>>>>
>>>>>>>> I'll check Zookeeper connection but another test method which
runs
>>>>>>>> on hbase without Spark works without any error. Hbase version
is
>>>>>>>> 0.98.8-hadoop2 and I use Spark 1.3.1
>>>>>>>>
>>>>>>>> Kind Regards,
>>>>>>>> Furkan KAMACI
>>>>>>>> 26 Ağu 2015 12:08 tarihinde "Ted Yu" <yuzhihong@gmail.com>
yazdı:
>>>>>>>>
>>>>>>>>> The connection failure was to zookeeper.
>>>>>>>>>
>>>>>>>>> Have you verified that localhost:2181 can serve requests
?
>>>>>>>>> What version of hbase was Gora built against ?
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Aug 26, 2015, at 1:50 AM, Furkan KAMACI <furkankamaci@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I start an Hbase cluster for my test class. I use that
helper
>>>>>>>>> class:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/apache/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/util/HBaseClusterSingleton.java
>>>>>>>>>
>>>>>>>>> and use it as like that:
>>>>>>>>>
>>>>>>>>> private static final HBaseClusterSingleton cluster =
>>>>>>>>> HBaseClusterSingleton.build(1);
>>>>>>>>>
>>>>>>>>> I retrieve configuration object as follows:
>>>>>>>>>
>>>>>>>>> cluster.getConf()
>>>>>>>>>
>>>>>>>>> and I use it at Spark as follows:
>>>>>>>>>
>>>>>>>>> sparkContext.newAPIHadoopRDD(conf, MyInputFormat.class,
clazzK,
>>>>>>>>>     clazzV);
>>>>>>>>>
>>>>>>>>> When I run my test there is no need to startup an Hbase
cluster
>>>>>>>>> because Spark will connect to my dummy cluster. However
when I run my test
>>>>>>>>> method it throws an error:
>>>>>>>>>
>>>>>>>>> 2015-08-26 01:19:59,558 INFO [Executor task launch
>>>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
>>>>>>>>> (ClientCnxn.java:logStartConnect(966)) - Opening socket
connection to
>>>>>>>>> server localhost/127.0.0.1:2181. Will not attempt to
authenticate
>>>>>>>>> using SASL (unknown error)
>>>>>>>>>
>>>>>>>>> 2015-08-26 01:19:59,559 WARN [Executor task launch
>>>>>>>>> worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
>>>>>>>>> (ClientCnxn.java:run(1089)) - Session 0x0 for server
null, unexpected
>>>>>>>>> error, closing socket connection and attempting reconnect
>>>>>>>>> java.net.ConnectException: Connection refused at
>>>>>>>>> sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
>>>>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at
>>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
>>>>>>>>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
>>>>>>>>> Hbase tests, which do not run on Spark, works well. When
I check
>>>>>>>>> the logs I see that cluster and Spark is started up correctly:
>>>>>>>>>
>>>>>>>>> 2015-08-26 01:35:21,791 INFO [main] hdfs.MiniDFSCluster
>>>>>>>>> (MiniDFSCluster.java:waitActive(2055)) - Cluster is active
>>>>>>>>>
>>>>>>>>> 2015-08-26 01:35:40,334 INFO [main] util.Utils
>>>>>>>>> (Logging.scala:logInfo(59)) - Successfully started service
'sparkDriver' on
>>>>>>>>> port 56941.
>>>>>>>>> I realized that when I start up an hbase from command
line my test
>>>>>>>>> method for Spark connects to it!
>>>>>>>>>
>>>>>>>>> So, does it means that it doesn't care about the conf
I passed to
>>>>>>>>> it? Any ideas about how to solve it?
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Mime
View raw message