spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aniruddh Tiwari (JIRA)" <>
Subject [jira] [Created] (SPARK-5356) Write to Hbase from Spark
Date Wed, 21 Jan 2015 20:28:34 GMT
Aniruddh Tiwari created SPARK-5356:

             Summary: Write to Hbase from Spark
                 Key: SPARK-5356
             Project: Spark
          Issue Type: Question
          Components: Examples, Spark Shell
    Affects Versions: 1.1.0
         Environment: Linux
            Reporter: Aniruddh Tiwari

I am able to Read in Hbase from Spark, but I am not able to write rows in Hbase from Spark.
I am on Cloudera 5.0 (Spark 1.1.0 and HBase 0.98.6) . So Far this is what I got.

I have a RDD localData, how can save that to Hbase, how can I use saveAsHadoopDataset?
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.client.Result
import org.apache.hadoop.hbase.mapred.TableOutputFormat
import org.apache.hadoop.mapred.JobConf
//Create RDD
val localData = sc.textFile("/home/hbase_example/antiwari/scala_code/resources/scala_load_file.txt")
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "localhost")
val jobConfig: JobConf = new JobConf(conf, this.getClass)
jobConfig.set(TableOutputFormat.OUTPUT_TABLE, "spark_data")
/*Contents of scala_load_file.txt
0000000001, Name01, Field1
0000000002, Name02, Field2
0000000003, Name03, Field3
0000000004, Name04, Field4

I looked at many examples online including (
, i get the following error (may be because I am on spark 1.1.0 and this example is old)

scala> def convert(triple: (Int, String, String)) = {
| val p = new Put(Bytes.toBytes(triple._1))
| p.add(Bytes.toBytes("cf"),
| Bytes.toBytes("col_1"), Bytes.toBytes(triple._2))
| p.add(Bytes.toBytes("cf"),
| Bytes.toBytes("col_2"), Bytes.toBytes(triple._3))
| (new ImmutableBytesWritable, p)
| }
<console>:18: error: not found: type Put
val p = new Put(Bytes.toBytes(triple._1))

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message