hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <>
Subject Re: blob handling in hive
Date Tue, 12 Oct 2010 22:49:10 GMT
storing the blob in hbase is too costly. hbase compaction costs lots of 
cpu. All I want to do is to be able to read the byte array out of a sequence 
file, and map that byte array to an hive column.
I can write a SerDe for this purpose.

I tried to define the data to be array<tinyint>. I then tried to write 
custom  SerDe, after  I get the byte array out  of the disk, I need to map 

  so I wrote the code:

but then how to I convert the data in the row.set() method ?

I tried this:

        byte [] bContent=ev.get_content()==null ? null : 
(ev.get_content().getData()==null ? null : ev.get_content().getData()); tContent = 
bContent==null ? new :  new[0]) ;
         row.set(2, tContent);

 this works for a single byte, but doesn't work for byte array.
Any way that I can get the byte array returned in sql is appreciated.


From: "Ted Yu" <>
Sent: Tuesday, October 12, 2010 2:19 PM
To: <>
Subject: Re: blob handling in hive

> One way is to store blob in HBase and use HBaseHandler to access your 
> blob.
> On Tue, Oct 12, 2010 at 2:14 PM, Jinsong Hu <> 
> wrote:
>> Hi,
>>  I am using sqoop to export data from mysql to hive. I noticed that hive
>> don't have blob data type yet. is there anyway I can do so hive can store
>> blob ?
>> Jimmy

View raw message