hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: Is it ok to store all integers as Strings instead of byte[] in hbase?
Date Fri, 08 Jul 2016 19:17:18 GMT
Hi Mahesha,

I think its not a good idea to store Numbers/Dates as String. If you store
numbers as strings then you wont be able to do numerical/date comparison.
HBase is Data Type Agnostic. IMO, you will be better off by using Apache
Phoenix(http://phoenix.apache.org/). Phoenix is a sql layer on top of
HBase. It is ANSI SQL compliant.

Currently Phoenix is officially supported by HDP and it is also present in
cloudera labs.

HTH,
Anil Gupta

On Fri, Jul 8, 2016 at 5:18 AM, Dima Spivak <dspivak@cloudera.com> wrote:

> Hey Mahesha,
>
> It might be worthwhile to read through the architecture section of our ref
> guide: https://hbase.apache.org/book.html#_architecture
>
> Cheers,
>   Dima
>
> On Friday, July 8, 2016, Mahesha999 <abnave.m@gmail.com> wrote:
>
> > I am trying out some hbase code. I realised that when I insert data
> through
> > hbase shell using put command, then everything (both numeric and string)
> is
> > put as string:
> >
> > hbase(main):001:0> create 'employee', {NAME => 'f'}
> > hbase(main):003:0> put 'employee', 'ganesh','f:age',30
> > hbase(main):004:0> put 'employee', 'ganesh','f:desg','mngr'
> > hbase(main):005:0> scan 'employee'
> > ROW                   COLUMN+CELL
> > ganesh               column=f:age, timestamp=1467926618738, value=30
> > ganesh               column=f:desg, timestamp=1467926639557, value=mngr
> >
> > However when I put data using Java API, non-string stuff gets serialized
> as
> > byte[]:
> >
> > Cluster lNodes = new Cluster();
> > lNodes.add("digitate-VirtualBox:8090");
> > Client lClient= new Client(lNodes);
> > RemoteHTable remoteht = new RemoteHTable(lClient, "employee");
> >
> > Put lPut = new Put(Bytes.toBytes("mahesh"));
> > lPut.add(Bytes.toBytes("f"), Bytes.toBytes("age"), Bytes.toBytes(25));
> > lPut.add(Bytes.toBytes("f"), Bytes.toBytes("desg"),
> Bytes.toBytes("dev"));
> > remoteht.put(lPut);
> >
> > Scan in hbase shell shows age 25 of mahesh is stored as \x00\x00\x00\x19:
> >
> > hbase(main):006:0> scan 'employee'
> > ROW                   COLUMN+CELL
> > ganesh               column=f:age, timestamp=1467926618738, value=30
> > ganesh               column=f:desg, timestamp=1467926639557, value=mngr
> > mahesh               column=f:age, timestamp=1467926707712,
> > value=\x00\x00\x00\x19
> > mahesh               column=f:desg, timestamp=1467926707712, value=dev
> >
> > *1.* Considering I will be storing only numeric and string data in hbase,
> > what benefits it does provide to store numeric data as byte[] (as in case
> > of
> > above) or as string:
> > lPut.add(Bytes.toBytes("f"), Bytes.toBytes("age"), Bytes.toBytes("25"));
> > //instead of toBytes(25)
> >
> > *2.*Also why strings are stored as is and are not serialized to byte[]
> even
> > when put using Java API?
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/Is-it-ok-to-store-all-integers-as-Strings-instead-of-byte-in-hbase-tp4081100.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message