hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upendra Yadav <upendra1...@gmail.com>
Subject Re: Is HBase is feasible for storing 4-5 MB of data as cell value
Date Tue, 25 Feb 2014 20:30:46 GMT
Me too realize same what you suggest...: (Keep them in a separate files in
HDFS and store in HBase only references)

will try several attachments into a single file...

And Thanks a lot...


On Wed, Feb 26, 2014 at 1:45 AM, Vladimir Rodionov
<vrodionov@carrieriq.com>wrote:

> Usually, it is not advisable to store such a large values in HBase (to
> avoid excessive IO during compaction).
> Keep them in a separate files in HDFS and store in HBase only references.
> To overcome inherent max file number limitation of NN
> you can bulk several values into a single file (you will need separate
> process -M/R job to garbage collect expired or deleted items).
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Tuesday, February 25, 2014 12:02 PM
> To: user@hbase.apache.org
> Subject: Re: Is HBase is feasible for storing 4-5 MB of data as cell value
>
> Minor:
> Value 0 also means no cap - see HTable#validatePut()
>
>     if (maxKeyValueSize > 0) {
>
> ...
>
>           if (kv.getLength() > maxKeyValueSize) {
>
>             throw new IllegalArgumentException("KeyValue size too large");
>
>           }
>
>
> On Tue, Feb 25, 2014 at 11:52 AM, Ameya Kanitkar <ameya@groupon.com>
> wrote:
>
> > The only other thing I'd add is, by default HBase caps size of the data
> per
> > column at 10 MB (I think). You can change that by changing this setting:
> >
> > hbase.client.keyvalue.maxsize
> > in hbase-site.xml
> >
> > -1 means no cap. You can put other numbers for appropriate cap for your
> use
> > case.
> >
> > Ameya
> >
> >
> > On Tue, Feb 25, 2014 at 12:12 AM, shashwat shriparv <
> > dwivedishashwat@gmail.com> wrote:
> >
> > > Yes for sure you can use hbase for this, you can have
> > > 1. different fields of mail in different column of a column family and
> > > attachment as a binary array also in a column.
> > > 2. you can keep whole message in columns in hbase and the attachments
> are
> > > large enoug on the hdfs and some reference to it in hbase table.
> > > 3. schema you can decide, you can have a matrix how you store values to
> > > that you can decide.
> > >
> > >
> > > *Warm Regards_**∞_*
> > > * Shashwat Shriparv*
> > >  [image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]<
> > > http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image:
> > > https://twitter.com/shriparv] <https://twitter.com/shriparv>[image:
> > > https://www.facebook.com/shriparv] <https://www.facebook.com/shriparv
> > > >[image:
> > > http://google.com/+ShashwatShriparv]
> > > <http://google.com/+ShashwatShriparv>[image:
> > > http://www.youtube.com/user/sShriparv/videos]<
> > > http://www.youtube.com/user/sShriparv/videos>[image:
> > > http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] <
> > shriparv@yahoo.com>
> > >
> > >
> > >
> > > On Tue, Feb 25, 2014 at 12:55 PM, Upendra Yadav <upendra1024@gmail.com
> > > >wrote:
> > >
> > > > I have to use hbase and have mix type of data
> > > >
> > > > Some of them have size 1-4K(Mail- Header....) and others
> > > > >5MB(Attachments...)
> > > >
> > > > And also we need only random access: any data
> > > >
> > > > Is HBase is feasible for storing this type of data
> > > >
> > > > What will be my schema design -
> > > > will have to go with 2 different Table -> 1st one for  1-4K and 2nd
> for
> > > big
> > > > file
> > > > (because of memstore flush will flush other CF, and huge random
> access)
> > > >
> > > > Or there is other way:;
> > > >
> > > > Thanks
> > > >
> > >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message