hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Am I crazy or that's not that much?
Date Tue, 01 Dec 2015 15:39:24 GMT
bq. current MR implementation my OOME if there is too many columns

This is related:
HBASE-14696 Support setting allowPartialResults in mapreduce Mappers

but it is not in any hbase release yet.

FYI

On Tue, Dec 1, 2015 at 7:16 AM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> I can not say if you are crazy or not. Only you know ;)
>
> Now, regarding the number of columns... it depends...
> If you want to store 800 000 1MB columns, it's almost 800GB for one region.
> Forget that! HBase will not split within a row. So you will kill you RS
> with a that big region. But if you want to store 800 000 8 bytes columns,
> it's only 6MB per row, which is totally doable in recent HBase versions.
> But think about:
> - If no consistency constraint, add the CQ (Column Qualifier) as part of
> the key to be able to split.
> - Regroup some values together if the are accessed together. If you always
> ready 10K at a time, just put those 10K together in a single cell.
>
> Also, keep in mind that current MR implementation my OOME if there is too
> many columns... A fix is coming, but is not ready yet.
>
> Now, regarding column families, use them only if you need them. Very
> different access pattern or data format (JPG vs plain text, etc.) can
> justify another column family, but most of the time you do all what you
> meed with a single one...
>
> HTH,
>
> JMS
>
> 2015-12-01 6:48 GMT-05:00 Marko Dinic <hacker.marko@gmail.com>:
>
> > Hi everyone,
> >
> > I'm new to HBase and I have a simple question - is 800.000 columns a lot
> to
> > be stored in a single column family?
> >
> > This data will be mostly be processed as MR jobs.
> >
> > My guess is that it is not, since all the values are stored in single
> > Region, so there shouldn't be a problem.
> >
> > Is there any limit to number of columns in a column family?
> >
> > --
> > Marko Dinic
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message