hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen Bysani <praveen.ii...@gmail.com>
Subject Re: Block size of HBase files
Date Mon, 13 May 2013 09:45:25 GMT
Hi,

I wanted to minimize on the number of map reduce tasks generated while
processing a job, hence configured it to a larger value.

I don't think i have configured HFile size in the cluster. I use Cloudera
Manager to mange my cluster, and the only configuration i can relate
to is hfile.block.cache.size
which is set to 0.25. How do i change the HFile size ?

On 13 May 2013 15:03, Amandeep Khurana <amansk@gmail.com> wrote:

> On Sun, May 12, 2013 at 11:40 PM, Praveen Bysani <praveen.iiith@gmail.com
> >wrote:
>
> > Hi,
> >
> > I have the dfs.block.size value set to 1 GB in my cluster configuration.
>
>
> Just out of curiosity - why do you have it set at 1GB?
>
>
> > I
> > have around 250 GB of data stored in hbase over this cluster. But when i
> > check the number of blocks, it doesn't correspond to the block size
> value i
> > set. From what i understand i should only have ~250 blocks. But instead
> > when i did a fsck on the /hbase/<table-name>, i got the following
> >
> > Status: HEALTHY
> >  Total size:    265727504820 B
> >  Total dirs:    1682
> >  Total files:   1459
> >  Total blocks (validated):      1459 (avg. block size 182129886 B)
> >  Minimally replicated blocks:   1459 (100.0 %)
> >  Over-replicated blocks:        0 (0.0 %)
> >  Under-replicated blocks:       0 (0.0 %)
> >  Mis-replicated blocks:         0 (0.0 %)
> >  Default replication factor:    3
> >  Average block replication:     3.0
> >  Corrupt blocks:                0
> >  Missing replicas:              0 (0.0 %)
> >  Number of data-nodes:          5
> >  Number of racks:               1
> >
> > Are there any other configuration parameters that need to be set ?
>
>
> What is your HFile size set to? The HFiles that get persisted would be
> bound by that number. Thereafter each HFile would be split into blocks, the
> size of which you configure using the dfs.block.size configuration
> parameter.
>
>
> >
> > --
> > Regards,
> > Praveen Bysani
> > http://www.praveenbysani.com
> >
>



-- 
Regards,
Praveen Bysani
http://www.praveenbysani.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message