hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Does adding new columns cause compaction storm?
Date Tue, 13 Oct 2015 21:21:28 GMT
It depends ;)

If the added column trigger a flush, this flush might trigger a compaction
;)

But it will be the exact same thing with an existing column. It's not
because it's a new column that it will trigger a compaction. Any mutation
command might trigger a flush then a compaction. What ever column it is.

HTH

JMS

2015-10-11 21:49 GMT-04:00 anil gupta <anilgupta84@gmail.com>:

> Hi Liren,
>
> In short, adding new columns will *not* trigger compaction.
>
>
> THanks,
> Anil Gupta
>
> On Sat, Oct 10, 2015 at 9:20 PM, Liren Ding <sky.gonna.bright@gmail.com>
> wrote:
>
> > Thanks Ted. So far I don't see direct answer yet in any hbase books or
> > articles. all resources say that values are ordered by rowkey:cf:column,
> > but no one explains how new columns are stored after compaction. I think
> > after compaction the store files should still follow the same way to
> > organize data. So if a new column need to be added in all rows regularly,
> > the compaction might have to extra works I/O operations accordingly.
> Maybe
> > the schema design better to keep old data intact instead of keep adding
> new
> > columns into it.
> >
> > On Sat, Oct 10, 2015 at 7:55 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Please take a look at:
> > >
> > > http://hbase.apache.org/book.html#_compaction
> > > http://hbase.apache.org/book.html#exploringcompaction.policy
> > >
> > >
> >
> http://hbase.apache.org/book.html#compaction.ratiobasedcompactionpolicy.algorithm
> > >
> > > FYI
> > >
> > > On Sat, Oct 10, 2015 at 6:53 PM, Liren Ding <
> sky.gonna.bright@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to design a schema for time series events data. The row
> key
> > > is
> > > > eventId, and event data is added into new "date" columns daily. So
> in a
> > > > query I only need to set filter on columns to find all data for
> > specified
> > > > events. The table should look like following:
> > > >
> > > > rowkey  |  09-01-2015 | 09-02-2015 | ......
> > > >
> > > > eventid1      data11              data12
> > > > eventid2      data21              data22
> > > > eventid3      ......                    ,......
> > > > .......
> > > >
> > > > I know during compaction the data with same row key will be stored
> > > > together. So with this design, will new columns cause compaction
> storm?
> > > Or
> > > > any other issues?
> > > > Appreciate!
> > > >
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Anil Gupta
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message