hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kranthi reddy <kranthili2...@gmail.com>
Subject Re: Porting SQL DB into HBASE
Date Tue, 13 Apr 2010 05:17:29 GMT
Hi all,


@Amandeep : The main reason for porting to Hbase is that it is an open
source. Currently the NGO is paying high licensing fee for Microsoft Sql
server. So in order to save money we planned to port to Hbase because of
scalability for large datasets.

@Jonathan : The problem is that these static tables can't be combined. Each
table describes about different entities. For Eg: One static table might
contain information about all the counties in a country. And another table
might contain information all the doctors present in the country.

That is the reason why I don't think it is possible to combine these static
tables as they don't have any primary/foreign keys referencing others.

The dynamic tables are pretty huge (small when compared to what Hbase can
support). But these tables will be expanded and might contain upto 100
million in the coming future.

Thank you,
kranthi

On Tue, Apr 13, 2010 at 12:17 AM, Michael Segel
<michael_segel@hotmail.com>wrote:

>
>
> Just an idea, take a look at a hierarchical design like Pick.
> I know its doable, but I don't know how well it will perform.
>
>
> > Date: Mon, 12 Apr 2010 14:25:48 +0530
> > Subject: Re: Porting SQL DB into HBASE
> > From: kranthili2020@gmail.com
> > To: hbase-user@hadoop.apache.org
> >
> > HI jonathan,
> >
> > Sorry for the late response. Missed your reply.
> >
> > The problem is, around 80% (400) of the tables are static tables and the
> > remaining 20% (100) are dynamic tables that are updated on a daily basis.
> > The problem is denormalising these 20% tables is also extremely difficult
> > and we are planning to port them directly into hbase. And also
> denormalising
> > these tables would lead to a lot of redundant data.
> >
> > Static tables have number of entries varying in hundreds and mostly less
> > than 1000 entries (rows). Where as the dynamic tables have more than
> 20,000
> > entries and each entry might be updated/modified at least once in a week.
> >
> > Regards,
> > kranthi
> >
> >
> > On Wed, Mar 31, 2010 at 10:23 PM, Jonathan Gray <jgray@facebook.com>
> wrote:
> >
> > > Kranthi,
> > >
> > > HBase can handle a good number of tables, but tens or maybe a hundred.
>  If
> > > you have 500 tables you should definitely be rethinking your schema
> design.
> > >  The issue is less about HBase being able to handle lots of tables, and
> much
> > > more about whether scattering your data across lots of tables will be
> > > performant at read time.
> > >
> > >
> > > 1)  Impossible to answer that question without knowing the schemas of
> the
> > > existing tables.
> > >
> > > 2)  Not really any relation between fault tolerance and the number of
> > > tables except potentially for recovery time but this would be the same
> with
> > > few, very large tables.
> > >
> > > 3)  No difference in write performance.  Read performance if doing
> simple
> > > key lookups would not be impacted, but most like having data spread out
> like
> > > this will mean you'll need joins of some sort.
> > >
> > > Can you tell more about your data and queries?
> > >
> > > JG
> > >
> > > > -----Original Message-----
> > > > From: kranthi reddy [mailto:kranthili2020@gmail.com]
> > > > Sent: Wednesday, March 31, 2010 3:05 AM
> > > > To: hbase-user@hadoop.apache.org
> > > > Subject: Porting SQL DB into HBASE
> > > >
> > > > Hi all,
> > > >
> > > >         I have run into some trouble while trying to port SQL DB to
> > > > Hbase.
> > > > The problem is my SQL DB has around 500 tables (approx) and it is
> very
> > > > badly
> > > > designed. Around 45-50 tables could be denormalised into a single
> table
> > > > and
> > > > the remaining tables are static tables. My doubts are
> > > >
> > > > 1) Is it possible to port this DB (Tables) to Hbase? If possible how?
> > > > 2) How many tables can Hbase support with tolerance towards failure?
> > > > 3) When so many tables are inserted, how is the performance going to
> be
> > > > effected? Will it remain same or degrade?
> > > >
> > > > One possible solution I think is using column family for each table.
> > > > But as
> > > > per my knowledge and previous experiments, I found Hbase isn't stable
> > > > when
> > > > column families are more than 5.
> > > >
> > > > Since every day large quantities of data is ported into the DataBase,
> > > > stability and fail proof system is highest priority.
> > > >
> > > > Hoping for a positive response.
> > > >
> > > > Thank you,
> > > > kranthi
> > >
> >
> >
> >
> > --
> > Kranthi Reddy. B
> > Room No : 98
> > Old Boys Hostel
> > IIIT-HYD
> >
> > -----------
> >
> > I don't know the key to success, but the key to failure is trying to
> impress
> > others.
>
> _________________________________________________________________
> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with
> Hotmail.
>
> http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
>



-- 
Kranthi Reddy. B
Room No : 98
Old Boys Hostel
IIIT-HYD

-----------

I don't know the key to success, but the key to failure is trying to impress
others.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message