hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Levin <magn...@gmail.com>
Subject Re: Millions of photos into Hbase
Date Tue, 21 Sep 2010 00:16:16 GMT
Sounds, good, only reason I ask is because of this:

There are currently two active branches of HBase:

    * 0.20 - the current stable release series, being maintained with
patches for bug fixes only. This release series does not support HDFS
durability - edits may be lost in the case of node failure.
    * 0.89 - a development release series with active feature and
stability development, not currently recommended for production use.
This release does support HDFS durability - cases in which edits are
lost are considered serious bugs.
>>>>>>

Are we talking about data loss in case of datanode going down while
being written to, or RegionServer going down?

-jack


On Mon, Sep 20, 2010 at 4:09 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
> We run 0.89 in production @ Stumbleupon.  We also employ 3 committers...
>
> As for safety, you have no choice but to run 0.89.  If you run a 0.20
> release you will lose data.  you must be on 0.89 and
> CDH3/append-branch to achieve data durability, and there really is no
> argument around it.  If you are doing your tests with 0.20.6 now, I'd
> stop and rebase those tests onto the latest DR announced on the list.
>
> -ryan
>
> On Mon, Sep 20, 2010 at 3:17 PM, Jack Levin <magnito@gmail.com> wrote:
>> Hi Stack, see inline:
>>
>> On Mon, Sep 20, 2010 at 2:42 PM, Stack <stack@duboce.net> wrote:
>>> Hey Jack:
>>>
>>> Thanks for writing.
>>>
>>> See below for some comments.
>>>
>>> On Mon, Sep 20, 2010 at 11:00 AM, Jack Levin <magnito@gmail.com> wrote:
>>>>
>>>> Image-Shack gets close to two million image uploads per day, which are
>>>> usually stored on regular servers (we have about 700), as regular
>>>> files, and each server has its own host name, such as (img55).   I've
>>>> been researching on how to improve our backend design in terms of data
>>>> safety and stumped onto the Hbase project.
>>>>
>>>
>>> Any other requirements other than data safety? (latency, etc).
>>
>> Latency is the second requirement.  We have some services that are
>> very short tail, and can produce 95% cache hit rate, so I assume this
>> would really put cache into good use.  Some other services however,
>> have about 25% cache hit ratio, in which case the latency should be
>> 'adequate', e.g. if its slightly worse than getting data off raw disk,
>> then its good enough.   Safely is supremely important, then its
>> availability, then speed.
>>
>>
>>
>>>> Now, I think hbase is he most beautiful thing that happen to
>>>> distributed DB world :).   The idea is to store image files (about
>>>> 400Kb on average into HBASE).
>>>
>>>
>>> I'd guess some images are much bigger than this.  Do you ever limit
>>> the size of images folks can upload to your service?
>>>
>>>
>>> The setup will include the following
>>>> configuration:
>>>>
>>>> 50 servers total (2 datacenters), with 8 GB RAM, dual core cpu, 6 x
>>>> 2TB disks each.
>>>> 3 to 5 Zookeepers
>>>> 2 Masters (in a datacenter each)
>>>> 10 to 20 Stargate REST instances (one per server, hash loadbalanced)
>>>
>>> Whats your frontend?  Why REST?  It might be more efficient if you
>>> could run with thrift given REST base64s its payload IIRC (check the
>>> src yourself).
>>
>> For insertion we use Haproxy, and balance curl PUTs across multiple REST APIs.
>> For reading, its a nginx proxy that does Content-type modification
>> from image/jpeg to octet-stream, and vice versa,
>> it then hits Haproxy again, which hits balanced REST.
>> Why REST, it was the simplest thing to run, given that its supports
>> HTTP, potentially we could rewrite something for thrift, as long as we
>> can use http still to send and receive data (anyone wrote anything
>> like that say in python, C or java?)
>>
>>>
>>>> 40 to 50 RegionServers (will probably keep masters separate on dedicated
boxes).
>>>> 2 Namenode servers (one backup, highly available, will do fsimage and
>>>> edits snapshots also)
>>>>
>>>> So far I got about 13 servers running, and doing about 20 insertions /
>>>> second (file size ranging from few KB to 2-3MB, ave. 400KB). via
>>>> Stargate API.  Our frontend servers receive files, and I just
>>>> fork-insert them into stargate via http (curl).
>>>> The inserts are humming along nicely, without any noticeable load on
>>>> regionservers, so far inserted about 2 TB worth of images.
>>>> I have adjusted the region file size to be 512MB, and table block size
>>>> to about 400KB , trying to match average access block to limit HDFS
>>>> trips.
>>>
>>> As Todd suggests, I'd go up from 512MB... 1G at least.  You'll
>>> probably want to up your flush size from 64MB to 128MB or maybe 192MB.
>>
>> Yep, i will adjust to 1G.  I thought flush was controlled by a
>> function of memstore HEAP, something like 40%?  Or are you talking
>> about HDFS block size?
>>
>>>  So far the read performance was more than adequate, and of
>>>> course write performance is nowhere near capacity.
>>>> So right now, all newly uploaded images go to HBASE.  But we do plan
>>>> to insert about 170 Million images (about 100 days worth), which is
>>>> only about 64 TB, or 10% of planned cluster size of 600TB.
>>>> The end goal is to have a storage system that creates data safety,
>>>> e.g. system may go down but data can not be lost.   Our Front-End
>>>> servers will continue to serve images from their own file system (we
>>>> are serving about 16 Gbits at peak), however should we need to bring
>>>> any of those down for maintenance, we will redirect all traffic to
>>>> Hbase (should be no more than few hundred Mbps), while the front end
>>>> server is repaired (for example having its disk replaced), after the
>>>> repairs, we quickly repopulate it with missing files, while serving
>>>> the missing remaining off Hbase.
>>>> All in all should be very interesting project, and I am hoping not to
>>>> run into any snags, however, should that happens, I am pleased to know
>>>> that such a great and vibrant tech group exists that supports and uses
>>>> HBASE :).
>>>>
>>>
>>> We're definetly interested in how your project progresses.  If you are
>>> ever up in the city, you should drop by for a chat.
>>
>> Cool.  I'd like that.
>>
>>> St.Ack
>>>
>>> P.S. I'm also w/ Todd that you should move to 0.89 and blooms.
>>> P.P.S I updated the wiki on stargate REST:
>>> http://wiki.apache.org/hadoop/Hbase/Stargate
>>
>> Cool, I assume if we move to that it won't kill existing meta tables,
>> and data?  e.g. cross compatible?
>> Is 0.89 ready for production environment?
>>
>> -Jack
>>
>

Mime
View raw message