hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Storing images in Hbase
Date Mon, 28 Jan 2013 18:58:14 GMT
If I were to design a large object store on HBase, I would do the
following: Under a threshold, store the object data into HBase. Over the
threshold, store metadata for the object only into HBase and the object
data itself into a file in HDFS. The threshold could be a fixed byte size
like 100 MB, or you could segment storage by MIME type, for example image/*
into HBase and video/* into HDFS. Video objects might be as large as 5-10
GB, full length features, depending on encoding bitrate. HBase can pack
millions or billions of small objects into much larger indexed files that
can be quickly retrieved, and this helps avoid namespace pressures on the
HDFS NameNode. However, the HBase API cannot do positioned reads of partial
byte ranges of stored objects, while the HDFS API can. Put smaller objects
into HBase. Put larger objects into HDFS so you can stream them at
approximately the same rate that the end user reads them and minimize
overheads for server side buffering. As Jack mentions, there is Hoop (
https://github.com/cloudera/hoop) or WebHDFS (
http://hadoop.apache.org/docs/stable/webhdfs.html) for accessing HDFS via a
RESTful API. Both will let you do positioned reads of partial byte ranges
out of HDFS. On the HBase side, is HBase's REST interface (
http://wiki.apache.org/hadoop/Hbase/Stargate). Put a cache in between the
HDFS and HBase services and the front end because even with the
capabilities of HBase and HDFS you should always have a caching tier
between the datastore and the front end.


On Sun, Jan 27, 2013 at 8:56 AM, Jack Levin <magnito@gmail.com> wrote:

> We did some experiments, open source project HOOP works well with
> interfacing to HDFS to expose REST Api interface to your file system.
>
> -Jack
>
> On Sun, Jan 27, 2013 at 7:37 AM, yiyu jia <jia.yiyu@gmail.com> wrote:
> > Hi Jack,
> >
> > Thanks so much for sharing! Do you have comments on storing video in
> HDFS?
> >
> > thanks and regards,
> >
> > Yiyu
> >
> > On Sat, Jan 26, 2013 at 9:56 PM, Jack Levin <magnito@gmail.com> wrote:
> >
> >> AFAIK, namenode would not like tracking 20 billion small files :)
> >>
> >> -jack
> >>
> >> On Sat, Jan 26, 2013 at 6:00 PM, S Ahmed <sahmed1020@gmail.com> wrote:
> >> > That's pretty amazing.
> >> >
> >> > What I am confused is, why did you go with hbase and not just straight
> >> into
> >> > hdfs?
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Jan 25, 2013 at 2:41 AM, Jack Levin <magnito@gmail.com>
> wrote:
> >> >
> >> >> Two people including myself, its fairly hands off. Took about 3
> months
> >> to
> >> >> tune it right, however we did have had multiple years of experience
> with
> >> >> datanodes and hadoop in general, so that was a good boost.
> >> >>
> >> >> We have 4 hbase clusters today, image store being largest
> >> >> On Jan 24, 2013 2:14 PM, "S Ahmed" <sahmed1020@gmail.com> wrote:
> >> >>
> >> >> > Jack, out of curiosity, how many people manage the hbase related
> >> servers?
> >> >> >
> >> >> > Does it require constant monitoring or its fairly hands-off now?
>  (or
> >> a
> >> >> bit
> >> >> > of both, early days was getting things write/learning and now
its
> >> purring
> >> >> > along).
> >> >> >
> >> >> >
> >> >> > On Wed, Jan 23, 2013 at 11:53 PM, Jack Levin <magnito@gmail.com>
> >> wrote:
> >> >> >
> >> >> > > Its best to keep some RAM for caching of the filesystem,
besides
> we
> >> >> > > also run datanode which takes heap as well.
> >> >> > > Now, please keep in mind that even if you specify heap of
say
> 5GB,
> >> if
> >> >> > > your server opens threads to communicate with other systems
via
> RPC
> >> >> > > (which hbase does a lot), you will indeed use HEAP +
> >> >> > > Nthreads*thread*kb_size.  There is a good Sun Microsystems
> document
> >> >> > > about it. (I don't have the link handy).
> >> >> > >
> >> >> > > -Jack
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > On Mon, Jan 21, 2013 at 5:10 PM, Varun Sharma <
> varun@pinterest.com>
> >> >> > wrote:
> >> >> > > > Thanks for the useful information. I wonder why you
use only 5G
> >> heap
> >> >> > when
> >> >> > > > you have an 8G machine ? Is there a reason to not use
all of it
> >> (the
> >> >> > > > DataNode typically takes a 1G of RAM)
> >> >> > > >
> >> >> > > > On Sun, Jan 20, 2013 at 11:49 AM, Jack Levin <
> magnito@gmail.com>
> >> >> > wrote:
> >> >> > > >
> >> >> > > >> I forgot to mention that I also have this setup:
> >> >> > > >>
> >> >> > > >> <property>
> >> >> > > >>   <name>hbase.hregion.memstore.flush.size</name>
> >> >> > > >>   <value>33554432</value>
> >> >> > > >>   <description>Flush more often. Default:
> 67108864</description>
> >> >> > > >> </property>
> >> >> > > >>
> >> >> > > >> This parameter works on per region amount, so this
means if
> any
> >> of
> >> >> my
> >> >> > > >> 400 (currently) regions on a regionserver has 30MB+
in
> memstore,
> >> the
> >> >> > > >> hbase will flush it to disk.
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> Here are some metrics from a regionserver:
> >> >> > > >>
> >> >> > > >> requests=2, regions=370, stores=370, storefiles=1390,
> >> >> > > >> storefileIndexSize=304, memstoreSize=2233,
> compactionQueueSize=0,
> >> >> > > >> flushQueueSize=0, usedHeap=3516, maxHeap=4987,
> >> >> > > >> blockCacheSize=790656256, blockCacheFree=255245888,
> >> >> > > >> blockCacheCount=2436, blockCacheHitCount=218015828,
> >> >> > > >> blockCacheMissCount=13514652, blockCacheEvictedCount=2561516,
> >> >> > > >> blockCacheHitRatio=94, blockCacheHitCachingRatio=98
> >> >> > > >>
> >> >> > > >> Note, that memstore is only 2G, this particular
regionserver
> >> HEAP is
> >> >> > set
> >> >> > > >> to 5G.
> >> >> > > >>
> >> >> > > >> And last but not least, its very important to have
good GC
> setup:
> >> >> > > >>
> >> >> > > >> export HBASE_OPTS="$HBASE_OPTS -verbose:gc -Xms5000m
> >> >> > > >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
> >> >> > > >> -XX:+PrintGCDateStamps
> >> >> > > >> -XX:+HeapDumpOnOutOfMemoryError
> >> >> -Xloggc:$HBASE_HOME/logs/gc-hbase.log
> >> >> > \
> >> >> > > >> -XX:MaxTenuringThreshold=15 -XX:SurvivorRatio=8
\
> >> >> > > >> -XX:+UseParNewGC \
> >> >> > > >> -XX:NewSize=128m -XX:MaxNewSize=128m \
> >> >> > > >> -XX:-UseAdaptiveSizePolicy \
> >> >> > > >> -XX:+CMSParallelRemarkEnabled \
> >> >> > > >> -XX:-TraceClassUnloading
> >> >> > > >> "
> >> >> > > >>
> >> >> > > >> -Jack
> >> >> > > >>
> >> >> > > >> On Thu, Jan 17, 2013 at 3:29 PM, Varun Sharma <
> >> varun@pinterest.com>
> >> >> > > wrote:
> >> >> > > >> > Hey Jack,
> >> >> > > >> >
> >> >> > > >> > Thanks for the useful information. By flush
size being 15
> %, do
> >> >> you
> >> >> > > mean
> >> >> > > >> > the memstore flush size ? 15 % would mean close
to 1G, have
> you
> >> >> seen
> >> >> > > any
> >> >> > > >> > issues with flushes taking too long ?
> >> >> > > >> >
> >> >> > > >> > Thanks
> >> >> > > >> > Varun
> >> >> > > >> >
> >> >> > > >> > On Sun, Jan 13, 2013 at 8:17 AM, Jack Levin
<
> magnito@gmail.com
> >> >
> >> >> > > wrote:
> >> >> > > >> >
> >> >> > > >> >> That's right, Memstore size , not flush
size is increased.
> >> >> >  Filesize
> >> >> > > is
> >> >> > > >> >> 10G. Overall write cache is 60% of heap
and read cache is
> 20%.
> >> >> >  Flush
> >> >> > > >> size
> >> >> > > >> >> is 15%.  64 maxlogs at 128MB. One namenode
server, one
> >> secondary
> >> >> > that
> >> >> > > >> can
> >> >> > > >> >> be promoted.  On the way to hbase images
are written to a
> >> queue,
> >> >> so
> >> >> > > >> that we
> >> >> > > >> >> can take Hbase down for maintenance and
still do inserts
> >> later.
> >> >> > > >>  ImageShack
> >> >> > > >> >> has ‘perma cache’ servers that allows
writes and serving of
> >> data
> >> >> > even
> >> >> > > >> when
> >> >> > > >> >> hbase is down for hours, consider it 4th
replica 😉
> outside of
> >> >> > hadoop
> >> >> > > >> >>
> >> >> > > >> >> Jack
> >> >> > > >> >>
> >> >> > > >> >>  *From:* Mohit Anchlia <mohitanchlia@gmail.com>
> >> >> > > >> >> *Sent:* ‎January‎ ‎13‎, ‎2013
‎7‎:‎48‎ ‎AM
> >> >> > > >> >> *To:* user@hbase.apache.org
> >> >> > > >> >> *Subject:* Re: Storing images in Hbase
> >> >> > > >> >>
> >> >> > > >> >> Thanks Jack for sharing this information.
This definitely
> >> makes
> >> >> > sense
> >> >> > > >> when
> >> >> > > >> >> using the type of caching layer. You mentioned
about
> >> increasing
> >> >> > write
> >> >> > > >> >> cache, I am assuming you had to increase
the following
> >> parameters
> >> >> > in
> >> >> > > >> >> addition to increase the memstore size:
> >> >> > > >> >>
> >> >> > > >> >> hbase.hregion.max.filesize
> >> >> > > >> >> hbase.hregion.memstore.flush.size
> >> >> > > >> >>
> >> >> > > >> >> On Fri, Jan 11, 2013 at 9:47 AM, Jack Levin
<
> >> magnito@gmail.com>
> >> >> > > wrote:
> >> >> > > >> >>
> >> >> > > >> >> > We buffer all accesses to HBASE with
Varnish SSD based
> >> caching
> >> >> > > layer.
> >> >> > > >> >> > So the impact for reads is negligible.
 We have 70 node
> >> >> cluster,
> >> >> > 8
> >> >> > > GB
> >> >> > > >> >> > of RAM per node, relatively weak nodes
(intel core 2
> duo),
> >> with
> >> >> > > >> >> > 10-12TB per server of disks.  Inserting
600,000 images
> per
> >> day.
> >> >> >  We
> >> >> > > >> >> > have relatively little of compaction
activity as we made
> our
> >> >> > write
> >> >> > > >> >> > cache much larger than read cache
- so we don't
> experience
> >> >> region
> >> >> > > file
> >> >> > > >> >> > fragmentation as much.
> >> >> > > >> >> >
> >> >> > > >> >> > -Jack
> >> >> > > >> >> >
> >> >> > > >> >> > On Fri, Jan 11, 2013 at 9:40 AM, Mohit
Anchlia <
> >> >> > > >> mohitanchlia@gmail.com>
> >> >> > > >> >> > wrote:
> >> >> > > >> >> > > I think it really depends on
volume of the traffic,
> data
> >> >> > > >> distribution
> >> >> > > >> >> per
> >> >> > > >> >> > > region, how and when files compaction
occurs, number of
> >> nodes
> >> >> > in
> >> >> > > the
> >> >> > > >> >> > > cluster. In my experience when
it comes to blob data
> where
> >> >> you
> >> >> > > are
> >> >> > > >> >> > serving
> >> >> > > >> >> > > 10s of thousand+ requests/sec
writes and reads then
> it's
> >> very
> >> >> > > >> difficult
> >> >> > > >> >> > to
> >> >> > > >> >> > > manage HBase without very hard
operations and
> maintenance
> >> in
> >> >> > > play.
> >> >> > > >> Jack
> >> >> > > >> >> > > earlier mentioned they have 1
billion images, It would
> be
> >> >> > > >> interesting
> >> >> > > >> >> to
> >> >> > > >> >> > > know what they see in terms of
compaction, no of
> requests
> >> per
> >> >> > > sec.
> >> >> > > >> I'd
> >> >> > > >> >> be
> >> >> > > >> >> > > surprised that in high volume
site it can be done
> without
> >> any
> >> >> > > >> Caching
> >> >> > > >> >> > layer
> >> >> > > >> >> > > on the top to alleviate IO spikes
that occurs because
> of
> >> GC
> >> >> and
> >> >> > > >> >> > compactions.
> >> >> > > >> >> > >
> >> >> > > >> >> > > On Fri, Jan 11, 2013 at 7:27
AM, Mohammad Tariq <
> >> >> > > dontariq@gmail.com
> >> >> > > >> >
> >> >> > > >> >> > wrote:
> >> >> > > >> >> > >
> >> >> > > >> >> > >> IMHO, if the image files
are not too huge, Hbase can
> >> >> > efficiently
> >> >> > > >> serve
> >> >> > > >> >> > the
> >> >> > > >> >> > >> purpose. You can store some
additional info along with
> >> the
> >> >> > file
> >> >> > > >> >> > depending
> >> >> > > >> >> > >> upon your search criteria
to make the search faster.
> Say
> >> if
> >> >> > you
> >> >> > > >> want
> >> >> > > >> >> to
> >> >> > > >> >> > >> fetch images by the type,
you can store images in one
> >> column
> >> >> > and
> >> >> > > >> its
> >> >> > > >> >> > >> extension in another column(jpg,
tiff etc).
> >> >> > > >> >> > >>
> >> >> > > >> >> > >> BTW, what exactly is the
problem which you are facing.
> >> You
> >> >> > have
> >> >> > > >> >> written
> >> >> > > >> >> > >> "But I still cant do it"?
> >> >> > > >> >> > >>
> >> >> > > >> >> > >> Warm Regards,
> >> >> > > >> >> > >> Tariq
> >> >> > > >> >> > >> https://mtariq.jux.com/
> >> >> > > >> >> > >>
> >> >> > > >> >> > >>
> >> >> > > >> >> > >> On Fri, Jan 11, 2013 at 8:30
PM, Michael Segel <
> >> >> > > >> >> > michael_segel@hotmail.com
> >> >> > > >> >> > >> >wrote:
> >> >> > > >> >> > >>
> >> >> > > >> >> > >> > That's a viable option.
> >> >> > > >> >> > >> > HDFS reads are faster
than HBase, but it would
> require
> >> >> first
> >> >> > > >> hitting
> >> >> > > >> >> > the
> >> >> > > >> >> > >> > index in HBase which
points to the file and then
> >> fetching
> >> >> > the
> >> >> > > >> file.
> >> >> > > >> >> > >> > It could be faster...
we found storing binary data
> in a
> >> >> > > sequence
> >> >> > > >> >> file
> >> >> > > >> >> > and
> >> >> > > >> >> > >> > indexed on HBase to
be faster than HBase, however,
> YMMV
> >> >> and
> >> >> > > HBase
> >> >> > > >> >> has
> >> >> > > >> >> > >> been
> >> >> > > >> >> > >> > improved since we did
that project....
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> > On Jan 10, 2013, at
10:56 PM, shashwat shriparv <
> >> >> > > >> >> > >> dwivedishashwat@gmail.com>
> >> >> > > >> >> > >> > wrote:
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> > > Hi Kavish,
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > > i have a better
idea for you copy your image files
> >> to a
> >> >> > > single
> >> >> > > >> >> file
> >> >> > > >> >> > on
> >> >> > > >> >> > >> > > hdfs, and if new
image comes append it to the
> >> existing
> >> >> > > image,
> >> >> > > >> and
> >> >> > > >> >> > keep
> >> >> > > >> >> > >> > and
> >> >> > > >> >> > >> > > update the metadata
and the offset to the HBase.
> >> Because
> >> >> > if
> >> >> > > you
> >> >> > > >> >> put
> >> >> > > >> >> > >> > bigger
> >> >> > > >> >> > >> > > image in hbase
it wil lead to some issue.
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > > ∞
> >> >> > > >> >> > >> > > Shashwat Shriparv
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > > On Fri, Jan 11,
2013 at 9:21 AM, lars hofhansl <
> >> >> > > >> larsh@apache.org>
> >> >> > > >> >> > >> wrote:
> >> >> > > >> >> > >> > >
> >> >> > > >> >> > >> > >> Interesting.
That's close to a PB if my math is
> >> >> correct.
> >> >> > > >> >> > >> > >> Is there a
write up about this somewhere?
> Something
> >> >> that
> >> >> > we
> >> >> > > >> could
> >> >> > > >> >> > link
> >> >> > > >> >> > >> > >> from the HBase
homepage?
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >> -- Lars
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >> ----- Original
Message -----
> >> >> > > >> >> > >> > >> From: Jack
Levin <magnito@gmail.com>
> >> >> > > >> >> > >> > >> To: user@hbase.apache.org
> >> >> > > >> >> > >> > >> Cc: Andrew
Purtell <apurtell@apache.org>
> >> >> > > >> >> > >> > >> Sent: Thursday,
January 10, 2013 9:24 AM
> >> >> > > >> >> > >> > >> Subject: Re:
Storing images in Hbase
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >> We stored about
1 billion images into hbase with
> >> file
> >> >> > size
> >> >> > > up
> >> >> > > >> to
> >> >> > > >> >> > 10MB.
> >> >> > > >> >> > >> > >> Its been running
for close to 2 years without
> issues
> >> >> and
> >> >> > > >> serves
> >> >> > > >> >> > >> > >> delivery of
images for Yfrog and ImageShack.  If
> you
> >> >> have
> >> >> > > any
> >> >> > > >> >> > >> > >> questions about
the setup, I would be glad to
> answer
> >> >> > them.
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >> -Jack
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >> On Sun, Jan
6, 2013 at 1:09 PM, Mohit Anchlia <
> >> >> > > >> >> > mohitanchlia@gmail.com
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> > >> wrote:
> >> >> > > >> >> > >> > >>> I have
done extensive testing and have found
> that
> >> >> blobs
> >> >> > > don't
> >> >> > > >> >> > belong
> >> >> > > >> >> > >> in
> >> >> > > >> >> > >> > >> the
> >> >> > > >> >> > >> > >>> databases
but are rather best left out on the
> file
> >> >> > system.
> >> >> > > >> >> Andrew
> >> >> > > >> >> > >> > >> outlined
> >> >> > > >> >> > >> > >>> issues
that you'll face and not to mention IO
> >> issues
> >> >> > when
> >> >> > > >> >> > compaction
> >> >> > > >> >> > >> > >> occurs
> >> >> > > >> >> > >> > >>> over large
files.
> >> >> > > >> >> > >> > >>>
> >> >> > > >> >> > >> > >>> On Sun,
Jan 6, 2013 at 12:52 PM, Andrew Purtell
> <
> >> >> > > >> >> > apurtell@apache.org
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> > >> wrote:
> >> >> > > >> >> > >> > >>>
> >> >> > > >> >> > >> > >>>> I meant
this to say "a few really large values"
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>> On
Sun, Jan 6, 2013 at 12:49 PM, Andrew
> Purtell <
> >> >> > > >> >> > >> apurtell@apache.org>
> >> >> > > >> >> > >> > >>>> wrote:
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>>>
Consider if the split threshold is 2 GB but
> your
> >> one
> >> >> > row
> >> >> > > >> >> > contains
> >> >> > > >> >> > >> 10
> >> >> > > >> >> > >> > >> GB
> >> >> > > >> >> > >> > >>>> as
> >> >> > > >> >> > >> > >>>>>
really large value.
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>> --
> >> >> > > >> >> > >> > >>>> Best
regards,
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>>   -
Andy
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>>> Problems
worthy of attack prove their worth by
> >> >> hitting
> >> >> > > >> back. -
> >> >> > > >> >> > Piet
> >> >> > > >> >> > >> > Hein
> >> >> > > >> >> > >> > >>>> (via
Tom White)
> >> >> > > >> >> > >> > >>>>
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> > >>
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >> >
> >> >> > > >> >> > >>
> >> >> > > >> >> >
> >> >> > > >> >>
> >> >> > > >>
> >> >> > >
> >> >> >
> >> >>
> >>
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message