hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Li" <annndy....@gmail.com>
Subject Re: long write operations and data recovery
Date Fri, 29 Feb 2008 22:37:49 GMT
What about a hot standby namenode?

For write-ahead-log to avoid crash and recovery, I think this is fine for
small I/O.
For large volume, the write-ahead-log will actually take up the system IO
pretty much that makes 2 IO per block (log and the actual data).  This will
fall back
how current database design implements recovery and crash.

Another thing I don't see in the picture is how Hadoop manage system file
instructions.  Each system has different implementation on their file system
and I believe
that by calling 'write' or 'flush' does not really flush the data to the
disk.  Not sure if this is
inevitable and platform OS dependent, but I cannot find any documents to
describe how
Hadoop handle this.

P.S. I handle HA and fail-over mechanism in my own application, but I think
for a framwork,
it should be transparent (semi-transparent) to the user.


On Fri, Feb 29, 2008 at 1:54 PM, Joydeep Sen Sarma <jssarma@facebook.com>

> I would agree with Ted. You should easily be able to get 100MBps write
> throughput on a standard Netapp box (with read bandwidth left over -
> since the peak write throughput rating is more than twice of that). Even
> at an average write throughput rate of 50MBps - the daily data volume
> would be (drumroll ..) 4+TB!
> So buffer to a decent box and copy stuff over ..
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@veoh.com]
> Sent: Friday, February 29, 2008 11:33 AM
> To: core-user@hadoop.apache.org
> Subject: Re: long write operations and data recovery
> Unless your volume is MUCH higher than ours, I think you can get by with
> a
> relatively small farm of log consolidators that collect and concatenate
> files.
> If each log line is 100 bytes after compression (that is huge really)
> and
> you have 10,000 events per second (also pretty danged high) then you are
> only writing 1MB/s.  If you need a day of buffering (=100,000 seconds),
> then
> you need 100GB of buffer storage.  These are very, very moderate
> requirements for your ingestion point.
> On 2/29/08 11:18 AM, "Steve Sapovits" <ssapovits@invitemedia.com> wrote:
> > Ted Dunning wrote:
> >
> >> In our case, we looked at the problem and decided that Hadoop wasn't
> >> feasible for our real-time needs in any case.  There were several
> >> issues,
> >>
> >> - first, of all, map-reduce itself didn't seem very plausible for
> >> real-time applications.  That left hbase and hdfs as the capabilities
> >> offered by hadoop (for real-time stuff)
> >
> > We'll be using map-reduce batch mode, so we're okay there.
> >
> >> The upshot is that we use hadoop extensively for batch operations
> >> where it really shines.  The other nice effect is that we don't have
> >> to worry all that much about HA (at least not real-time HA) since we
> >> don't do real-time with hadoop.
> >
> > What I'm struggling with is the write side of things.  We'll have a
> huge
> > amount of data to write that's essentially a log format.  It would
> seem
> > that writing that outside of HDFS then trying to batch import it would
> > be a losing battle -- that you would need the distributed nature of
> > to do very large volume writes directly and wouldn't easily be able to
> take
> > some other flat storage model and feed it in as a secondary step
> without
> > having the HDFS side start to lag behind.
> >
> > The realization is that Name Node could go down so we'll have to have
> a
> > backup store that might be used during temporary outages, but that
> > most of the writes would be direct HDFS updates.
> >
> > The alternative would seem to be to end up with a set of distributed
> files
> > without some unifying distributed file system (e.g., like lots of
> Apache
> > web logs on many many individual boxes) and then have to come up with
> > some way to funnel those back into HDFS.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message