hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Bigdatafun <sean.bigdata...@gmail.com>
Subject Re: HDFS without Hadoop: Why?
Date Tue, 01 Feb 2011 02:34:17 GMT
I feel this is a great discussion, so let's think of HDFS' customers.

(1) MapReduce --- definitely a perfect fit as Nathan has pointed out
*(2) HBase --- it seems HBase (Bigtable's log structured file) did a great
job on this. The solution comes out of Google, it must be right. But would
Google necessarily has chosen this approach in its Bigtable system should
GFS did not exist in the first place? i.e, can we have alternative 'best'
approach?*


Anything else? I do not think HDFS is a good file system choice for
enterprise applications.

On Tue, Jan 25, 2011 at 12:37 PM, Nathan Rutman <nrutman@gmail.com> wrote:

> I have a very general question on the usefulness of HDFS for purposes other
> than running distributed compute jobs for Hadoop.  Hadoop and HDFS seem very
> popular these days, but the use of HDFS for other purposes (database
> backend, records archiving, etc) confuses me, since there are other free
> distributed filesystems out there (I personally work on Lustre), with
> significantly better general-purpose performance.
>
> So please tell me if I'm wrong about any of this.  Note I've gathered most
> of my info from documentation rather than reading the source code.
>
> As I understand it, HDFS was written specifically for Hadoop compute jobs,
> with the following design factors in mind:
>
>    - write-once-read-many (worm) access model
>    - use commodity hardware with relatively high failures rates (i.e.
>    assumptive failures)
>    - long, sequential streaming data access
>    - large files
>    - hardware/OS agnostic
>    - moving computation is cheaper than moving data
>
>
> While appropriate for processing many large-input Hadoop data-processing
> jobs, there are significant penalties to be paid when trying to use these
> design factors for more general-purpose storage:
>
>    - Commodity hardware requires data replication for safety.  The HDFS
>    implementation has three penalties: storage redundancy, network loading, and
>    blocking writes.  By default, HDFS blocks are replicated 3x: local,
>    "nearby", and "far away" to minimize the impact of data center catastrophe.
>     In addition to the obvious 3x cost for storage, the result is that every
>    data block must be written "far away" - exactly the opposite of the "Move
>    Computation to Data" mantra.  Furthermore, these over-network writes are
>    synchronous; the client write blocks until all copies are complete on disk,
>    with the longest latency path of 2 network hops plus a disk write gating the
>    overall write speed.   Note that while this would be disastrous for a
>    general-purpose filesystem, with true WORM usage it may be acceptable to
>    penalize writes this way.
>
> Facebook seems to have a more cost effective way to do replication, but I
am not sure about its MapReduce performance -- at the end of the day, there
are only two 'proper' map slot machines that can host a 'cheap' mapper
operation.


>
>    - Large block size implies fewer files.  HDFS reaches limits in the
>    tens of millions of files.
>    - Large block size wastes space for small file.  The minimum file size
>    is 1 block.
>    - There is no data caching.  When delivering large contiguous streaming
>    data, this doesn't matter.  But when the read load is random, seeky, or
>    partial, this is a missing high-impact performance feature.
>
> Yes, can anyone answer this question? -- I want to ask the same question as
well.

>
>    - In a WORM model, changing a small part of a file requires all the
>    file data to be copied, so e.g. database record modifications would be very
>    expensive.
>
> Yes, can anyone answer this question?

>
>    - There are no hardlinks, softlinks, or quotas.
>    - HDFS isn't directly mountable, and therefore requires a non-standard
>    API to use.  (FUSE workaround exists.)
>    - Java source code is very portable and easy to install, but not very
>    quick.
>    - Moving computation is cheaper than moving data.  But the data
>    nonetheless always has to be moved: either read off of a local hard drive or
>    read over the network into the compute node's memory.  It is not necessarily
>    the case that reading a local hard drive is faster than reading a
>    distributed (striped) file over a fast network.  Commodity network (e.g.
>    1GigE), probably yes.  But a fast (and expensive) network (e.g. 4xDDR
>    Infiniband) can deliver data significantly faster than a local commodity
>    hard drive.
>
> I agree with this statement: "It is not necessarily the case that reading a
local hard drive is faster than reading a distributed (striped) file over a
fast network. ", probably Infiniband as well as 10GigE network. And this is
why I feel it might not be a good strategy that HBase entirely attach its
design to HDFS.



> If I'm missing other points, pro- or con-, I would appreciate hearing them.
>  Note again I'm not questioning the success of HDFS in achieving those
> stated design choices, but rather trying to understand HDFS's applicability
> to other storage domains beyond Hadoop.
>
> Thanks for your time.
>
>


-- 
--Sean

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message