hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amandeep Khurana <ama...@gmail.com>
Subject Re: Store Large files/images HBase
Date Mon, 19 Oct 2009 17:04:22 GMT
comments inline

Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

On Mon, Oct 19, 2009 at 6:58 AM, Fred Zappert <fzappert@gmail.com> wrote:

> Does anyone want to pick up on this?
> ---------- Forwarded message ----------
> From: Luis Carlos Junges <luis.junges@gmail.com>
> Date: Mon, Oct 19, 2009 at 4:14 AM
> Subject: Store Large files/images HBase
> To: general@hadoop.apache.org
> Hi,
> I am currently doing some research on distributed database that can be
> scaled easily in terms of storage capacity.
> The reason is to use it on the brazilian federal project called
> "portal do aluno" wich will have around 10 million kids accessing it
> monthly. The idea is to build a portal similar to facebook/orkut with
> the main objective to spread knowledge amoung kids (6 -13 years old).
> well, now the problem:
> Those kids will generate a lot of data which include photos, videos,
> presentations, school tasks among others. In order to have a 100%
> available system and also to scale this amount of data (initial
> estimative is 10  TB at the full use of the portal), a distributed
> storage engine seems to be the solution.
> On the avialable solutions, i liked voldemort because it seems not to have
> a
> SPOF (single point of
> failure) when compared to HBase. However HBase seems to integrate with more
> tools and sub-projects.

The Hbase 0.20 release doesnt have an SPOF. We have the capability of having
multiple masters daemons running on different nodes. The master is elected
out of one of them through zookeeper.

> my question is concerned to the fact of storing such big items (2 MB
> photo for example) with HBase. I read on on blogs that HBase has a high
> latency which leads it to
> be inappropriate to serve dynamic pages. Will the performance of HBase
> decrease even more if large binary objects are stored on it?

Again, the 0.20 release has solved the problem of high latency to a great
degree. The read speeds are comparable to a MySQL database. Ofcourse, larger
objects would mean more time to read.

> Other question i have is related to the fact of modelling the data
> using key/value pattern. With relational database it is just follow cake
> recipe and it´s done. Do we have such recipe for key/value? Currently
> a lot of code was done with relational database postgreSQL using
> hibernate to mapping the objects.
The modelling will depend on the kind of queries you want to do. Post a
little more about the kind of data you have and the queries you want to do
on it. You can get specific tips accordingly.

> i will appreciate any comments
> --
> "A realidade de cada lugar e de cada época é uma alucinação coletiva."
> Bloom, Howard

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message