hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gayatri Rao <rgayat...@gmail.com>
Subject Re: Newbie question
Date Thu, 18 Nov 2010 09:55:49 GMT
Thank you for the feedback and clearing the confusion.

Thanks,
Gayatri

On Mon, Nov 15, 2010 at 3:59 PM, Jonathan Gray <jgray@facebook.com> wrote:

> > Thank you for the feedback. So to summarize, HBase is doing good for
> > high
> > reads, writes. Update is really writing a new version of the data. So
> > updating is okay but Handling deletes is not possible in the current
> > version
> > of the data unless a new version of the data is written down.
>
> Deletes are supported (you can delete all of a row, all of a column, or
> specific versions of columns).
>
> They are really tombstones / markers, so the data does actually still sit
> on disk for some time, but HBase will never return it back to you once it is
> marked as deleted.  In the background and over time, HBase will eventually
> evict all of the deleted data.
>
>
> > Also, I was reading some documentation to figure out if there is a way
> > to
> > store and get column values in a sorted manner.
> > I understand It is possible to do range queries on key (as the key is
> > sorted
> > and stored) but it looks like its not straight forward to do the same
> > on the
> > columns values. For example I have a set of column values with a name
> > and a
> > score and for a given key and i want to retrieve the column names for a
> > given key sorted by the score. From my understanding so far, this has
> > to be
> > handled at the application end. Please let me know if I am missing
> > something
> > here.
>
> You're not missing something.  HBase tables are sorted by row, each row is
> sorted by columns, each column is sorted by versions.  There is no sorting
> on values.
>
> You would either have to read all the values and do the sorting in the
> client (sometimes this makes sense but if you have 1M columns it probably
> doesn't).  The other way would be to create more tables.  A table can be
> used to create a different index on your data (the value would now be the
> row key, so the table would be sorted by value, for example).
>
> Hope that helps.
>
> JG
>
>
>
>
> >
> > Thanks,
> > Gayatri
> >
> > On Mon, Nov 15, 2010 at 10:29 AM, Ryan Rawson <ryanobjc@gmail.com>
> > wrote:
> >
> > > That is a static snapshot of a particular version of HBase with a
> > > particular version of their code (each with various flaws, mistakes,
> > > etc, etc).
> > >
> > > At this moment, Stumbleupon uses HBase behind parts of it's website,
> > > doing reads, writes, updates, and so on.  Performance is quite good,
> > > and we are very happy with HBase.
> > >
> > > -ryan
> > >
> > > On Sun, Nov 14, 2010 at 8:54 PM, Hari Sreekumar
> > > <hsreekumar@clickable.com> wrote:
> > > > Hi,
> > > >   I read the comparison from this pdf:
> > > >   http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf
> > > >
> > > > hari
> > > >
> > > >
> > > > On Mon, Nov 15, 2010 at 4:20 AM, Jonathan Gray <jgray@facebook.com>
> > > wrote:
> > > >
> > > >> HBase is well-suited for a high-write workload.
> > > >>
> > > >> Hari, I'm not sure what would be different in a database like
> > Cassandra
> > > >> with respect to updates and deletes?  In this regard HBase and
> > Cassandra
> > > are
> > > >> nearly identical (updates are really just insertions of new
> > versions,
> > > >> deletions are actually tombstone markers... ie data is immutable
> > once
> > > >> written).
> > > >>
> > > >> JG
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: Hari Sreekumar [mailto:hsreekumar@clickable.com]
> > > >> > Sent: Friday, November 12, 2010 6:21 AM
> > > >> > To: user@hbase.apache.org
> > > >> > Subject: Re: Newbie question
> > > >> >
> > > >> > Hi Gayatri,
> > > >> >
> > > >> >              I am myself quite new to hbase but from my little
> > > >> > experience
> > > >> > and from whatever I have read, HBase is more suitable for
> > environments
> > > >> > with
> > > >> > high read and write, but very few updates and no real deletions.
> > It is
> > > >> > more
> > > >> > of a write once and forget kind of database. Cassandra or
> > MongoDB
> > > might
> > > >> > be
> > > >> > more suitable for your requirement imo. My advice would be to
> > consider
> > > >> > those
> > > >> > as well before making any decision.
> > > >> >
> > > >> > thanks,
> > > >> > hari
> > > >> >
> > > >> > On Fri, Nov 12, 2010 at 7:00 PM, Gayatri Rao
> > <rgayatri1@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Hi All,
> > > >> > >
> > > >> > > I am new to hbase. I have been reading up documentation
and
> > studying
> > > >> > how
> > > >> > > hbase suits to our requirement.
> > > >> > >
> > > >> > > We want to be able to store a key and corresponding values.
> > However,
> > > >> > while
> > > >> > > reading, i want to read values in sorted order something
like
> > the
> > > >> > topN. Its
> > > >> > > a web facing environment and our requirement is write heavy
> > infact
> > > >> > they are
> > > >> > > updates of the already existing records (about 270K updates
in
> > an
> > > >> > hour
> > > >> > > though actual data that needs to be stored in it might be
much
> > much
> > > >> > more).
> > > >> > > Deletes would be in the order of a few thousands every day.
> > > >> > >
> > > >> > > I wanted to find out know your opinions on how good is hbase
> > for
> > > this
> > > >> > kind
> > > >> > > of scenario.
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Gayatri
> > > >> > >
> > > >>
> > > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message