hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Noels <stev...@outerthought.org>
Subject Re: Using external indexes in an HBase Map/Reduce job...
Date Tue, 12 Oct 2010 17:21:44 GMT
Did you have a look at Lily? A billion items will be interesting, but we
offer M/R index rebuild (against SOLR) and incremental updates as well. You
could take a look at the RowLog library we did to do this in a robust way -
which has no Lily dependencies.




On Tue, Oct 12, 2010 at 2:36 PM, Michael Segel <michael_segel@hotmail.com>wrote:

> Hi,
> Now I realize that most everyone is sitting in NY, while some of us can't
> leave our respective cities....
> Came across this problem and I was wondering how others solved it.
> Suppose you have a really large table with 1 billion rows of data.
> Since HBase really doesn't have any indexes built in (Don't get me started
> about the contrib/transactional stuff...), you're forced to use some sort of
> external index, or roll your own index table.
> The net result is that you end up with a list object that contains your
> result set.
> So the question is... what's the best way to feed the list object in?
> One option I thought about is writing the object to a file and then using
> it as the file in and then control the splitters. Not the most efficient but
> it would work.
> Was trying to find a more 'elegant' solution and I'm sure that anyone using
> SOLR or LUCENE or whatever... had come across this problem too.
> Any suggestions?
> Thx

Steven Noels
Open Source Content Applications
Makers of Kauri, Daisy CMS and Lily

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message