hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: Lucene instead of HFiles?
Date Sat, 06 Oct 2012 02:38:36 GMT
Hi Lars,

Yeah, maybe.  Somewhere in the back of my head was a completely fuzzy
idea that if one were to sneak in Lucene at that low level one could
get that full-text search over HBase data that comes up periodically.
Also, I was thinking, having Lucene down there could make it possible
to get ad-hoc reports on data in HBase and one wouldn't have to figure
out the key structure ahead of time.

But I think Jacques makes a good point - there are already
ElasticSearch and Solr.  They are full-text search engines, but people
also use them for pure boolean matching, as key value stores, etc.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Fri, Oct 5, 2012 at 5:11 AM, Lars George <lars.george@gmail.com> wrote:
> Hi Otis,
>
> My initial reaction was, "interesting idea". On second thoughts though I do not see how
this makes more sense compared to what we have now. HFiles combined with Bloom filters are
fast to look up anyways. Adding Lucene as another "Storage Engine" (getting us close to Voldemort
or MySQL with replaceable storage backends) does seem to not add any value, and more so, might
even have a few drawbacks. Especially range scans will suffer, as HFiles and their block oriented
layout plus caching makes for really fast I/O. Lucene is for search, not xyzbytes of data
transfers. And simply replacing the block index and Blooms with Lucene is also I think overkill.
Just saying.
>
> Lars
>
> On Oct 5, 2012, at 5:34 AM, Otis Gospodnetic <otis.gospodnetic@gmail.com> wrote:
>
>> Hi,
>>
>> Has anyone attempted using Lucene instead of HFiles (see
>> https://twitter.com/otisg/status/254047978174701568 )?
>>
>> Is that a completely crazy, bad, would-never-work,
>> don't-bother-trying-this-at-home, it's-too-late-go-to-sleep idea? Or
>> not?
>>
>> Thanks,
>> Otis
>> --
>> Search Analytics - http://sematext.com/search-analytics/index.html
>> Performance Monitoring - http://sematext.com/spm/index.html
>

Mime
View raw message