hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: HBase and Lucene for realtime search
Date Mon, 14 Feb 2011 02:09:30 GMT
I think there's another way to look at this, and that is what types of
queries do HBase users perform that search can enhance?  Eg, given we
can index extremely quickly with Lucene and with RT we can search with
near-zero latency, perhaps there are new queries that would be of
interest/useful to HBase users?  Things like traditional SQLish
queries with multiple clauses should be possible?

> I haven't seen any search sites that absolutely need transactional
> consistency.

While this is true, databases usually require this?  And so this is
somewhat of an out-of-the-box view on search, and this's why it's
perhaps better to frame it more in the context databases, eg,
transactions, consistency, and complex queries.

On Sun, Feb 13, 2011 at 1:36 AM, Ted Dunning <tdunning@maprtech.com> wrote:
> Transactional consistency isn't going to happen if you even involve more
> than one hbase row.
>
> I haven't seen any search sites that absolutely need transactional
> consistency.  What they need is that documents can be found very shortly
> after they are inserted and that crashes won't compromise that.
>
> On Sat, Feb 12, 2011 at 1:31 PM, Jason Rutherglen <
> jason.rutherglen@gmail.com> wrote:
>
>> Right, the concepts aren't that hard (write ahead log etc), however to
>> keep the data transactionally consistent with another datastore across
>> servers [I believe] is a little more difficult?  Also with RT there
>> needs to be a primary data store somewhere outside of Lucene,
>> otherwise we'd be storing the same data twice, eg, in HBase and
>> Lucene, that's inefficient.  I'm guessing it'll be easier to keep
>> Lucene indexes in parallel with HBase regions across servers, and then
>> use the Coprocessor architecture etc, to keep them in'sync, on the
>> same server.  When a region is split, we'd need to also split the
>> Lucene index, this'd be the only 'new' technology that'd need to be
>> created on the Lucene side.
>>
>> I think it's advantageous to build a distributed search system that
>> mirrors the underlying data, if the search indices are on their own
>> servers, I think there's always going to be sync'ing problems?
>>
>> On Sat, Feb 12, 2011 at 1:14 PM, Ted Dunning <tdunning@maprtech.com>
>> wrote:
>> > I really think that putting update semantics into Katta would be much
>> > easier.
>> >
>> > Building the write-ahead log for the lucene case isn't all that hard.  If
>> > you follow the Zookeeper model of having a WAL thread that writes batches
>> of
>> > log entries you can get pretty high speed as well.  The basic idea is
>> that
>> > update requests are put into a queue of pending log writes, but are
>> written
>> > to the index immediately.  When the WAL thread finishes the previous
>> trenche
>> > of log items, it comes back around and takes everything that is pending.
>> >  When it finishes a trenche of writes, it releases all of the pending
>> > updates in a batch.  If updates are lot frequent, then you lose no
>> latency.
>> >  If you updates are very high speed, then you transition seamlessly to a
>> > bandwidth oriented scheme of large updates while latency is roughly
>> bounded
>> > to 2-3x the original case.
>> >
>> > If you put the write-ahead log on a reliable replicated file system then,
>> as
>> > you say, much of the complexity of write ahead logging goes away.
>> >
>> > But this verges off topic for hbase.
>> >
>> > On Sat, Feb 12, 2011 at 1:01 PM, Jason Rutherglen <
>> > jason.rutherglen@gmail.com> wrote:
>> >
>> >> So in giving this a day of breathing room, it looks like HBase loads
>> >> values as it's scanning a column?  I think that'd be a killer to some
>> >> Lucene queries, eg, we'd be loading entire/part-of posting lists just
>> >> for a linear scan of the terms dict?  Or we'd probably instead want to
>> >> place the posting list into it's own column?
>> >>
>> >> Another approach would be to feed off the HLog, place updates into a
>> >> dedicated RT Lucene index (eg, outside of HBase).  With the latter
>> >> system we'd get transactional consistency, and we wouldn't need to
>> >> work so hard to force Lucene's index into HBase columns etc (which's
>> >> extremely high risk).  On being built, the indexes could be offloaded
>> >> automatically into HDFS.  This architecture would be more of a
>> >> 'parallel' to HBase Lucene index.  We'd still gain the removal of
>> >> doc-stores, we wouldn't need to sorry about tacking on new HBase
>> >> specific merge policies, and we'd gain [probably most importantly] a
>> >> consistent transactional view of the data, while also being able to
>> >> query that data using con/disjunction and phrase queries, amongst
>> >> others.  A delete or update in HBase'd cascade into a Lucene delete,
>> >> and this'd be performed atomically, and vice versa.
>> >>
>> >> On Fri, Feb 11, 2011 at 7:00 PM, Ted Dunning <tdunning@maprtech.com>
>> >> wrote:
>> >> > No.  And I doubt there ever will be.
>> >> >
>> >> > That was one reason to split the larger posting vectors.  That way
you
>> >> can
>> >> > multi-thread the fetching and the scoring.
>> >> >
>> >> > On Fri, Feb 11, 2011 at 6:56 PM, Jason Rutherglen <
>> >> > jason.rutherglen@gmail.com> wrote:
>> >> >
>> >> >> Thanks!  In browsing the HBase code, I think it'd be optimal to
>> stream
>> >> >> the posting/binary data directly from the underlying storage (instead
>> >> >> of loading the entire byte[]), it doesn't look like there's a way
to
>> >> >> do this (yet)?
>> >> >>
>> >> >> On Fri, Feb 11, 2011 at 6:20 PM, Ted Dunning <tdunning@maprtech.com>
>> >> >> wrote:
>> >> >> > Go for it!
>> >> >> >
>> >> >> > On Fri, Feb 11, 2011 at 4:44 PM, Jason Rutherglen <
>> >> >> > jason.rutherglen@gmail.com> wrote:
>> >> >> >
>> >> >> >> > Michi's stuff uses flexible indexing with a zero
lock
>> architecture.
>> >> >>  The
>> >> >> >> > speed *is* much higher.
>> >> >> >>
>> >> >> >> The speed's higher, and there isn't much Lucene left there
either,
>> as
>> >> >> >> I believe it was built specifically for the 140 characters
use
>> case
>> >> >> >> (eg, not the general use case).  I don't think most indexes
can be
>> >> >> >> compressed to only exist in RAM on a single server?  The
Twitter
>> use
>> >> >> >> case isn't one that the HBase RT search solution is useful
for?
>> >> >> >>
>> >> >> >> > If you were to store entire posting vectors as values
with terms
>> as
>> >> >> keys,
>> >> >> >> > you might be OK.  Very long posting vectors or add-ons
could be
>> >> added
>> >> >> >> using
>> >> >> >> > a key+serial number trick.
>> >> >> >>
>> >> >> >> This sounds like the right approach to try.  Also, the
Lucene
>> terms
>> >> >> >> dict is sorted anyways, so moving the terms into HBase's
sorted
>> keys
>> >> >> >> probably makes sense.
>> >> >> >>
>> >> >> >> > For updates, speed would only be acceptable if you
batch up a
>> >> >> >> > lot updates or possibly if you build in a value append
function
>> as
>> >> a
>> >> >> >> > co-processor.
>> >> >> >>
>> >> >> >> Hmm... I think the main issue would be the way Lucene
implements
>> >> >> >> deletes (eg, today as a BitVector).  I think we'd keep
that
>> >> >> >> functionality.  The new docs/updates would be added to
the
>> >> >> >> in-RAM-buffer.  I think there'd be a RAM size based flush
as there
>> is
>> >> >> >> today.  Where that'd be flushed to is an open question.
>> >> >> >>
>> >> >> >> I think the key advantages to the RT + HBase architecture
is the
>> >> index
>> >> >> >> would live alongside HBase columns, and so all other scaling
>> problems
>> >> >> >> (especially those related to scaling RT, such as synchronization
>> of
>> >> >> >> distributed data and updates) goes away.
>> >> >> >>
>> >> >> >> A distributed query would remain the same, eg, it'd hit
N servers?
>> >> >> >>
>> >> >> >> In addition, Lucene offers a wide variety of new query
types which
>> >> >> >> HBase'd get in realtime for free.
>> >> >> >>
>> >> >> >> On Fri, Feb 11, 2011 at 4:13 PM, Ted Dunning <
>> tdunning@maprtech.com>
>> >> >> >> wrote:
>> >> >> >> > On Fri, Feb 11, 2011 at 3:50 PM, Jason Rutherglen
<
>> >> >> >> > jason.rutherglen@gmail.com> wrote:
>> >> >> >> >
>> >> >> >> >> > I can't imagine that the speed achieved
by using Hbase would
>> be
>> >> >> even
>> >> >> >> >> within
>> >> >> >> >> > orders of magnitude of what you can do in
Lucene 4 (or even
>> 3).
>> >> >> >> >>
>> >> >> >> >> The indexing speed in Lucene hasn't changed in
quite a while,
>> are
>> >> you
>> >> >> >> >> saying HBase would somehow be overloaded?  That
doesn't seem to
>> >> jive
>> >> >> >> >> with the sequential writes HBase performs?
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > Michi's stuff uses flexible indexing with a zero
lock
>> architecture.
>> >> >>  The
>> >> >> >> > speed *is* much higher.
>> >> >> >> >
>> >> >> >> > The real problem is that hbase repeats keys.
>> >> >> >> >
>> >> >> >> > If you were to store entire posting vectors as values
with terms
>> as
>> >> >> keys,
>> >> >> >> > you might be OK.  Very long posting vectors or add-ons
could be
>> >> added
>> >> >> >> using
>> >> >> >> > a key+serial number trick.
>> >> >> >> >
>> >> >> >> > Short queries would involve reading and merging several
posting
>> >> >> vectors.
>> >> >> >>  In
>> >> >> >> > that mode, query speeds might be OK, but there isn't
a lot of
>> >> Lucene
>> >> >> left
>> >> >> >> at
>> >> >> >> > that point.  For updates, speed would only be acceptable
if you
>> >> batch
>> >> >> up
>> >> >> >> a
>> >> >> >> > lot updates or possibly if you build in a value append
function
>> as
>> >> a
>> >> >> >> > co-processor.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >> The speed of indexing is a function of creating
segments, with
>> >> >> >> >> flexible indexing, the underlying segment files
(and postings)
>> may
>> >> be
>> >> >> >> >> significantly altered from the default file structures,
eg,
>> placed
>> >> >> >> >> into HBase in various ways.  The posting lists
could even be
>> split
>> >> >> >> >> along with HBase regions?
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > Possibly.  But if you use term + counter and post
vectors of
>> >> limited
>> >> >> >> length
>> >> >> >> > you might be OK.
>> >> >> >> >
>> >> >> >>
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message