hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes
Date Tue, 13 May 2014 21:33:07 GMT
Breaking off hackathon thread.

The conversation around HBASE-8089 concluded with two points:
 - HBase should provide support for order-preserving encodings while not
dropping support for the existing encoding formats.
 - HBase is not in the business of schema management; that is a
responsibility left to application developers.

To handle the first point, OrderedBytes is provided. For the supporting the
second, the DataType API is introduced. By introducing this layer above
specific encoding formats, it gives us a hook for plugging in different
implementations and for helper utilities to ship with HBase, such as
HBASE-10091.

Things get fuzzy around complex data types: pojos, compound rowkeys (a
special case of pojo), maps/dicts, and lists/arrays. These types are
composed of other types and have different requirements based on where in
the schema they're used. Again, by falling back on the DataType API, we
give application developers an "out" for doing what makes the most sense
for them.

For compound rowkeys, the Struct class is designed to fill in this gap,
sitting between data encoding and schema expression. It gives the
application implementer, the person managing the schema, enough flexibility
express the key encoding in terms of the component types. These components
are not limited to the simple primitives already defined, but any DataType
implementation. Order preservation is likely important here.

For arrays/lists, there's no implementation yet, but you can see how it
might be done if you have a look at struct. Order preservation may or may
not be important for arrays/list.

The situation for maps/dicts is similar to arrays/lists. The one
complication is the case where you want to map to a column family. How can
these APIs support this thing?

Pojos are a little more complicated. Probably Struct is sufficient for
basic cases, but it doesn't support nice features like versioning -- these
are sacrificed in favor of order preservation. Luckily, there's plenty of
tools out there for this already: Avro, MessagePack, Protobuf, Thrift, &c.
There's no need to reinvent the wheel here. Application developers can
implement the DataType API backed by their management tool of choice. I
created HBASE-11161 and will post a patch shortly.

Specific comments about the Hackathon notes inline.

Thanks,
Nick

On Mon, May 12, 2014 at 5:01 PM, Jonathan Hsieh <jon@cloudera.com> wrote:

>
> Here's where where I believe there is a agreement:
> * basic memcmp numeric encodings for ints, floats/doubles
>

This is already provided by OrderedBytes and the DataType implementations
Ordered{Float32,Float64,Int8,Int16,Int32,Int64,Numeric}.

* fixed scale decimal type
>

Provided by OrderedNumeric

* evolvability is highly desirable and thus tagged types of structs is
> desirable.  seemed like agreement for protobuf (which meets criteria)
> encoding for complex data types (records, arrays, lists with records, maps
> in a single cell)
>

Why not use protobuf directly instead of reimplementing a slight variation
of their format?

* no protobuf complex type encodings in rowkey (rowkey is like a struct but
> memcmp is critical).
>

Agreed. Struct is provided for this purpose.

* memcmp encodings for primitives in cells desired for phoenix (2ndary
> indices?)
>

This sounds like a Phoenix-specific decision.

* must support nulls in compound keys.
>

Struct offers this when the component types support it.

* this might be a separate module in hbase
>

This was discussed when HBASE-8089 was started and the consensus was to
place it into hbase-common. This can be reconsidered as necessary.

Here's where the main discussion points that need follow up (or where I'm
> not sure there was agreement):
> * for compound key encoding with nulls, do we need to distinguish null from
> ""? (phoenix emulates oracle, where they are same)
>

Null and "" are distinct in all JVM languages I'm aware of. We should not
preclude the possibility.

* compound key encoding of string/byte[]'s (how to handle \0)
>

OrderedBytes implements a bit-shifting strategy for this.
{FixedLength,Terminated}Wrapper are provided to add flexibility. Ryan has
suggested a variation of run-length encoding as another alternative,
something we could add is there's sufficient need.

* do we include 1 byte and 2 byte ints?
>

Following the initial commit of HBASE-8201, these were requested HBASE-9369.

* how to handle encodings of sql compound type like date (are they complex
> or primitives?)
> ** updated suggestion for date encodings.
>

Dates were not mentioned in previous discussions; would be good to have!

* Nick brings up some issues about the philosophy around
> o.a.h.h.types.DataType.. As I understand it, this datatype api has
> *extensibility* as the goal of being one api that could wrap many alternate
> encodings of data for hbase.


The above date question is a perfece example of why I think it's important
that we have the DataType interface. Having the interface means an
application can implement it's own types when their needs are too unique
for commit to HBase. Other applications can still use that implementation
by including the relevant application jars. They enjoy interoperability by
agreeing on the DataType implementation, not on something provided out of
the box by a particular HBase version.


> The focus of the discussion initially was around having one physical
> encoding usable by many systems for *interop*. I think the two are
> orthogonal and compatible but we'd need some examples (currently code
> inspection required) to see if they are actually compatible.  (there might
> be some primitive type tagging required with the current o.a.h.h.types
> implementation.)
>

A common set of encoding primitives are provided through existing Bytes and
OrderedBytes implementations, facilitating interop on a basic level. My
original intention was to make OrderedBytes that single implementation. As
the work progressed, it became clear that a single means of serializing an
int was not sufficient for everyone! DataType API extends that interop to a
level above the encoding details and allows for interop at an API level,
not just at the encoding level.

On Mon, May 12, 2014 at 10:32 AM, Stack <stack@duboce.net> wrote:
>
> > Below are some rough notes taken during first 20 minutes of our
> > hackathon/dev meetup last Tuesday (Your secretary had to abandon
> > note-taking for white-boarding duty and failed to pass the baton).
> >
> > There were about ~50 folks in attendance with representatives
> > from a variety of organizations.
> >
> > We started by promoting the items listed on the hackathon
> > meetup page to the white board:
> >
> > http://files.meetup.com/1350316/IMG_20140506_190458.jpg
> >
> > The list of topics up for discussions were:
> >
> > + Agreeing on types among hbase-related projects (phoenix,
> > kite, etc.).
> > + Discussion around colocation of hbase master and meta
> > and the master also doing regionserver services and whether
> > we should retain the current topology (no regions on the
> > master or backup master) for 1.0.
> > + The ongoing 'consensus' work (putting zk usage behind
> > a pluggable interface).
> > + Transactions over HBase (Continuuity work, Xiaomi
> > percolator, and VCNC's work).
> > + Netty the server
> > + HTrace
> > + Multitenancy
> >
> > We started talking types.  It took a while to get going . Below are the
> raw
> > notes:
> >
> > Static, dynamic or what?
> >
> > Lars Hofhansl: Is hbase a byte store? Should types be in there? If hbase
> > knew
> > about the type then could we exploit it internally.
> >
> > Should HBase know about types? Thought is that HBase could do some
> > perf stuff if it was cognizant of types.  Maybe later do this or
> > start out w/ a few common types first.
> >
> > At least one encoding strategy, work on this first.
> >
> > Make it so we don't design ourselves into a corner.
> >
> > Talking to Enis, Hive & Pig... what do these projects want?  Plug in
> codec?
> >
> > Kite project, tries to do a layer above byte store rather than in the fs.
> >
> > Encodings are not the same across storage engines for Kite.
> >
> > Does Parquet need to do float or integer?  But that is where it gets its
> > perf advantage.
> >
> > Parquet does not necessarily work inside hbase.
> >
> > Kite as a lib?  That phoenix could use?
> >
> > Could add a phoenix encoding to Kite?  Yes.  But lets agree and then
> retire
> > the avro encoding.
> >
> > Encoding is separate from schema.
> >
> > Ryan Blue: Don't want to say phoenix is on kite.  Just want to focus on
> > encoding.
> >
> > James Taylor: Phoenix should be under kite, not on top of kite.
> >
> > Ryan: move to common encoding, and a lib to serialize....
> >
> > Jon Hsieh: where this could be used... phoenix encoding for kite... then
> > all
> > could use it writing out.
> >
> > Lars: What you doing at HP for type encoding.
> >
> > HP: Our code is in C.  All byte arrays.  We have a way of doing floats,
> > etc.
> >
> > Ryan: But you want a serializing too?
> >
> > Lars: How we start in on this?
> >
> > Ryan: I sent out a doc. Ryan then presented this doc:
> >
> >
> >
> https://docs.google.com/a/cloudera.com/document/d/15INOaxyifycpFxvB6xNdoj96JbOpslEywjkaoG8MURE/edit#heading=h.o1cgqtsqgqyg
> >
> > Pick just a few simple types and get agreement on these first?
> >
> > HP: We have seen in the past that this works for a while but then folks
> > figure
> > out that their complex types are slower than they should be and they
> start
> > coming up w/ their own encodings as workarounds; now you are back to
> > square one.... you need to version it all.
> >
> > Was proposed that we try and unify around a typing strategy that would
> work
> > as
> > the ONE way to do it in HBase with Phoenix going to try and come up on it
> > first.
> > To be continued up on the HBase dev list.
> >
> > Another proposal that had alot of nods in favor was that we not require
> > 1.0 and 2.0 to be compatible.
> >
> > Notes ran out at this stage.
> >
> > St.Ack
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // HBase Tech Lead, Software Engineer, Cloudera
> // jon@cloudera.com // @jmhsieh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message