phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <>
Subject Re: [DISCUSS] Suggestions for Phoenix from HBaseCon Asia notes
Date Tue, 28 Aug 2018 20:37:36 GMT
On Mon, Aug 27, 2018 at 11:03 AM Josh Elser <> wrote:

> 2. Can Phoenix be the de-facto schema for SQL on HBase?
> We've long asserted "if you have to ask how Phoenix serializes data, you
> shouldn't be do it" (a nod that you have to write lots of code). What if
> we turn that on its head? Could we extract our PDataType serialization,
> composite row-key, column encoding, etc into a minimal API that folks
> with their own itches can use?
> With the growing integrations into Phoenix, we could embrace them by
> providing an API to make what they're doing easier. In the same vein, we
> cement ourselves as a cornerstone of doing it "correctly"

There have been discussion where I work where it seems this would be a
great idea. If data types, row key constructors, and other key and data
serialization concerns were a public API, these could be used by connectors
to Spark or other systems to generate and consume Phoenix compatible data.
It improves the integration story all around.

Another thought for refactoring I've heard is exposing an API for
generating query plans without needing the SQL parser. A public API  for
programmatically building query plans could used by connectors to Spark or
other systems when pushing down parts of a parallelized or federated query
to Phoenix data sources, avoiding unnecessary hacking SQL language
generation, string mangling, or (re)parsing overheads. This kind of
describes Calcite's raison d'ĂȘtre. If Phoenix is not embedding Calcite as
query planner, as it does not currently, it is independently useful to have
a public API for programmatic query plan construction given the current
implementation regardless. If Phoenix were to embed Calcite as query
planner, you'd probably get a ton of re-use among internal and external
users of the Calcite APIs. I'd think whatever option you might choose would
be informed by the suitability (or not) of embedding Calcite as Phoenix's
query planner, and how soon that might be expected to be feature complete.
For what it's worth. Again this extends possibilities for integration.

> 3. Better recommendations to users to not attempt certain queries.
> We definitively know that there are certain types of queries that
> Phoenix cannot support well (compared to optimal Phoenix use-cases).
> Users very commonly fall into such pitfalls on their own and this leaves
> a bad taste in their mouth (thinking that the product "stinks").
> Can we do a better job of telling the user when and why it happened?
> What would such a user-interaction model look like? Can we supplement
> the "why" with instructions of what to do differently (even if in the
> abstract)?
> 4. Phoenix-Calcite
> This was mentioned as a "nice to have". From what I understand, there
> was nothing explicitly from with the implementation or approach, just
> that it was a massive undertaking to continue with little immediate
> gain. Would this be a boon for us to try to continue in some form? Are
> there steps we can take that would help push us along the right path?
> Anyways, I'd love to hear everyone's thoughts. While the concerns were
> raised at HBaseCon Asia, the suggestions that accompany them here are
> largely mine ;). Feel free to break them out into their own threads if
> you think that would be better (or say that you disagree with me --
> that's cool too)!
> - Josh

Best regards,

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message