calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maryann Xue <maryann....@gmail.com>
Subject Re: Collation meets relational algebra
Date Fri, 31 Jul 2015 03:32:07 GMT
Thanks Julian for taking time to sort out all these requirements and
rethink about the model!
Thank you Milinda! Really appreciate your quick response to the issue.

On Thu, Jul 30, 2015 at 4:57 PM, Julian Hyde <jhyde@apache.org> wrote:

> There are a few issues in play regarding collations (783, 784, 793; see
> links below) and they seem to be overlapping. Maryann and Milinda have been
> at odds with each other (in the politest possible way!)
>
> The cause is that they are both doing very interesting new work using
> collation:
> * Maryann is optimizing Phoenix plans to use secondary indexes. These are
> tables that are project-sort materializations of a base table, itself
> sorted.
> * Milinda is planning Samza streaming-aggregation queries. A plan can only
> be found if you know that the stream is sorted on one of the aggregation
> keys, usually a time column.
>
> I spoke with Maryann about this today. I think that logical plans should
> not have a sort order:
> * In 783 and 784, I think I was wrong to allow logical RelNodes
> (LogicalProject and LogicalAggregate) to have collations. Because they are
> logical, they are inherently un-sorted. (But they may be based on a table,
> say an ArrayTable, that does have a sort order.)
> * In 793, Maryann was right so say that we should not bake in the
> collation that a plan *happens to have* when the SQL is first translated,
> because trying to find a physical plan with the same collation restricts
> our options.
>
> But SQL ASTs should have a sort order (if the top node is an ORDER BY
> clause, or if a table referenced in the FROM clause is a stream) and
> physical RelNodes should also have a sort order.
>
> And Milinda’s logical plans need a concept similar to sorting. Maybe a
> piece of metadata that this RelNode *could be sorted by X, Y if desired*.
> Any table can, of course, be re-sorted into any order you like, but a
> stream, which is infinite, can only be re-sorted to an order that does not
> conflict with the order of the incoming data.
>
> I still need to roll up my sleeves and help these patient developers
> (especially Milinda) get something working, but I hope it helps to have a
> general direction.
>
> Julian
>
> * https://issues.apache.org/jira/browse/CALCITE-783 Infer collation of
> Project using monotonicity
> * https://issues.apache.org/jira/browse/CALCITE-784 LogicalAggregate's
> create method discards any collation traits from input
> * https://issues.apache.org/jira/browse/CALCITE-793 The compiler asks for
> unnecessary collation trait on plan with materialized view
> * https://issues.apache.org/jira/browse/CALCITE-825 Allow user to specify
> sort order of an ArrayTable
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message