asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Question on language translation for Algebricks
Date Sun, 14 Feb 2016 22:51:35 GMT
PS: There's an important point below that you shouldn't miss (Sandeep) 
if you look at the Hivesterix code - if you find its approach puzzling, 
note that it was designed to only add what was needed to run Hive 
queries on Hyracks - and so that it could potentially be kept in 
upper-level sync with Hive itself.  As a result, it was not done as a 
"Hive lookalike done right" - it was done as a "Hive lookalike that lets 
the existing Hive code do as much of the initial work as possible".


On 2/14/16 2:48 PM, Yingyi Bu wrote:
> Hi Sandeep,
>
> Here is the Hivesterix codebase in the Apache source tree:
> https://github.com/apache/incubator-asterixdb-hyracks/tree/fullstack-0.2.13
>
> We have maintained Hivesterix up to hyracks-0.2.13, but stopped maintaining
> after that release. Mike has elaborated the reason.
>
>>> Furthermore, none of these rewrite rules seem to be SQL-specific.  Are there
> any SQL-specific rewrite rules which were added?
> That's exactly the motivation of the Algebricks project --- most rules that
> a typical SQL compiler implemented are not SQL-specific:-)
> However, there indeed are few Hive-specific rules that I added in order to
> get the Hive-on-Algebricks plan work efficiently:
> https://github.com/apache/incubator-asterixdb-hyracks/tree/fullstack-0.2.13/hivesterix/hivesterix-optimizer/src/main/java/edu/uci/ics/hivesterix/optimizer/rules
>
> The Hivesterix implementation first translates a Hive-optimized MR plan
> into an Algebricks logical plan, and then let Algebricks do further
> optimizations and finally execute the resulting Hyracks job on the Hyracks
> runtime.
>
> Best,
> Yingyi
>
>
>
> On Sun, Feb 14, 2016 at 2:26 PM, Mike Carey <dtabass@gmail.com> wrote:
>
>> Sandeep,
>>
>> Just to chime in as well:
>>
>>   - VXQuery is indeed the best example to look at, probably, to understand
>> the AsterixDB/Algebricks separation.
>>
>>   - Hivesterix was built by Yingyi Bu (who'll see this) early on - it drove
>> the separation idea, actually, but we made a decision not to try and
>> maintain it.  It was intended to provide a third/different proof of
>> separation and applicability of the approach, from a research standpoint,
>> but doesn't have additional value to offer the world (since Hive itself is
>> a moving target and Hive on Tez now provides the non-MapReduce-runtime
>> value that Hivesterix initially offered).  Yingi would probably be happy to
>> share the code base with you if you wanted to look at it for any reason,
>> but the only things in the Apache AsterixDB (incubating) project are things
>> deemed worthy of engineering/maintenance work.
>>
>> Hope that helps too!
>>
>> Cheers,
>> Mike
>>
>>
>>
>> On 2/14/16 11:47 AM, Till Westmann wrote:
>>
>>> Hi Sandeep,
>>>
>>> Apache VXQuery, the XQuery implementation mentioned in the SoCC paper, is
>>> a separate project [1].
>>>
>>> Specifically to your questions:
>>>
>>> 1) There is no need to implement other projects that use Algebricks
>>> inside of the AsterixDB source tree (as VXQuery shows).
>>>
>>> 2) It is clearly easier to combine a Java parser and plan tree generator
>>> with Algebricks, but there's no reason why one couldn't connect to other
>>> languages (e.g. by using a text-based intermediate format between the
>>> parser and the optimizer and between the plan generator and the runtime).
>>>
>>> 3) The reason for the different set of rules is that some are language
>>> agnostic and some are language-specific. As you can see in figure 2 of the
>>> paper a language implementation has to provide language-specific rules to
>>> augment the language-agnostic rules provided by Algebricks.
>>> Specifically, the rules in AsterixDB's asterix-algebra project augment
>>> the rules in Algebricks to support AsterixDB's query language AQL.
>>>
>>> Hope this helps,
>>> Till
>>>
>>> [1] http://vxquery.apache.org
>>>
>>> On 14 Feb 2016, at 11:02, Sandeep Joshi wrote:
>>>
>>> I had some questions about the process of mapping other query languages to
>>>> Algebricks.  The Sigmod SoCC 15 paper mentions that two languages XQuery
>>>> and HiveQL which have been mapped to Algebricks, but the implementation
>>>> is
>>>> not found in either of the two repositories released under Apache.
>>>>
>>>> I found Hivesterix and Pregelix under
>>>> https://github.com/madhusudancs/hyracks/tree/master/fullstack/hivesterix
>>>>
>>>> I couldn't find the XQuery to Algebricks translator anywhere. Has this
>>>> been released ?
>>>>
>>>> What is the reason these language translators are not part of the Apache
>>>> repository ?
>>>>
>>>> The Apache repositories contain the language translators for AQL and SQL.
>>>> After comparing the implementations for Hivesterix and SQL/AQL, here are
>>>> some questions
>>>>
>>>> 1) Does one have to integrate the parser for a new language within the
>>>> Apache AsterixDB source tree, or can one build the Algebricks translator
>>>> outside the Apache tree and invoke the Hyracks job execution engine
>>>> directly, as is being done in the hivesterix implementation seen here.
>>>>
>>>>
>>>> https://github.com/madhusudancs/hyracks/blob/36bb1021b17b736aa1648bd439e1246ae419aa89/fullstack/hivesterix/hivesterix-dist/src/main/java/edu/uci/ics/hivesterix/runtime/exec/HyracksExecutionEngine.java
>>>>
>>>> 2) When a query language is converted to Algebricks, the ICompilerFactory
>>>> converts one plan tree to another by calling Visitor::visit() on each
>>>> node
>>>> of the source query.  Does this imply that the plan tree for the source
>>>> language can only be constructed in Java ?  Would it be
>>>> difficult/impossible to integrate a parser and plan tree generator which
>>>> was written in any language into Algebricks ?
>>>>
>>>> 3) In the Apache repositories, the query rewrite rules which are used
>>>> during optimization are found under two different repositories.
>>>>
>>>> One in main asterixdb repository
>>>>
>>>>
>>>> https://github.com/apache/incubator-asterixdb/tree/master/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules
>>>>
>>>> and the other in the hyracks repository
>>>>
>>>>
>>>> https://github.com/apache/incubator-asterixdb-hyracks/tree/master/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules
>>>>
>>>> Are these two sets of rules characteristically different or is this
>>>> duplication just an artifact of rapid prototyping ?
>>>>
>>>> Furthermore, none of these rewrite rules seem to be SQL-specific.  Are
>>>> there any SQL-specific rewrite rules which were added ?
>>>>
>>>> -Sandeep
>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message