hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thiruvel Thirumoolan (JIRA)" <>
Subject [jira] [Commented] (HIVE-2439) Upgrade antlr version to 3.4
Date Mon, 03 Dec 2012 22:07:59 GMT


Thiruvel Thirumoolan commented on HIVE-2439:

Took a shot at this one on branch 0.9. Faced the same problems as Ashutosh, but I was able
compile cleanly using antlr 3.2 and 3.3 after I cleaned up the ivy cache or removed all antlr
directories. The change in parse tree happens with 3.3 and the release notes don't help me.

It looks like adding a token to one of the grammars was providing a consistent tree with 3.2
and 3.3/3.4. I have requested antlr group for clarification and hope to get to something working.
I will try to post a patch for branch 0.9 that has been done based on Ashutosh's previous
patch if my approach is correct. Here is an email that I sent along with the links of my experiments.
In case someone has any thoughts or feedback, they are welcome.


On 12/3/12 6:11 PM, "Thiruvel Thirumoolan" <*> wrote:

Hi Jim,

Thanks for your response, I guess the formatting was misleading. I have
reformatted and this should help.

After some poking around, a small change to the grammar provides a
consistent tree between 3.2 and 3.3. Is this the original mistake we had?

Suspected rule:

	| KW_TABLE tableOrPartition
		-> ^(tableOrPartition)

Modified rule:

	| KW_TABLE tableOrPartition
		-> ^(TOK_TAB_OR_PART tableOrPartition)

After adding a token to the rewrite rule, I was able to see a consistent
tree. I will have to change AST parsing code in Hive obviously, a little
involved. But is this the bug in the grammar? I could not find any
incompatible change in ANTLR 3.3 release notes regarding this [1] [only
debug related incompatible change].

[1] -


On 12/3/12 7:43 AM, "Jim Idle" <*> wrote:

With just a quick glance at your sample grammar I think that your issues
only that some of your rules are using rewrite rules ( -> ) but they are
only being used on say 1 out of 2 alts. IIRC, if you use rewrite rules on
one alt of a rule, you must use them on all the others too.


On Fri, Nov 30, 2012 at 10:56 PM, Thiruvel Thirumoolan <*> wrote:


I work on Apache Hive and it currently uses antlr 3.0.1. We would like
upgrade to antlr 3.4 so its easy to work with other Apache projects on
Hadoop that use antlr 3.4. We found that the parse tree generated from
Hive.g [1] is different with 3.0.1/3.1/3.2 and 3.3/3.4.

I have stripped down the lengthy grammar and created a smaller version
(Insert.g [2]). I have pushed a small mvn v3 project to that uses ANTLR in a way
Hive uses it. Here is the tree difference and the entire output is on
github. One can run "mvn test" to simulate it.

Antlr 3.0.1/3.1/3.2:

TOK_PARTVAL( DIM_1)( 'A'))( TOK_PARTVAL( DIM_2)( 'B')))))

Antlr 3.3/3.4:


Are we missing something in the grammar or is this a bug addressed in
I am afraid we can't move to v4 as that would mean moving all other
projects to v4. Are there any workarounds that we can use with antlr
3.4 to
ensure a similar Tree is generated?

Any help is greatly appreciated.

Thank You!

[1] -
[2] -

> Upgrade antlr version to 3.4
> ----------------------------
>                 Key: HIVE-2439
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Ashutosh Chauhan
>         Attachments: hive-2439_incomplete.patch
> Upgrade antlr version to 3.4

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message