hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ritesh Gautam <grites...@gmail.com>
Subject Re: Parsing Hive Query to get table names and column names
Date Thu, 06 Nov 2014 15:28:29 GMT
Thank You, Alok and Devopam, it finally worked.
I was able to parse the query as hive parses it by taking guidance from the
hive codebase.

Regards,
Ritesh

On Thu, Nov 6, 2014 at 1:23 PM, Alok Kumar <alokawi@gmail.com> wrote:

> btw, what error are you getting with ANTLR or HiveParser? If any
> dependency class file is missing?
>
> Thanks
> Alok
>
> On Thu, Nov 6, 2014 at 12:59 PM, Alok Kumar <alokawi@gmail.com> wrote:
>
>> As Devopam suggested, do it right way from the start.
>>
>> PS  : if queries are written manually, you could get tables too manually.
>> :)
>>
>> Thanks
>> Alok
>>
>> On Wed, Nov 5, 2014 at 6:44 PM, Devopam Mittra <devopam@gmail.com> wrote:
>>
>>> hi Ritesh,
>>> Please reconsider your entire design , it might be helpful to do it now
>>> than becoming unmanageable later.
>>>
>>> If unavoidable, please use a metadata based approach for pre-calculating
>>> and keeping the list of tables that you need to refresh prior to firing a
>>> query on them (?)
>>>
>>> Hope it helps.
>>>
>>> regards
>>> Devopam
>>>
>>>
>>> On Wed, Nov 5, 2014 at 6:20 PM, Ritesh Gautam <gritesh12@gmail.com>
>>> wrote:
>>>
>>>> hey Alok,
>>>>  I want to do this so that I can refresh the dependent tables before I
>>>> run my query, so that my query would now run on the current data.
>>>>
>>>> The queries are written manually, so that the only way to do this will
>>>> be to parse the query.
>>>> Isn't Hive somewhat different from SQL? I have already tried using
>>>> JsqlParser but for some cases it doesn't works.
>>>>
>>>> Thanks,
>>>> Regards,
>>>> Ritesh
>>>>
>>>> On Wed, Nov 5, 2014 at 6:05 PM, Alok Kumar <alokawi@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Why at the first place you would want this? ( just curious )
>>>>>
>>>>> Few thoughts -
>>>>> a) Try to get it from the piece of code where these query are being
>>>>> generated [ if not static in code!], that would be best place to get
it.
>>>>> b) [ if you don't have access to a) ] - try
>>>>> http://zql.sourceforge.net/ ,  it should be easier. Also check the
>>>>> licence.
>>>>>
>>>>> Thanks
>>>>> Alok
>>>>>
>>>>> On Wed, Nov 5, 2014 at 5:47 PM, Ritesh Gautam <gritesh12@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>         I am trying to parse hive queries so that I can get the table
>>>>>> names on which the query is dependent on.
>>>>>>
>>>>>> I have tried the following :
>>>>>> 1) downloaded the grammer and used ANTLR to generate the lexer and
>>>>>> parser, but there are some errors as such when I try to build it:
>>>>>> ......
>>>>>>   symbol:   class RecognitionException
>>>>>>   location: class HiveLexer
>>>>>> HiveLexer.java:2432: error: cannot find symbol
>>>>>> public final void mKW_ESCAPED() throws RecognitionException {
>>>>>>                                        ^
>>>>>>   symbol:   class RecognitionException
>>>>>>   location: class HiveLexer
>>>>>> HiveLexer.java:2453: error: cannot find symbol
>>>>>> public final void mKW_COLLECTION() throws RecognitionException {
>>>>>>                                           ^
>>>>>>   symbol:   class RecognitionException
>>>>>>   location: class HiveLexer
>>>>>> 100 errors
>>>>>>
>>>>>> 2) I have tried using org.apache.hadoop.hive.ql.parse but I am stuck
>>>>>> at this point:
>>>>>>
>>>>>>         ANTLRStringStream input = new ANTLRStringStream("SELECT x
>>>>>> FROM abc");
>>>>>>         HiveLexer lexer = new HiveLexer(input);
>>>>>>         TokenStream tokens = new CommonTokenStream(lexer);
>>>>>>         HiveParser parser = new HiveParser(tokens);
>>>>>>         System.out.println(parser.statement());
>>>>>>
>>>>>> *How should I proceed from here to extract the table names and column
>>>>>> names?*
>>>>>> *And, Is the way I am doing it correct?*
>>>>>>
>>>>>> Thank You.
>>>>>> Regards,
>>>>>> Ritesh
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Alok Kumar
>>>>> http://sharepointorange.blogspot.in/
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Devopam Mittra
>>> Life and Relations are not binary
>>>
>>
>>
>>
>>
>
>
> --
> Alok Kumar
> BigData Developer | DataRPM
> Email : alokawi@gmail.com
> http://sharepointorange.blogspot.in/
> http://www.linkedin.com/in/alokawi
>

Mime
View raw message