hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kawa <kawa.a...@gmail.com>
Subject Re: hive.query.string not reflecting the current query
Date Tue, 03 Dec 2013 23:41:21 GMT
Maybe you can parse the output of EXPLAIN operator applied on your query
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain  or
look for other configuration property (e.g. saying that number of map and
reduce tasks is equal to 0, or something).


2013/12/3 Petter von Dolwitz (Hem) <petter.von.dolwitz@gmail.com>

> Yes, it seems related. I think the query string is not refreshed when hive
> decides to run without a map reduce job. Problem is that I try to interact
> with the query string to apply an early filter in the record reader. Any
> other known way to detect that a map reduce job is not spawned so that I
> can work around this issue?
>
> /Petter
>
> Den tisdagen den 3:e december 2013 skrev Adam Kawa:
>
> Hmmm?
>>
>> Maybe it is related to the fact, that a query:
>> > select * from mytable limit 100;
>> does not start any MapReduce job. It is starts a reading operation from
>> HDFS (and a communication with MetaStore to know what is the schema and how
>> to parse the data using InputFormat and SerDe).
>>
>> For example, If you run a query that has the same functionality (i.e. to
>> show all content of the table by specifying all columns in SELECT)
>> > select column1, column2, ... columnN from mytable limit 100;
>> then a map-only job will be started and maybe (?) hive.query.string will
>> contain this query..
>>
>>
>> 2013/12/3 Petter von Dolwitz (Hem) <petter.von.dolwitz@gmail.com>
>>
>>> Hi,
>>>
>>> I use hive 0.11 with a five machine cluster. I am reading the property
>>> hive.query.string from a custom RecordReader (used for reading external
>>> tables).
>>>
>>> If I first invoke a query like
>>>
>>> select * from mytable where mycolumn='myvalue';
>>>
>>> I get the correct query string in this property.
>>>
>>> If I then invoke
>>>
>>> select * from mytable limit 100;
>>>
>>> the property hive.query.string still contains the first query. Seems
>>> like hive uses local mode for the second query. Don't know if it is related.
>>>
>>> Anybody knows why the query string is not updated in the second case?
>>>
>>> Thanks,
>>> Petter
>>>
>>
>>

Mime
View raw message