hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Predicates for 'like' and 'between' operators to custom storage handler.
Date Thu, 05 May 2016 12:58:38 GMT
Right. What is the underlying Hive table format? Is that Parquet, Avro, ORC
..?

Also do you store your time as raw time in Hive table?

For example this is the way I store timestamp it in an ORC table

TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(TransactionDate,'dd/MM/yyyy'),'yyyy-MM-dd'))
AS TransactionDate


Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 5 May 2016 at 13:25, Amey Barve <ameybarve15@gmail.com> wrote:

> Hi,
>
> Do you have the equivalent of that operation in pure SQL.
> ---> This is my hive query: *select count(*) from u_data where unixtime
> like '%888904884%'*
>  Query evaluates and results are correct. But point is that hive does not
> give like operator during predicate push down to custom storage handler.
> I am mapping with *hive's UDFLike* class.
>
> Also have you tried Spark query tool with Hive table.
> ---> *No*.
>
> I gather you are doing this through Java?
> ---> *YES*.
>
> Has anybody tried mapping operators other than  '=', '!=', '<', '<=', '>'
> and '>=' ?
>
> Regards,
> Amey
>
> On Thu, May 5, 2016 at 5:44 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com
> > wrote:
>
>> Hi,
>>
>> Do you have the equivalent of that operation in pure SQL. Also have you
>> tried Spark query tool with Hive table.
>>
>> I gather you are doing this through Java?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 5 May 2016 at 13:10, Amey Barve <ameybarve15@gmail.com> wrote:
>>
>>> Thanks Mich,
>>>
>>> It does work, but with operators other than '=', '!=', '<', '<=', '>'
>>> and '>=' , my code with custom storage handler gets null expression.
>>>
>>> *final String expression = conf.get(TableScanDesc.FILTER_EXPR_CONF_STR);
>>> // expression is null for hive query that has like operator*
>>>
>>> Why does above API return *null* for hive query that has like operator?
>>> I need to pass like operator for predicate push down to my custom storage
>>> handler.
>>>
>>> Regards,
>>> Amey
>>>
>>> On Thu, May 5, 2016 at 5:30 PM, Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> On a normal query using sql in* Hive 2* LIKE predicate works fine.
>>>> Case in point in a 1 billion rows table with the column random_string of
>>>> varchar(50) I have one row that satisfies the followinh@
>>>>
>>>>
>>>> +-----------+------------------+------------------+-------------------+-----------------------------------------------------+-----------------+----------------+--+
>>>> | dummy.id  | dummy.clustered  | dummy.scattered  | dummy.randomised
>>>> |                 dummy.random_string                 | dummy.small_vc  |
>>>> dummy.padding  |
>>>>
>>>> +-----------+------------------+------------------+-------------------+-----------------------------------------------------+-----------------+----------------+--+
>>>> | 1         | 0                | 0                | 63                |
>>>> rMLTDXxxqXOZnqYRJwInlGfGBTxNkAszBGEUGELqTSRnFjRGbi  |          1
>>>> | xxxxxxxxxx     |
>>>> | 2         | 0                | 1                | 926               |
>>>> UEDJsfIgoYqwreSuuvjIcPZarpxMdCthpDCsgPlJfvIiylLiBS  |          2      |
>>>> xxxxxxxxxx     |
>>>>
>>>> Now let us try to select that row with LIKE predicate:
>>>>
>>>> 0: jdbc:hive2://rhes564:10010/default> select count(1) from dummy where
>>>> random_string like 'rMLTDXxxqXOZnqYRJ%';
>>>>
>>>> INFO  :
>>>> Query Hive on Spark job[0] stages:
>>>> INFO  : 0
>>>> INFO  : 1
>>>> INFO  :
>>>> Status: Running (Hive on Spark job[0])
>>>>
>>>> INFO  : Completed executing
>>>> command(queryId=hduser_20160505125700_cbc415b6-91bb-4ed6-95e4-d177e12988f6);
>>>> Time taken: 153.544 seconds
>>>> INFO  : OK
>>>> +-----+--+
>>>> | c0  |
>>>> +-----+--+
>>>> | 1   |
>>>> +-----+--+
>>>> 1 row selected (153.959 seconds)
>>>>
>>>> So it does work
>>>>
>>>> HTH
>>>>
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>>
>>>> On 5 May 2016 at 11:53, Amey Barve <ameybarve15@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I have implemented custom storage-handler and able to get predicates
>>>>> from hive for '=', '!=', '<', '<=', '>' and '>=' operators.
>>>>> But I cannot get predicates from hive for 'like', 'between' operators.
>>>>>
>>>>> Here's my code:
>>>>>
>>>>> *final String expression =
>>>>> conf.get(TableScanDesc.FILTER_EXPR_CONF_STR);*
>>>>>
>>>>> here expression remains null for like and between operators but not
>>>>> null for above operators.
>>>>>
>>>>> Does hive not give predicates for 'like' and 'between' operators to
>>>>> custom storage handler ?
>>>>> *Is there some other mechanism to get predicates for 'like' operator*?
>>>>>
>>>>> I tested with hive version 1.2 and 0.14.
>>>>>
>>>>> Thanks and Regards,
>>>>> Amey
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message