hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay <tec...@gmail.com>
Subject Re: Use RANK OVER PARTITION function in Hive 0.11
Date Wed, 17 Jul 2013 09:01:16 GMT
As the error message states: "One ore more arguments are expected," you
have to pass a column to the rank function.


On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier
<verdier.jerome66@gmail.com>wrote:

> Hi Richa,
>
> I have tried a simple query without joins, etc....
>
> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
> default.thm_renta_rgrp_produits_n_1 mag;
>
> Unfortunately, the error is the same like previously.
>
> Error: Query returned non-zero code: 40000, cause: FAILED:
> SemanticException Failed to breakup Windowing invocations into Groups. At
> least 1 group must only depend on input columns. Also check for circular
> dependencies.
>
> Underlying error: org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException:
> One or more arguments are expected.
> SQLState:  42000
> ErrorCode: 40000
>
>
>
>
> 2013/7/17 Richa Sharma <mailtorichasharma@gmail.com>
>
>> Jerome
>>
>> I would recommend that you try Rank function with columns from just one
>> table first.
>> Once it is established that rank is working fine then add all the joins.
>>
>> I am still on Hive 0.10 so cannot test it myself.
>> However, I can find a similar issue on following link - so its possible
>> you are facing issues due to this reported bug.
>>
>> https://issues.apache.org/jira/browse/HIVE-4663
>>
>>
>> Richa
>>
>>
>> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
>> verdier.jerome66@gmail.com> wrote:
>>
>>> You can see my query below :
>>>
>>> SELECT
>>>             mag.co_magasin,
>>>             dem.id_produit                                  as
>>> id_produit_orig,
>>>             pnvente.dt_debut_commercial                     as
>>> dt_debut_commercial,
>>>             COALESCE(pnvente.id_produit,dem.id_produit)     as
>>> id_produit,
>>>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>>> ORDER BY pnvente.dt_debut_commercial DESC,
>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>>
>>> FROM default.demarque_mag_jour dem
>>>
>>>           LEFT OUTER JOIN default.produit_norm pn
>>>           ON  pn.co_societe = dem.co_societe
>>>           AND pn.id_produit = dem.id_produit
>>>           LEFT OUTER JOIN default.produit_norm pnvente
>>>           ON  pnvente.co_societe = pn.co_societe
>>>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>>>           INNER JOIN default.kpi_magasin mag
>>>           ON mag.id_magasin = dem.id_magasin
>>>
>>>
>>> GROUP BY
>>>             mag.co_magasin,
>>>             dem.id_produit,
>>>             pnvente.dt_debut_commercial,
>>>             COALESCE(pnvente.id_produit,dem.id_produit);
>>>
>>>
>>> 2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>
>>>
>>>> Can you share query with just RANK().
>>>>
>>>> Richa
>>>>
>>>>
>>>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>>>> verdier.jerome66@gmail.com> wrote:
>>>>
>>>>> Hi Richa,
>>>>>
>>>>> I tried to execute the rank function alone, but the result is the same
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> 2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>
>>>>>
>>>>>> Hi Jerome
>>>>>>
>>>>>>
>>>>>> I think the problem is you are trying to use MIN, SUM and RANK
>>>>>> function in a single query.
>>>>>>
>>>>>> Try to get the rank first in a query and on top of it apply these
>>>>>> aggregate functions
>>>>>>
>>>>>> Richa
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>>>>> verdier.jerome66@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have a problem while using RANK OVER PARTITION function with
Hive.
>>>>>>>
>>>>>>> Hive is in version 0.11 and, as we can see here :
>>>>>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>>>>>>> we can now use these functions in Hive.
>>>>>>>
>>>>>>> But, when i use it, i encountered this error :
>>>>>>>
>>>>>>> FAILED: SemanticException Failed to breakup Windowing invocations
>>>>>>> into Groups. At least 1 group must only depend on input columns.
Also check
>>>>>>> for circular dependencies.
>>>>>>> Underlying error:
>>>>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One
or more
>>>>>>> arguments are expected.
>>>>>>>
>>>>>>> Here is my script :
>>>>>>>
>>>>>>> SELECT
>>>>>>>             mag.co_magasin,
>>>>>>>             dem.id_produit                                  as
>>>>>>> id_produit_orig,
>>>>>>>             pnvente.dt_debut_commercial                     as
>>>>>>> dt_debut_commercial,
>>>>>>>             COALESCE(pnvente.id_produit,dem.id_produit)     as
>>>>>>> id_produit,
>>>>>>>             min(
>>>>>>>               CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
1 END
>>>>>>>             )                                               as
>>>>>>> flg_demarque_valide,
>>>>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>>>>> CAST(dem.mt_revient_ope AS INT) END)
>>>>>>>                                                             as
>>>>>>> me_dem_con_prx_cs,
>>>>>>>             0                                               as
>>>>>>> me_dem_inc_prx_cs,
>>>>>>>             0                                               as
>>>>>>> me_dem_prov_stk_cs,
>>>>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>>>>> CAST(dem.qt_demarque AS INT) END)
>>>>>>>                                                             as
>>>>>>> qt_dem_con,
>>>>>>>             0                                               as
>>>>>>> qt_dem_inc,
>>>>>>>             0                                               as
>>>>>>> qt_dem_prov_stk, -- !!!!!!!! VIRGULE
>>>>>>>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>>>>>>> ORDER BY pnvente.dt_debut_commercial DESC,
>>>>>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>>>>>>           from default.calendrier cal
>>>>>>>           INNER JOIN default.demarque_mag_jour dem
>>>>>>>           ON  CASE WHEN dem.co_societe = 1 THEN 1 ELSE 2 END
=
>>>>>>> '${hiveconf:in_co_societe}' -- A modifier
>>>>>>>           AND dem.dt_jour    = cal.dt_jour
>>>>>>>           LEFT OUTER JOIN default.produit_norm pn
>>>>>>>           ON  pn.co_societe = dem.co_societe
>>>>>>>           AND pn.id_produit = dem.id_produit
>>>>>>>           LEFT OUTER JOIN default.produit_norm pnvente
>>>>>>>           ON  pnvente.co_societe = pn.co_societe
>>>>>>>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>>>>>>>           AND pnvente.co_type_motif='05'
>>>>>>>           INNER JOIN default.kpi_magasin mag
>>>>>>>           ON  mag.co_societe = '${hiveconf:in_co_societe}'
>>>>>>>           AND mag.id_magasin = dem.id_magasin
>>>>>>>           WHERE cal.dt_jour = '${hiveconf:in_dt_jour}'
>>>>>>>           AND NOT (dem.co_validation IS NULL AND cal.dt_jour
>
>>>>>>> unix_timestamp()-3*60*60*24) -- A verifier
>>>>>>>           -- JYP 4.4
>>>>>>>           AND dem.co_operation_magasin IN ('13','14','32')
>>>>>>>           GROUP BY
>>>>>>>             mag.co_magasin,
>>>>>>>             dem.id_produit,
>>>>>>>             pnvente.dt_debut_commercial,
>>>>>>>             COALESCE(pnvente.id_produit,dem.id_produit)
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> --
>>>>>>> *Jérôme VERDIER*
>>>>>>> 06.72.19.17.31
>>>>>>> verdier.jerome66@gmail.com
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Jérôme VERDIER*
>>>>> 06.72.19.17.31
>>>>> verdier.jerome66@gmail.com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Jérôme VERDIER*
>>> 06.72.19.17.31
>>> verdier.jerome66@gmail.com
>>>
>>>
>>
>
>
> --
> *Jérôme VERDIER*
> 06.72.19.17.31
> verdier.jerome66@gmail.com
>
>

Mime
View raw message