hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richa Sharma <mailtorichasha...@gmail.com>
Subject Re: Use RANK OVER PARTITION function in Hive 0.11
Date Wed, 17 Jul 2013 06:10:37 GMT
Jerome

I would recommend that you try Rank function with columns from just one
table first.
Once it is established that rank is working fine then add all the joins.

I am still on Hive 0.10 so cannot test it myself.
However, I can find a similar issue on following link - so its possible you
are facing issues due to this reported bug.

https://issues.apache.org/jira/browse/HIVE-4663


Richa


On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier
<verdier.jerome66@gmail.com>wrote:

> You can see my query below :
>
> SELECT
>             mag.co_magasin,
>             dem.id_produit                                  as
> id_produit_orig,
>             pnvente.dt_debut_commercial                     as
> dt_debut_commercial,
>             COALESCE(pnvente.id_produit,dem.id_produit)     as id_produit,
>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER
> BY pnvente.dt_debut_commercial DESC,
> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>
> FROM default.demarque_mag_jour dem
>
>           LEFT OUTER JOIN default.produit_norm pn
>           ON  pn.co_societe = dem.co_societe
>           AND pn.id_produit = dem.id_produit
>           LEFT OUTER JOIN default.produit_norm pnvente
>           ON  pnvente.co_societe = pn.co_societe
>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>           INNER JOIN default.kpi_magasin mag
>           ON mag.id_magasin = dem.id_magasin
>
>
> GROUP BY
>             mag.co_magasin,
>             dem.id_produit,
>             pnvente.dt_debut_commercial,
>             COALESCE(pnvente.id_produit,dem.id_produit);
>
>
> 2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>
>
>> Can you share query with just RANK().
>>
>> Richa
>>
>>
>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>> verdier.jerome66@gmail.com> wrote:
>>
>>> Hi Richa,
>>>
>>> I tried to execute the rank function alone, but the result is the same
>>>
>>> Thanks
>>>
>>>
>>> 2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>
>>>
>>>> Hi Jerome
>>>>
>>>>
>>>> I think the problem is you are trying to use MIN, SUM and RANK function
>>>> in a single query.
>>>>
>>>> Try to get the rank first in a query and on top of it apply these
>>>> aggregate functions
>>>>
>>>> Richa
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>>> verdier.jerome66@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have a problem while using RANK OVER PARTITION function with Hive.
>>>>>
>>>>> Hive is in version 0.11 and, as we can see here :
>>>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>>>>> we can now use these functions in Hive.
>>>>>
>>>>> But, when i use it, i encountered this error :
>>>>>
>>>>> FAILED: SemanticException Failed to breakup Windowing invocations into
>>>>> Groups. At least 1 group must only depend on input columns. Also check
for
>>>>> circular dependencies.
>>>>> Underlying error:
>>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>>>> arguments are expected.
>>>>>
>>>>> Here is my script :
>>>>>
>>>>> SELECT
>>>>>             mag.co_magasin,
>>>>>             dem.id_produit                                  as
>>>>> id_produit_orig,
>>>>>             pnvente.dt_debut_commercial                     as
>>>>> dt_debut_commercial,
>>>>>             COALESCE(pnvente.id_produit,dem.id_produit)     as
>>>>> id_produit,
>>>>>             min(
>>>>>               CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 END
>>>>>             )                                               as
>>>>> flg_demarque_valide,
>>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>>> CAST(dem.mt_revient_ope AS INT) END)
>>>>>                                                             as
>>>>> me_dem_con_prx_cs,
>>>>>             0                                               as
>>>>> me_dem_inc_prx_cs,
>>>>>             0                                               as
>>>>> me_dem_prov_stk_cs,
>>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>>> CAST(dem.qt_demarque AS INT) END)
>>>>>                                                             as
>>>>> qt_dem_con,
>>>>>             0                                               as
>>>>> qt_dem_inc,
>>>>>             0                                               as
>>>>> qt_dem_prov_stk, -- !!!!!!!! VIRGULE
>>>>>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>>>>> ORDER BY pnvente.dt_debut_commercial DESC,
>>>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>>>>           from default.calendrier cal
>>>>>           INNER JOIN default.demarque_mag_jour dem
>>>>>           ON  CASE WHEN dem.co_societe = 1 THEN 1 ELSE 2 END =
>>>>> '${hiveconf:in_co_societe}' -- A modifier
>>>>>           AND dem.dt_jour    = cal.dt_jour
>>>>>           LEFT OUTER JOIN default.produit_norm pn
>>>>>           ON  pn.co_societe = dem.co_societe
>>>>>           AND pn.id_produit = dem.id_produit
>>>>>           LEFT OUTER JOIN default.produit_norm pnvente
>>>>>           ON  pnvente.co_societe = pn.co_societe
>>>>>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>>>>>           AND pnvente.co_type_motif='05'
>>>>>           INNER JOIN default.kpi_magasin mag
>>>>>           ON  mag.co_societe = '${hiveconf:in_co_societe}'
>>>>>           AND mag.id_magasin = dem.id_magasin
>>>>>           WHERE cal.dt_jour = '${hiveconf:in_dt_jour}'
>>>>>           AND NOT (dem.co_validation IS NULL AND cal.dt_jour >
>>>>> unix_timestamp()-3*60*60*24) -- A verifier
>>>>>           -- JYP 4.4
>>>>>           AND dem.co_operation_magasin IN ('13','14','32')
>>>>>           GROUP BY
>>>>>             mag.co_magasin,
>>>>>             dem.id_produit,
>>>>>             pnvente.dt_debut_commercial,
>>>>>             COALESCE(pnvente.id_produit,dem.id_produit)
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> *Jérôme VERDIER*
>>>>> 06.72.19.17.31
>>>>> verdier.jerome66@gmail.com
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Jérôme VERDIER*
>>> 06.72.19.17.31
>>> verdier.jerome66@gmail.com
>>>
>>>
>>
>
>
> --
> *Jérôme VERDIER*
> 06.72.19.17.31
> verdier.jerome66@gmail.com
>
>

Mime
View raw message