hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Verdier <verdier.jerom...@gmail.com>
Subject Re: Use RANK OVER PARTITION function in Hive 0.11
Date Tue, 16 Jul 2013 13:11:46 GMT
You can see my query below :

SELECT
            mag.co_magasin,
            dem.id_produit                                  as
id_produit_orig,
            pnvente.dt_debut_commercial                     as
dt_debut_commercial,
            COALESCE(pnvente.id_produit,dem.id_produit)     as id_produit,
            RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit ORDER
BY pnvente.dt_debut_commercial DESC,
COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang

FROM default.demarque_mag_jour dem
          LEFT OUTER JOIN default.produit_norm pn
          ON  pn.co_societe = dem.co_societe
          AND pn.id_produit = dem.id_produit
          LEFT OUTER JOIN default.produit_norm pnvente
          ON  pnvente.co_societe = pn.co_societe
          AND pnvente.co_produit_rfu = pn.co_produit_lip
          INNER JOIN default.kpi_magasin mag
          ON mag.id_magasin = dem.id_magasin

GROUP BY
            mag.co_magasin,
            dem.id_produit,
            pnvente.dt_debut_commercial,
            COALESCE(pnvente.id_produit,dem.id_produit);


2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>

> Can you share query with just RANK().
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
> verdier.jerome66@gmail.com> wrote:
>
>> Hi Richa,
>>
>> I tried to execute the rank function alone, but the result is the same
>>
>> Thanks
>>
>>
>> 2013/7/16 Richa Sharma <mailtorichasharma@gmail.com>
>>
>>> Hi Jerome
>>>
>>>
>>> I think the problem is you are trying to use MIN, SUM and RANK function
>>> in a single query.
>>>
>>> Try to get the rank first in a query and on top of it apply these
>>> aggregate functions
>>>
>>> Richa
>>>
>>>
>>>
>>>
>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>> verdier.jerome66@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a problem while using RANK OVER PARTITION function with Hive.
>>>>
>>>> Hive is in version 0.11 and, as we can see here :
>>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>>>> we can now use these functions in Hive.
>>>>
>>>> But, when i use it, i encountered this error :
>>>>
>>>> FAILED: SemanticException Failed to breakup Windowing invocations into
>>>> Groups. At least 1 group must only depend on input columns. Also check for
>>>> circular dependencies.
>>>> Underlying error:
>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>>> arguments are expected.
>>>>
>>>> Here is my script :
>>>>
>>>> SELECT
>>>>             mag.co_magasin,
>>>>             dem.id_produit                                  as
>>>> id_produit_orig,
>>>>             pnvente.dt_debut_commercial                     as
>>>> dt_debut_commercial,
>>>>             COALESCE(pnvente.id_produit,dem.id_produit)     as
>>>> id_produit,
>>>>             min(
>>>>               CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 END
>>>>             )                                               as
>>>> flg_demarque_valide,
>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>> CAST(dem.mt_revient_ope AS INT) END)
>>>>                                                             as
>>>> me_dem_con_prx_cs,
>>>>             0                                               as
>>>> me_dem_inc_prx_cs,
>>>>             0                                               as
>>>> me_dem_prov_stk_cs,
>>>>             sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>>> CAST(dem.qt_demarque AS INT) END)
>>>>                                                             as
>>>> qt_dem_con,
>>>>             0                                               as
>>>> qt_dem_inc,
>>>>             0                                               as
>>>> qt_dem_prov_stk, -- !!!!!!!! VIRGULE
>>>>             RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>>>> ORDER BY pnvente.dt_debut_commercial DESC,
>>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>>>           from default.calendrier cal
>>>>           INNER JOIN default.demarque_mag_jour dem
>>>>           ON  CASE WHEN dem.co_societe = 1 THEN 1 ELSE 2 END =
>>>> '${hiveconf:in_co_societe}' -- A modifier
>>>>           AND dem.dt_jour    = cal.dt_jour
>>>>           LEFT OUTER JOIN default.produit_norm pn
>>>>           ON  pn.co_societe = dem.co_societe
>>>>           AND pn.id_produit = dem.id_produit
>>>>           LEFT OUTER JOIN default.produit_norm pnvente
>>>>           ON  pnvente.co_societe = pn.co_societe
>>>>           AND pnvente.co_produit_rfu = pn.co_produit_lip
>>>>           AND pnvente.co_type_motif='05'
>>>>           INNER JOIN default.kpi_magasin mag
>>>>           ON  mag.co_societe = '${hiveconf:in_co_societe}'
>>>>           AND mag.id_magasin = dem.id_magasin
>>>>           WHERE cal.dt_jour = '${hiveconf:in_dt_jour}'
>>>>           AND NOT (dem.co_validation IS NULL AND cal.dt_jour >
>>>> unix_timestamp()-3*60*60*24) -- A verifier
>>>>           -- JYP 4.4
>>>>           AND dem.co_operation_magasin IN ('13','14','32')
>>>>           GROUP BY
>>>>             mag.co_magasin,
>>>>             dem.id_produit,
>>>>             pnvente.dt_debut_commercial,
>>>>             COALESCE(pnvente.id_produit,dem.id_produit)
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> *Jérôme VERDIER*
>>>> 06.72.19.17.31
>>>> verdier.jerome66@gmail.com
>>>>
>>>>
>>>
>>
>>
>> --
>> *Jérôme VERDIER*
>> 06.72.19.17.31
>> verdier.jerome66@gmail.com
>>
>>
>


-- 
*Jérôme VERDIER*
06.72.19.17.31
verdier.jerome66@gmail.com

Mime
View raw message