Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EA3FF10FE1 for ; Wed, 19 Mar 2014 23:58:56 +0000 (UTC) Received: (qmail 11634 invoked by uid 500); 19 Mar 2014 23:58:53 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 11574 invoked by uid 500); 19 Mar 2014 23:58:53 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 11566 invoked by uid 99); 19 Mar 2014 23:58:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Mar 2014 23:58:53 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of spragues@gmail.com designates 209.85.192.46 as permitted sender) Received: from [209.85.192.46] (HELO mail-qg0-f46.google.com) (209.85.192.46) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Mar 2014 23:58:49 +0000 Received: by mail-qg0-f46.google.com with SMTP id e89so272944qgf.5 for ; Wed, 19 Mar 2014 16:58:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=N7QTyMHOXo2UOC4gAKItf0UY60pj0PKeHAERHZHhfNk=; b=L+tan2CIi2cw4tdTeG93pHyLlwFAkWbd4iaujjgrdLlZnIWVglaS1CrVrCJyHXZM29 ATy8kjfpZpKhalbe4LitLAAm1c2P8eH/3dPzE1Dqx+z9YNkfzROUTBI2Xy8IXTdOihqC a7R/6uuYEChMVXqpGep56VrYItC8G8fo5z0j/BprgJEaNGst4PD7f3qo2ZHa5wo747t1 yKFSv5mX+zCGMIO5a4wXsNh8YSwBdjGZgf4i6KFsTgwPKL4Og7AEqKVW7xw5R+W/PP4F EBCXcdW/f1x3cmuW3PwWRf7NOJMSNwM0Fc1JwsOloJtHFzlZjL4AX8lGrFkP+IUqD1uk cCUA== X-Received: by 10.140.30.230 with SMTP id d93mr34987583qgd.51.1395273508422; Wed, 19 Mar 2014 16:58:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.92.70 with HTTP; Wed, 19 Mar 2014 16:58:08 -0700 (PDT) In-Reply-To: References: <34fd060d1002042132hbbe7066g345592e6534e6009@mail.gmail.com> From: Stephen Sprague Date: Wed, 19 Mar 2014 16:58:08 -0700 Message-ID: Subject: Re: computing median and percentiles To: "user@hive.apache.org" Content-Type: multipart/alternative; boundary=001a113a5a9ed58ba304f4fe6daa X-Virus-Checked: Checked by ClamAV on apache.org --001a113a5a9ed58ba304f4fe6daa Content-Type: text/plain; charset=ISO-8859-1 not a hive question is it? its more like a math question. On Wed, Mar 19, 2014 at 1:30 PM, Seema Datar wrote: > > > I understand the percentile function is supported in Hive in the latest > versions. However, how does once calculate percentiles when the data is > across two columns. So say - > > Value Count > > 100 2 ( so basically 100 occurred twice) > 200 4 > 300 1 > 400 6 > 500 3 > > > I want to find out the 0.25 percentile for the value distribution. How > can I do it using the Hive percentile function? > > > --001a113a5a9ed58ba304f4fe6daa Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
not a hive question is it?=A0=A0 its more like a math questi= on.=A0=A0



On Wed, Mar 19, 2014 at 1:30 PM, Seema Datar <sdatar@yahoo= -inc.com> wrote:


I understand the percentile function is supported in Hive in the lates= t versions. However, how does once calculate percentiles when the data is a= cross two columns. So say -=A0

Value =A0Count

100 2 =A0 ( so basically 1= 00 occurred twice)
200 4
300 1
400 6
500 3


I want to find out the 0.25 percentile for the value distribution. How= can I do it using the Hive percentile function?



--001a113a5a9ed58ba304f4fe6daa--