hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Meagher <john.meag...@gmail.com>
Subject Re: Converting rows into dynamic colums in Hive
Date Thu, 09 Aug 2012 14:10:53 GMT
R is another programming language geared to statistical analysis.  It
has some features that make it easy to write this sort of data
manipulation.  http://www.r-project.org/

Here's a sample R function that does something similar to what you
need (note: I'm not a real R programmer, so this might be ugly, but it
works):

abCountsToMatrix <- function(csvFilename, outputmatrixfile){

   data <- read.csv(csvFilename, as.is=T, check.names=F, header=F)
   cat("Loaded", nrow(data), "rows.\n")
   idrows <- sort(unique(c(data[ , 1])))
   idcols <- sort(unique(c(data[ , 2])))

   cat("Found", length(idrows), " X ", length(idcols), ".")
   output <- array(0, c(length(idrows), length(idcols)))
   rownames(output) <- idrows
   colnames(output) <- idcols

   for(i in 1:nrow(data)){

     if (i %% 100 == 0) {
       cat("On ", i, " of ", nrow(data), "\n")
     }

     rowidx <- which(idrows == data[i, 1])
     colidx <- which(idcols == data[i, 2])

     count <- data[i,3]

     output[rowidx, colidx] <- count
   }


   write.csv(output, file=outputmatrixfile, quote=F)

}



On Wed, Aug 8, 2012 at 8:58 PM,  <richin.jain@nokia.com> wrote:
> John,
>
> What is R?
>
> -----Original Message-----
> From: ext John Meagher [mailto:john.meagher@gmail.com]
> Sent: Wednesday, August 08, 2012 4:34 PM
> To: user@hive.apache.org
> Subject: Re: Converting rows into dynamic colums in Hive
>
> I don't think having dynamic columns is possible in Hive.  I've always output from Hive
a structure like your query output and used R to convert it into a dynamic column structure.
>
>
> On Wed, Aug 8, 2012 at 3:56 PM,  <richin.jain@nokia.com> wrote:
>> Thanks Ashish, that gives an idea.
>>
>> But I am not sure about the outer select loop, I have to know all the
>> values in Beta column beforehand to do a max on each value.
>>
>> Is there a better way?
>>
>>
>>
>> Richin
>>
>>
>>
>> From: ext Ashish Thusoo [mailto:athusoo@qubole.com]
>> Sent: Tuesday, August 07, 2012 5:05 PM
>> To: user@hive.apache.org
>> Subject: Re: Converting rows into dynamic colums in Hive
>>
>>
>>
>> you should be able to do this in hive using a group by on alpha and
>> then using a combination of the max and if statement... something on
>> the following lines
>>
>> select alpha, max(abc), max(pqr), ...
>> (
>>   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta ==
>> 'pqr', Gamma, NUL) as pqr, ....
>>   from table
>> )
>> group by alpha
>>
>> something on those lines...
>>
>> Ashish
>>
>> On Tue, Aug 7, 2012 at 1:57 PM, <richin.jain@nokia.com> wrote:
>>
>> Hi All,
>>
>>
>>
>> One of my Query output looks like-
>>
>>
>>
>> Alpha                    Beta                       Gamma
>>
>> 123                         xyz                         1.0
>>
>> 123                         abc                         0.5
>>
>> 123                         pqr                         1.3
>>
>> 123                         def                         2.1
>>
>> 456                         xyz                         0.1
>>
>> 456                         abc                         0.6
>>
>> 456                         pqr                         1.9
>>
>> 456                         def                         3.2
>>
>> 456                         lmn                        1.1
>>
>> 456                         sdf                          1.2
>>
>>
>>
>> I want the output for the data visualization purpose to look like
>> (basically taking rows from one table and making it column in another
>> table)
>>
>>
>>
>> Alpha                    xyz         abc         pqr         def         lmn
>> sdf
>>
>> 123                         1.0          0.5          1.3          2.1
>>
>> 456                         0.1          0.6          1.9          3.2
>> 1.1          1.2
>>
>>
>>
>> Question - Can it be done in Hive? If not, any suggestions.
>>
>>
>>
>> Thanks,
>>
>> Richin
>>
>>
>>
>>
>>
>>

Mime
View raw message