flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yiannis Gkoufas <johngou...@gmail.com>
Subject Best strategy for calculating percentage
Date Fri, 20 Feb 2015 21:01:47 GMT
Hi there,

I have the following scenario:
My files have 2 attributes and 1 numeric value:
(attr1,attr2,val)
I want to generate the percentage of values of each of attr1 on the sum of
val grouped on attr2
Currently I am doing it like this:

input.map(e => e._2.toString.split(","))
  .map(e=> (e(0),Utils.getMonthFromDate(e(1).toLong),e(3).toDouble,e(3).toDouble))
  .groupBy(0,1)
  .sum(2)
  .groupBy(1)
  .sum(3)
  .map(e => (e._1,e._2,scala.math.BigDecimal(e._3*1.0/e._4*1.0).toString()))

Is there a more efficient way to calculate this?

Thanks a lot!

Mime
View raw message