flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <rimin...@sina.cn>
Subject 回复:How to Broadcast a very large model object (used in iterative scoring in recommendation system) in Flink
Date Fri, 30 Sep 2016 08:13:18 GMT
your message is very short,i can not read more.the follow is my guss,
    in flink,the dataStream is not for iterative computation,the dataSet would be more well.and
fink suggest broadcast mini data,not large.

   your can load your model data (it can be from file,or table),before main function,andassignment
to variable ,like name=yourModel.
 and the dataStream(it is a stream,unscored record,like DataStream[String] or DataStream[yourClass]),
and dataStream.map{x=>
  val score = computeScore(x,yourModel) 
}

object YourObject {

load your model 
val yourModel = ;

def main(){
   ...............
    read unscoreed record,from socket or kafka,or ....

     dataStream.map{x=>
      val score = computeScore(x,yourModel) 
    }
   ......
}
}
----- 原始邮件 -----
发件人:Anchit Jatana <development.anchit@gmail.com>
收件人:user@flink.apache.org
主题:How to Broadcast a very large model object (used in iterative scoring in recommendation
system) in Flink
日期:2016年09月30日 14点15分

Hi All,
I&#39;m building a recommendation system streaming application for which I need to broadcast
a very large model object (used in iterative scoring) among all the task managers performing
the operation parallely for the operator
I&#39;m doing an this operation in map1 of CoMapFunction. Please suggest me some way to
achieve the broadcasting of the large model variable (something similar to what Spark has
with broadcast variables).
Thank you
Regards,Anchit

Mime
View raw message