spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlwh <...@git.apache.org>
Subject [GitHub] incubator-spark pull request: [Proposal] Adding sparse data suppor...
Date Tue, 18 Feb 2014 23:57:02 GMT
Github user dlwh commented on the pull request:

    https://github.com/apache/incubator-spark/pull/575#issuecomment-35450646
  
    @mengxr thanks for doing all this!  It's nice to see that the overhead in Breeze is largely
negligible  as compared to MTJ (and maybe even slightly better sometimes?).
    
    I'm pretty aggressive in avoiding auto-boxing in Breeze, and there are relatively few
implicit conversions. Auto-boxing definitely crops up here and there, but between codegen
and specialization, I think most of the tight loops are entirely unboxed. Spire's macros would
hopefully remove some of the byte code bloat from the codegen.. (and @VladUreche's miniboxing
might make things even better, one day.) @fommil is right that implicits really don't cause
a performance problem, especially not the way they're used in Breeze (which is to say, few
implicit conversions, lots of implicit parameters).  Generic programming can be a problem,
and implicit parameters make it tempting to do more of it. 
    
    I'm curious to know what you guys found in the past with Breeze. The only thing I ever
heard from ya'll before now was someone trying to update a CSCMatrix constantly, which is
just always going to be slow. (I'm sure there must be problems. Breeze is young enough that
some code paths aren't as tested as they could be.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
infrastructure@apache.org or file a JIRA ticket with INFRA.
---

Mime
View raw message