hivemall-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kottmann <...@git.apache.org>
Subject [GitHub] incubator-hivemall issue #93: [WIP][HIVEMALL-126] Maximum Entropy Model usin...
Date Wed, 02 Aug 2017 13:26:31 GMT
Github user kottmann commented on the issue:

    https://github.com/apache/incubator-hivemall/pull/93
  
    @helenahm as far as I know the training data is stored once in memory, and then for each
thread a copy of the parameters is stored. 
    
    Yeah, so if you have a lot of training data then running out of memory is one symptom
you run into, but that is not the actual problem of this implementation. The actual cause
is that it won't scale beyond one machine.
    
    Bottom line if you want to use GIS training with lots of data don't use this implementation,
 the training requires a certain amount of CPU time and it increases with the amount of training
data. In case you manage to make this run with much more data the time it will take to run
will be uncomfortably high.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message