A while back, Karl Wettin said[1]:
> Coming from Weka's immensely bloated ARFF instances implementation,
> I would like to see a really, really, really abstract solution. So if
> possible I would prefere that collections was something introduced in
> a layer further up. That way the consumer gets to choose what solution
> is best at any given environment. JFC, raw data in a NIO-buffer, some
> sort of stream, or what not.
I think this is a critical area for pre-coding design.
I made a section on the main Wiki page for Design, and added under it a link to a new whiteboard
page for discussing this topic:
<http://cwiki.apache.org/confluence/display/MAHOUT/Collection(De-)Serialization>
(So far, I've only put ARFF links there.)
Karl, can you elaborate on what you think is wrong with Weka's instances implementation?
Steve
[1] <http://ml.grantingersoll.com/pipermail/ml-grantingersoll.com/2007-July/000034.html>
|