mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Rowe" <sar...@odyssey.net>
Subject (De-)serializing collections/datasets
Date Thu, 31 Jan 2008 22:37:50 GMT
A while back, Karl Wettin said[1]:

> Coming from Weka's immensely bloated ARFF instances implementation,
> I would like to see a really, really, really abstract solution. So if
> possible I would prefere that collections was something introduced in
> a layer further up. That way the consumer gets to choose what solution
> is best at any given environment. JFC, raw data in a NIO-buffer, some
> sort of stream, or what not.

I think this is a critical area for pre-coding design.

I made a section on the main Wiki page for Design, and added under it a link to a new whiteboard
page for discussing this topic:

<http://cwiki.apache.org/confluence/display/MAHOUT/Collection(De-)Serialization>

(So far, I've only put ARFF links there.)

Karl, can you elaborate on what you think is wrong with Weka's instances implementation?

Steve

[1] <http://ml.grantingersoll.com/pipermail/ml-grantingersoll.com/2007-July/000034.html>


Mime
View raw message