mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: What about a universal input data handling mechanism for Mahout?
Date Tue, 26 Jul 2011 09:50:32 GMT
We do have:
SequenceFilesFromCsvFilter, although it is somewhat basic
CSVVectorIterator, which takes a CSV file and produces a dense vector


On Jul 26, 2011, at 3:58 AM, Ted Dunning wrote:

> The critical design step here is to decide how to express the schema of the
> CSV file.  There is a beginning of this in the CsvRecordFactory, but I was
> never happy with the (lack of) speed.
> 
> On Tue, Jul 26, 2011 at 12:10 AM, Sebastian Schelter <ssc@apache.org> wrote:
> 
>> 2. SequenceFile is not file format that command line users can
>>> prepare, is there tool for converting CSV files into SequenceFiles
>>> 
>> 
>> I don't think we have that yet, but it would be very useful imho.
>> 

--------------------------
Grant Ingersoll




Mime
View raw message