systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shirish Tatikonda <shirish.tatiko...@gmail.com>
Subject Re: Matrix Market format with metadata file
Date Tue, 16 Feb 2016 00:50:33 GMT
Btw (Just to be precise), in your example of "mm" file.. the metadata is "4
3 6" but the following non-zero values are only up to row number 3. So,
either it was a typo or the 4th row contains all zeros.



On Mon, Feb 15, 2016 at 4:26 PM, Shirish Tatikonda <
shirish.tatikonda@gmail.com> wrote:

> Both "mm" and "text" formats are identical except for a couple of
> differences:
>
> 1) for "mm": the matrix metadata is included in the first two lines; and
> for "text": the metadata is present in the associated .mtd file
> 2) "mm" data must be in a single file (i.e., no *part* files) where
> "text" data can span multiple *part* files (like any other file on HDFS).
>
> The support for "mm" is created mainly for the purpose of
> importing/exporting data in the format that R likes.
>
> Shirish
>
> On Mon, Feb 15, 2016 at 4:17 PM, Deron Eriksson <deroneriksson@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have a question with regards to text vs mm. Isn't the mm coordinate
>> format identical to the text format but the mm data file happens to
>> include
>> the metadata line for rows, cols, and nnzs, so shouldn't they scale the
>> same since the text row values (i,j,v) correspond to the mm rows?
>>
>> If we have the following MM:
>> %%MatrixMarket matrix coordinate real general
>> 4 3 6
>> 1 1 1.0
>> 1 2 2.0
>> 1 3 3.0
>> 3 1 7.0
>> 3 2 8.0
>> 3 3 9.0
>>
>> The corresponding text format (with accompanying metadata file) is:
>> 1 1 1.0
>> 1 2 2.0
>> 1 3 3.0
>> 3 1 7.0
>> 3 2 8.0
>> 3 3 9.0
>>
>> So aren't these formats essentially the same?
>>
>> Deron
>>
>>
>> On Mon, Feb 15, 2016 at 3:56 PM, Matthias Boehm <mboehm@us.ibm.com>
>> wrote:
>>
>> > The meta data file is still useful in order to get the format. In case
>> of
>> > matrix market, errors will be raised if included meta data is
>> inconsistent.
>> > So no, we should not disallow to specify the meta data. In general, we
>> > anyway recommend using text (textcell) instead mm (matrix market) for
>> > scalability reasons.
>> >
>> > Regards,
>> > Matthias
>> >
>> > [image: Inactive hide details for Deron Eriksson ---02/15/2016 03:45:46
>> > PM---Hi, The Matrix Market coordinate format contains # rows, #]Deron
>> > Eriksson ---02/15/2016 03:45:46 PM---Hi, The Matrix Market coordinate
>> > format contains # rows, # columns, and #
>> >
>> > From: Deron Eriksson <deroneriksson@gmail.com>
>> > To: dev@systemml.incubator.apache.org
>> > Date: 02/15/2016 03:45 PM
>> > Subject: Matrix Market format with metadata file
>> > ------------------------------
>> >
>> >
>> >
>> > Hi,
>> >
>> > The Matrix Market coordinate format contains # rows, # columns, and #
>> > non-zero values as metadata near the top of a matrix data file.
>> >
>> > If I write a matrix in mm format using SystemML, no metadata file is
>> > created since the metadata is stored within the data file.
>> >
>> > However, when reading a matrix with mm format, I can supply a metadata
>> > file, even though metadata exists in the matrix data file. Is there any
>> > reason for this, or should this be disallowed since the metadata file is
>> > redundant and can cause confusion, since metadata values can then be
>> > specified in two places, which then brings up the question, "which
>> metadata
>> > value should be used"?
>> >
>> > Deron
>> >
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message