giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gianmarco De Francisci Morales (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-235) SequenceFile output format (id-value only)
Date Fri, 06 Jul 2012 11:43:35 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407915#comment-13407915
] 

Gianmarco De Francisci Morales commented on GIRAPH-235:
-------------------------------------------------------

Indeed, you are right. Here is how SequenceFileOutputFormat returns the RecordWriter.

{code}

61	    final SequenceFile.Writer out = 
62	      SequenceFile.createWriter(fs, conf, file,
63	                                context.getOutputKeyClass(),
64	                                context.getOutputValueClass(),
65	                                compressionType,
66	                                codec,
67	                                context);
68	
69	    return new RecordWriter<K, V>() {
70	
71	        public void write(K key, V value)
72	          throws IOException {
73	
74	          out.append(key, value);
75	        }

{code}

It uses the key/value class information in the context and not the one in the template.

I think I cannot configure the RecordWriter at runtime: information about generic types is
stripped out from the class files.

However, I am a bit confused. I would think that these lines in GraphMapper serve exactly
the purpose of configuring the job correctly:

{code}
    conf.setClass(GiraphJob.VERTEX_INDEX_CLASS,
        (Class<?>) vertexIndexType,
        WritableComparable.class);
    conf.setClass(GiraphJob.VERTEX_VALUE_CLASS,
        (Class<?>) vertexValueType,
{code}

Shouldn't it work out of the box this way?
                
> SequenceFile output format (id-value only)
> ------------------------------------------
>
>                 Key: GIRAPH-235
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-235
>             Project: Giraph
>          Issue Type: New Feature
>          Components: lib
>            Reporter: Gianmarco De Francisci Morales
>         Attachments: GIRAPH-235.1.patch
>
>
> Create a SequenceFileOutputFormat for the cases where compression is important and we
only want the value of the vertex (e.g. pagerank)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message