hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5266) Values Iterator should support "mark" and "reset"
Date Wed, 15 Apr 2009 13:51:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699195#action_12699195
] 

Devaraj Das commented on HADOOP-5266:
-------------------------------------

Some points: 
1. Put a comment around IFile.Writer.close() for the keyClass!=null check
add the clear in the MarkableIterator interface
2. A Counter for the number of times values are iterated over would be nice to have
You probably can improve the implementation of how you write the firstkeybytes/firstvaluebytes
by passing the Serializer the stream corresponding to the BackupStore as opposed to making
a DataOutputBuffer copy of the bytes. Granted this is happening only for the first key/value
bytes after a mark is called. But maybe it makes sense to keep the implementation tight if
it doesn't mess up the code a lot.
3. Remove values.clear() from the ReduceValuesIterator iteration
4. Task.ValuesIterator.readNextValue should do "nextValueBytes.getLength() - nextValueBytes.getPosition()"
to get the length?
5. The size for the MemoryCache in BackupStore should probably be a fraction of mapred.job.reduce.input.buffer.percent.

> Values Iterator should support "mark" and "reset"
> -------------------------------------------------
>
>                 Key: HADOOP-5266
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5266
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.21.0
>
>         Attachments: hadoop-5266-v1.patch
>
>
> Some users have expressed interest in having a mark-reset functionality on values iterator.
Users can call mark() at any point during the iteration process and a subsequent reset() should
move the iterator to the last value emitted when mark() was called. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message