hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: retain states between mappers
Date Sat, 05 Feb 2011 16:14:47 GMT
If you know abc.csv is supposed to be "0", xyz supposed to be "1",
etc. then yes, there may be an easy way. For maintaining a persistent
count across jobs, one can even go that extra step to use another
service to take care of that.

If not, what would you do if multiple mappers get instantiated at the
same time for a lot of files? How would you determine the ordered
count for each if they all (or some, possibly) begin at the same time?

On Sat, Feb 5, 2011 at 5:33 PM, ANKITBHATNAGAR <abhatnagar@vantage.com> wrote:
> Hi All,
> I am working on a task where I have to determine the count in the sequence
> and increment by one.
> My input to the job is multiple files
> input/abc.csv
> input/xyz.csv
> So for example if my mapper is processing abc.csv I should be able to say my
> current count is 0.
> for file xyz.csv I should be able to say current count is 1.
> is there a way I can retain the count between mappers and increment.?
> Thanks
> Ankit
> --
> View this message in context: http://old.nabble.com/retain-states-between-mappers-tp30851293p30851293.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.

Harsh J

View raw message