hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: TextInputFormat unique key across files
Date Mon, 04 May 2009 20:04:45 GMT
Hi Rares,

You can access the name of the current file by looking at the
"mapred.input.file" configuration variable in the Configuration object.

If you're using Hadoop Streaming this is available as $MAPRED_INPUT_FILE

Hope that helps,
-Todd

On Mon, May 4, 2009 at 12:46 PM, Rares Vernica <rares@ics.uci.edu> wrote:

> Hello,
>
> TextInputFormat is a perfect match for my problem. The only drawback is
> that fact that keys are unique only within a file. Is there an easy way
> to have keys unique across files. That is, each line in any file should
> get a unique key. Is there an unique id for each file? If yes, maybe I
> can concatenate them if I can access the file id from the map function.
>
> Thanks,
> Rares
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message