hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <ar...@yahoo-inc.com>
Subject JobConf.setOutputKeyComparatorClass
Date Thu, 29 Jun 2006 04:06:59 GMT

      I have a *map* which does some processing and then a *reduce* which sorts the results.

      TextInputFormat & TextOutputFormat are the input/output formats respectively.

      However the *sort* I want to perform is as follows:
      I want to sort output by 'comparing' 'columns' of 'key's in the Comparator and not the
entire 'key'.

      E.g. spec: column1, column0 is the sort-spec.
      aaa ccc ggg
      bbb aaa hhh

      should result in:
      bbb aaa hhh
      aaa ccc ggg

  I can't seem to find an 'elegant' way to do this via the MR framework i.e. I can't seem
to be able to set a *policy* (i.e. set the sort-spec) for the WritableComparable via the framework.
Is there something I'm missing? In essence I probably need a *configure* callback for the
WritableComparable interface too? Is there a better way? Or is this outside the scope of the


View raw message