incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CRUNCH-23) PCollection#sort doesn't do a full sort on values
Date Sat, 21 Jul 2012 11:39:33 GMT
Gabriel Reid created CRUNCH-23:
----------------------------------

             Summary: PCollection#sort doesn't do a full sort on values
                 Key: CRUNCH-23
                 URL: https://issues.apache.org/jira/browse/CRUNCH-23
             Project: Crunch
          Issue Type: Bug
            Reporter: Gabriel Reid


When a PCollection is sorted (using PCollection#sort), the sorting that is performed is only
per reducer, and not an absolute sort over all values. This means that the values are not
in sorted order if they are iterated over on a materialized collection. It also means that
the sorted files that are output from a sort operation can not be simply concatenated to come
to a single sorted file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message