pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PIG-4298) Descending order-by is broken in some cases when key is bytearrays
Date Wed, 05 Nov 2014 16:36:36 GMT
Cheolsoo Park created PIG-4298:
----------------------------------

             Summary: Descending order-by is broken in some cases when key is bytearrays 
                 Key: PIG-4298
                 URL: https://issues.apache.org/jira/browse/PIG-4298
             Project: Pig
          Issue Type: Bug
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park
             Fix For: 0.15.0


Here is a repo script (using [PigPen|https://github.com/Netflix/PigPen])-
{code}
REGISTER pigpen.jar;

load4254 = LOAD 'input.clj'
    USING PigStorage('\n')
    AS (value:chararray);

DEFINE udf4265 pigpen.PigPenFnDataBag('(clojure.core/require (quote [pigpen.runtime]) (quote
[clojure.edn]))','(pigpen.runtime/exec [(pigpen.runtime/process->bind (pigpen.runtime/pre-process
:pig :native)) (pigpen.runtime/map->bind clojure.edn/read-string) (pigpen.runtime/key-selector->bind
clojure.core/identity) (pigpen.runtime/process->bind (pigpen.runtime/post-process :pig
:native-key-frozen-val))])');

generate4263 = FOREACH load4254 GENERATE
    FLATTEN(udf4265(value));
generate4257 = FOREACH generate4263 GENERATE
    $0 AS key,
    $1 AS value;

order4258 = ORDER generate4257 BY key DESC; <-- sort order isn't changed by DESC
dump order4258;
{code}
This script returns the same result for both ASC and DESC orders.

The problem is as follows-
# {{PigBytesRawComparator}} calls {{BinInterSedesTupleRawComparator.compare()}}.
# {{BinInterSedesTupleRawComparator}} applies descending order.
# {{PigBytesRawComparator}} applies descending order again to what {{BinInterSedesTupleRawComparator}}
returns.

Therefore, descending order is never applied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message