Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0C84E200D2A for ; Sat, 28 Oct 2017 18:44:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 0B2B6160BE1; Sat, 28 Oct 2017 16:44:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 513B8160BDC for ; Sat, 28 Oct 2017 18:44:06 +0200 (CEST) Received: (qmail 55210 invoked by uid 500); 28 Oct 2017 16:44:05 -0000 Mailing-List: contact dev-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list dev@asterixdb.apache.org Received: (qmail 55197 invoked by uid 99); 28 Oct 2017 16:44:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 28 Oct 2017 16:44:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 6BF45180785 for ; Sat, 28 Oct 2017 16:44:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id E20g7tQaN5uV for ; Sat, 28 Oct 2017 16:44:02 +0000 (UTC) Received: from mail-oi0-f42.google.com (mail-oi0-f42.google.com [209.85.218.42]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 190885FB0B for ; Sat, 28 Oct 2017 16:44:02 +0000 (UTC) Received: by mail-oi0-f42.google.com with SMTP id h6so15305348oia.10 for ; Sat, 28 Oct 2017 09:44:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=G505g/CbaIMz6JYKrokMlE35aV9wObS5RC5HUi8IRBE=; b=V3OKYj3WgumaQpmPLVTZHRZMp6wsqX7N6Vu6nd9FUmN3kXp2haBmpB+KJHnR0S7NN8 KjMvFLGq4WeOUcHMv1YCyT4wyo/yw4vjBfP9TIZ60bSiBTGAiJpuDPP/1Y5UIgmwLRSk SrdDselWj6dfUMy1ro0mX0HRZUFxABxU1XrUlHdplXDqjjzOkeKGxqPtyyk99qO7y0qD JbTBh51amfgf1MsT/zLJgrk25FI/YI+bHyW8S2SXOLxrS1BZa0Bh2xgoO6WnSx5aW2WS JKqcERLdNNcfHdGJ45BMEwDbNMQUleB0Kzok71R0BSPzHy+KzQpqZBY7zoywmq+zfUKl FHzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=G505g/CbaIMz6JYKrokMlE35aV9wObS5RC5HUi8IRBE=; b=QAwjh02MXjV448Zu18R5ovj8GjodQFu1MPYKIQ/DS3ufFSU9dLBYRAINDuDEqqHm2h 3/7ynpyyqtJbfv+pkf2b7NsY2lgxxQl91IjF9NUcMs8KQ+OxtXCimk9ckWE2eW2YeWSw AwHbm75uMJgJXXhBgIBRlH0d/TVrqfpzoTpoLy4RJo0x/4TjfXsxZ96klvFixJ2CEVxV fxyg/FRHnFgTeMtfc36e9JZt5hc6Pw1YFV0qqTYSeP8DcACZO8Fwvr0qan5ufW5Oy4WW 1lRoTqpaAIKQoflEbMhG4XrnbO03FzmYIPc04q7B7GqEHxDBA99sJ9DTA4b8XUmtE2p0 4J/g== X-Gm-Message-State: AMCzsaXlUcDoch8tb3ByjNxcHUKsAPcja5xo5AczsJx0RzBmvK7sH6Ow DiRO9CFi4CjPhCiWfGzWf45aFGv5lBIzMMF4Y7Y= X-Google-Smtp-Source: ABhQp+Rn3aK9HERWy9NxFc5pT6rqQQ5t3zExp0kj+fnyJT4uQu2aqZ6LvPUSGUK9XqjZfJ5Qx6j2ssCw8DkSxf6h81E= X-Received: by 10.157.48.66 with SMTP id w2mr2251360otd.417.1509209041217; Sat, 28 Oct 2017 09:44:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.14.9 with HTTP; Sat, 28 Oct 2017 09:44:00 -0700 (PDT) In-Reply-To: References: <0DB6BB3D-170F-417B-9C3D-99905228C069@gmail.com> <3dd76412.a45d.15f61ce2cc6.Coremail.00008749@whu.edu.cn> From: Wail Alkowaileet Date: Sat, 28 Oct 2017 09:44:00 -0700 Message-ID: Subject: Re: Re: Adapting TimSort into AsterixDB/Hyracks To: dev@asterixdb.apache.org Content-Type: multipart/alternative; boundary="001a113ac576cb2c9c055c9e1ca1" archived-at: Sat, 28 Oct 2017 16:44:07 -0000 --001a113ac576cb2c9c055c9e1ca1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable P.S Spark implementation is under Apache. On Sat, Oct 28, 2017 at 9:41 AM, Wail Alkowaileet wrote: > Android has an implementation: > https://github.com/retrostreams/android-retrostreams/blob/master/src/ > main/java/java9/util/TimSort.java > > Spark has ported it > https://github.com/apache/spark/blob/master/core/src/ > main/java/org/apache/spark/util/collection/TimSort.java > > We can customize it for AsterixDB comparators. > > On Sat, Oct 28, 2017 at 9:28 AM, Chen Luo wrote: > >> I don't know whether there is an easy way for us to directly reuse TimSo= rt >> in JDK since it's design for sorting objects in an Array, while in Hyrac= ks >> we don't have explicit object creation and sort everything in main memor= y. >> So what I did is that I copied it's source code, and replaced all object >> assignments/swaps/comparisons using Hyracks in-memory operations. >> >> Best regards, >> Chen Luo >> >> On Sat, Oct 28, 2017 at 12:07 AM, =E6=9D=8E=E6=96=87=E6=B5=B7 <00008749@= whu.edu.cn> wrote: >> >> > I believe reusing jdk afap could be better. btw, timsort is better tha= n >> > others by 1x when records are locally ordered . >> > best >> > >> > =E5=9C=A8 2017-10-28 14:38:21=EF=BC=8C"abdullah alamoudi" =E5=86=99=E9=81=93=EF=BC=9A >> > >> > >While I have no answer to the question of legality, this sounds great= . >> > > >> > >~Abdullah. >> > > >> > >> On Oct 27, 2017, at 9:20 PM, Chen Luo wrote: >> > >> >> > >> Hi devs, >> > >> >> > >> I have adapted the TimSort algorithm used in JDK (java.util.TimSort= ) >> > into >> > >> Hyracks, which gives 10-20% performance improvements on random data= . >> It >> > >> will be more useful if the input data is partially sorted, e.g., >> primary >> > >> keys fetched from secondary index scan, which I haven't got time to >> > >> experiment with. >> > >> >> > >> *Before going any further, is it legal to adapt some algorithm >> > >> implementation from JDK into our codebase? *I saw the JDK >> implementation >> > >> itself is adopted from >> > >> http://svn.python.org/projects/python/trunk/Objects/listsort.txt as >> > well. >> > >> >> > >> Best regards, >> > >> Chen Luo >> > > >> > >> > >> > > > > -- > > *Regards,* > Wail Alkowaileet > --=20 *Regards,* Wail Alkowaileet --001a113ac576cb2c9c055c9e1ca1--