Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 965C1200D5C for ; Fri, 1 Dec 2017 01:40:58 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 94B06160C04; Fri, 1 Dec 2017 00:40:58 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B3194160C01 for ; Fri, 1 Dec 2017 01:40:57 +0100 (CET) Received: (qmail 13157 invoked by uid 500); 1 Dec 2017 00:40:56 -0000 Mailing-List: contact dev-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list dev@asterixdb.apache.org Received: (qmail 13144 invoked by uid 99); 1 Dec 2017 00:40:56 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Dec 2017 00:40:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id DABB7C5C6A for ; Fri, 1 Dec 2017 00:40:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.18 X-Spam-Level: ** X-Spam-Status: No, score=2.18 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, KB_WAM_FROM_NAME_SINGLEWORD=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=uci-edu.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id R1vwt8XN5BLo for ; Fri, 1 Dec 2017 00:40:54 +0000 (UTC) Received: from mail-it0-f47.google.com (mail-it0-f47.google.com [209.85.214.47]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 3FA875F1BA for ; Fri, 1 Dec 2017 00:40:54 +0000 (UTC) Received: by mail-it0-f47.google.com with SMTP id x28so670019ita.0 for ; Thu, 30 Nov 2017 16:40:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uci-edu.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=hLkGiSpDiE1pgfy0qx0FR6aGpaOmem6Fpw3MmKZtOZw=; b=oJE1ftkZyg7qpxMNyb6PF9TFzu5L7J0vENZKZsjr4J4/TPyhqjms5ISWKltymtMicR nQmCy+Z4ZnUMa0s1kApTBF4X5NIAphbRtMFqIk/tKrFHN+YbQtCgJODp8jAKhxSOUO7r fUEcL+FoT1qU6rDfo6Db6/wVeXftTBHbw8e8NoBu5lZ7lAt96Q54nx+DNvsITum+HGF3 2D9LTb1X648yibzj18sJdp8TkTJNYq5xNNZ1Kk7wqPdD7pdJAZBUg6AshyAOpv83NDA9 NrwNOr3JG+FNloPix/Gn3JJJBdhG4Zdf8kndhvDwiYIa8bgLO6tiX3nMHYUoeHI4nX3x 6/9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=hLkGiSpDiE1pgfy0qx0FR6aGpaOmem6Fpw3MmKZtOZw=; b=esY31hDrO6g+acvbuJSM3w0ykIVpDfrMRBE0b2v50QDHkFFUJUui80gyqmH2lTDRvi MoQ7Yh9gufl9PqyXfAUPVvi+CpWINg7te8szUN9TZQ6p8dttR+sVnUz6/HZhiINc8uGJ MdsUqFc/4cFCIHZvcdb42hwhohuPjqMOftkTruEkBPc4KFqa4z18qFtNWAKOCH/LvpIu AbHFHhWjTFk/JNXMb8wxD+lWX8hHW4QMv5aJLLV4fVEEHLhgBrCm5R0NQuQOlGEKoRgZ Vy1D49WaVZUnyGosu9bzijxsfbX5hXEGSGL4IulhpqEg2OhedxrOSG1hdlT1TQ6Sw0tK 3KUA== X-Gm-Message-State: AJaThX4POk9KaYxo5At6q0ycIxesFrJaf0ilpIW3EPUE/pe6BGEshtLO NJolsm2qSY/llbdkOb/7vXzh6Z4NZuGAO6hcdr+sYzfq X-Google-Smtp-Source: AGs4zMZdYjC0HNjsUdRteMJkaf5rKgN5xJ0IQy89vRV1pWSwnuToSNENIZ/sOkKm9CPHirPfvVpqpZSCiDqwHb4JNes= X-Received: by 10.36.31.212 with SMTP id d203mr5518637itd.48.1512088853339; Thu, 30 Nov 2017 16:40:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.79.27.142 with HTTP; Thu, 30 Nov 2017 16:40:52 -0800 (PST) In-Reply-To: References: <79D1C462-7953-4633-974B-DA189424A80B@gmail.com> From: Chen Luo Date: Thu, 30 Nov 2017 16:40:52 -0800 Message-ID: Subject: Re: The IIndexCursor interface To: dev@asterixdb.apache.org Content-Type: multipart/alternative; boundary="001a11446060f8f9ae055f3c9e77" archived-at: Fri, 01 Dec 2017 00:40:58 -0000 --001a11446060f8f9ae055f3c9e77 Content-Type: text/plain; charset="UTF-8" A typo in previous email "internal objects should only be created during the call of open" -> "internal objects should only be created during the first call of open" On Thu, Nov 30, 2017 at 4:37 PM, Chen Luo wrote: > +1. > > It seems to me the main issue for cursor is that a cursor sometime needs > to be re-used for performance reason (e.g., during primary key lookups > after secondary index search). One thing to note is that when make these > changes, or implement new cursors, one has to be very careful that a cursor > might be reused. As a requirement, internal objects should only be created > during the call of open, and close method must not clean up these objects > (unfortunately, the previous implementation of LSMBTreePointSearchCursor[1] > mistakenly clears its internal objects during reset, which results in tons > of objects are created during primary key lookups.) > > [1] https://github.com/apache/asterixdb/blob/ > 89e6a93277205a9dbc76c18e249919a745d224d2/hyracks-fullstack/ > hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/ > apache/hyracks/storage/am/lsm/btree/impls/LSMBTreePointSearchCursor. > java#L142 > > On Thu, Nov 30, 2017 at 3:42 PM, abdullah alamoudi > wrote: > >> Dear devs, >> The IIndexCursor interface is one of the critical interfaces inside >> asteridxb. It is used to access tuples inside indexes, we have many >> implementations for it and it is used differently in a different places. We >> are trying to specify a contract for the interface that all >> implementors/users of the a cursor have to follow to ensure consistent >> state and no leaked resources under any circumstances. The scope of this >> email focuses on the lifecycle of cursors and on the following existing >> methods: >> >> -- void open(ICursorInitialState initialState, ISearchPredicate >> searchPred) throws HyracksDataException; >> -- boolean hasNext() throws HyracksDataException; >> -- void next() throws HyracksDataException; >> -- void close() throws HyracksDataException; >> -- void reset() throws HyracksDataException; >> >> Currently, these calls are "mostly" used as follows in our code: >> >> - If there are multiple search predicates: >> cursor = new cursor(); >> while (more predicates){ >> cursor.reset() >> cursor.open(predicate); >> while (cursor.hasNext()){ >> cursor.next() >> } >> } >> cursor.close(); >> >> - If there is a single search predicate: >> cursor = new cursor(); >> cursor.open(predicate); >> while (cursor.hasNext()){ >> cursor.next() >> } >> cursor.close(); >> >> There are two problems with this: >> >> 1. There is no enforcement of any type of contract. For example, one can >> open a cursor and reset it and then continue to read tuples from the cursor >> as follows: >> >> cursor.open(predicate); >> cursor.hasNext() >> cursor.next() >> cursor.reset() >> cursor.hasNext() >> cursor.next() >> >> and continue to read tuples. This is bug prone and can cause hidden bugs >> to linger for a long time. >> >> 2. Naming and symmetry: open calls don't have corresponding close calls >> "unless we know the cursor will be used with exactly one search predicate" >> With this, the implementation of the cursor lead to either duplicate code >> or having close() call reset() or the other way around and handling of >> special cases. >> Moreover, when there are slight differences, often it is easy to make a >> change in one and forget about the other. >> >> ========================================== >> To deal with these issues, we are proposing the following: >> >> 1. change the methods to: >> >> -- void open(ICursorInitialState initialState, ISearchPredicate >> searchPred) throws HyracksDataException; >> -- boolean hasNext() throws HyracksDataException; >> -- void next() throws HyracksDataException; >> -- void close(); // used to be reset() >> -- void destroy(); // used to be close() >> >> >> The call cycle becomes: >> - If there are multiple search predicates: >> cursor = new cursor(); >> while (more predicates){ >> cursor.open(predicate); >> while (cursor.hasNext()){ >> cursor.next() >> } >> cursor.close(); // used to be reset() >> } >> cursor.destroy(); // used to be close() >> >> - If there is a single search predicate: >> cursor = new cursor(); >> cursor.open(predicate); >> while (cursor.hasNext()){ >> cursor.next() >> } >> cursor.close(); // used to be reset() >> cursor.destroy(); // used to be close() >> >> This way, we have a symmetry and we know that: >> -- A created cursor will always have cursor.destroy() called. >> -- An open cursor will always have cursor.close() called. >> >> >> 2. Enforce the cursor state machine as follows: >> The states are: >> CLOSED >> OPEN >> DESTROYED >> When a cursor object is created, it is in the CLOSED state. >> >> - CLOSED: The only legal calls are open() --> OPEN, or destroy() --> >> DESTROYED >> - OPEN: The only legal calls are hasNext(), next(), or close() --> CLOSED. >> - DESTROYED: All calls are illegal. >> >> We can then add tests to ensure that each of the cursors is enforcing the >> contract. >> >> Thoughts? > > > --001a11446060f8f9ae055f3c9e77--