asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Luo <cl...@uci.edu>
Subject Re: The IIndexCursor interface
Date Fri, 01 Dec 2017 00:40:52 GMT
A typo in previous email "internal objects should only be created during
the call of open" -> "internal objects should only be created during the
first call of open"

On Thu, Nov 30, 2017 at 4:37 PM, Chen Luo <cluo8@uci.edu> wrote:

> +1.
>
> It seems to me the main issue for cursor is that a cursor sometime needs
> to be re-used for performance reason (e.g., during primary key lookups
> after secondary index search). One thing to note is that when make these
> changes, or implement new cursors, one has to be very careful that a cursor
> might be reused. As a requirement, internal objects should only be created
> during the call of open, and close method must not clean up these objects
> (unfortunately, the previous implementation of LSMBTreePointSearchCursor[1]
> mistakenly clears its internal objects during reset, which results in tons
> of objects are created during primary key lookups.)
>
> [1] https://github.com/apache/asterixdb/blob/
> 89e6a93277205a9dbc76c18e249919a745d224d2/hyracks-fullstack/
> hyracks/hyracks-storage-am-lsm-btree/src/main/java/org/
> apache/hyracks/storage/am/lsm/btree/impls/LSMBTreePointSearchCursor.
> java#L142
>
> On Thu, Nov 30, 2017 at 3:42 PM, abdullah alamoudi <bamousaa@gmail.com>
> wrote:
>
>> Dear devs,
>> The IIndexCursor interface is one of the critical interfaces inside
>> asteridxb. It is used to access tuples inside indexes, we have many
>> implementations for it and it is used differently in a different places. We
>> are trying to specify a contract for the interface that all
>> implementors/users of the a cursor have to follow to ensure consistent
>> state and no leaked resources under any circumstances. The scope of this
>> email focuses on the lifecycle of cursors and on the following existing
>> methods:
>>
>> -- void open(ICursorInitialState initialState, ISearchPredicate
>> searchPred) throws HyracksDataException;
>> -- boolean hasNext() throws HyracksDataException;
>> -- void next() throws HyracksDataException;
>> -- void close() throws HyracksDataException;
>> -- void reset() throws HyracksDataException;
>>
>> Currently, these calls are "mostly" used as follows in our code:
>>
>> - If there are multiple search predicates:
>> cursor = new cursor();
>> while (more predicates){
>>   cursor.reset()
>>   cursor.open(predicate);
>>   while (cursor.hasNext()){
>>     cursor.next()
>>   }
>> }
>> cursor.close();
>>
>> - If there is a single search predicate:
>> cursor = new cursor();
>> cursor.open(predicate);
>> while (cursor.hasNext()){
>>   cursor.next()
>> }
>> cursor.close();
>>
>> There are two problems with this:
>>
>> 1. There is no enforcement of any type of contract. For example, one can
>> open a cursor and reset it and then continue to read tuples from the cursor
>> as follows:
>>
>> cursor.open(predicate);
>> cursor.hasNext()
>> cursor.next()
>> cursor.reset()
>> cursor.hasNext()
>> cursor.next()
>>
>> and continue to read tuples. This is bug prone and can cause hidden bugs
>> to linger for a long time.
>>
>> 2. Naming and symmetry: open calls don't have corresponding close calls
>> "unless we know the cursor will be used with exactly one search predicate"
>> With this, the implementation of the cursor lead to either duplicate code
>> or having close() call reset() or the other way around and handling of
>> special cases.
>> Moreover, when there are slight differences, often it is easy to make a
>> change in one and forget about the other.
>>
>> ==========================================
>> To deal with these issues, we are proposing the following:
>>
>> 1. change the methods to:
>>
>> -- void open(ICursorInitialState initialState, ISearchPredicate
>> searchPred) throws HyracksDataException;
>> -- boolean hasNext() throws HyracksDataException;
>> -- void next() throws HyracksDataException;
>> -- void close(); // used to be reset()
>> -- void destroy(); // used to be close()
>>
>>
>> The call cycle becomes:
>> - If there are multiple search predicates:
>> cursor = new cursor();
>> while (more predicates){
>>   cursor.open(predicate);
>>   while (cursor.hasNext()){
>>     cursor.next()
>>   }
>>   cursor.close(); // used to be reset()
>> }
>> cursor.destroy(); // used to be close()
>>
>> - If there is a single search predicate:
>> cursor = new cursor();
>> cursor.open(predicate);
>> while (cursor.hasNext()){
>>   cursor.next()
>> }
>> cursor.close(); // used to be reset()
>> cursor.destroy(); // used to be close()
>>
>> This way, we have a symmetry and we know that:
>> -- A created cursor will always have cursor.destroy() called.
>> -- An open cursor will always have cursor.close() called.
>>
>>
>> 2. Enforce the cursor state machine as follows:
>> The states are:
>> CLOSED
>> OPEN
>> DESTROYED
>> When a cursor object is created, it is in the CLOSED state.
>>
>> - CLOSED: The only legal calls are open() --> OPEN, or destroy() -->
>> DESTROYED
>> - OPEN: The only legal calls are hasNext(), next(), or close() --> CLOSED.
>> - DESTROYED: All calls are illegal.
>>
>> We can then add tests to ensure that each of the cursors is enforcing the
>> contract.
>>
>> Thoughts?
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message