gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alfonso Nishikawa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GORA-514) Scan of a single key with a limit clears the persistent instance when iterating results
Date Mon, 17 Jul 2017 21:09:00 GMT

    [ https://issues.apache.org/jira/browse/GORA-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090564#comment-16090564
] 

Alfonso Nishikawa commented on GORA-514:
----------------------------------------

I have been digging, and maybe I am wrong.

I was thinking that result#get() should return a not-reused instance as happens in HBase,
but maybe it is HBase which is wrong? result#get() reuses the instance?
The sourcecode in some way suggest that [1], but I did not find any documentation where it
is specified. Anyone knows?

If this is the case, then it is HBaseStore which is returning new instances and the only thing
to do in this issue is to write somewhere in the javadoc that the instance returned by #get()
_could be_ one being reused and should be cloned if necessary.

What do you think? I now feel quite confident about that the only need is the explanation
somewhere :)

[1] https://github.com/apache/gora/blob/apache-gora-0.7/gora-core/src/main/java/org/apache/gora/query/impl/ResultBase.java#L103

> Scan of a single key with a limit clears the persistent instance when iterating results
> ---------------------------------------------------------------------------------------
>
>                 Key: GORA-514
>                 URL: https://issues.apache.org/jira/browse/GORA-514
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-accumulo, gora-cassandra, gora-core, gora-hbase, gora-jcache,
gora-mongodb, gora-solr
>    Affects Versions: 0.7, 0.8
>            Reporter: Alfonso Nishikawa
>            Priority: Minor
>         Attachments: GORA-514-Example-Test.diff
>
>
> To put in context, I am just doing a scan where the start key, end key and limit are
configurable:
> {code:java}
> Query<String,Persistent> dataQuery = dataStore.newQuery() ;
> if (startKey != null && !startKey.equals("")) {
>     dataQuery.setStartKey(startKey);
> }
> if (endKey != null && !endKey.equals("")) {
>     dataQuery.setEndKey(endKey);
> }
> dataQuery.setLimit(limit);
> Result<?,Persistent> result = dataQuery.execute();
> while (result.next()) {
>     results.put(result.getKey(), result.get()) ;
> }
> {code}
> When the start key is equal to end key, and the limit is configured to a value >=
2 (the default value is -1), the second call to result.next() in the while bucle clears the
instance previously returned by result.get().
> We could think that this would be an expected behaviour since result.get() -especifically
for HBase- is a reusable instance when performing a Get operation, but this clashes with the
actual expected general behaviour in the usual Scan operation shown in the former code example.
> This is: next() and get() when performing a scan should behave the same no matter what
initial/end keys you configure, and what maximum number of results you want.
> I implemented a test than shows the issue affecting Accumulo, Cassandra, HBase, JCache,
MongoDB and Solr, probably because it is some issue in the core.
> To see the error, you can apply the attached patch with the tests example and execute:
> {code}
> mvn -Dtest=#testScanSingleResultWithLimit -fn -DfailIfNoTests=false test
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message