cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sachin Goyal (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-13350) Having utility methods in session object
Date Mon, 20 Mar 2017 06:10:42 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sachin Goyal updated CASSANDRA-13350:
-------------------------------------
    Description: 
Data modeling in Cassandra is the key to querying.
Best way to query is to have tables where you always query by primary-key or by partition-key.

And yet there is no method in the datastax's session object that simplifies this process.
It would be great to have methods like:
# session.getByPrimaryKey (String tableName, Object []primaryKeys)
# session.getByPartitionKey (String tableName, Object []partitionKeys)
# session.getByPartitionKeys (String tableName, Object [][]partitionKeys) // Like an in-query
# session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)

The last is an unsupported feature yet in Cassandra but it would be really awesome to have
the same. It would be like a read equivalent of the batch-statements in write.

Advantages:
# Ease-of-use: User does not have to create a string query or a prepared query.
# User does not have to worry about [using prepared statements with select * queries|https://docs.datastax.com/en/developer/java-driver/3.1/manual/statements/prepared/#avoid-preparing-select-queries].
I am not yet sure how the driver would handle it but if it can, wow!
# If murmur-3 hashing in the client is same as the cluster, clients can query just the right
node (Better token-aware?)

Such methods are present in several other software. Examples:
# Hibernate: [session.get()|https://www.mkyong.com/hibernate/different-between-session-get-and-session-load/]
 and
# JPA: [find()|http://www.java2s.com/Code/Java/JPA/GetEntitybyID.htm].
# Solr: [getById()|https://lucene.apache.org/solr/6_4_1/solr-solrj/org/apache/solr/client/solrj/SolrClient.html#getById-java.lang.String-java.util.Collection-org.apache.solr.common.params.SolrParams-]
and several flavors of the same.

(Please note that these links are just an example, not meant to provide implementation details
or the behavior).

As a feature, *session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)* should
 help get a performance boost to the users because it allows running queries for different
partitions in parallel and also allows getting results from the same partition in one query.
We can put this in a separate JIRA task if it is seen as a useful feature by all.

  was:
Data modeling in Cassandra is the key to querying.
Best way to query is to have tables where you always query by primary-key or by partition-key.

And yet there is no method in the datastax's session object that simplifies this process.
It would be great to have methods like:
# session.getByPrimaryKey (String tableName, Object []primaryKeys)
# session.getByPartitionKey (String tableName, Object []partitionKeys)
# session.getByPartitionKeys (String tableName, Object [][]partitionKeys) // Like an in-query
# session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)

The last is an unsupported feature yet in Cassandra but it would be really awesome to have
the same. It would be like a read equivalent of the batch-statements in write.

Advantages:
# Ease-of-use: User does not have to create a string query or a prepared query.
# User does not have to worry about [using prepared statements with select * queries|https://docs.datastax.com/en/developer/java-driver/3.1/manual/statements/prepared/#avoid-preparing-select-queries].
I am not yet sure how the driver would handle it but if it can, wow!
# If murmur-3 hashing in the client is same as the cluster, clients can query just the right
node (Better token-aware?)

Tools like Hibernate provide such a feature. Examples: 
# [session.get()|https://www.mkyong.com/hibernate/different-between-session-get-and-session-load/]
 and
# [JPA.find|http://www.java2s.com/Code/Java/JPA/GetEntitybyID.htm].

(Please note that these links are just an example, not meant to provide implementation details
or the behavior).

As a feature, *session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)* should
 help get a performance boost to the users because it allows running queries for different
partitions in parallel and also allows getting results from the same partition in one query.
We can put this in a separate JIRA task if it is seen as a useful feature by all.


> Having utility methods in session object
> ----------------------------------------
>
>                 Key: CASSANDRA-13350
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13350
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sachin Goyal
>
> Data modeling in Cassandra is the key to querying.
> Best way to query is to have tables where you always query by primary-key or by partition-key.
> And yet there is no method in the datastax's session object that simplifies this process.
> It would be great to have methods like:
> # session.getByPrimaryKey (String tableName, Object []primaryKeys)
> # session.getByPartitionKey (String tableName, Object []partitionKeys)
> # session.getByPartitionKeys (String tableName, Object [][]partitionKeys) // Like an
in-query
> # session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)
> The last is an unsupported feature yet in Cassandra but it would be really awesome to
have the same. It would be like a read equivalent of the batch-statements in write.
> Advantages:
> # Ease-of-use: User does not have to create a string query or a prepared query.
> # User does not have to worry about [using prepared statements with select * queries|https://docs.datastax.com/en/developer/java-driver/3.1/manual/statements/prepared/#avoid-preparing-select-queries].
I am not yet sure how the driver would handle it but if it can, wow!
> # If murmur-3 hashing in the client is same as the cluster, clients can query just the
right node (Better token-aware?)
> Such methods are present in several other software. Examples:
> # Hibernate: [session.get()|https://www.mkyong.com/hibernate/different-between-session-get-and-session-load/]
 and
> # JPA: [find()|http://www.java2s.com/Code/Java/JPA/GetEntitybyID.htm].
> # Solr: [getById()|https://lucene.apache.org/solr/6_4_1/solr-solrj/org/apache/solr/client/solrj/SolrClient.html#getById-java.lang.String-java.util.Collection-org.apache.solr.common.params.SolrParams-]
and several flavors of the same.
> (Please note that these links are just an example, not meant to provide implementation
details or the behavior).
> As a feature, *session.getByPrimaryKeys (String tableName, Object [][]primaryKeys)* should
 help get a performance boost to the users because it allows running queries for different
partitions in parallel and also allows getting results from the same partition in one query.
We can put this in a separate JIRA task if it is seen as a useful feature by all.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message