cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Rantil <jens.ran...@tink.se>
Subject Re: How to extract all the user id from a single table in Cassandra?
Date Mon, 02 Mar 2015 08:32:03 GMT
Hi Check,

Please avoid double posting on mailing lists. It leads to double work
(respect people's time!) and makes it hard for people in the future having
the same issue as you to follow discussions and answers.

That said, if you have a lot of primary keys

    select user_id from testkeyspace.user_record;

will most definitely timeout. Have a look at `SELECT DISTINCT` at [1]. More
importantly, for larger datasets you will also need to split the token
space into smaller segments and iteratively select your primary keys. See
[2].

[1]
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
[2]
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html?scroll=reference_ds_d35_v2q_xj__paging-through-unordered-results

If you are having specific issues with the Java Driver I suggest you ask on
that mailing list (only).

Cheers,
Jens

On Sun, Mar 1, 2015 at 6:38 PM, Check Peck <comptechgeeky@gmail.com> wrote:

> Sending again as I didn't got any response on this.
>
> Any thoughts?
>
> On Fri, Feb 27, 2015 at 8:24 PM, Check Peck <comptechgeeky@gmail.com>
> wrote:
>
>> I have a Cassandra table like this -
>>
>>     create table user_record (user_id text, record_name text,
>> record_value blob, primary key (user_id, record_name));
>>
>> What is the best way to extract all the user_id from this table? As of
>> now, I cannot change my data model to do this exercise so I need to find a
>> way by which I can extract all the user_id from the above table.
>>
>> I am using Datastax Java driver in my project. Is there any other easy
>> way apart from code to extract all the user_id from the above table through
>> come cqlsh utility and dump it into some file?
>>
>> I am thinking below code might timed out after some time -
>>
>>     public class TestCassandra {
>>
>>             private Session session = null;
>>             private Cluster cluster = null;
>>
>>             private static class ConnectionHolder {
>>                 static final TestCassandra connection = new
>> TestCassandra();
>>             }
>>
>>             public static TestCassandra getInstance() {
>>                 return ConnectionHolder.connection;
>>             }
>>
>>             private TestCassandra() {
>>                 Builder builder = Cluster.builder();
>>                 builder.addContactPoints("127.0.0.1");
>>
>>                 PoolingOptions opts = new PoolingOptions();
>>                 opts.setCoreConnectionsPerHost(HostDistance.LOCAL,
>> opts.getCoreConnectionsPerHost(HostDistance.LOCAL));
>>
>>                 cluster =
>> builder.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE).withPoolingOptions(opts)
>>                         .withLoadBalancingPolicy(new TokenAwarePolicy(new
>> DCAwareRoundRobinPolicy("PI")))
>>                         .withReconnectionPolicy(new
>> ConstantReconnectionPolicy(100L))
>>                         .build();
>>                 session = cluster.connect();
>>             }
>>
>>         private Set<String> getRandomUsers() {
>>             Set<String> userList = new HashSet<String>();
>>
>>             String sql = "select user_id from testkeyspace.user_record;";
>>
>>             try {
>>                 SimpleStatement query = new SimpleStatement(sql);
>>                 query.setConsistencyLevel(ConsistencyLevel.ONE);
>>                 ResultSet res = session.execute(query);
>>
>>                 Iterator<Row> rows = res.iterator();
>>                 while (rows.hasNext()) {
>>                     Row r = rows.next();
>>
>>                     String user_id = r.getString("user_id");
>>                     userList.add(user_id);
>>                 }
>>             } catch (Exception e) {
>>                 System.out.println("error= " + e);
>>             }
>>
>>             return userList;
>>         }
>>     }
>>
>> Adding java-driver group and Cassandra group as well to see whether there
>> is any better way to execute this?
>>
>
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Mime
View raw message