incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: pig counting question
Date Thu, 24 Mar 2011 18:34:18 GMT
The limit defaults to 1024 but you can set it when you use CassandraStorage in pig, like so:
rows = LOAD 'cassandra://Keyspace/ColumnFamily' USING CassandraStorage(4096);
or whatever value you wish.

Give that a try and see if it gives you more of what you're looking for.

On Mar 24, 2011, at 1:16 PM, Jeffrey Wang wrote:

> Hey all,
>  
> I’m trying to run a very simple Pig script against my Cassandra cluster (5 nodes, 0.7.3).
I’ve gotten it all set up and working, but the script is giving me some strange results.
Here is my script:
>  
> rows = LOAD 'cassandra://Keyspace/ColumnFamily' USING CassandraStorage();
> rowct = FOREACH rows GENERATE $0, COUNT($1);
> dump rowct;
>  
> If I understand Pig correctly, this should output (row name, column count) tuples, but
I’m always seeing 1024 for the column count even though the rows have highly variable number
of columns. Am I missing something? Thanks.
>  
> -Jeffrey
>  


Mime
View raw message