cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: is it possible to map an one from a a file and an one from cassandra?
Date Sun, 16 Jan 2011 20:56:00 GMT
The  Pig readers are just the same as any other data source so you should be able to mix and
match them as you please

Tthe sample pig script in contrib/pig/example-script.pig specifies the to use the CassandraStorage
source when loading data 

rows = LOAD 'cassandra://Keyspace1/Standard1' USING CassandraStorage();

The LOAD command in Pig Latin supports a USING keyword to identify the data source type

I'm less familiar with Hadoop, but it should be possible. AFAIK though it's going to be easier
to do a join between data sources with Pig. 

Hope that helps. 

On 15 Jan, 2011,at 06:00 PM, 김준영 <> wrote:


cassandra supports hadoop to map & reduce from cassandra.

now I am digging to find out a way to map from a file and cassandra together.

I mean if both of them are files in my disk, it is possible by using splits.

but, in this kind of a situtation, which way is posssible?

for example. 

in a cassandra)
key1| value1 | value2
key2| value3 | value4
key3| value5 | value6

in a file)
key1| value1 | value2
key2| value7 | value4
key3| value7 | value6

the size of both are very hugh.
I want to get a result from diff from both of them.

which keys are deleted?
which values are changed?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message