cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: is it possible to map an one from a a file and an one from cassandra?
Date Mon, 17 Jan 2011 00:58:02 GMT
Yup, everything you can do in pig is doable in normal Hadoop. When you say you want to compare
the keys, you're sort of doing an outer join. That's why I thought pig may make your life
a bit easier,

Good luck.
Aaron

On 17/01/2011, at 1:07 PM, Jun Young Kim <juneng603@gmail.com> wrote:

> Hi aron.
> 
> I think that if the pig is able to support to map it, the same job could be represented
in java code itself.
> 
> I believe that we can call a map function by loading a file and cassandra at the same
time.
> 
> Ps) I dont need to join from them. I just wanna compare each keys which are read from
them.
> 
> Thanks.
> 
> 2011. 1. 17. 오전 5:56에 "Aaron Morton" <aaron@thelastpickle.com>님이 작성:
> > The  Pig readers are just the same as any other data source so you should be able
to mix and match them as you please
> > 
> > Tthe sample pig script in contrib/pig/example-script.pig specifies the to use the
CassandraStorage source when loading data 
> > 
> > rows = LOAD 'cassandra://Keyspace1/Standard1' USING CassandraStorage();
> > 
> > The LOAD command in Pig Latin supports a USING keyword to identify the data source
type 
> > http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#Load%2FStore+Functions
> > 
> > I'm less familiar with Hadoop, but it should be possible. AFAIK though it's going
to be easier to do a join between data sources with Pig. 
> > 
> > Hope that helps. 
> > Aaron
> >  
> > 
> > 
> > On 15 Jan, 2011,at 06:00 PM, 김준영 <juneng603@gmail.com> wrote:
> > 
> > hi, 
> > 
> > cassandra supports hadoop to map & reduce from cassandra.
> > 
> > now I am digging to find out a way to map from a file and cassandra together.
> > 
> > I mean if both of them are files in my disk, it is possible by using splits.
> > 
> > but, in this kind of a situtation, which way is posssible?
> > 
> > for example. 
> > 
> > in a cassandra)
> > key1| value1 | value2
> > key2| value3 | value4
> > key3| value5 | value6
> > 
> > in a file)
> > key1| value1 | value2
> > key2| value7 | value4
> > key3| value7 | value6
> > 
> > 
> > the size of both are very hugh.
> > I want to get a result from diff from both of them.
> > 
> > which keys are deleted?
> > which values are changed?
> > 
> > thanks.

Mime
View raw message