accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shweta.agrawal" <shweta.agra...@orkash.com>
Subject Time based aggregation problem on storing data in D4M schema
Date Thu, 24 Sep 2015 13:03:44 GMT
Hi all,

I have stored twitter graph data in the form of D4M schema.
As in D4M schema we have tweet id in rowid. But I want to  aggregate 
fields on the basis of time. If I apply timestamp filter for this query 
it will work slow the query, as data is large. And also if I want to 
check condition also before aggregation.

I have 10 years of tweets data and want to run second level aggregations 
on two months data.
Like I want to aggregate all location field of tweets having hashtag 
modi and tweets of 2 months.
I can create reverse index on time but cannot apply any additional 
conditions on it with the help of index like hashtag modi condition.
So can anyone tell me how to aggregate fields with some condition on the 
basis of time on D4M style data?

Thanks and Regards
Shweta

Mime
View raw message