asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: How to calculate the word count in AQL?
Date Wed, 09 Dec 2015 21:08:13 GMT
Excellent!  We are now 100% "map/reduce ready" since we can claim 
wordcount as a use case...  :-)

On 12/9/15 12:43 PM, Jianfeng Jia wrote:
> Yes, it works! Thanks!
>
>> On Dec 9, 2015, at 10:41 AM, Yingyi Bu <buyingyi@gmail.com> wrote:
>>
>> This probably is the query you want:
>>
>> for $ts in $hashtags
>> for $t in $ts
>> group by $tag:=$t with $t
>> return {"g": $tag, count($t)}
>>
>>
>> On Wed, Dec 9, 2015 at 10:34 AM, Jianfeng Jia <jianfeng.jia@gmail.com>
>> wrote:
>>
>>> Hi devs,
>>>
>>> Here is my use case, each tweets has a set of hashtags. Here is the
>>> hashtags example for the first five tweets:
>>> [ {{ "Samibeigi" }}, {{ "LeadwithGiants" }}, {{ "MountainView",
>>> "Healthcare", "Job" }}, {{ "SanFrancisco", "job", "NettempsJobs", "IT",
>>> "Hiring", "CareerArc" }}, {{ "BeGreat" }} ]
>>> I want to calculate the most frequent hashtag in all tweets.
>>>
>>> I could generate a internal group count as following
>>>
>>> let $inner := for $x in $hashtags
>>> return
>>> for $xx in $x
>>> group by $xx with $xx
>>> return { "g": $xx[0], "c": count($xx)}
>>>
>>> It will return a list of list,
>>> [ { "g": "Samibeigi", "c": 1 } ]
>>> [ { "g": "LeadwithGiants", "c": 1 } ]
>>> [ { "g": "Healthcare", "c": 1 }, { "g": "Job", "c": 1 }, { "g":
>>> "MountainView", "c": 1 } ]
>>> [ { "g": "CareerArc", "c": 1 }, { "g": "Hiring", "c": 1 }, { "g": "IT",
>>> "c": 1 }, { "g": "NettempsJobs", "c": 1 }, { "g": "SanFrancisco", "c": 1 },
>>> { "g": "job", "c": 1 } ]
>>> [ { "g": "BeGreat", "c": 1 } ]
>>>
>>> I would expect to add a flatten function to the list which will give me a
>>> list of record, then I can groupby the “g". Do we have such AQL functions?
>>> Thank you.
>>>
>>> Best,
>>>
>>> Jianfeng Jia
>>> PhD Candidate of Computer Science
>>> University of California, Irvine
>>>
>>>
>
>
> Best,
>
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message