incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mariano Vallés <zucari...@gmail.com>
Subject Re: Twittertopiccount
Date Mon, 23 Apr 2012 11:16:32 GMT
Hi again Sangeetha,

It's better if we stick to the s4-user mailing list, rather than
sending individual messages,  so that anyone can benefit from this
conversation.

To solve your issue I would include some extra attributes in the
TopicSeen class, so that you include the Tweet and the username (or
id, whatever suits you) along the path in the application.

Then in TopicCountAndReportPE I wouldn't wait until the count value
gets to 4, but just output every single event, as most probably every
event will have a different tweet message and of course it will come
from a different user.

The tricky part is in TopNTopicPE where you should have 1 or possibly
2 hashmaps where you store the Tweets for each topic and the users for
each topic. Bare in mind that each topic will have more than one value
per key, so it's a good idea to use a list as the value for the maps.

I went on and worked on it for a while, so here's an example to get
the last tweet related to a certain topic in the /tmp/top_n_hashtags
file. Getting the user and the other set of tweets for the same topic
shouldn't be too hard, in fact it's almost done.
Here's the code:
https://github.com/zucaritas/twittertopiccount

I hope that helps
Regards,

Mariano

2012/4/22 Mariano Vallés <zucaritas@gmail.com>:
> On Sat, Apr 21, 2012 at 11:55 PM, Sangeetha Hebbar <sxhebbar@ualr.edu> wrote:
>>
>> Hi,
>> Can you please tell me what a sample event looks like in the twittertopiccount example.
>
> Hi Sangeetha,
> If you open a browser and go to:
> https://stream.twitter.com/1/statuses/sample.json
> which is the URL present in the TwitterFeedListener.java file, you can
> log in using your twitter username and passwd and see what the
> statuses look like.
>
> The "geo" field is null most of the time as only around 10% of the
> total set of tweets are geo located though.
>
> I hope that helps.
> Regards,
>
> Mariano
>
>
>>
>> Thanks,
>>
>>
>>
>> On Fri, Apr 20, 2012 at 2:39 AM, Matthieu Morel <mmorel@apache.org> wrote:
>>>
>>> On 4/19/12 8:15 PM, Sangeetha Hebbar wrote:
>>>>
>>>> Hi
>>>> I am working on a project to extract the user id, location of the user
>>>> and the tweets of the top ten hashtags. What files do I need to change
>>>> to get this information?
>>>
>>>
>>> Hi, you probably want to have a look at the adapter code:
>>> https://github.com/s4/twittertopiccount/blob/master/src/main/java/org/apache/s4/example/twittertopiccount/TwitterFeedListener.java
>>>
>>> and the Status class: https://github.com/s4/twittertopiccount/blob/master/src/main/java/org/apache/s4/example/twittertopiccount/Status.java
>>>
>>> However, note that these are custom classes to handle the raw json twitter stream,
and you might be better off by using a library such as twitter4j
>>>
>>>
>>> Hope this helps,
>>>
>>> Matthieu
>>>
>>>
>>>
>>>
>>

Mime
View raw message