flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Tzoumas <ktzou...@apache.org>
Subject Re: Using Flink to analyze GDELT
Date Fri, 07 Nov 2014 13:54:04 GMT
Hi Tamara!

I have not used GDELT, looks pretty cool!

You can certainly use Flink to analyze structured csv files, and people
have worked with larger, as well as with smaller datasets using Flink.

So, you can certainly give Flink a spin. Whether Flink is the ideal tool
also depends on what kind of analysis you want to run on this data. Posting
some more details about your jobs would be helpful.


On Fri, Nov 7, 2014 at 10:46 AM, Tamara Mendt <tammymendt@gmail.com> wrote:

> Hello!
> I was wondering if anyone has tried to use Flink to perform analysis on
> the GDELT (http://gdeltproject.org/). This database is a structured (csv)
> repository of global events. It contains about 100 GB of data (aprox. 250M
> events, 50 attributes for each event) and is updated with new events every
> day.
> I am a bit concerned that since this is a structured database that is not
> too big Flink may not be the ideal tool to work with it. Any insight?
> Thanks!
> --
> Tamara Mendt

View raw message