hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Labour <matth...@actionx.com>
Subject Advice on Migrating to hadoop + hive
Date Thu, 27 Sep 2012 01:04:07 GMT
I have posted in this user group before and received great help. Thank you!
I am hoping to also get some advice for the following hive/hadoop question:
The way we currently process our log files is the following: we collect log
files. We run a program via cron job that processes/consolidates them and
inserts rows in Postgresql database. Analysts connect to the database,
performs sql queries, generate excel reports. Our logs are growing. The
process of getting the data into the database is getting too slow.
We are thinking leveraging hadoop and my questions are the following.
Should we use hadoop to insert to Postgresql or can we get rid of
Postgresql and rely on Hive only ?
If we use Hive, can we persist the Hive table so we only load the data (run
the hadoop job) one time ?
Can we insert into existing Hive table and add a day of data without the
need to reprocess all previous days files ?
Are there Hive visual tools (Similar to Postgres Maestro) that would make
it easier for the analyst to build/run queries? (Ideally they would need to
work with Amazon EWS)
Thank you for your help

View raw message