hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Ortiz <mlor...@uci.cu>
Subject Re: Planning a migration from PostgreSQL to Hadoop/Hive
Date Wed, 04 May 2011 22:18:48 GMT
On 05/04/2011 04:14 PM, Alexandre "TAZ" dos Santos Andrade wrote:
> Hi Marcos,
>
> I'm doing exactally the same migration, first of all you have to 
> remember that hive is gonna make mapreduce for each query you dont 
> write the result on a table, second is a litle bit anoing to migrate 
> the data, there's no direct connector so I user a simple dump, 
> extracted the header and footer and Loaded in hive structure.
>
> I hope I could Help you
>
> Alexandre dos Santos Andrade
>
> 2011/5/4 Marcos Ortiz <mlortiz@uci.cu <mailto:mlortiz@uci.cu>>
>
>     We are planning a migration from a large PostgreSQL-based DWH to
>     Hadoop/Hive. The principal reason for this migration is the
>     massive growth of the data to analyze (5.6 TB and growing) where
>     PostgreSQL like a MVCC-based RDBMS has its pitfalls with heavy
>     updates and query execution with great quantities of data. (We had
>     done many query tunning and optimization to the server, with a
>     minor effect on the latency of the queries).
>
>     So, we have viewed Hadoop and we have done some tests combined
>     with Hive and HBase and it´s awesome the obtained performance.
>
>     Can you give us some advices to develop a good plan for this?
>
>     Environment:
>     - O.S:CentOS-5.5 64 bits
>     - Java version: 1.6. Update 20
>     - Hardware: 8 Nodes - AMD Opteron QuadCore 4130
>                                        8 GB RAM
>                                        1 TB HDD
>
>     Regards
>
>     -- 
>     Marcos Luís Ortíz Valmaseda
>      Software Engineer (Large-Scaled Distributed Systems)
>      University of Information Sciences,
>      La Habana, Cuba
>      Linux User # 418229
>     http://about.me/marcosortiz
>
>
>
>
> -- 
> <a 
> href="http://cwconnect.computerworld.com.br/profile_view.aspx?customerid=alexandreandrade"><img

> src="http://cwconnect.computerworld.com.br/businesscard.aspx?customerid=alexandreandrade"

> border="0" alt="Join Me at CW Connect!"></a>
Thanks a lot, Alexandre.
Did you use Sqoop to load the data from PostgreSQL to Hive?



-- 
Marcos Luís Ortíz Valmaseda
  Software Engineer (Large-Scaled Distributed Systems)
  University of Information Sciences,
  La Habana, Cuba
  Linux User # 418229
  http://about.me/marcosortiz


Mime
View raw message