hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhavesh Shah <>
Subject Re: Is my Use Case possible with Hive?
Date Mon, 14 May 2012 07:48:54 GMT
I have near about 1 billion records in my relational database.
Currently locally I am using just one cluster. But I also tried this on
Amazon Elastic Mapreduce with 10 nodes. But the time taken to execute the
complete program is same as that on my  single local machine.

On Mon, May 14, 2012 at 1:13 PM, Nitin Pawar <>wrote:

> how many # records?
> what is your hadoop cluster setup? how many nodes?
> if you are running hadoop on a single node setup with normal desktop, i
> doubt it will be of any help.
> You need a stronger cluster setup for better query runtimes and ofcourse
> query optimization which I guess you would have already taken care.
> On Mon, May 14, 2012 at 12:39 PM, Bhavesh Shah <>wrote:
>> Hello all,
>> My Use Case is:
>> 1) I have a relational database which has a very large data. (MS SQL
>> Server)
>> 2) I want to do analysis on these huge data  and want to generate reports
>> on it after analysis.
>> Like this I have to generate various reports based on different analysis.
>> I tried to implement this using Hive. What I did is:
>> 1) I imported all tables in Hive from MS SQL Server using SQOOP.
>> 2) I wrote many queries in Hive which is executing using JDBC on Hive
>> Thrift Server
>> 3) I am getting the correct result in table form, which I am expecting
>> 4) But the problem is that the time which require to execute is too much
>> long.
>>    (My complete program is executing in near about 3-4 hours on *small
>> amount of data*).
>>    I decided to do this using Hive.
>>     And as I told previously how much time Hive consumed for execution. my
>> organization is expecting to complete this task in near about less than
>> 1/2 hours
>> Now after spending too much time for complete execution for this task what
>> should I do?
>> I want to ask one thing that:
>> *Is this Use Case is possible with Hive?* If possible what should I do in
>> my program to increase the performance?
>> *And If not possible what is the other good way to implement this Use
>> Case?*
>> Please reply me.
>> Thanks
>> --
>> Regards,
>> Bhavesh Shah
> --
> Nitin Pawar

Bhavesh Shah

View raw message