Return-Path: Delivered-To: apmail-hive-user-archive@www.apache.org Received: (qmail 58232 invoked from network); 26 Jan 2011 19:05:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Jan 2011 19:05:01 -0000 Received: (qmail 51372 invoked by uid 500); 26 Jan 2011 19:05:01 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 51227 invoked by uid 500); 26 Jan 2011 19:05:00 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 51219 invoked by uid 99); 26 Jan 2011 19:05:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jan 2011 19:05:00 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of njain@fb.com designates 66.220.144.132 as permitted sender) Received: from [66.220.144.132] (HELO mx-out.facebook.com) (66.220.144.132) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jan 2011 19:04:55 +0000 Received: from [192.168.18.198] ([192.168.18.198:2920] helo=mail.thefacebook.com) by mta003.snc4.facebook.com (envelope-from ) (ecelerity 2.2.2.45 r(37388)) with ESMTP id AF/A3-01490-340704D4; Wed, 26 Jan 2011 11:04:35 -0800 Received: from SC-MBX01-3.TheFacebook.com ([fe80::ed71:3809:487b:e7cc]) by sc-hub03.TheFacebook.com ([192.168.18.198]) with mapi id 14.01.0218.012; Wed, 26 Jan 2011 11:04:35 -0800 From: Namit Jain To: "user@hive.apache.org" Subject: Re: Hive Concurrency Model - does it work? Thread-Topic: Hive Concurrency Model - does it work? Thread-Index: AQHLvYpSPCXsfO1QXkOA8vebpOiyiZPjnLEA Date: Wed, 26 Jan 2011 19:04:34 +0000 Message-ID: In-Reply-To: <7D75BAFD-19C7-4FCA-AD42-FF1B6F5E5449@tripadvisor.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.0.0.100825 x-originating-ip: [192.168.18.252] Content-Type: multipart/alternative; boundary="_000_C965AF861D592njainfbcom_" MIME-Version: 1.0 --_000_C965AF861D592njainfbcom_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable The patch below has been committed. https://issues.apache.org/jira/browse/HIVE-1865 was a follow-up patch which= should help concurrency. I have not tried backporting the patch on hive 0.5 or hive0.6, but I don=92= t think it will work, since the code has changed significantly, and a number of bug fixes to update the inputs a= nd outputs went in. By default, concurrency is disabled. If you want to enable it, you need to = set: hive.support.concurrency to true Thanks, -namit From: Jay Ramadorai > Reply-To: > Date: Wed, 26 Jan 2011 13:52:58 -0500 To: > Subject: Hive Concurrency Model - does it work? https://issues.apache.org/jira/browse/HIVE-1293 : Is this JIRA truly fixed = and included in 0.7.0? If so, can the patch be applied separately on top of 0.5.0 or 0.6.0? Are there instructions somewhere for how to enable/integrate Zookeeper with= Hive for this patch to work? The JIRA comments indicate the patch was tested and committed, however the = wiki that the JIRA points to http://wiki.apache.org/hadoop/Hive/Locking im= plies concurrency will not be supported. Hence the confusion. Is there a simple way in Hive to query which tables are currently being acc= essed? More detail: What I'm trying to do is to do daily Sqoop-imports into Hive from an extern= al database. There are jobs running on the Hive warehouse a lot of the time= s. I import the data into temporary tables in Hive and then want to drop th= e permanent tables, and rename the (just-imported) temporary ones to the pe= rmanent names WITHOUT IMPACTING THE JOBS. At the moment of course doing an= ALTER TABLE RENAME results in any running jobs accessing the table to die = on the next fetch. So I thought if the above JIRA was indeed fixed, then 0.= 7.0 should allow the job to complete before the Rename gets its X lock, or = if the rename is in progress, the Job wont get its S lock until the Rename = is done. However our test on 0.7.0 trunk (pulled in late September) reveals= that the rename happens instantly even with a query accessing the table, n= ot waiting for any locks. Barring this patch, are there any other ideas anyone can suggest for accomp= lishing what I want? Some ideas we have considered: - Parse Hive logs/xml files looking for a tablename to determine if there i= s a job currently accessing the table. If not, then rename. - Create views on temporary tables named by day. Have jobs go against the v= iews. When we are ready to rename, basically replace the view, pointing it = now to the new table of today. The key question here is: is the View metada= ta consulted only upon query startup, or is it repeatedly looked at during = query execution. If only on startup, we might be able to get away this tric= k, until concurrency truly works. Thanks Jay --_000_C965AF861D592njainfbcom_ Content-Type: text/html; charset="Windows-1252" Content-ID: <1EA4C3A1ED656E4291C11518C616A94A@fb.com> Content-Transfer-Encoding: quoted-printable
The patch below has been committed.


https://is= sues.apache.org/jira/browse/HIVE-1865 was a follow-up patch which = should help concurrency.
I have not tried backporting the patch on hive 0.5 or hive0.6, but I d= on=92t think it will work, since the code
has changed significantly, and a number of bug fixes to update the inp= uts and outputs went in.

By default, concurrency is disabled. If you want to enable it, you nee= d to set: hive.support.concurrency to true


Thanks,
-namit


From: Jay Ramadorai <jramadorai@tripadvisor.com>
Reply-To: <user@hive.apache.org>
Date: Wed, 26 Jan 2011 13:52:58 -05= 00
To: <user@hive.apache.org>
Subject: Hive Concurrency Model - d= oes it work?

https://is= sues.apache.org/jira/browse/HIVE-1293 : Is this JIRA truly fixed a= nd included in 0.7.0? 
If so, can the patch be applied separately on top of 0.5.0 or 0.6.0?
Are there instructions somewhere for how to enable/integrate Zookeeper= with Hive for this patch to work?
The JIRA comments indicate the patch was tested and committed, however= the wiki that the JIRA points to  http://wiki.apache.org/hadoop/Hive/Locking imp= lies concurrency will not be supported. Hence the confusion.
Is there a simple way in Hive to query which tables are currently bein= g accessed? 

More detail:
What I'm trying to do is to do daily Sqoop-imports into Hive from an e= xternal database. There are jobs running on the Hive warehouse a lot of the= times. I import the data into temporary tables in Hive and then want to dr= op the permanent tables, and rename the (just-imported) temporary ones to the permanent names WITHOUT IMPACTIN= G THE JOBS.  At the moment of course doing an ALTER TABLE RENAME resul= ts in any running jobs accessing the table to die on the next fetch. So I t= hought if the above JIRA was indeed fixed, then 0.7.0 should allow the job to complete before the Rename gets its X l= ock, or if the rename is in progress, the Job wont get its S lock until the= Rename is done. However our test on 0.7.0 trunk (pulled in late September)= reveals that the rename happens instantly even with a query accessing the table, not waiting for any locks= .

Barring this patch, are there any other ideas anyone can suggest for a= ccomplishing what I want? Some ideas we have considered:
- Parse Hive logs/xml files looking for a tablename to determine if th= ere is a job currently accessing the table. If not, then rename.
- Create views on temporary tables named by day. Have jobs go against = the views. When we are ready to rename, basically replace the view, pointin= g it now to the new table of today. The key question here is: is the View m= etadata consulted only upon query startup, or is it repeatedly looked at during query execution. If only on = startup, we might be able to get away this trick, until concurrency truly w= orks.

Thanks
Jay
--_000_C965AF861D592njainfbcom_--