Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1403176D3 for ; Wed, 8 Apr 2015 23:36:17 +0000 (UTC) Received: (qmail 58096 invoked by uid 500); 8 Apr 2015 23:36:15 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 58030 invoked by uid 500); 8 Apr 2015 23:36:15 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 58020 invoked by uid 99); 8 Apr 2015 23:36:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2015 23:36:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of alanfgates@gmail.com designates 209.85.192.170 as permitted sender) Received: from [209.85.192.170] (HELO mail-pd0-f170.google.com) (209.85.192.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2015 23:35:48 +0000 Received: by pdea3 with SMTP id a3so131660806pde.3 for ; Wed, 08 Apr 2015 16:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=SEIDuDwLkQalqXfAVyoBAJuEtkm4RvWujJSYM0O+DrI=; b=HaLntQby6uJ+jC0Dx0UXcsvlNBBZetDx8AxDuQMaZhAEpPj95oXLoOT7lQRMeTHj5z i8AkFNfhBxcl8+YYJxpRj9E9hK7Y1ZKmiyBzIYaHl3CsVw58J67PS5b0pHuzUMzpSYxv eqyEBv2gf9i2m34n1EprBfCQz9HZWxApURAcqvPSeSIl302O2pCYDVDmolJtxvoeWAgC wB30w6Ru1Z+OCJESLXbxVb+WBmkYyZ+Ou+Bo8VtYpJ+q5eZvs6ysJnGev09YMoNnpI2q HX/04SNC0dfqxkv6/18RQNFdqQF626Rdodb11eEQ9iBG3ESQtZq+W7Kyy+Y8XQzvjtaE qPwQ== X-Received: by 10.70.44.3 with SMTP id a3mr51194212pdm.93.1428536146439; Wed, 08 Apr 2015 16:35:46 -0700 (PDT) Received: from Alan-Gatess-MacBook-Pro.local ([192.175.27.21]) by mx.google.com with ESMTPSA id cy5sm7192753pdb.85.2015.04.08.16.35.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 08 Apr 2015 16:35:45 -0700 (PDT) Message-ID: <5525BB4E.4060806@gmail.com> Date: Wed, 08 Apr 2015 16:35:42 -0700 From: Alan Gates User-Agent: Postbox 3.0.11 (Macintosh/20140602) MIME-Version: 1.0 To: user@hive.apache.org Subject: Re: Hive support for concurrency with Oracle Metastore References: <01bd01d07245$642e9a30$2c8bce90$@co.uk> In-Reply-To: <01bd01d07245$642e9a30$2c8bce90$@co.uk> Content-Type: multipart/alternative; boundary="------------030101080605010001060505" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------030101080605010001060505 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit All locks in Hive are on database, table, or partition level. There are no row level locks. When using DbTxnManager the locking is chosen to be as fine grained as possible (ie only partitions are locked when we can a priori know the partitions that will be used in the query). Alan. > Mich Talebzadeh > April 8, 2015 at 14:45 > > Hi, > > Thanks for all the useful info. I managed to set up hive concurrency > using Oracle Metastore for Hive. Had to modify > /hive-txn-schema-0.14.0.oracle.sql/ script in order to drop the > existing tables as I had created the metastore with > hive-schema-0.14.0.oracle.sql initially. > > All the transactions seem to behave as expected (meaning conforming to > behaviour expected from an ACID compliant RDBMS). Except a question > that comes to mind with locking when deleting all rows from the table > (as opposed to truncating the table) > > I created a test table > > create table txtest (col1 int, col2 varchar(30)) clustered by (col1) > into 10 buckets STORED AS orc TBLPROPERTIES('transactional'='true'); > > Inserted 10 rows and updatedstats > > insert into table txtest values > (1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e'),(6,'f'),(7,'f'),(8,'g'),(9,'h'),(10,'i'); > > analyse table txtest compute statistics; > > Updated three columns followed by a single column > > update txtest set col2 = col2 where col1 in (1,3,5); > > update txtest set col2 = 'row1' where col1 = 1; > > From another session looked at locks > > show locks; > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | lockid | database | table | partition | lock_state | > lock_type | transaction_id | last_heartbeat | acquired_at | > user | hostname | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | Lock ID | Database | Table | Partition | State | > Type | Transaction ID | Last Hearbeat | Acquired At | > User | Hostname | > > | 21 | oraclehadoop | txtest | NULL | ACQUIRED | > SHARED_WRITE | 6 | 1428525777008 | 1428525774206 | > hduser | rhes564 | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > Now see what happens if we try to delete two different rows from two > different sessions concurrently (well almost) > > Session 1 > > hive> delete from txtest where col1 = 10; > > Session 2 > > delete from txtest where col1 = 4; > > The locks are shown below > > show locks; > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | lockid | database | table | partition | lock_state | > lock_type | transaction_id | last_heartbeat | acquired_at | > user | hostname | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | Lock ID | Database | Table | Partition | State | > Type | Transaction ID | Last Hearbeat | Acquired At | > User | Hostname | > > | 23 | oraclehadoop | txtest | NULL | ACQUIRED | > SHARED_WRITE | 7 | 1428526836905 | 1428526833132 | > hduser | rhes564 | > > | 24 | oraclehadoop | txtest | NULL | WAITING | > SHARED_WRITE | 8 | 1428526865246 | NULL | > hduser | rhes564 | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > Now _delete all rows_ from the table > > hive> delete from txtest; > > 0: jdbc:hive2://rhes564:10010/default> show locks; > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | lockid | database | table | partition | lock_state | > lock_type | transaction_id | last_heartbeat | acquired_at | > user | hostname | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > | Lock ID | Database | Table | Partition | State | > Type | Transaction ID | Last Hearbeat | Acquired At | > User | Hostname | > > | 25 | oraclehadoop | txtest | NULL | ACQUIRED | > SHARED_WRITE | 9 | 1428527740870 | 1428527737658 | > hduser | rhes564 | > > +----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+ > > The question I have is if we delete from the whole table it seems that > "only" one lock is applied to whole table. Does that mean a full > table lock rather that locks for every row deleted? > > Thanks, > > Mich Talebzadeh > > http://talebzadehmich.wordpress.com > > __ > > Author of the books*"A Practitioner's Guide to Upgrading to > Sybase**ASE 15", **ISBN 978-0-9563693-0-7*. > > co-author *"Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4* > > _Publications due shortly:_ > > *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen > and Coherence Cache* > > *Oracle and Sybase, Concepts and Contrasts*, ISBN:978-0-9563693-1-4, > volume one out shortly > > NOTE: The information in this email is proprietary and confidential. > This message is for the designated recipient only, if you are not the > intended recipient, you should destroy it immediately. Any information > in this message shall not be understood as given or endorsed by > Peridale Ltd, its subsidiaries or their employees, unless expressly so > stated. It is the responsibility of the recipient to ensure that this > email is virus free, therefore neither Peridale Ltd, its subsidiaries > nor their employees accept any responsibility. > --------------030101080605010001060505 Content-Type: multipart/related; boundary="------------020304080301060307010804" --------------020304080301060307010804 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit All locks in Hive are on database, table, or partition level.  There are no row level locks.  When using DbTxnManager the locking is chosen to be as fine grained as possible (ie only partitions are locked when we can a priori know the partitions that will be used in the query).

Alan.

April 8, 2015 at 14:45

Hi,

 

Thanks for all the useful info.  I managed to set up hive concurrency using Oracle Metastore for Hive. Had to modify hive-txn-schema-0.14.0.oracle.sql script in order to drop the existing tables as I had created the metastore with hive-schema-0.14.0.oracle.sql initially.

 

All the transactions seem to behave as expected (meaning conforming to behaviour expected from an ACID compliant RDBMS). Except a question that comes to mind with locking when deleting all rows from the table (as opposed to truncating the table)

 

I created a test table

 

create table txtest (col1 int, col2 varchar(30)) clustered by (col1) into 10 buckets  STORED AS orc TBLPROPERTIES('transactional'='true');

 

Inserted 10 rows and updatedstats

 

insert into table txtest values (1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e'),(6,'f'),(7,'f'),(8,'g'),(9,'h'),(10,'i');

analyse table txtest compute statistics;

 

Updated three columns followed by a single column

 

update txtest set col2 = col2 where col1 in (1,3,5);

 

update txtest set col2 = 'row1' where col1 = 1;

 

From another session looked at locks

 

show locks;

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

|  lockid  |   database    |  table  | partition  | lock_state  |   lock_type   | transaction_id  | last_heartbeat  |  acquired_at   |  user   | hostname  |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

| Lock ID  | Database      | Table   | Partition  | State       | Type          | Transaction ID  | Last Hearbeat   | Acquired At    | User    | Hostname  |

| 21       | oraclehadoop  | txtest  | NULL       | ACQUIRED    | SHARED_WRITE  | 6               | 1428525777008   | 1428525774206  | hduser  | rhes564   |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

 

Now see what happens if we try to delete two different rows from two different  sessions concurrently (well almost)

 

Session 1

hive> delete from txtest where col1 = 10;

 

Session 2

delete from txtest where col1 = 4;

 

The locks are shown below

 

show locks;

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

|  lockid  |   database    |  table  | partition  | lock_state  |   lock_type   | transaction_id  | last_heartbeat  |  acquired_at   |  user   | hostname  |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

| Lock ID  | Database      | Table   | Partition  | State       | Type          | Transaction ID  | Last Hearbeat   | Acquired At    | User    | Hostname  |

| 23       | oraclehadoop  | txtest  | NULL       | ACQUIRED    | SHARED_WRITE  | 7               | 1428526836905   | 1428526833132  | hduser  | rhes564   |

| 24       | oraclehadoop  | txtest  | NULL       | WAITING     | SHARED_WRITE  | 8               | 1428526865246   | NULL           | hduser  | rhes564   |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

 

Now delete all rows from the table

 

hive> delete from txtest;

 

0: jdbc:hive2://rhes564:10010/default> show locks;

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

|  lockid  |   database    |  table  | partition  | lock_state  |   lock_type   | transaction_id  | last_heartbeat  |  acquired_at   |  user   | hostname  |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

| Lock ID  | Database      | Table   | Partition  | State       | Type          | Transaction ID  | Last Hearbeat   | Acquired At    | User    | Hostname  |

| 25       | oraclehadoop  | txtest  | NULL       | ACQUIRED    | SHARED_WRITE  | 9               | 1428527740870   | 1428527737658  | hduser  | rhes564   |

+----------+---------------+---------+------------+-------------+---------------+-----------------+-----------------+----------------+---------+-----------+--+

 

The question I have is if we delete from the whole table it seems that  “only” one lock is applied to whole table. Does that mean a full table lock rather that locks for every row deleted?

 

 

Thanks,

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

--------------020304080301060307010804 Content-Type: image/jpeg; x-apple-mail-type=stationery; name="compose-unknown-contact.jpg" Content-Transfer-Encoding: base64 Content-ID: Content-Disposition: inline; filename="compose-unknown-contact.jpg" /9j/4AAQSkZJRgABAQEARwBHAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEC AQEBAQEBAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/2wBDAQEBAQEBAQICAgICAgIC AgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/wAAR CAAZABkDAREAAhEBAxEB/8QAGAAAAwEBAAAAAAAAAAAAAAAABgcICQr/xAA0EAABAwMCAgUK BwAAAAAAAAACAQMEBQYRABITIQcUMUF2CBUXIjI2N0JRtVRWkZOV0dL/xAAYAQEAAwEAAAAA AAAAAAAAAAADAAEEAv/EACQRAAICAAQGAwAAAAAAAAAAAAABAhEDMrHREyExM0FxgfDx/9oA DAMBAAIRAxEAPwDuEt+gW/ULet6oVC3rfqNQqFv0OfPn1GhUqfOmzZtKZlS5UqZMaNwzNwiJ VIl7eXLCaZIGwBl3TY8epPx2+jy2ZNPjvkwc9uhW8j7nCPhvOsQliYIeS7cvCpp8o50qwrC4 v3lsNSDbdmTEhvs2tahxpfV3WnmbbozJEw/gwdadbYExVRXKEKoSdvJcaOSqxE7/AAiX0gXx +a69/JSf9alIlste0VzaNpeFrcT9KKymotyiaZ0KRCnzacoE7Kjzn4gi2KqUh3jqDHDHv4mR UfruTWlMzlVUKIVNp9GguEJnAh0+IZjyAiisgyRDnu5azS8miKqjOTVkKqS/psG37fo1Fbab eg25b8eZPeFJBBJSjMG5HjMeyihnaauZwe4OGiju13GAcpOwBeN+U8/IkGbsiS8b7ryogmbz hbyc9REROfZhERO5ETShjPtvpGqTUyLErytS4siSwx5x2tRH4hPOI0DkjZtaJtFxuVEbIUUi yeNujlBUJGbJN6nM/Cyf2Hf60YgjvKA+NPSP4gT7axpcPtr51YWJnYn9dnAQWl722p4ot37y zqnlfp6FrqbwawG8/9k= --------------020304080301060307010804-- --------------030101080605010001060505--