hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vic0777 <vic0...@163.com>
Subject Re:Re: Where is the base directory of a transaction table?
Date Wed, 03 Dec 2014 07:13:14 GMT
Hi Alan,

Thans for your help.  I set the hive.compactor.initiator.on(= true) and hive.compactor.worker.threads(=2)
in hive-site.xml. After the configuration, I started Hive for its first run. what do you mean
by "When you say you set hive.compactor.initiator.on (=true I hope) and hive.compactor.worker.threads,
did you did that in your metastore process?" Do I need to config something in other place?


In order to verify the compaction feature, I executed a alter table t1_txn compact 'major'
command first. The request is enqueued but its state is always initiated even after I restart
hive. How to make the request execute? Then I set hive.compactor.delta.num.threshold(=2) in
hive-site.xml. There supposed to be a minor compaction after two update or delete operations.
But after 5 UPDATEs, the compaction did not happen. The show compactions command only lists
the request of the previous alter table command. Besides, I set hive.compactor.delta.pct.threshold(=0.01),
according to the document,  it specifies the percentage (fractional) size of the delta files
relative to the base that will trigger a major compaction. Since the base does not exist in
the beginning, how does the system know when to trigger a major compaction?  So, my question
is how to make compaction work? Is there any tutorial or help?

Following is my hive-site.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
  </property>
  <property>
     <name>hive.txn.timeout</name>
     <value>1000</value>
  </property>
  <property>
     <name>hive.compactor.initiator.on</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.compactor.worker.threads</name>
     <value>2</value>
  </property>
  <property>
     <name>hive.support.concurrency</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.enforce.bucketing</name>
     <value>true</value>
  </property>
  <property>
     <name>hive.exec.dynamic.partition.mode</name>
     <value>nonstrict</value>
  </property>
 <property>
     <name>hive.in.test</name>
     <value>true</value>
  </property>
    <property>
     <name>hive.compactor.delta.num.threshold</name>
     <value>2</value>
  </property>
  <property>
     <name>hive.compactor.delta.pct.threshold</name>
     <value>0.01</value>
  </property>
</configuration>







At 2014-12-03 09:59:34, "Alan Gates" <gates@hortonworks.com> wrote:
The base directories will only exist after compaction has run.  When you say you set hive.compactor.initiator.on
(=true I hope) and hive.compactor.worker.threads, did you did that in your metastore process?
 If so, did you restart the metastore after changing the config values?

Alan.


vic0777
December 1, 2014 at 23:12
Hi All,

I am trying to use the new transaction feature in Hive-0.14. According to its document, every
transaction table have a base directory and one delta directory for each transaction in HDFS
for data storage. But I can not find where the base directory is in HDFS, there is only delta
directories. Following is the commands I used.

create table test_txn (id int ,name string ) clustered by (id) into 2 buckets stored as orc
TBLPROPERTIES('transactional'='true');
insert into table test_txn select * from test_text;
update test_txn set name="liu" where id = 10;

P.S. I have configured the parameters required by the transaction feature:
  hive.support.concurrency,
  hive.enforce.bucketing,
  hive.exec.dynamic.partition.mode,
  hive.txn.manager,
  hive.compactor.initiator.on
  hive.compactor.worker.threads.

Although I cannot find the base directory in HDFS, all SELECT, UPDATE and DELETE statements
works fine and the data in the table is correct. I am wondering where the base directory is.

Any help is appreciated.

Thanks,
Wantao









--

Sent with Postbox

CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.
Mime
View raw message