hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Marc Spaggiari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8521) Cells cannot be overwritten with bulk loaded HFiles
Date Thu, 16 May 2013 18:45:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659818#comment-13659818
] 

Jean-Marc Spaggiari commented on HBASE-8521:
--------------------------------------------

I just tried with a new client including this fix to bulkload to an existing cluster running
0.94.7 using the usecase Jonathan had provide above.

{code}
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hdfs://node3:9000/user/hbase/familyDir1
test

 echo "scan 'test'" | bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.8-SNAPSHOT, r, Thu May 16 08:43:22 EDT 2013

scan 'test'
ROW                                                   COLUMN+CELL                        
                                                                                         
                               
 aaaa                                                 column=myfam:myqual, timestamp=1368157470713,
value=oldVal                                                                             
                     
1 row(s) in 0.7900 seconds

bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hdfs://node3:9000/user/hbase/familyDir2
test

echo "scan 'test'" | bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.8-SNAPSHOT, r, Thu May 16 08:43:22 EDT 2013

scan 'test'
ROW                                                   COLUMN+CELL                        
                                                                                         
                               
 aaaa                                                 column=myfam:myqual, timestamp=1368157470713,
value=oldVal                                                                             
                     
1 row(s) in 0.6810 seconds
{code}

Things are consistent and working as before, not any error is displayed.

Full log:
{code}
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hdfs://node3:9000/user/hbase/familyDir2
test
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-1392090,
built on 09/30/2012 17:52 GMT
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:host.name=cloudera.distparser.com
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_45
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems
Inc.
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/local/jdk1.6.0_45/jre
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT/bin/../conf:/usr/local/jdk1.6.0_45//lib/tools.jar:/home/jmspaggiari/.m2/repository/asm/asm/3.1/asm-3.1.jar:/home/jmspaggiari/.m2/repository/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar:/home/jmspaggiari/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/home/jmspaggiari/.m2/repository/com/google/guava/guava/11.0.2/guava-11.0.2.jar:/home/jmspaggiari/.m2/repository/com/google/protobuf/protobuf-java/2.4.0a/protobuf-java-2.4.0a.jar:/home/jmspaggiari/.m2/repository/com/sun/jersey/jersey-core/1.8/jersey-core-1.8.jar:/home/jmspaggiari/.m2/repository/com/sun/jersey/jersey-json/1.8/jersey-json-1.8.jar:/home/jmspaggiari/.m2/repository/com/sun/jersey/jersey-server/1.8/jersey-server-1.8.jar:/home/jmspaggiari/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar:/home/jmspaggiari/.m2/repository/com/yammer/metrics/metrics-core/2.1.2/metrics-core-2.1.2.jar:/home/jmspaggiari/.m2/repository/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar:/home/jmspaggiari/.m2/repository/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar:/home/jmspaggiari/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/home/jmspaggiari/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/home/jmspaggiari/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/home/jmspaggiari/.m2/repository/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar:/home/jmspaggiari/.m2/repository/commons-digester/commons-digester/1.8/commons-digester-1.8.jar:/home/jmspaggiari/.m2/repository/commons-el/commons-el/1.0/commons-el-1.0.jar:/home/jmspaggiari/.m2/repository/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar:/home/jmspaggiari/.m2/repository/commons-io/commons-io/2.1/commons-io-2.1.jar:/home/jmspaggiari/.m2/repository/commons-lang/commons-lang/2.5/commons-lang-2.5.jar:/home/jmspaggiari/.m2/repository/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar:/home/jmspaggiari/.m2/repository/commons-net/commons-net/1.4.1/commons-net-1.4.1.jar:/home/jmspaggiari/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/home/jmspaggiari/.m2/repository/javax/xml/bind/jaxb-api/2.1/jaxb-api-2.1.jar:/home/jmspaggiari/.m2/repository/junit/junit/4.10-HBASE-1/junit-4.10-HBASE-1.jar:/home/jmspaggiari/.m2/repository/log4j/log4j/1.2.16/log4j-1.2.16.jar:/home/jmspaggiari/.m2/repository/org/apache/avro/avro/1.5.3/avro-1.5.3.jar:/home/jmspaggiari/.m2/repository/org/apache/avro/avro-ipc/1.5.3/avro-ipc-1.5.3.jar:/home/jmspaggiari/.m2/repository/org/apache/commons/commons-math/2.1/commons-math-2.1.jar:/home/jmspaggiari/.m2/repository/org/apache/ftpserver/ftplet-api/1.0.0/ftplet-api-1.0.0.jar:/home/jmspaggiari/.m2/repository/org/apache/ftpserver/ftpserver-core/1.0.0/ftpserver-core-1.0.0.jar:/home/jmspaggiari/.m2/repository/org/apache/ftpserver/ftpserver-deprecated/1.0.0-M2/ftpserver-deprecated-1.0.0-M2.jar:/home/jmspaggiari/.m2/repository/org/apache/hadoop/hadoop-core/1.0.4/hadoop-core-1.0.4.jar:/home/jmspaggiari/.m2/repository/org/apache/hadoop/hadoop-test/1.0.4/hadoop-test-1.0.4.jar:/home/jmspaggiari/.m2/repository/org/apache/httpcomponents/httpclient/4.1.2/httpclient-4.1.2.jar:/home/jmspaggiari/.m2/repository/org/apache/httpcomponents/httpcore/4.1.3/httpcore-4.1.3.jar:/home/jmspaggiari/.m2/repository/org/apache/mina/mina-core/2.0.0-M5/mina-core-2.0.0-M5.jar:/home/jmspaggiari/.m2/repository/org/apache/thrift/libthrift/0.8.0/libthrift-0.8.0.jar:/home/jmspaggiari/.m2/repository/org/apache/velocity/velocity/1.7/velocity-1.7.jar:/home/jmspaggiari/.m2/repository/org/apache/zookeeper/zookeeper/3.4.5/zookeeper-3.4.5.jar:/home/jmspaggiari/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar:/home/jmspaggiari/.m2/repository/org/codehaus/jackson/jackson-jaxrs/1.8.8/jackson-jaxrs-1.8.8.jar:/home/jmspaggiari/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar:/home/jmspaggiari/.m2/repository/org/codehaus/jackson/jackson-xc/1.8.8/jackson-xc-1.8.8.jar:/home/jmspaggiari/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar:/home/jmspaggiari/.m2/repository/org/eclipse/jdt/core/3.1.1/core-3.1.1.jar:/home/jmspaggiari/.m2/repository/org/jamon/jamon-runtime/2.3.1/jamon-runtime-2.3.1.jar:/home/jmspaggiari/.m2/repository/org/jboss/netty/netty/3.2.4.Final/netty-3.2.4.Final.jar:/home/jmspaggiari/.m2/repository/org/jruby/jruby-complete/1.6.5/jruby-complete-1.6.5.jar:/home/jmspaggiari/.m2/repository/org/mockito/mockito-all/1.8.5/mockito-all-1.8.5.jar:/home/jmspaggiari/.m2/repository/org/mortbay/jetty/jetty/6.1.26/jetty-6.1.26.jar:/home/jmspaggiari/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/home/jmspaggiari/.m2/repository/org/mortbay/jetty/jsp-2.1/6.1.14/jsp-2.1-6.1.14.jar:/home/jmspaggiari/.m2/repository/org/mortbay/jetty/jsp-api-2.1/6.1.14/jsp-api-2.1-6.1.14.jar:/home/jmspaggiari/.m2/repository/org/mortbay/jetty/servlet-api-2.5/6.1.14/servlet-api-2.5-6.1.14.jar:/home/jmspaggiari/.m2/repository/org/slf4j/slf4j-api/1.4.3/slf4j-api-1.4.3.jar:/home/jmspaggiari/.m2/repository/org/slf4j/slf4j-log4j12/1.4.3/slf4j-log4j12-1.4.3.jar:/home/jmspaggiari/.m2/repository/org/xerial/snappy/snappy-java/1.0.3.2/snappy-java-1.0.3.2.jar:/home/jmspaggiari/.m2/repository/stax/stax-api/1.0.1/stax-api-1.0.1.jar:/home/jmspaggiari/.m2/repository/tomcat/jasper-compiler/5.5.23/jasper-compiler-5.5.23.jar:/home/jmspaggiari/.m2/repository/tomcat/jasper-runtime/5.5.23/jasper-runtime-5.5.23.jar:/home/jmspaggiari/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT/bin/../target/classes:/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT/bin/../target/test-classes:/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT/bin/../target:/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT/bin/../lib/*.jar:
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/local/jdk1.6.0_45/jre/lib/amd64/server:/usr/local/jdk1.6.0_45/jre/lib/amd64:/usr/local/jdk1.6.0_45/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:os.version=3.2.0-4-amd64
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:user.name=jmspaggiari
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/jmspaggiari
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/jmspaggiari/workspace/hbase-0.94.8-SNAPSHOT
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=latitude:2181,cube:2181,node3:2181
sessionTimeout=180000 watcher=hconnection
13/05/16 14:38:15 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 16902@cloudera
13/05/16 14:38:15 INFO zookeeper.ClientCnxn: Opening socket connection to server latitude/192.168.23.4:2181.
Will not attempt to authenticate using SASL (Unable to locate a login configuration)
13/05/16 14:38:15 INFO zookeeper.ClientCnxn: Socket connection established to latitude/192.168.23.4:2181,
initiating session
13/05/16 14:38:15 INFO zookeeper.ClientCnxn: Session establishment complete on server latitude/192.168.23.4:2181,
sessionid = 0x23eadc5ddeb008f, negotiated timeout = 40000
13/05/16 14:38:15 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=latitude:2181,cube:2181,node3:2181
sessionTimeout=180000 watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@79ee2c2c
13/05/16 14:38:15 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 16902@cloudera
13/05/16 14:38:15 INFO zookeeper.ClientCnxn: Opening socket connection to server latitude/192.168.23.4:2181.
Will not attempt to authenticate using SASL (Unable to locate a login configuration)
13/05/16 14:38:15 DEBUG catalog.CatalogTracker: Starting catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@77ff92f5
13/05/16 14:38:15 INFO zookeeper.ClientCnxn: Socket connection established to latitude/192.168.23.4:2181,
initiating session
13/05/16 14:38:16 INFO zookeeper.ClientCnxn: Session establishment complete on server latitude/192.168.23.4:2181,
sessionid = 0x23eadc5ddeb0090, negotiated timeout = 40000
13/05/16 14:38:16 DEBUG client.HConnectionManager$HConnectionImplementation: Looked up root
region location, connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@79ee2c2c;
serverName=node5,60020,1368669066647
13/05/16 14:38:16 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location
for .META.,,1.1028785192 is node6:60020
13/05/16 14:38:16 DEBUG client.ClientScanner: Creating scanner over .META. starting at key
'test,,'
13/05/16 14:38:16 DEBUG client.ClientScanner: Advancing internal scanner to startKey at 'test,,'
13/05/16 14:38:16 DEBUG catalog.CatalogTracker: Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@77ff92f5
13/05/16 14:38:16 INFO zookeeper.ZooKeeper: Session: 0x23eadc5ddeb0090 closed
13/05/16 14:38:16 INFO zookeeper.ClientCnxn: EventThread shut down
13/05/16 14:38:16 DEBUG client.MetaScanner: Scanning .META. starting at row=test,,00000000000000
for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@79ee2c2c
13/05/16 14:38:16 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location
for test,,1368729286037.32038a4d760aa9303643e2985dcd29a5. is node2:60020
13/05/16 14:38:16 DEBUG client.MetaScanner: Scanning .META. starting at row= for max=2147483647
rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@79ee2c2c
13/05/16 14:38:17 DEBUG client.MetaScanner: Scanning .META. starting at row=test,,00000000000000
for max=2147483647 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@79ee2c2c
13/05/16 14:38:17 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 246.9m
13/05/16 14:38:17 INFO util.ChecksumType: Checksum can use java.util.zip.CRC32
13/05/16 14:38:17 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://node3:9000/user/hbase/familyDir2/myfam/hfile_1
first=aaaa last=aaaa
13/05/16 14:38:18 DEBUG mapreduce.LoadIncrementalHFiles: Going to connect to server region=test,,1368729286037.32038a4d760aa9303643e2985dcd29a5.,
hostname=node2, port=60020 for row 
{code}
                
> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
>                 Key: HBASE-8521
>                 URL: https://issues.apache.org/jira/browse/HBASE-8521
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Jonathan Natkins
>         Attachments: HBASE-8521.diff, HBASE-8521-v0-0.94.patch, HBASE-8521-v1-0.94.patch,
hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains a cell
that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp without an
explicit put operation. This seems to be the case even after minor and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work being done
on the compaction algorithm via HBASE-7842, but I'm not sure if one of would fix the other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message