hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: HBase: "small" WAL transactions Q
Date Wed, 03 Oct 2012 00:20:10 GMT
This is an interesting observation. I have not thought about HBASE-5229 in terms of a performance
improvement.
Currently HRegion.mutateRowsWithLocks actually acquires locks on all rows first (since the
contract here is a transaction), so (currently) you would get unnecessarily reduced concurrency
using that API for changes that do not need to be atomic.


Also note that a Put(List<Put>) operation already writes multiple updates to a single
WALEdit (doing a best effort batching).

-- Lars



________________________________
 From: Alex Baranau <alex.baranov.v@gmail.com>
To: user@hbase.apache.org 
Sent: Tuesday, October 2, 2012 4:29 PM
Subject: HBase: "small" WAL transactions Q
 
Hello,

May be silly question.

Data in WAL is written in small transactions. One transaction is a set of
KeyValues for specific (single) row. As we want each written transaction to
be durable we write them into the WAL one-by-one (ideally with FS sync()
calls, etc. on each write). Which is very costly (doing that for each
write).

Having bigger WAL transactions (writing changes to several "close" records)
should be more efficient (would result in increase of write throughput).
I.e. WALEdit record would contain updates to the multiple different rows.
As far as I understand smth like that was implemented in HBASE-5229 [1].
But it is not a default behavior when sending multiple records changes to
RS (e.g. when flushing client-side buffer). It also cannot be forced. What
are the major reasons for not using that? Is locking multiple "close" rows
looks so dangerous? Or is it simply not efficient (there's more to that
besides what I described above)?

Thank you,
Alex Baranau
------
Sematext :: http://sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr

[1] https://issues.apache.org/jira/browse/HBASE-5229
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message