Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of sambit19@gmail.com designates
 209.85.213.41 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAL4Pj0ph9icYmfRfMC__9NZEA_RJ7Dtm3t=updk6z3iunS_VNg@mail.gmail.com>
References: 
 <CAL4Pj0ph9icYmfRfMC__9NZEA_RJ7Dtm3t=updk6z3iunS_VNg@mail.gmail.com>
Date: Thu, 9 Aug 2012 16:30:27 +0530
Message-ID: 
 <CAJDmWVcL7YhGiHo-=TKa_MWBP9WvZ6d0vYQCT7dFiE-yVmVvzw@mail.gmail.com>
Subject: Re: multi-threaded HTablePool, incrementColumnValue, compaction and
 large data set
From: Sambit Tripathy <sambit19@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=e89a8fb205f296b3db04c6d32260

--e89a8fb205f296b3db04c6d32260
Content-Type: text/plain; charset=ISO-8859-1

So did you get any success with the problem? Well, I think you can try
using it with Asynchbase, a hbase client used in OpenTSDB.


On Mon, Jan 16, 2012 at 6:46 AM, Neil Yalowitz <neilyalowitz@gmail.com>wrote:

> I'm seeing something unusual here and I wanted to see if it has occurred
> for any other HBase 0.90 users.  I've read several emails here that
> recommend NOT using multi-threading in an MR job, so that's certainly under
> consideration.  If anyone could add to their experiences with
> multi-threading in an MR job it would be very helpful.  We are testing both
> implementations (with threading and without), but the threaded solution is
> causing the problem.
>
> We are processing log files with PUTs in the Map and a followup
> incrementColumnValue() to a separate "counts" table in the Reducer.  The
> reduce phase uses multi-threading.  The Reducer initializes an HTablePool
> in the setup(), starts threads in the reduce() (to a
> Java BlockingQueue/CompletionService) which do the incrementColumnValue()
> and depending on the value returned create a PUT in the "counter" table,
> and in the cleanup() performs a completionService.take() which is ignored
> and flushes the PUTs queued by the threads.
>
> There are no issues for approximately the first 100GB of data inserted.
>  After approximately 100GB however, every subsequent job has a freeze
> during the Reduce phase.  What I see happening is at some point the Reduce
> (where the incrementColumnValue() takes place) tasks are "hung" and
> eventually killed with reason: task client has not responded for 600
> seconds.  The counters in the reduce job seem to grow briefly but then all
> the tasks' counter stop increasing and the task is eventually killed.
>
> Oddly, the problem does not occur if compaction is completely disabled (not
> just major, but also setting hbase.hstore.compactionThreshold = 9999999
> and hbase.hstore.blockingStoreFiles = 9999999).
>
> Could there be a bug with HTablePool for large datasets and compaction?
>  Again, this works as expected for approximately the first 100 jobs (1GB
> each) but consistently fails after that.  Also to repeat, the problem does
> not occur with ALL compaction disabled.
>
> Difficult problem to describe, but I'm hoping someone may have some
> feedback and/or similar experiences.  I can provide code examples if anyone
> is curious.
>
>
>
> Neil Yalowitz
>

--e89a8fb205f296b3db04c6d32260--