Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C354F8B0 for ; Thu, 28 Mar 2013 16:16:11 +0000 (UTC) Received: (qmail 410 invoked by uid 500); 28 Mar 2013 16:16:11 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 328 invoked by uid 500); 28 Mar 2013 16:16:11 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 303 invoked by uid 99); 28 Mar 2013 16:16:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 16:16:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of roshanp@gmail.com designates 209.85.216.53 as permitted sender) Received: from [209.85.216.53] (HELO mail-qa0-f53.google.com) (209.85.216.53) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 16:16:04 +0000 Received: by mail-qa0-f53.google.com with SMTP id k4so1634379qaq.5 for ; Thu, 28 Mar 2013 09:15:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=JdiBLdD+PNIZ+OVWoiELJnyCOm4HTaWWJuIBzjPNz3w=; b=ioWDbWFurhs7366M4tIRRGgXYn1X4oFOTcL49nTBaYuFUPHi7JdzyG9AtcL9Oapp6J 61bKs1Es05fY4blNq87NzkcE6fLn+kJHqvduzoGOMuZsodlfkcqN/9QyZPb+DBK2pzG7 YQeDiSWVPZqoRDbR8DT16GgqWWbYEPGpZOWTKtuTP9ja46sjS6swXFQU1X+EoHxmcjxz lFe1kZB2mQXLobmak3Edt+l9PjY52XT1xKtp41HgiCd1zXtUDhDIAmgJeTyfFae46aap BdwezXEUUe2rUDBZu8e1HydoJDi0Eg96NFdZJOUVF+BnJI6YPmNk983Ed6xCM+Kukzfh o6FQ== X-Received: by 10.229.76.159 with SMTP id c31mr7293902qck.134.1364487343945; Thu, 28 Mar 2013 09:15:43 -0700 (PDT) Received: from [192.168.183.91] ([208.253.119.98]) by mx.google.com with ESMTPS id c2sm26171533qeg.6.2013.03.28.09.15.42 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 28 Mar 2013 09:15:42 -0700 (PDT) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Accumulo Utilities From: roshanp@gmail.com In-Reply-To: Date: Thu, 28 Mar 2013 12:15:41 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <51613010-FD8C-4175-9C20-7EAF60310B63@gmail.com> References: <5B0ECD26-9346-4463-9EB4-E2795D8839EF@gmail.com> To: user@accumulo.apache.org X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org Thanks! I like the idea of sending my own thread pool to the batch = scanner, that would definitely be the better solution. Yeah I thought about creating a batch scanner with only one thread, but = I was not sure if that is making a separate thread (outside of the = current one) or using the current one. At the time I did not want a new = thread to be created at all. Though, didn't realize the Scanner was also = spinning up a thread at all, thought that was in process.=20 To mitigate the separate RPC call per range, would it make more sense to = do a "binRanges" based on the ranges at the tablets to reduce the number = of ranges? On Mar 28, 2013, at 11:55 AM, Keith Turner wrote: > I took a quick look at the code. Excluding the threading issue, a > major conceptual difference is that BatchScannerWithScanners seems to > do a RPC round trip for each range. The TabletServerBatchReader > sends all of the ranges that a tablet server needs to lookup in one > RPC. >=20 > Instead of creating a BatchScannerWithScanners, maybe you could create > a batch scanner with just one thread when resources are exceeded? > This will be similar to what you are doing now, just one thread will > be doing work fetching data. The client thread would just be waiting > on this background thread. Although this does allow the processing > of result to happen concurrently with fetching of data. Using > BatchScannerWithScanners would not allow this. >=20 > Something to be aware of, the regular scanner will spin up a read > ahead thread if you read a lot of data through it. It does not do > this immediately, only after fetching a few batches of key value pairs > from the tablet server. If this happens you could have one thread > fetching data while the client thread processes results. >=20 > Do you think we should open a a ticket about giving users control over > threads created by client code? Maybe users could pass in their own > thread pool to a batch scanner? >=20 >=20 > Keith >=20 > On Thu, Mar 28, 2013 at 11:00 AM, wrote: >> In some of my projects, we needed to control the number of threads = spun up with the use of multiple batch scanners. We created a utility to = control the number of threads, and if the max threads has been reached, = return a batch scanner that is actually backed by Scanners. Wanted to = get any feedback on the code. Seems like such a simple thing to do, I = bet someone already has this. Thanks! >>=20 >> https://github.com/calrissian/mango/tree/master/accumulo