Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9B735200CF7 for ; Tue, 5 Sep 2017 03:53:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 99CE91637BE; Tue, 5 Sep 2017 01:53:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D845A1637BC for ; Tue, 5 Sep 2017 03:53:23 +0200 (CEST) Received: (qmail 57232 invoked by uid 500); 5 Sep 2017 01:53:22 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 57219 invoked by uid 99); 5 Sep 2017 01:53:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Sep 2017 01:53:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 010FD1A166E for ; Tue, 5 Sep 2017 01:53:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.401 X-Spam-Level: X-Spam-Status: No, score=-0.401 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id mfKy-8hzvfGG for ; Tue, 5 Sep 2017 01:53:16 +0000 (UTC) Received: from mail-io0-f176.google.com (mail-io0-f176.google.com [209.85.223.176]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 498AB60CE6 for ; Tue, 5 Sep 2017 01:53:16 +0000 (UTC) Received: by mail-io0-f176.google.com with SMTP id z67so8203831iof.3 for ; Mon, 04 Sep 2017 18:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=wgK2q95zJl60FPZ9v/544XJjd4VJeKJOx2Fjx3LS07k=; b=AByhMyU1lQZ035uayR8YQH0iThn0whjbonr8GFsHSCiYqrX1iUeRCF5xu+R5NajcTD DDoZjoJ7Z6hauzJw8gt9pZCSJSY7IahB6cAERWtWNn2QqXwk74SKVwzZBL1+Y3o5bENu ihqcABJ1dpPxEaMVxJnPJoT2ONrYl/WBLMaEhoD/njy7cEl/wVxaUsyOMhsbSPh2RRwB dbWw4HyI7gOovbNRveAeqw7WFQC9VL0jxMF6uiEijUky7PUxZDGZrf1fnpSQ3qvFjEcV c1VW1J3VeshsK2idg7juUv6RxuDsqoB+yejq4FabMjJ4lN1Uf0Jp0+KDFGSM5ZNoSrR7 ZKuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=wgK2q95zJl60FPZ9v/544XJjd4VJeKJOx2Fjx3LS07k=; b=NEBiyKxgpCPM5gG6zSjoqDBjS8RK+NgjwCbYxHkPoCmCteYbLpTaLpCVfo6KEMs0mX TdTxtnM+A5t3OyXIkwkAMcgOQJhGq3xyrLFUAPeG8zC+JSLzyeDUjH7pLzpXaV9l5FnU IEZJz8CGSg0WG9V96e8kdnQ1eh22XBSQCTmvRcpLnfqmOueytXbeQvZlwcp1FGxZkYtY xM3Oi0M0ogNXDQMWzpdxjU8uSh8YG0ofYZzfNXiVVvf58MmRdTasWuqoO4itH8wGPNQ8 jRHCkYysH34U1JaCFFUiQkTMg/0xdao0zBAa3yN7zyy/GPmJ8E2vL6Oeppe6/MmcCnRq XDzw== X-Gm-Message-State: AHPjjUggSpFmVk+lB1EpGbznsO3ZzHIMVxTOmw+NPYFvwpinpa5pi9Uk ThtFFbHQLZ8JvF5mlZLKfz26NfH1Snrd X-Google-Smtp-Source: ADKCNb6yj2YxttjtDEekv7FWP9WZRcwaKvvhAO8bULD6UVLpq3RQQUPGhGworwiqWYid4JX0vTGAJchdNV2U5EjYEfU= X-Received: by 10.107.131.162 with SMTP id n34mr1158477ioi.206.1504576394992; Mon, 04 Sep 2017 18:53:14 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.26.81 with HTTP; Mon, 4 Sep 2017 18:53:14 -0700 (PDT) In-Reply-To: References: From: libis Date: Tue, 5 Sep 2017 09:53:14 +0800 Message-ID: Subject: Re: should we split the scan range into serveral segments when the scan range only located in a single region? To: dev@hbase.apache.org Content-Type: multipart/alternative; boundary="001a113eb6048fa1610558677dd7" archived-at: Tue, 05 Sep 2017 01:53:24 -0000 --001a113eb6048fa1610558677dd7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for Mikhail. I am pleasure to pick HBASE-18090 up (my jira account is xinxin fan). i notice that the issue HBASE-16894( https://issues.apache.org/jira/browse/HBASE-16894) tries to work on the similar thing. Chia-Ping, look it? 2017-09-04 20:41 GMT+08:00 Chia-Ping Tsai : > Thanks for the information. Mikhail. It seems to me the issue is popular. > libis, Could you take HBASE-18090 over? I can assign the issue to you if = i > get ur jira account. > > On 2017-09-04 20:26, Mikhail Antonov wrote: > > I've filed https://issues.apache.org/jira/browse/HBASE-18090 some time > ago > > and attached draft patch to it. It's not complete as we need some deepe= r > > changes in the way we open regions (see comments) but basic stuff works > (I > > ended up going the other route and didn't have bandwidth to finish that= - > > would be great if someone picked it up) > > > > Mikhail > > > > On Mon, Sep 4, 2017 at 11:13 AM Chia-Ping Tsai > wrote: > > > > > That sounds good. There are some related issue. see > > > https://issues.apache.org/jira/browse/HBASE-4914 and > > > https://issues.apache.org/jira/browse/HBASE-4063. > > > > > > On 2017-09-04 15:06, libis wrote: > > > > Hi > > > > > > > > When TableInputFormat is used to source an HBase table in a MapRedu= ce > > > job, > > > > its splitter will make a map task for each region of the table. > However, > > > in > > > > some cases, the user=E2=80=99s scan range may locate in a single re= gion, > > > resulting > > > > in there is a only mapper. For example, the rowkey of the table is > > > > =E2=80=98md5(userid) + timestamp=E2=80=99, once client want to scan= the data of a > > > specified > > > > user in the latest month with MR, it=E2=80=99s much possible that t= here is > only > > > one > > > > mapper working. > > > > > > > > In order to scan data in parallel if the user's scan range located > in a > > > > single region, should we split the scan range into serveral segment= s > > > within > > > > a region? > > > > > > > > Best, > > > > > > > > xinxin > > > > > > > > > -- > > Thanks, > > Michael Antonov > > > --001a113eb6048fa1610558677dd7--