From user-return-55423-archive-asf-public=cust-asf.ponee.io@hbase.apache.org Wed Aug 15 11:07:14 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7CF70180626 for ; Wed, 15 Aug 2018 11:07:13 +0200 (CEST) Received: (qmail 31651 invoked by uid 500); 15 Aug 2018 09:07:12 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 31639 invoked by uid 99); 15 Aug 2018 09:07:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Aug 2018 09:07:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0FE97C6352 for ; Wed, 15 Aug 2018 09:07:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.889 X-Spam-Level: * X-Spam-Status: No, score=1.889 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 7vrJKtfa87Ie for ; Wed, 15 Aug 2018 09:07:10 +0000 (UTC) Received: from mail-lj1-f173.google.com (mail-lj1-f173.google.com [209.85.208.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id DF1DA5F396 for ; Wed, 15 Aug 2018 09:07:09 +0000 (UTC) Received: by mail-lj1-f173.google.com with SMTP id f8-v6so454200ljk.1 for ; Wed, 15 Aug 2018 02:07:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=lTMmgD3OoWaDJs4xflxE+G+90zvwFmm9hP+obXaonlY=; b=C8Gu3jGXXCEqysxCIAGl6Vj/8Lq7Kxl637jBHqvR6hXCFgOsq4Ihj0eYM3wsQhJsL0 FkkQQ5feWvt2/0LWMgEsmBNJT1jpvCdgBs+JGxO1NZxGu1rxF1scQFwk029RXhzW7Uwg dRU7HHC2iC49vh0Qhjb3c8cgi0HV4M2qy4zSF9nfxqJu6p3pzPmq+5bcca2dz0IMcp2M uJxtbY66bLic3E932uhGUbWQMtvgeQyRngSrQ6pQrKn4+gpAuZWbTfYLzv4LwPL/BLcW yuYj57Au4QD+Po1FXlFwSFV3X5xQgticcaJG1fBv0NJZXgbbwX7+Pf7fmaUNk4wWBg5h VLfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=lTMmgD3OoWaDJs4xflxE+G+90zvwFmm9hP+obXaonlY=; b=GcVlXnEuWjeCT7HMoGbtRQ6iggMSVNi09qoI2UWpStqufBtIhW4swjnHqP+HyKRt8H ogth5jec9JjWxXC3MM5D348wtjx8GAnn/PqmNhe9TtE6BA47m/WzI9ise4UYlM50d2lr 8kNiVwiB22UJoZlwBdpPaGcn0MOnDefuX+Qhk1sTH/GDqajkNSA6Ig43B7Ec+B9QmQ7O j9bj7QaLCOhoHAARVodQYyMiyPLdxWZg+xU2XT3EvDTvr55lQ2OLRVNcEXl8gt6H6+dw h2ZT5C+w9uDd/RZ5xBDSSuGubzWS8JzQtobD9DPfVm3oCpxjGnHmh2eOHNYur00Ks2kj 2bIA== X-Gm-Message-State: AOUpUlFjv4e6HidfYq8G8TZa9QPZYFormrtICNcX52kIIETMltY29JkF PQSgIuWaxILBEdlgCxR1KG96+yo/vtdF+rWkcP+Q6o2MNWo= X-Google-Smtp-Source: AA+uWPxVgvw5Shn1RaIbyYKApYbsVv1je9XQbUmVr84EXdaxOaucA+aFtywWvTKyZKyoMk5PR+xgQqokLgD6MyV5ogQ= X-Received: by 2002:a2e:44c6:: with SMTP id b67-v6mr17282747ljf.102.1534324022615; Wed, 15 Aug 2018 02:07:02 -0700 (PDT) MIME-Version: 1.0 From: Biplob Biswas Date: Wed, 15 Aug 2018 11:06:29 +0200 Message-ID: Subject: difference - filterlist with rowfilters vs multiget To: user@hbase.apache.org Content-Type: multipart/alternative; boundary="00000000000056a12e057375a616" --00000000000056a12e057375a616 Content-Type: text/plain; charset="UTF-8" Hi, During our implementation for fetching multiple records from an HBase table, we came across a discussion regarding the best way to get records out. The first implementation is something like: FilterList filterList = new FilterList(Operator.MUST_PASS_ONE); > for (String rowKey : rowKeys) { > filterList.addFilter(new RowFilter(CompareOp.EQUAL,new > BinaryComparator(Bytes.toBytes(rowKey)))); > } > > Scan scan = new Scan(); > scan.setFilter(filterList); > ResultScanner resultScanner = table.getScanner(scan); and the second implementation is somethign like this: List listGet = rowKeys.stream() > .map(entry -> { > Get get = new Get(Bytes.toBytes(entry)); > return get; > }) > .collect(Collectors.toList()); > Result[] results = table.get(listGet) The only difference I see directly is that filterList would do a full table scan whereas multiget wouldn't do anything as such. But what other benefits one has over the other? Also, when HBase finds out that all the filters in the filterList are RowFilters, would it perform some kind of optimization and perform multiget rather than doing a full table scan? Thanks & Regards Biplob Biswas --00000000000056a12e057375a616--