Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F098D588 for ; Thu, 16 May 2013 21:51:04 +0000 (UTC) Received: (qmail 92342 invoked by uid 500); 16 May 2013 21:51:02 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 92237 invoked by uid 500); 16 May 2013 21:51:02 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 92228 invoked by uid 99); 16 May 2013 21:51:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 21:51:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of saint.ack@gmail.com designates 209.85.214.45 as permitted sender) Received: from [209.85.214.45] (HELO mail-bk0-f45.google.com) (209.85.214.45) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 May 2013 21:50:58 +0000 Received: by mail-bk0-f45.google.com with SMTP id je9so2001991bkc.18 for ; Thu, 16 May 2013 14:50:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=ISNtB5BQbgxz0uIUtXxlZpE6s/lBXeINYTnasFla4MM=; b=BahZkNry/vGgGkqPUmOoI1In4LJEsuXkZJv3A1oPDZn1ShgvyhnadUcTxgmL6V7qhf nMVidZEmkRy3ScJnZiU09FJGH6RptSPMNqpeIf5ohx1/HcgxW3WrNRPhnGyb7Lw+pO/T dqIiPlxUREZ3s14FWsuuVRw3L7etAmCFvEILXFgIwWRsk9rbNFuXvHofu3dwb0jIGg3D m+YHFwV9H6vMwYJXSNalNqH0Ag9pKabe0vPNgQ+jhgbwnjZF+ze4YKWF9hJhC+UuePnj XhODq3QVDo6CP4ki/xMinWQjA5wFFEPEtEc3q2+ZSvX5O99blMhkeFg+3luyfSL6Wlb5 2uEA== MIME-Version: 1.0 X-Received: by 10.205.116.80 with SMTP id fh16mr14608275bkc.112.1368741037074; Thu, 16 May 2013 14:50:37 -0700 (PDT) Sender: saint.ack@gmail.com Received: by 10.205.25.129 with HTTP; Thu, 16 May 2013 14:50:36 -0700 (PDT) In-Reply-To: References: Date: Thu, 16 May 2013 14:50:36 -0700 X-Google-Sender-Auth: Iv0J7sofTvksOcWeerSumyWVYJ8 Message-ID: Subject: Re: Question about HFile seeking From: Stack To: Hbase-User Content-Type: multipart/alternative; boundary=14dae94733dd4e209c04dcdcdb8d X-Virus-Checked: Checked by ClamAV on apache.org --14dae94733dd4e209c04dcdcdb8d Content-Type: text/plain; charset=UTF-8 What is your query? If scanning over rows of 100k, yeah, you will go through each row's content unless you specify you are only interested in some subset of the rows. Then a 'skipping' facility will cut where we will use the index to skip over unwanted content. St.Ack On Thu, May 16, 2013 at 2:42 PM, Varun Sharma wrote: > Nothing, I am just curious... > > So, we will do a bunch of wasteful scanning - that's lets say row1 has col1 > - col100000 - basically 100K columns, we will scan all those key values > even though we are going to discard them, is that correct ? > > > On Thu, May 16, 2013 at 2:30 PM, Stack wrote: > > > What you seeing Varun (or think you are seeing)? > > St.Ack > > > > > > On Thu, May 16, 2013 at 2:30 PM, Stack wrote: > > > > > On Thu, May 16, 2013 at 2:03 PM, Varun Sharma > > wrote: > > > > > >> Or do we use some kind of demarcator b/w rows and columns and > timestamps > > >> when building the HFile keys and the indices ? > > >> > > > > > > No demarcation but in KeyValue, we keep row, column family name, column > > > family qualifier, etc., lengths and offsets so the comparators on ly > > > compare pertinent bytes. > > > > > > If you doing a prefix scan w/ row1c, we should be starting the scan at > > > row1c, not row1 (or more correctly at the row that starts the block we > > > believe has a row1c row in it...). > > > > > > St.Ack > > > > > > --14dae94733dd4e209c04dcdcdb8d--