Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 31B52109A1 for ; Mon, 9 Dec 2013 13:08:38 +0000 (UTC) Received: (qmail 91410 invoked by uid 500); 9 Dec 2013 13:08:34 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 91366 invoked by uid 500); 9 Dec 2013 13:08:32 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 91357 invoked by uid 99); 9 Dec 2013 13:08:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 13:08:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mikael.sitruk@gmail.com designates 209.85.160.49 as permitted sender) Received: from [209.85.160.49] (HELO mail-pb0-f49.google.com) (209.85.160.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Dec 2013 13:08:25 +0000 Received: by mail-pb0-f49.google.com with SMTP id jt11so5432514pbb.36 for ; Mon, 09 Dec 2013 05:08:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=+wPq/ScVNlx4hy1NHUEucb6wkuOzoUXvfWVm5kl5qbo=; b=qKpRTNzdDQ6irbYqONfKTc8aa6MtQpIRqiTVtS9R771RLu3KD6ixUmc57jqvbw4E9s DeunM44chDrZFuHYl/hO/6wx0I0KDKyPjzoOZbBZ2f+XZ2aaWyVL969RHuTY5NjL5IFe GYmb+HuwloE0HmU7tQllMhQ7TV9nD9/QFlcI1YIsKxVlCn9BK/074am7me4altUKWjga OOq3FV7svtFP03ZfOm5L/xVLj6a9CsecVnsYmrerVfDCFUSgIzA01DjtYbpgqGeqVYlA z+rALnd5loYhuWaiwTnueRe6pYwX12VzS7NcMLJP2RQ+UcdnQ94jvs0SkkHQjhgL0EOp +GQw== MIME-Version: 1.0 X-Received: by 10.68.189.34 with SMTP id gf2mr20546791pbc.91.1386594483619; Mon, 09 Dec 2013 05:08:03 -0800 (PST) Received: by 10.66.166.7 with HTTP; Mon, 9 Dec 2013 05:08:03 -0800 (PST) In-Reply-To: References: Date: Mon, 9 Dec 2013 15:08:03 +0200 Message-ID: Subject: Re: How and where exactly LSM trees are used in HBase? From: Mikael Sitruk To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=e89a8ff1caf2a4e5ff04ed19af37 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8ff1caf2a4e5ff04ed19af37 Content-Type: text/plain; charset=UTF-8 LSM tree are the basis for reducing random I/O which is a huge performance factor with big data system. A good overview can be found in HBase in action book, from Lars George. The basic idea is that you have an in memory structure for the latest changes and a structure stored on files, The files content is always ordered by key, and each row the file is jus the row_key, Column family identifier, column name, timestamp and the value (+ a marker). When the memory is full, the memory structure is flushed to disk, when there are a certain amount of files on filesystem the files are merged to bigger ones, since the files are ordered the merge is very fast, (like merge in mergesort algo) On Sun, Dec 8, 2013 at 8:42 AM, Ted Yu wrote: > Searching for 'lsm tree hbase' would give you several articles. > > I am in China - the search results are mostly in Chinese. > > You should be able to read this: > > http://stackoverflow.com/questions/13762992/log-structured-merge-tree-in-hbase > > Cheers > > > On Wed, Dec 4, 2013 at 6:49 PM, AnilKumar B wrote: > > > Hi, > > > > We are trying to understand how and where exactly LSM tress are used in > > HBase. Currently as per our understanding, while flushing memstore to > Store > > files and while HFile compaction it is used. And sits on top of HFiles at > > memstore level. > > > > Is this understanding correct. Can you please give more insight on this? > > How exactly is the merging done? > > > > Thanks & Regards, > > B Anil Kumar. > > > --e89a8ff1caf2a4e5ff04ed19af37--