Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 89A5010439 for ; Mon, 1 Jul 2013 17:03:54 +0000 (UTC) Received: (qmail 22248 invoked by uid 500); 1 Jul 2013 17:03:52 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 21042 invoked by uid 500); 1 Jul 2013 17:03:46 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 21031 invoked by uid 99); 1 Jul 2013 17:03:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jul 2013 17:03:45 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vkjk89@gmail.com designates 209.85.223.175 as permitted sender) Received: from [209.85.223.175] (HELO mail-ie0-f175.google.com) (209.85.223.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jul 2013 17:03:39 +0000 Received: by mail-ie0-f175.google.com with SMTP id a13so10041957iee.34 for ; Mon, 01 Jul 2013 10:03:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=P416sMDd4nFwZ8N1wk8b8VQIYN2DzLx3IcGbkA2sCjc=; b=RsJ+Te50pxlBQqkr6NfqUEZJKV3NkrGkmWst0nec+0AMbzlbf6DlMVg17NHuvgSdu4 pDEdWba0OIYb1DURZjEL8KJMq2T1BpwfeM/cJL2qlerE5uQJ79JsiOBfDkE2kgjZGsW3 bj7/M0kdkoUVVBDA6ZnQ3JcbywC+gLDieeSAsy0I5OEdYywFMWhZFjKP7FOMoZoDKApL +oAmtQOqiHxFarnqcOPJyMiNbPe2nLawRgjAc1c+66RC6spaEitRiLPS+Ak5FbQk7zgU z+EgQdRwemUsgDQeIhS7egM3SobYE5Qzb8OI5sol9gZpkaEYer7pGGRKsahd7dCNxZTf M6kA== MIME-Version: 1.0 X-Received: by 10.50.17.69 with SMTP id m5mr16526653igd.19.1372698198589; Mon, 01 Jul 2013 10:03:18 -0700 (PDT) Received: by 10.64.81.232 with HTTP; Mon, 1 Jul 2013 10:03:18 -0700 (PDT) In-Reply-To: <1372693093.35732.YahooMailNeo@web140604.mail.bf1.yahoo.com> References: <1372693093.35732.YahooMailNeo@web140604.mail.bf1.yahoo.com> Date: Mon, 1 Jul 2013 22:33:18 +0530 Message-ID: Subject: Re: How many column families in one table ? From: Vimal Jain To: user@hbase.apache.org, lars hofhansl Content-Type: multipart/alternative; boundary=089e0102f40082d40a04e07634fc X-Virus-Checked: Checked by ClamAV on apache.org --089e0102f40082d40a04e07634fc Content-Type: text/plain; charset=ISO-8859-1 Hi Lars, 1)I have around 140 columns for each row , out of 140 , around 100 rows are holds java primitive data type , remaining 40 rows contains serialized java object as byte array. Yes , I do delete data but the frequency is very less ( 1 out of 5K operations ). I dont run any compaction. 2) I had ran scan keeping in mind the CPU,IO and other system related parameters.I found them to be normal with system load being 0.1-0.3. 3) Yes i have 3 versions of cell ( default value). On Mon, Jul 1, 2013 at 9:08 PM, lars hofhansl wrote: > The performance you're seeing is definitely not typical. 'couple of > further questions: > - How large are your KVs (columns)?- Do you delete data? Do you run major > compactions? > - Can you measure: CPU, IO, context switches, etc, during the scanning? > - Do you have many versions of the columns? > > > Note that HBase is a key value store, i.e. the storage is sparse. Each > column is represented by its own key value pair, and HBase has to do the > work to reassemble the data. > > > -- Lars > > > > ________________________________ > From: Vimal Jain > To: user@hbase.apache.org > Sent: Monday, July 1, 2013 4:44 AM > Subject: Re: How many column families in one table ? > > > Hi, > We had some hardware constraints along with the fact that our total data > size was in GBs. > Thats why to start with Hbase , we first began with pseudo distributed > mode and thought if required we would upgrade to fully distributed mode. > > > > On Mon, Jul 1, 2013 at 5:09 PM, Ted Yu wrote: > > > bq. I have configured Hbase in pseudo distributed mode on top of HDFS. > > > > What was the reason for using pseudo distributed mode in production > setup ? > > > > Cheers > > > > On Mon, Jul 1, 2013 at 1:44 AM, Vimal Jain wrote: > > > > > Thanks Dhaval/Michael/Ted/Otis for your replies. > > > Actually , i asked this question because i am seeing some performance > > > degradation in my production Hbase setup. > > > I have configured Hbase in pseudo distributed mode on top of HDFS. > > > I have created 17 Column families :( . I am actually using 14 out of > > these > > > 17 column families. > > > Each column family has around on average 8-10 column qualifiers so > total > > > around 140 columns are there for each row key. > > > I have around 1.6 millions rows in the table. > > > To completely scan the table for all 140 columns , it takes around > 30-40 > > > minutes. > > > Is it normal or Should i redesign my table schema ( probably merging > 4-5 > > > column families into one , so that at the end i have just 3-4 cf ) ? > > > > > > > > > > > > On Sat, Jun 29, 2013 at 12:06 AM, Otis Gospodnetic < > > > otis.gospodnetic@gmail.com> wrote: > > > > > > > Hm, works for me - > > > > > > > > > > > > > > http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning > > > > > > > > Shorter version: http://search-hadoop.com/m/qOx8l15Z1q42 > > > > > > > > Otis > > > > -- > > > > Solr & ElasticSearch Support -- http://sematext.com/ > > > > Performance Monitoring -- http://sematext.com/spm > > > > > > > > > > > > > > > > On Fri, Jun 28, 2013 at 8:40 AM, Vimal Jain > wrote: > > > > > Hi All , > > > > > Thanks for your replies. > > > > > > > > > > Ted, > > > > > Thanks for the link, but its not working . :( > > > > > > > > > > > > > > > On Fri, Jun 28, 2013 at 5:57 PM, Ted Yu > wrote: > > > > > > > > > >> Vimal: > > > > >> Please also refer to: > > > > >> > > > > >> > > > > > > > > > > http://search-hadoop.com/m/qOx8l15Z1q42/column+families+fb&subj=Re+HBase+Column+Family+Limit+Reasoning > > > > >> > > > > >> On Fri, Jun 28, 2013 at 1:37 PM, Michel Segel < > > > > michael_segel@hotmail.com > > > > >> >wrote: > > > > >> > > > > >> > Short answer... As few as possible. > > > > >> > > > > > >> > 14 CF doesn't make too much sense. > > > > >> > > > > > >> > Sent from a remote device. Please excuse any typos... > > > > >> > > > > > >> > Mike Segel > > > > >> > > > > > >> > On Jun 28, 2013, at 12:20 AM, Vimal Jain > > wrote: > > > > >> > > > > > >> > > Hi, > > > > >> > > How many column families should be there in an hbase table ? > Is > > > > there > > > > >> any > > > > >> > > performance issue in read/write if we have more column > families > > ? > > > > >> > > I have designed one table with around 14 column families in it > > > with > > > > >> each > > > > >> > > having on average 6 qualifiers. > > > > >> > > Is it a good design ? > > > > >> > > > > > > >> > > -- > > > > >> > > Thanks and Regards, > > > > >> > > Vimal Jain > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks and Regards, > > > > > Vimal Jain > > > > > > > > > > > > > > > > -- > > > Thanks and Regards, > > > Vimal Jain > > > > > > > > > -- > Thanks and Regards, > Vimal Jain > -- Thanks and Regards, Vimal Jain --089e0102f40082d40a04e07634fc--