Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BFD8E10E25 for ; Tue, 25 Feb 2014 19:53:24 +0000 (UTC) Received: (qmail 50710 invoked by uid 500); 25 Feb 2014 19:53:16 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 50531 invoked by uid 500); 25 Feb 2014 19:53:13 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 50332 invoked by uid 99); 25 Feb 2014 19:53:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Feb 2014 19:53:10 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ameya@groupon.com designates 74.125.245.96 as permitted sender) Received: from [74.125.245.96] (HELO na3sys010aog114.obsmtp.com) (74.125.245.96) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 25 Feb 2014 19:53:04 +0000 Received: from mail-vc0-f179.google.com ([209.85.220.179]) (using TLSv1) by na3sys010aob114.postini.com ([74.125.244.12]) with SMTP ID DSNKUwz0imOVf2/uOFVMYcxHiDISfW4afFJS@postini.com; Tue, 25 Feb 2014 11:52:43 PST Received: by mail-vc0-f179.google.com with SMTP id lh14so7920558vcb.10 for ; Tue, 25 Feb 2014 11:52:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=groupon.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=wKvGPoDvWvpa4wEprS8XbFtSknwuTx3Lm2QqH75T+Bs=; b=F9ps+S/TgBjHCd6AcQAsw19uZq6WFYKaQMZKNQXXwukcGDA1LCSP3+NQObcvlUTi7q oQSP/ZlZVUen6hVZeMVzdBMpwEYxrM524EsNvepwwLKC0dpbIkNhTrsQ2p2BbIEe306X PNuuv1HQ3BcTvknYxb6H0iy7wx69rMFYEoYDA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=wKvGPoDvWvpa4wEprS8XbFtSknwuTx3Lm2QqH75T+Bs=; b=mrfqUrenp/n0idIkg60+XPETaZFfMLXe4vr5FAiMQGJKgGzFwXO5KMsGO3RlE2IwFr SGQ78rbId3G9wE2qeDoay+D2Ov0vRpC5ajMs/eadPLQejjTm6E/dh45YhSStd1cuI0yQ FUygU0gPYFmeLZfa5jQ6ltNlp455iC5gkw8ysC1nLtEr3OWng451rUfDt8Qud2lpruqW GbPJXskt2CFzLVCEUJSlyZuYWbQOu1bhN2g3GArOwkegwtUw5X4TZYCEs9x6aFvl7WpQ rDTeOPlIEOBI7oasKtgApJ/GIwr1DjN4FygIzPvtJfcv+nbbeolhPNaQX/HDaq2coVDI eZOQ== X-Gm-Message-State: ALoCoQmxei78K1pOpuUOlbZMnNlwbDbMsz+IotbTTR1BaGCUsmnTDemXimbXkt4Y+13PZI6dcpJVi7MdniUhQiZbnqnxYR4cIW3nRW5WH0+XFFKWYX9XAEiyS2l+xqRBkjeMuZ1XabNiZKEtHmt2BjAD9bUTzt3Ts9B07zpqcPiCqoT0PYioIjw= X-Received: by 10.58.37.232 with SMTP id b8mr2459260vek.27.1393357961622; Tue, 25 Feb 2014 11:52:41 -0800 (PST) X-Received: by 10.58.37.232 with SMTP id b8mr2459250vek.27.1393357961511; Tue, 25 Feb 2014 11:52:41 -0800 (PST) MIME-Version: 1.0 Received: by 10.58.100.50 with HTTP; Tue, 25 Feb 2014 11:52:21 -0800 (PST) In-Reply-To: References: From: Ameya Kanitkar Date: Tue, 25 Feb 2014 11:52:21 -0800 Message-ID: Subject: Re: Is HBase is feasible for storing 4-5 MB of data as cell value To: user@hbase.apache.org Cc: upendra1024@gmail.com Content-Type: multipart/alternative; boundary=089e01176de557525b04f3406ead X-Virus-Checked: Checked by ClamAV on apache.org --089e01176de557525b04f3406ead Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable The only other thing I'd add is, by default HBase caps size of the data per column at 10 MB (I think). You can change that by changing this setting: hbase.client.keyvalue.maxsize in hbase-site.xml -1 means no cap. You can put other numbers for appropriate cap for your use case. Ameya On Tue, Feb 25, 2014 at 12:12 AM, shashwat shriparv < dwivedishashwat@gmail.com> wrote: > Yes for sure you can use hbase for this, you can have > 1. different fields of mail in different column of a column family and > attachment as a binary array also in a column. > 2. you can keep whole message in columns in hbase and the attachments are > large enoug on the hdfs and some reference to it in hbase table. > 3. schema you can decide, you can have a matrix how you store values to > that you can decide. > > > *Warm Regards_**=E2=88=9E_* > * Shashwat Shriparv* > [image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9]< > http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9>[image: > https://twitter.com/shriparv] [image: > https://www.facebook.com/shriparv] >[image: > http://google.com/+ShashwatShriparv] > [image: > http://www.youtube.com/user/sShriparv/videos]< > http://www.youtube.com/user/sShriparv/videos>[image: > http://profile.yahoo.com/SWXSTW3DVSDTF2HHSRM47AV6DI/] > > > > On Tue, Feb 25, 2014 at 12:55 PM, Upendra Yadav >wrote: > > > I have to use hbase and have mix type of data > > > > Some of them have size 1-4K(Mail- Header....) and others > > >5MB(Attachments...) > > > > And also we need only random access: any data > > > > Is HBase is feasible for storing this type of data > > > > What will be my schema design - > > will have to go with 2 different Table -> 1st one for 1-4K and 2nd for > big > > file > > (because of memstore flush will flush other CF, and huge random access) > > > > Or there is other way:; > > > > Thanks > > > --089e01176de557525b04f3406ead--