Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 11C8B200CA7 for ; Wed, 14 Jun 2017 19:18:01 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 105F8160BDB; Wed, 14 Jun 2017 17:18:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 30CDC160BD6 for ; Wed, 14 Jun 2017 19:18:00 +0200 (CEST) Received: (qmail 28654 invoked by uid 500); 14 Jun 2017 17:17:59 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 28642 invoked by uid 99); 14 Jun 2017 17:17:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Jun 2017 17:17:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 7275D1A07FF for ; Wed, 14 Jun 2017 17:17:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.252 X-Spam-Level: X-Spam-Status: No, score=0.252 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_SOFTFAIL=0.972] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=messagingengine.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id gTwaZGdE2y50 for ; Wed, 14 Jun 2017 17:17:57 +0000 (UTC) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4A6B85F297 for ; Wed, 14 Jun 2017 17:17:56 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id A1FDF20A19 for ; Wed, 14 Jun 2017 13:17:55 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Wed, 14 Jun 2017 13:17:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s= fm1; bh=XxCq91S/Ri2rkXQxLlZp7CksEtyhUaegDbAtiYKivVs=; b=MoMVs3uu gHAPhcrUc660xcjUzZCnORuoBMAOd1F0UBUnDMBhD7lofVswzkY0C6cZTYREiMj4 vhoB65RdwDrxKF8/W9fylCUOYWEMzBQl5L0GHUWm/LlkOS/hlqEcdPavFz3RLK3K Tt4IGnPUni8U51AjW32Ls+NKaH3uZ6Uf57H/pmARFEtErYLdkEyDO1uRnt4pAOzZ BBtUAf+izTywxZa3EF4G4JQhXyipHS5lm2p79x5kWp+wEVzK0A5+PaY95fzAaPTe lxSFqM6+wSSwY7csw+HLN3elhMa9lyMJ0dNRj4BVGidOKjcobxM6TN3mqaTU++Zn 0ecm2ZUiFJpclg== X-ME-Sender: X-Sasl-enc: miQz6Cvz/MBu7JewkJr41DHqDRZc/MszawfdfAqe5Wb7 1497460675 Received: from kocolosk.cambridge.ibm.com (bi-03pt1.bluebird.ibm.com [129.42.208.172]) by mail.messagingengine.com (Postfix) with ESMTPA id 5AEA67E9BB for ; Wed, 14 Jun 2017 13:17:55 -0400 (EDT) From: Adam Kocoloski Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Store meta data of millions of images for search purpose Date: Wed, 14 Jun 2017 13:17:54 -0400 References: <48B90C5F0B65C944A52392D05C410EE031F6D070@genmail2010.vitalhuber.local> <48B90C5F0B65C944A52392D05C410EE031F6D088@genmail2010.vitalhuber.local> To: user@couchdb.apache.org In-Reply-To: <48B90C5F0B65C944A52392D05C410EE031F6D088@genmail2010.vitalhuber.local> Message-Id: <8C00C38E-061E-4886-BEDA-53D3BF315F1A@apache.org> X-Mailer: Apple Mail (2.3273) archived-at: Wed, 14 Jun 2017 17:18:01 -0000 Hi Ajay, the view engine will happily keep up with 10k - 20k updates per = day. If you=E2=80=99re using CouchDB 2.0 you can distribute this = database across several underlying physical shards. You won=E2=80=99t = need to do that just to keep up with your designed update rate, but an = index with a billion entries will be easier to manage operationally if = it=E2=80=99s sharded. Compaction in particular can be an unwieldy = operation on an index that large. Cheers, Adam > On Jun 14, 2017, at 8:47 AM, Ajay Pawaskar = wrote: >=20 > as per application there will be multiple images per = record[201700000000002...]. images can be of different types [current, = viewable] like one image is marked as bCurrent=3Dtrue and another with = bCurrent=3Dfalse. then there will be search where I need to search image = related to record which have bCurrent=3Dtrue/false. if I make documents = per image then number of documents will be increased [more than = billions] >=20 > -----Original Message----- > From: aa mm [mailto:assaf.morami@gmail.com]=20 > Sent: Wednesday, June 14, 2017 6:11 PM > To: user@couchdb.apache.org > Subject: Re: Store meta data of millions of images for search purpose >=20 > What each document represents? Why do you need to generate ids? Is = this a requirement? >=20 > If not, and image file name is unique, then you can make it so each = document represents an image. _id will be the image file name, and thus = you won't need a view to access an image, you'll need only the image = name. >=20 > Assaf. >=20 >=20 > =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 14 =D7=91=D7=99=D7=95=D7=A0=D7=99 = 2017 01:13 PM,=E2=80=8F "Ajay Pawaskar" > =D7=9B=D7=AA=D7=91: >=20 > Hi, > I am having question related to storing millions/billions of document = in Couch DB and use view to get required documents. I would like to know = about performance/scalability of views/ Couch DB in following case. >=20 > I have application where I need to store meta data of = millions/billions of images for search purpose. Images will be = added/updated/deleted/retrieve on regular basis [10000/20000 per day]. >=20 > we are thinking to store these documents in following format e.g. > { > "_id": "201700000000002", /* this will be generated by our = application*/ > "_rev": "1-b85e805bdd293a5f727517beea9512b3", > "12398712397129": {"bCurrent": true, "bCanView": true} = /*"12398712397129" is image file name*/ > "98127397192319": {"bCurrent": false, "bCanView": false}} = /*"98127397192319" is image file name*/ >=20 > } >=20 > { > "_id": "201700000000003", /* this will be generated by our = application*/ > "_rev": "1-b85e83432d293a5f727517beea9512b3", > "89723979823929": {"bCurrent": true, "bCanView": true} = /*"12398712397129" is image file name*/ > "92347324667324": {"bCurrent": false, "bCanView": false}} = /*"98127397192319" is image file name*/ > "72832532467217": {"bCurrent": true, "bCanView": false}} = /*"72832532467217" is image file name*/ } >=20 >=20 > so if user want to get current image for record 201700000000002 we = will be having following view >=20 > function(doc) { > for(var prop in doc){ > if(prop!=3D"_id" && prop!=3D"_rev"){ > if(doc[prop].bCurrent!=3D=3Dundefined && doc[prop].bCurrent){ > emit(doc._id, { RecordID: doc._id,ImageID: prop,bCurrent: > doc[prop].bCurrent, doc[prop].bCanView}); } } } which will be called = with key "201700000000002". >=20 > but as mentioned earlier Images will be added/updated/deleted/retrieve = on regular basis [10000/20000 per day] how this going to affect views = performance? >=20 > Regards, > Ajay.