couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: Store meta data of millions of images for search purpose
Date Wed, 14 Jun 2017 17:17:54 GMT
Hi Ajay, the view engine will happily keep up with 10k - 20k updates per day. If you’re using
CouchDB 2.0 you can distribute this database across several underlying physical shards. You
won’t need to do that just to keep up with your designed update rate, but an index with
a billion entries will be easier to manage operationally if it’s sharded. Compaction in
particular can be an unwieldy operation on an index that large. Cheers,

Adam

> On Jun 14, 2017, at 8:47 AM, Ajay Pawaskar <APawaskar@genesisinfo.com> wrote:
> 
> as per application there will be multiple images per record[201700000000002...]. images
can be of different types [current, viewable] like one image is marked as bCurrent=true and
another with bCurrent=false. then there will be search where I need to search image related
to record which have bCurrent=true/false. if I make documents per image then number of documents
will be increased [more than billions]
> 
> -----Original Message-----
> From: aa mm [mailto:assaf.morami@gmail.com] 
> Sent: Wednesday, June 14, 2017 6:11 PM
> To: user@couchdb.apache.org
> Subject: Re: Store meta data of millions of images for search purpose
> 
> What each document represents? Why do you need to generate ids? Is this a requirement?
> 
> If not, and image file name is unique, then you can make it so each document represents
an image. _id will be the image file name, and thus you won't need a view to access an image,
you'll need only the image name.
> 
> Assaf.
> 
> 
> בתאריך 14 ביוני 2017 01:13 PM,‏ "Ajay Pawaskar" <APawaskar@genesisinfo.com>
> כתב:
> 
> Hi,
> I am having question related to storing millions/billions of document in Couch DB and
use view to get required documents. I would like to know about performance/scalability of
views/ Couch DB in following case.
> 
> I have  application where I need to store meta data of millions/billions of images for
search purpose. Images will be added/updated/deleted/retrieve on regular basis [10000/20000
per day].
> 
> we are thinking to store these documents in following format e.g.
> {
>   "_id": "201700000000002", /* this will be generated by our application*/
>   "_rev": "1-b85e805bdd293a5f727517beea9512b3",
>   "12398712397129": {"bCurrent": true, "bCanView": true} /*"12398712397129" is image
file name*/
>   "98127397192319": {"bCurrent": false, "bCanView": false}} /*"98127397192319" is image
file name*/
> 
> }
> 
> {
>   "_id": "201700000000003", /* this will be generated by our application*/
>   "_rev": "1-b85e83432d293a5f727517beea9512b3",
>   "89723979823929": {"bCurrent": true, "bCanView": true} /*"12398712397129" is image
file name*/
>   "92347324667324": {"bCurrent": false, "bCanView": false}} /*"98127397192319" is image
file name*/
>   "72832532467217": {"bCurrent": true, "bCanView": false}} /*"72832532467217" is image
file name*/ }
> 
> 
> so if user want to get current image for record  201700000000002 we will be having following
view
> 
> function(doc) {
> for(var prop in doc){
> if(prop!="_id" && prop!="_rev"){
>  if(doc[prop].bCurrent!==undefined && doc[prop].bCurrent){
>     emit(doc._id, { RecordID: doc._id,ImageID: prop,bCurrent:
> doc[prop].bCurrent, doc[prop].bCanView}); } } } which will be called with key "201700000000002".
> 
> but as mentioned earlier Images will be added/updated/deleted/retrieve on regular basis
[10000/20000 per day] how this going to affect views performance?
> 
> Regards,
> Ajay.


Mime
View raw message