incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Metson <si...@cloudant.com>
Subject Re: Storage limitations?
Date Mon, 11 Nov 2013 15:19:07 GMT
I think if you can happily fit it all into CouchDB (and from the numbers you’ve mentioned
so far I’m confident you can) I’d go with that - simplest solution. If your 1000’s become
1000000’s then you can bite the bullet and switch to metadata in CouchDB, image data in
something else.  


On Monday, 11 November 2013 at 15:16, Mark Deibert wrote:

> @Simon: I understand what your saying about "source of truth". What is
> master record, the files or the Couch records if the upload succeeds and
> the Couch record write fails or vise-versa. Not sure what the answer is
> here either.
>  
>  
> On Mon, Nov 11, 2013 at 10:00 AM, Mark Deibert <mark.deibert@gmail.com (mailto:mark.deibert@gmail.com)>wrote:
>  
> > @Simon: Thanks for the advice. As I type, I don't anticipate any
> > aggregation on the Photo docs, but I should think about that some more.
> >  
> >  
> > On Mon, Nov 11, 2013 at 9:56 AM, Simon Metson <simon@cloudant.com (mailto:simon@cloudant.com)>
wrote:
> >  
> > > Hey,
> > > I’d go with what you’re doing (1:1 doc:photo).
> > >  
> > > 2 seems a bit weird - do you mean host the app in a different server to
> > > the data or have different databases for different types of data (that
> > > might work, if you never want to query across types of data)? 3 might work
> > > so long as you never want to aggregate across the categories. 4 is a
> > > reasonable approach for very very large attachments, but you can get into
> > > consistency issues - what’s the source of truth the database or the
> > > filesystem?
> > > Cheers
> > > Simon
> > >  
> > >  
> > > On Monday, 11 November 2013 at 14:41, Mark Deibert wrote:
> > >  
> > > > A followup on the "1000's of images" question. I could approach this a
> > > > couple ways. Currently each image is attached to it's own Photo doc.
> > >  
> > >  
> > > Which
> > > > I've read is better for replication than one attachment with many
> > > > attachments. So that's fine, but will Couch have any issue managing
> > >  
> > >  
> > > several
> > > > thousand of these Photo docs, each with a 3MB'ish image attachment? If
> > >  
> > >  
> > > you
> > > > were building this Couchapp, would you...
> > > >  
> > > > 1) Keep the photos as described above in one CouchDB
> > > > 2) Move the Photo docs with attachments out into a separate CouchDB
> > > > 3) Do 2, but break Photos into multiple categorized CouchDBs
> > > > 4) Upload the images to the filesystem, just store the link in Couch
> > > >  
> > > > I want to build this Couchapp in such a way as to not make life
> > > miserable
> > > > for CouchDB :-D
> > > >  
> > > >  
> > > >  
> > > >  
> > > > On Sun, Nov 10, 2013 at 6:33 PM, Dave Cottlehuber <dch@jsonified.com
(mailto:dch@jsonified.com)(mailto:
> > > dch@jsonified.com (mailto:dch@jsonified.com))> wrote:
> > > >  
> > > > > On 10 November 2013 23:14, Mark Deibert <mark.deibert@gmail.com
(mailto:mark.deibert@gmail.com)(mailto:
> > > mark.deibert@gmail.com (mailto:mark.deibert@gmail.com))> wrote:
> > > > > > I read an article somewhere that using include_docs is "hard"
on
> > > > >  
> > > >  
> > >  
> > >  
> > > memory
> > > > >  
> > > > >  
> > > > > or
> > > > > > disk or in some way taxes Couch and therefore you should just
emit
> > > > >  
> > > >  
> > >  
> > >  
> > > the
> > > > >  
> > > > >  
> > > > > doc.
> > > > > > Is this true?
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > >  
> > > > > Like most general statements it has some truth and some lies in it
:-)
> > > > >  
> > > > > views and docs are stored in separate .couch btree files on disk.
> > > > >  
> > > > > emit(key, doc) puts a full copy of the doc (that's already in the
doc
> > > > > .couch b~tree) into the view b~tree.
> > > > >  
> > > > > advantage - no need to hop over to the doc .couch file to retrieve
the
> > > > > document.
> > > > > disadvantage - you now have 2 copies of the doc in separate files,
> > > >  
> > >  
> > >  
> > > wasted
> > > > > space.
> > > > >  
> > > > > If you do things right, and your app fits this model, the generated
> > > > > etags from views and docs can be cached in nginx or similar, and
> > > > > repeated queries don't need to hit your couch.
> > > > >  
> > > > > So yes, include_docs means extra reads, but like most of these things
> > > > > you should benchmark your situation, under a realistic load, not
just
> > > > > pumping 1000 single-doc reads at it.
> > > > >  
> > > > > A+
> > > > > Dave
> > > >  
> > >  
> >  
>  




Mime
View raw message