couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Metson <si...@cloudant.com>
Subject Re: Efficient way to identify documents
Date Fri, 08 Jun 2012 13:11:08 GMT
Hi,  
You're better off emitting all the docs regardless of type and then using start_key/end_key
to get the docs you're interested in:

function(doc){
  emit(doc._id, null);
}

then query the view with ?startkey="T1_"&endkey="T2_"

If your also interested in the number of docs per T1, T2 etc… You could do:

function(doc){
  emit(doc._id.substr(0,3), 1);
}


and use a _sum or _count reduce function to count the docs.
HTH
Simon


On Friday, 8 June 2012 at 08:18, Paulo Carvalho wrote:

> Hello,
>  
> Most of the functionnalities of the application using this database are listing functionnalities.
This is the main reason why in my couchDB structure I have put T2's information inside T1's
documents.
>  
> So, it is necessary for me to have such structure to improve listing functions performance.
>  
> This is why I am looking (if existant) a better way that the one I am using to identify
all T1 documents, etc.
>  
> e.g. function to get all the T1 documents:
> function(doc) { if(doc._id.indexOf('T1_') == 0) { emit(null, doc); } }Thank you
>  
> Regards
>  
> On Thu, Jun 7, 2012 at 11:00 PM, Simon Metson <simon@cloudant.com (mailto:simon@cloudant.com)>
wrote:
> > Hey Paulo,  
> > I think you need to think about what data you have and how you want to interrogate
it, rather than how you currently structure your database schema; it sounds like you have
a very normalised schema and you need to denormalise that to make the most of CouchDB.
> >  
> > By moving to a NoSQL DB you need to review your use cases and requirements and change
how you think about the data. It might not be that the tradeoffs you need to make don't fit
your data, or that you have no choice but to change your data model to adopt a tool that has
the features you need.  
> > Cheers
> > Simon
> >  
> > On Thursday, 7 June 2012 at 15:02, Paulo Carvalho wrote:
> >  
> > > Hello,
> > >  
> > > Thanks for your answer.
> > >  
> > > However, I don't know if the proposed solution works for my purpose because
I have several tables more (so, much more documents).
> > >  
> > > Before, in my relational database, I had several tables, to simplify, I will
call them T1, T2, T3, T4, T5, ..., TN
> > >  
> > > Now, in the couchDB database, I have a structure like that:
> > >  
> > > {
> > >   _id = T1_1
> > >   T2 {
> > >      ...
> > >   }
> > > {
> > >   _id = T1_2
> > >  T2 {
> > >      ...
> > >   }
> > > }
> > >  
> > > {
> > >   _id = T1_3
> > >  T2 {
> > >      ...
> > >   }
> > >  T3 {
> > >      ...
> > >   }
> > > }
> > >  
> > > {
> > >   _id = T2_1
> > >  ...
> > > }
> > >  
> > >  
> > > {
> > >   _id = T2_2
> > >  ...
> > > }
> > >  
> > >  
> > > {
> > >   _id = T2_3
> > >  ...
> > > }
> > >  
> > > {
> > >   _id = T2_3
> > >   T5 {
> > >     ...
> > >   }
> > >   ...
> > > }
> > >  
> > > }
> > >  
> > >  
> > > As you can see, in the "root" level of the DB, I can have documents which are
"instances" of different tables.
> > >  
> > > What I would like to do, is to find the most performant way to list all the
documents which come from T1 table, all the documents which come from T2 table, etc.
> > >  
> > > Do you understood my doubt?
> > >  
> > > Thank you
> > >  
> > > Best regards
> > >  
> > > Paulo
> > >  
> > >  
> > > On Thu, Jun 7, 2012 at 3:38 PM, Simon Metson <simon@cloudant.com (mailto:simon@cloudant.com)>
wrote:
> > > > Hi,  
> > > > Moving this to user@
> > > >  
> > > > If you just need the authors names, and books are unique, you could just
have your book documents be something like:
> > > >  
> > > > {  
> > > >   _id: book_title,
> > > >   author: {
> > > >     name: author_name,
> > > >     age: author_age
> > > >   },
> > > >   year: book_year
> > > > }
> > > >  
> > > > then pull out all the authors via a view.   
> > > >  
> > > > You shouldn't include the doc in the view value, that's very space inefficient.
I think you'll want a view like:
> > > >  
> > > > // view for books
> > > > function(doc) {
> > > >   emit(doc._id, 1);
> > > > }
> > > >  
> > > >  
> > > > //view for authors
> > > > function(doc) {
> > > >   emit(doc.author.name (http://doc.author.name), 1);
> > > > }
> > > >  
> > > >  
> > > > (you could use a _count reduce to return the number of books/authors,
if thats useful)
> > > >  
> > > > The above depends a lot on what you want to use the data for, and what
other information you need to include in your docs, though. You might need to have a list
of authors, for example, or include their biography in which case the above isn't great. 

> > > > Cheers
> > > > Simon
> > > >  
> > > > On Thursday, 7 June 2012 at 13:57, pjmorce wrote:
> > > > > Hello,
> > > > >  
> > > > > I am new to noSQL databases and more precisely to couchDB.
> > > > >  
> > > > > I have migrated my PostgreSQL database to couchDB.
> > > > >  
> > > > > Before, in my relational database I had 2 tables: Authors and Books
> > > > >  
> > > > > Now, for each row of these tables, on my couchDB database I have
a document:
> > > > >  
> > > > > { _id = "author_1"  
> > > > >  
> > > > > {
> > > > > name = "a"
> > > > > age = "b"
> > > > > }
> > > > > }
> > > > > { _id = "author_2"
> > > > >  
> > > > > {
> > > > > name = "abc"
> > > > > age = "bcd"
> > > > > }
> > > > > }
> > > > > { _id = "book_1"
> > > > >  
> > > > > {
> > > > > title = "the x files"
> > > > > year = "1994"
> > > > > }
> > > > > }
> > > > > { _id = "book_2"
> > > > >  
> > > > > {
> > > > > title = "the jungle book"
> > > > > year = "1964"
> > > > > }
> > > > > }
> > > > > ...
> > > > >  
> > > > > For getting all the authors I created the following view:
> > > > >  
> > > > > function(doc) {
> > > > > if(doc._id.indexOf('author_') == 0) {
> > > > > emit(null, doc);
> > > > > }
> > > > > }
> > > > >  
> > > > > and for getting all the books I created the following view:
> > > > >  
> > > > > function(doc) {
> > > > > if(doc._id.indexOf('book_') == 0) {
> > > > > emit(null, doc);
> > > > > }
> > > > > }
> > > > >  
> > > > >  
> > > > > Is there any more efficient way to do this? I think this solution
is not
> > > > > performant when large amount of documents will be in the database...
> > > > >  
> > > > > Thank you
> > > > >  
> > > > > Regards
> > > > >  
> > > > > --
> > > > > View this message in context: http://couchdb-development.1959287.n2.nabble.com/Efficient-way-to-identify-documents-tp7580274.html
> > > > > Sent from the CouchDB Development mailing list archive at Nabble.com
(http://Nabble.com).
> > > > >  
> > > > >  
> > > > >  
> > > >  
> > > >  
> > >  
> >  
>  


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message