incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Copenhaver <sean.copenha...@gmail.com>
Subject Re: Help with data modelling
Date Mon, 13 Jun 2011 11:54:45 GMT
I honestly don't know if it's a valid concern (the many tiny documents). I
could certainly understand if you did a document per user's session and used
an update function to append a log entry to it.

You'll need to think about how you are going to use the information and run
some tests I think. Perhaps someone else with experience managing a database
with millions of tiny documents could chime in. Me personally, I would test
it out with the more then the expected load if possible and see how your
write performance and database size performs. There are probably pros/cons
to both methods, but I would be speculating what they are.

2011/6/13 Javier Rodríguez Escolar <javiescolar@gmail.com>

> Hello Sean,
>
> First of all, thanks for your answer. In fact, I have no  strong reasons to
> combine things in single documents, it is just a matter of balance between
> the number of documents and the amount of data they contain. For instance,
> let's imagine my system holds 1000 users and each of them runs the
> application during 1 hour per day (notice that the log information must be
> sent to the server in real time, each second). If I decide to write each
> log
> entry into a single document, I will be creating 3.600.000 new documents
> per
> day and each of them will have just a few words. Instead of that, If I
> decide to create one document per application life cycle, I will be
> creating
> 1000 new documents per day, each of them containing 3.600 log entries,
> which
> to my mind, seems to be more balanced. I don't know if it is an appropriate
> reasoning. What do you think?
>
> Best regards,
>
> Javi
>
> 2011/6/10 Sean Copenhaver <sean.copenhaver@gmail.com>
>
> > It sounds like you are trying to combine things into a single document.
> > What
> > were your concerns that you would want to put all manufacturers in a
> single
> > document or all log entries for example? That is instead of in a view
> query
> > result?
> >
> > I don't think there would be an issue with having everything as a
> separate
> > document. There are ways to pull back related documents in one query that
> > would resolve concerns of having to do many hits. As an example, you
> could
> > create a view to be able to pull back the device, user, and profile in
> one
> > query.
> >
> > The wiki has some information on managing relationships:
> > http://wiki.apache.org/couchdb/EntityRelationship
> >
> > On the flip side there are ways to perform CRUD operations on multiple
> > documents at once, including transactional (although with multiple dbs
> you
> > can run into inconsistencies as the transaction won't hold up across
> > replication):
> > http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API
> >
> >
> > 2011/6/10 Javier Rodríguez Escolar <javiescolar@gmail.com>
> >
> > > Hello, I'm a CouchDB newbie trying to migrate an existing application
> > from
> > > SQL to NoSQL. I have designed different approaches to model the CouchDB
> > > documents and I have been leafing through a couple of books [1],[2] in
> > > order
> > > to figure out the possible problems each approach might cause, but I
> > still
> > > have some doubts. Basically the data model of my application domain has
> > the
> > > following scheme:
> > >
> > > *Data model overview*
> > >
> > >   - Mobile manufacturers (in the order of 60). Each manufacturer has
> > >   different models:
> > >   - Mobile models (in the order of 2000 per manufactorer)
> > >   - Errors. Each manufacturer has a set of types of errors (in the
> order
> > of
> > >   1000 per manufacturer)
> > >   - User
> > >   - Mobile device
> > >   - Profile. Identified by a User and a MobileDevice
> > >   - DebugLog. Each debug log takes just 10 words and one DebugLog per
> > >   second is sent to the server.
> > >   - ErrorLog. Each error log takes just 10 words and they are generated
> > >   once in a while.
> > >   - So, my main doubts are listed below:
> > >
> > >
> > > *Doubt 1 (manufacturers and models)*
> > >
> > >   - Option 1
> > >      - One document for all the manufacturers: "Manufacturers". It just
> > >      includes a list of manufacturers, each of them has an identifier.
> > >      - One document per model: "ModelX". Each model includes a
> reference
> > to
> > >      its manufacturer.
> > >   - Option 2
> > >      - One document for all the manufacturers: "Manufacturers". It
> > includes
> > >      a list of manufacturers. Each manufacturer points to a list of
> > models.
> > >      - One document per manufacturer: "ListOfModels". It includes all
> the
> > >      models for a given manufacturer.
> > >
> > > *Doubt 2 (logs)*
> > >
> > >   - Option 1
> > >      - One document per DebugLog: "DebugLogX".
> > >   - Option 2
> > >      - One document per application life cycle:
> > >      "DebugLogsDuringApplicationLifeCycleX". It includes all the debug
> > logs
> > >      created by the application during its life cycle. An application
> > > life cycle
> > >      might takes from just a few seconds to some hours.
> > >
> > > *Doubt 3 (user, mobile and profile)*
> > >
> > >   - Option 1
> > >      - One document per profile: "ProfileX". It includes information
> > about
> > >      the mobile device and the user.
> > >   - Option 2
> > >      - One document per user: "UserX"
> > >      - One document per device: "DeviceX"
> > >      - One document for all the profiles: "Profiles". It contains a
> list
> > of
> > >      profiles, each one pointing to its associated user and device.
> > >
> > >
> > > *Doubt 4 (manufacturer errors)*
> > >
> > >   - Option 1
> > >      - One document for all the errors. Each error is associated to its
> > >      manufacturer.
> > >   - Option 2
> > >      - One document per manufacturer: "ManufacturerXErrors".
> > >
> > >
> > > I would appreciate any piece of advice.
> > >
> > > Thanks in advance and congrats for your project,
> > >
> > >
> > > [1]
> > >
> > >
> >
> http://www.amazon.com/Beginning-CouchDB-ebook/dp/B003U890N2/ref=sr_1_10?ie=UTF8&qid=1307691243&sr=8-10
> > > [2]
> > >
> > >
> >
> http://www.amazon.com/CouchDB-Definitive-Guide-Animal-ebook/dp/B0043D2E9U/ref=sr_1_3?ie=UTF8&qid=1307691243&sr=8-3
> > >
> >
> >
> >
> > --
> > “The limits of language are the limits of one's world. “ -Ludwig von
> > Wittgenstein
> >
>



-- 
“The limits of language are the limits of one's world. “ -Ludwig von
Wittgenstein

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message