couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Wall <jw...@google.com>
Subject Re: Looking for advice using CouchDB for a FreeSoftware project
Date Sat, 13 Jun 2009 16:25:35 GMT
ahhhh ok that makes sense then.

On Sat, Jun 13, 2009 at 11:21 AM, fana <fana@2flub.org> wrote:

> Hi, thanks for the quicky reply,
>
>
> On Sat, 13 Jun 2009 10:11:00 -0500, Jeremy Wall <jwall@google.com> wrote:
> > I think you can actually get rid of the relation table.
> > just put the movie hash as an attribute of your subtitle document.
>
> This was one of my first thoughts, too.
> At the beginning I had a list of movie hashes in the SubtitleFile document.
> Then I thought it would be better in the other direction and have a list of
> subtitle hashes in the MovieFile document.
>
> The problem I had, is, that there is a ManyToMany relation between them.
> One MovieFile can have many suitable SubtitleFiles and vice-versa.
>
> Maybe I still think too "relational-databased" but the advantage I see
> with the "relation" document is, that if somebody wants to add further
> matching hashes,
> I don't have to make sure that existing hashes don't get lost.
>
>
> > On Sat, Jun 13, 2009 at 9:13 AM, fana <fana@2flub.org> wrote:
> >
> >> Hi,
> >>
> >> I heard about CouchDB in a german Podcast[1] last week
> >> and I think I found the last missing piece for a FreeSoftware
> project[2].
> >>
> >>  Background:
> >>
> >> There is a program called "SubDownloader"[3] which is an XML-RPC client
> >> to the XML-RPC server of http://www.opensubtitles.org . It works like
> >> this:
> >>
> >>  * You have a movie and you want a subtitle for it.
> >>  * You open your movie with Subdownloader.
> >>  * Subdownloader hashes[4] your movie file.
> >>  * Subdownloader asks XML-RPC server whether it has a subtitle for this
> >> movie hash and downloads it.
> >>
> >> Problem now is that opensubtitles.org infrastructure can't handle the
> >> load
> >> anymore[5] and it's not possible to scale it.
> >>
> >> We now re-implement the XML-RPC server in Python but it was a big
> >> headache
> >> designing the database, because we don't want to "navigate the ship in
> >> the
> >> same iceberg" as opensubtitles.org did.
> >>
> >> I think that CouchDB is perfect for us in terms of scalability,
> >> replication, collaboration and design changes in the future.
> >>
> >> As I want to eliminate as much mistakes from the beginning as possible
> >> I would like to ask here for advice and created a first draft how our
> >> database would look like.
> >>
> >> Would this draft work out with CouchDB or is there a better way?
> >>
> >
> > Modify this to be:
> >
> >>
> >> SubtitleFile
> >> ------------
> >>
> >> {
> >>  "_id"              : "String",       (MD5 hash of subtitle file)
> >
> >    "movie_hash"  : "String",       (Id from the movie document)
> >
> >>
> >>  "type"             : "subtitlefile",
> >>  "format"           : "String",       (e.g. "SubRip")
> >>  "language"         : "String",       (ISO 639-2 code)
> >>  "hearing_impaired" : "String",       ("True" or "False")
> >>  "fansub"           : "String",       ("True" or "False")
> >>  "uploader"         : "String",
> >>  "_attachments"     :
> >>
> >>  {
> >>    "subtitle.srt":
> >>    {
> >>      "content_type" : "text\/plain",
> >>      "data"         : "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
> >>    }
> >>  }
> >>
> >> }
> >>
> >>  THERE IS NO HOSTING OF MOVIE FILES OF THE MOVIE INDUSTRY
> >>  (just peoples' file hashes)
> >>
> >
> > Keep this the same:
> >
> >
> >>
> >> MovieFile
> >> ---------
> >>
> >> {
> >>  "_id"      : "String",               (Computed hash of movie file)
> >>  "type"     : "moviefile",
> >>  "length"   :  number,                (seconds)
> >>  "filesize" :  number,                (kb)
> >>  "fps"      :  number,
> >>  "uploader" : "String"
> >> }
> >>
> >
> > get rid of this completely:
> >
> >
> >>
> >> Relation
> >> --------
> >>
> >> {
> >>                                       (here "_id" will be generated by
> >> CouchDB)
> >>  "type"            : "relation"
> >>  "id_subtitlefile" : "String",        (the MD5 hash of the subtitle)
> >>  "id_moviefile"    : "String"         (the     hash of the movie file)
> >
> >
> >> }
> >
> >
> > You can still look up subtitles by the the movie id and you get rid of an
> > unecessary document. In my (admittedly limited) experience a linking
> > document is usually unnessary.
> >
> >>
> >>
> >>
> >> [1] http://chaosradio.ccc.de/cre125.html
> >> [2] https://launchpad.net/osclone
> >> [3] http://subdownloader.net
> >> [4]
> >>
> http://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes
> >> [5] http://forum.opensubtitles.org/viewtopic.php?t=1775
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message