couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Wall <jw...@google.com>
Subject Re: Looking for advice using CouchDB for a FreeSoftware project
Date Sat, 13 Jun 2009 15:11:00 GMT
I think you can actually get rid of the relation table.
just put the movie hash as an attribute of your subtitle document.

On Sat, Jun 13, 2009 at 9:13 AM, fana <fana@2flub.org> wrote:

> Hi,
>
> I heard about CouchDB in a german Podcast[1] last week
> and I think I found the last missing piece for a FreeSoftware project[2].
>
>  Background:
>
> There is a program called "SubDownloader"[3] which is an XML-RPC client
> to the XML-RPC server of http://www.opensubtitles.org . It works like
> this:
>
>  * You have a movie and you want a subtitle for it.
>  * You open your movie with Subdownloader.
>  * Subdownloader hashes[4] your movie file.
>  * Subdownloader asks XML-RPC server whether it has a subtitle for this
> movie hash and downloads it.
>
> Problem now is that opensubtitles.org infrastructure can't handle the load
> anymore[5] and it's not possible to scale it.
>
> We now re-implement the XML-RPC server in Python but it was a big headache
> designing the database, because we don't want to "navigate the ship in the
> same iceberg" as opensubtitles.org did.
>
> I think that CouchDB is perfect for us in terms of scalability,
> replication, collaboration and design changes in the future.
>
> As I want to eliminate as much mistakes from the beginning as possible
> I would like to ask here for advice and created a first draft how our
> database would look like.
>
> Would this draft work out with CouchDB or is there a better way?
>

Modify this to be:

>
> SubtitleFile
> ------------
>
> {
>  "_id"              : "String",       (MD5 hash of subtitle file)

   "movie_hash"  : "String",       (Id from the movie document)

>
>  "type"             : "subtitlefile",
>  "format"           : "String",       (e.g. "SubRip")
>  "language"         : "String",       (ISO 639-2 code)
>  "hearing_impaired" : "String",       ("True" or "False")
>  "fansub"           : "String",       ("True" or "False")
>  "uploader"         : "String",
>  "_attachments"     :
>
>  {
>    "subtitle.srt":
>    {
>      "content_type" : "text\/plain",
>      "data"         : "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
>    }
>  }
>
> }
>
>  THERE IS NO HOSTING OF MOVIE FILES OF THE MOVIE INDUSTRY
>  (just peoples' file hashes)
>

Keep this the same:


>
> MovieFile
> ---------
>
> {
>  "_id"      : "String",               (Computed hash of movie file)
>  "type"     : "moviefile",
>  "length"   :  number,                (seconds)
>  "filesize" :  number,                (kb)
>  "fps"      :  number,
>  "uploader" : "String"
> }
>

get rid of this completely:


>
> Relation
> --------
>
> {
>                                       (here "_id" will be generated by
> CouchDB)
>  "type"            : "relation"
>  "id_subtitlefile" : "String",        (the MD5 hash of the subtitle)
>  "id_moviefile"    : "String"         (the     hash of the movie file)


> }


You can still look up subtitles by the the movie id and you get rid of an
unecessary document. In my (admittedly limited) experience a linking
document is usually unnessary.

>
>
>
> [1] http://chaosradio.ccc.de/cre125.html
> [2] https://launchpad.net/osclone
> [3] http://subdownloader.net
> [4]
> http://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes
> [5] http://forum.opensubtitles.org/viewtopic.php?t=1775
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message