incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fana <f...@2flub.org>
Subject RE: Looking for advice using CouchDB for a FreeSoftware project
Date Sat, 13 Jun 2009 17:20:46 GMT
> I think I'd make the movie hash a regular field in the document instead
> of the _id. Then you can just have multiple subtitle documents for a
movie
> and you could create a view that emit this field as a key so you can
query
> for it.

Problem is, that there is a ManyToMany relation between them.
One MovieFile can have many suitable SubtitleFiles and vice-versa.

With the "relation" document I don't have to make sure that existing hashes
don't get lost
and so it is easier for someone to add further matching hashes.

> JSON has booleans, so I'd make fansub a boolean instead of a True or
> False string.

Yes, good point. Thanks for the hint.

> You could use attachments to store the subtitle files.

Yeah, it's already in my draft


On Sat, 13 Jun 2009 18:24:39 +0200, Nils Breunese <N.Breunese@vpro.nl>
wrote:
> Hello,
> 
> Some ideas:
> 
> - I think I'd make the movie hash a regular field in the document instead
> of the _id. Then you can just have multiple subtitle documents for a
movie
> and you could create a view that emit this field as a key so you can
query
> for it.
> - JSON has booleans, so I'd make fansub a boolean instead of a True or
> False string.
> - You could use attachments to store the subtitle files.
> 
> Nils Breunese.
> 
> ________________________________________
> Van: fana [fana@2flub.org]
> Verzonden: zaterdag 13 juni 2009 16:13
> Aan: user@couchdb.apache.org
> Onderwerp: Looking for advice using CouchDB for a FreeSoftware project
> 
> Hi,
> 
> I heard about CouchDB in a german Podcast[1] last week
> and I think I found the last missing piece for a FreeSoftware project[2].
> 
>   Background:
> 
> There is a program called "SubDownloader"[3] which is an XML-RPC client
> to the XML-RPC server of http://www.opensubtitles.org . It works like
this:
> 
>  * You have a movie and you want a subtitle for it.
>  * You open your movie with Subdownloader.
>  * Subdownloader hashes[4] your movie file.
>  * Subdownloader asks XML-RPC server whether it has a subtitle for this
> movie hash and downloads it.
> 
> Problem now is that opensubtitles.org infrastructure can't handle the
load
> anymore[5] and it's not possible to scale it.
> 
> We now re-implement the XML-RPC server in Python but it was a big
headache
> designing the database, because we don't want to "navigate the ship in
the
> same iceberg" as opensubtitles.org did.
> 
> I think that CouchDB is perfect for us in terms of scalability,
> replication, collaboration and design changes in the future.
> 
> As I want to eliminate as much mistakes from the beginning as possible
> I would like to ask here for advice and created a first draft how our
> database would look like.
> 
> Would this draft work out with CouchDB or is there a better way?
> 
> SubtitleFile
> ------------
> 
> {
>   "_id"              : "String",       (MD5 hash of subtitle file)
>   "type"             : "subtitlefile",
>   "format"           : "String",       (e.g. "SubRip")
>   "language"         : "String",       (ISO 639-2 code)
>   "hearing_impaired" : "String",       ("True" or "False")
>   "fansub"           : "String",       ("True" or "False")
>   "uploader"         : "String",
>   "_attachments"     :
> 
>   {
>     "subtitle.srt":
>     {
>       "content_type" : "text\/plain",
>       "data"         : "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
>     }
>   }
> 
> }
> 
> 
> 
>   THERE IS NO HOSTING OF MOVIE FILES OF THE MOVIE INDUSTRY
>   (just peoples' file hashes)
> 
> MovieFile
> ---------
> 
> {
>   "_id"      : "String",               (Computed hash of movie file)
>   "type"     : "moviefile",
>   "length"   :  number,                (seconds)
>   "filesize" :  number,                (kb)
>   "fps"      :  number,
>   "uploader" : "String"
> }
> 
> Relation
> --------
> 
> {
>                                        (here "_id" will be generated by
> CouchDB)
>   "type"            : "relation"
>   "id_subtitlefile" : "String",        (the MD5 hash of the subtitle)
>   "id_moviefile"    : "String"         (the     hash of the movie file)
> }
> 
> 
> [1] http://chaosradio.ccc.de/cre125.html
> [2] https://launchpad.net/osclone
> [3] http://subdownloader.net
> [4]
> http://trac.opensubtitles.org/projects/opensubtitles/wiki/HashSourceCodes
> [5] http://forum.opensubtitles.org/viewtopic.php?t=1775
> 
> De informatie vervat in deze  e-mail en meegezonden bijlagen is
uitsluitend
> bedoeld voor gebruik door de geadresseerde en kan vertrouwelijke
informatie
> bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of
> verstrekking van deze informatie aan derden is voorbehouden aan
> geadresseerde. De VPRO staat niet in voor de juiste en volledige
> overbrenging van de inhoud van een verzonden e-mail, noch voor tijdige
> ontvangst daarvan.

Mime
View raw message