incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guby <guby.m...@gmail.com>
Subject Re: URL as document_id
Date Mon, 03 Mar 2008 23:36:02 GMT
Hi Chris
When using URLs as IDs you will get in trouble when you have URLs that  
contain question marks! I don't get it to work anyhow.
I think it might be related to the errors I get when using keys, when  
querying views, that contain question marks or = or &. I actually sent  
a message about this to the list yesterday and Neil suggested that  
there might be an error in

src/CouchDB/mod_couch.erl.

where this command is called:
case regexp:split(RequestUri, "\\?") of

Hope that helps.

Best regards
G




On Mar 3, 2008, at 6:17 PM, Chris Anderson wrote:

> Hello all.
>
> I'm planning to store the results of a web-crawl in CouchDB, and want
> to use the page urls as document_ids. I understand that I can get the
> same uniq identifier constraints by using an MD5 of the url, but the
> raw URL appeals to me.
>
> The only downside to using a URL as the document_id, is that they can
> contain a wide set of characters, and can be quite long. It's not
> clear from the wiki if there are any practical limitations on
> document_ids -- I'm hoping that gives the go-ahead for me to just pour
> raw web sewage (URLs) into CouchDB document_ids.
>
> Thanks for any advice/warnings,
> Chris
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com


Mime
View raw message