incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Anderson" <>
Subject URL as document_id
Date Mon, 03 Mar 2008 21:17:26 GMT
Hello all.

I'm planning to store the results of a web-crawl in CouchDB, and want
to use the page urls as document_ids. I understand that I can get the
same uniq identifier constraints by using an MD5 of the url, but the
raw URL appeals to me.

The only downside to using a URL as the document_id, is that they can
contain a wide set of characters, and can be quite long. It's not
clear from the wiki if there are any practical limitations on
document_ids -- I'm hoping that gives the go-ahead for me to just pour
raw web sewage (URLs) into CouchDB document_ids.

Thanks for any advice/warnings,

Chris Anderson

View raw message