couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Crowell <robccrow...@gmail.com>
Subject Re: Building views to locate documents WITHOUT a certain set of tags
Date Wed, 30 Nov 2011 21:04:47 GMT
On Wed, Nov 30, 2011 at 3:56 PM, Dave Cottlehuber <dave@muse.net.nz> wrote:
> On 30 November 2011 21:49, Rob Crowell <robccrowell@gmail.com> wrote:
>> I suppose it would be possible to make multiple queries, using
>> startkey and endkey to pull out the ranges.
>>
>> 1. Sort the "bad" tags: (BROKEN_IMAGE, OFFENSIVE_IMAGE)
>> 2. For each bad tag, request documents:
>>    i. Query 1:
>>        startkey = []
>>        endkey = ["BROKEN_IMAGE"]
>>
>>    ii. Query 2:
>>        startkey = ["BROKEN_IMAGE", {}]
>>        endkey = ["OFFENSIVE_IMAGE"]
>>
>>    iii. Query 3:
>>        startkey = ["OFFENSIVE_IMAGE", {}]
>>        endkey = [{}]
>>
>> Requires making N+1 queries, which for a fairly small list wouldn't be too bad.
>>
>> On Wed, Nov 30, 2011 at 3:10 PM, Rob Crowell <robccrowell@gmail.com> wrote:
>>> Hey everyone, view question here.
>>>
>>> I've got couch records that represent images.  They may have any
>>> number of tags (from zero to hundreds).  However, while there are
>>> thousands of tags in the dataset, there are only a couple that are
>>> considered "bad" (BROKEN_IMAGE, BLANK_IMAGE, etc.)  Here's an example
>>> document:
>>>
>>> {
>>>    _id: ...,
>>>    url: "http://example.org/whatever.png",
>>>    tags: ["OUTDOORS", "BEACH", "RED_DRESS"]
>>> }
>>>
>>> I wrote a view to emit documents that don't have these "bad" tags by
>>> hard-coding the list of bad tags and checking every tag against this
>>> list.  If none of the tags are bad, then emit the document.
>>>
>>> However, a user may also specify tags that he doesn't like
>>> (OFFENSIVE_IMAGE, DENVER_BRONCOS, whatever).  Is there any good way to
>>> build a view around this idea ("show me all documents that don't have
>>> a set of tags") short of defining a custom view (with their own "bad"
>>> tags list) for every user?
>>>
>>> I could do this filtering client-side of course, but if I wanted to
>>> generate an exhaustive list of matching documents (for a report or
>>> something similar) then it would be a lot of work.  I'm stumped at the
>>> moment.  Thanks for any suggestions!
>>>
>
> foo AND bar
> NOT baz
> CONTAINS beer
>
> Classic use cases for couchdb-lucence or elasticsearch.
>
> A+
> Dave
>

Thanks, I'll look into those.

I think the method I outlined in my second message doesn't work
anyways, since documents can have multiple tags (d'oh!).  I'd need to
get all documents, and then get the list of documents that have any of
the invalid tags (using multiple queries similar to my incorrect
solution earlier), and then write some code to remove the documents
with at least one of the bad tags from the overall list.  Yuck.

Mime
View raw message