Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 22C74200828 for ; Fri, 13 May 2016 17:42:51 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 213D516099F; Fri, 13 May 2016 15:42:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6DA7C1602BE for ; Fri, 13 May 2016 17:42:50 +0200 (CEST) Received: (qmail 11580 invoked by uid 500); 13 May 2016 15:42:48 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 11563 invoked by uid 99); 13 May 2016 15:42:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2016 15:42:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D503DC0A45 for ; Fri, 13 May 2016 15:42:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.28 X-Spam-Level: ** X-Spam-Status: No, score=2.28 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, KAM_LIVE=1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=johnbickerstaff-com.20150623.gappssmtp.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id y1uyiU_sCpBP for ; Fri, 13 May 2016 15:42:45 +0000 (UTC) Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com [209.85.220.45]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id DE1DC5F19D for ; Fri, 13 May 2016 15:42:44 +0000 (UTC) Received: by mail-pa0-f45.google.com with SMTP id xk12so42271588pac.0 for ; Fri, 13 May 2016 08:42:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=johnbickerstaff-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=joU3kMYNCDoQXCEzWBf7f+dFD6QKblfqxdM33YiV3eQ=; b=cSbFay+TwMEZxU9wC55XKF++7Y2Mf92m/Yk28ogK9iXHmIvWc7ThNkZqlz6JepHbjG nsbqr965m/ebbTBfS47cGKMO0y2VTkv3HnPYDF5QHrdNF+/JC7/1Fq8hDY/jbLe0ICZx YYiZ3BfiDaq9dkNhx7FMpScOtEqFrAS9cj3ngY8N24HTRBc89OsuHNKlAMDHaxiXNdgl wP2j4Wuj73eBn62QhHblGVNcxDlC1Afg3AHHXquaLxBPTv/WIOT8OUV5ZSNxq9HXYaUV BlmwtEg9hXHAVpHMzVNTvAor3WmK+impWxh7wvSE4rxgtnoUYcKhNfDGxhzYkE0ZnFkj v0+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=joU3kMYNCDoQXCEzWBf7f+dFD6QKblfqxdM33YiV3eQ=; b=fLE5qm+BV6tLT9InrczLidMXgkDvn8qyHiua/fEsHg+goeF7TrpPj5nur750IeGYsM QhzS0Iiz7BLeovvkYDrHVS7haCIINA4rwkiTRfIA/ZxsceLsXhdaws+q/S7oTW2mFD8j 3hkBt5td+Rk3bu1DBWbYBTrWCzlbevukUqsB9TBCNhtyX0WKYUSXuzkmd4TuT6CXTwFG YErHnC1XBaZYx6L0rE3M5Q5+jyUDKimZlWnHoJygfptzL3wEfexoJL93Fz6byP4p71+U ii5WcHDkL4rxWbIw8ptmp8JzX2qGfatCsJq7pLvSvPzOFAKf6ZcOYGnCgkr4KJakqzDl Ok2Q== X-Gm-Message-State: AOPr4FXTG69lbAMrKy9kjfv8k8Z4LRv2G17eDQO8S83a9zGNKm8eu/VL7BsM6+YhcoP5UDav73H8+4bmhDyhbQ== MIME-Version: 1.0 X-Received: by 10.66.231.98 with SMTP id tf2mr24265132pac.56.1463154163562; Fri, 13 May 2016 08:42:43 -0700 (PDT) Received: by 10.66.230.131 with HTTP; Fri, 13 May 2016 08:42:43 -0700 (PDT) In-Reply-To: References: <070b9715-f4c3-7e16-8f93-30bde983c0ea@elyograg.org> Date: Fri, 13 May 2016 09:42:43 -0600 Message-ID: Subject: Re: Is there an equivalent to an SQL "select distinct" in Solr From: John Bickerstaff To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=047d7b111f092bd98f0532bb20e3 archived-at: Fri, 13 May 2016 15:42:51 -0000 --047d7b111f092bd98f0532bb20e3 Content-Type: text/plain; charset=UTF-8 I should clarify: http:/XXX.XXX.XX.XX:8983/solr/yourCoreName/select q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=category "yourCoreName" will get built in for you if you use the Solr Admin UI for queries -- On Fri, May 13, 2016 at 9:36 AM, John Bickerstaff wrote: > In case it's helpful for a quick and dirty peek at your facets, the > following URL (in a browser or Curl) will get you basic facets for a field > named "category" -- assuming you change the IP address / hostname to match > yours. > > http:/XXX.XXX.XX.XX:8983/solr/statdx_shard1_replica3/select > q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=category > > You can also do this in the Admin UI by checking the "facet" box, and > entering the field name in the facet.field that pops up. You can leave the > query field at the default *:* > > You need to make sure that you put a "0" in the rows field as well (right > under "sort") in order to just get back the facet counts. > > On Fri, May 13, 2016 at 7:52 AM, Joel Bernstein > wrote: > >> You may also want to try out the SQL interface in Solr 6.0 which supports >> SELECT DISTINCT queries. >> >> >> https://cwiki.apache.org/confluence/display/solr/Parallel+SQL+Interface#ParallelSQLInterface-SELECTDISTINCTQueries >> >> Joel Bernstein >> http://joelsolr.blogspot.com/ >> >> On Fri, May 13, 2016 at 9:47 AM, GW wrote: >> >> > Thank you Shawn, >> > >> > I will toy with these over the weekend. Solr/Hadoop/Hbase has been a >> nasty >> > learning curve for me, >> > It would probably would have been a lot easier if I didn't have 30 >> years of >> > RDBMS stuck in my head. >> > >> > Again, >> > >> > Many thanks for your response. >> > >> > >> > On 13 May 2016 at 08:57, Shawn Heisey wrote: >> > >> > > On 5/13/2016 6:48 AM, GW wrote: >> > > > Let's say I have 10,000 documents and there is a field named >> "category" >> > > and >> > > > lets say there are 200 categories but I do not know what they are. >> > > > >> > > > My question: Is there a query/filter that can pull a list of >> distinct >> > > > categories? >> > > >> > > Sounds like a job for faceting or grouping. Which one of them to use >> > > will depend on exactly what you're trying to obtain in your results. >> > > >> > > https://cwiki.apache.org/confluence/display/solr/Faceting >> > > https://cwiki.apache.org/confluence/display/solr/Result+Grouping >> > > >> > > Thanks, >> > > Shawn >> > > >> > > >> > >> > > --047d7b111f092bd98f0532bb20e3--