Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D54BE173BB for ; Wed, 29 Apr 2015 14:42:05 +0000 (UTC) Received: (qmail 50317 invoked by uid 500); 29 Apr 2015 14:42:04 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50251 invoked by uid 500); 29 Apr 2015 14:42:04 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50197 invoked by uid 99); 29 Apr 2015 14:42:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2015 14:42:03 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MANY_SPAN_IN_TEXT,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: message received from 54.76.25.247 which is an MX secondary for java-user@lucene.apache.org) Received: from [54.76.25.247] (HELO mx1-eu-west.apache.org) (54.76.25.247) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2015 14:41:34 +0000 Received: from mail-ob0-f175.google.com (mail-ob0-f175.google.com [209.85.214.175]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 5317C2AA95 for ; Wed, 29 Apr 2015 14:41:33 +0000 (UTC) Received: by oblw8 with SMTP id w8so21546230obl.0 for ; Wed, 29 Apr 2015 07:41:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=anzxWL/RZuq0E/gKCTTPR2scLjyGP8kPuPglHYOp2HU=; b=NKlGS9jIQsL8A66EMy0ss81ds7ToEtIaLvuH6sdbq1cr6fxjN+FLcpJYlahtLz5Vcc gHzXtJaLG2e/S22LNobMHZ1+f4iLdWDQKEIjf4gysXt2tpMSxPmabR78uCtAwS7fAZNc yNSb49UIjz7RSZ2fKYw7+xq0DOkO1DGeZEKLKYKIra8oxtyYN9LHeYujNuxKSStiMO2d a4/xhIyxlEUyrwPV9leK4p9VrLQoo1gsh0e7a1WgHOYGOHsZL6rjFiCdmb2+MXS137TP qxkN8dpJld3WfqsBUDOLSRGOjMsJTjveiC6Z/G92dksNqGFVR7TkIATuJ8XvRczFKdK0 VEcw== X-Gm-Message-State: ALoCoQmorF1rx0vIzzKEMVLFAWZwMt0IOBzJYQQ+1rxHHGfgrUB3qj4UtCX74kcd8W/aVCLrsln0 MIME-Version: 1.0 X-Received: by 10.202.93.4 with SMTP id r4mr18153752oib.92.1430318492155; Wed, 29 Apr 2015 07:41:32 -0700 (PDT) Received: by 10.202.219.86 with HTTP; Wed, 29 Apr 2015 07:41:32 -0700 (PDT) Date: Wed, 29 Apr 2015 10:41:32 -0400 Message-ID: Subject: custom collector From: Robust Links To: "solr-user@lucene.apache.org" , java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a113d4a74a414d30514ddf992 X-Virus-Checked: Checked by ClamAV on apache.org --001a113d4a74a414d30514ddf992 Content-Type: text/plain; charset=UTF-8 Hi I need help porting my lucene code from 4 to 5. In particular, I need to customize a collector (to collect all doc Ids in the index - which can be >30MM docs..). Below is how I achieved this in lucene 4. Is there some guidelines how to do this in lucene 5, specially on semantics changes of AtomicReaderContext (which seems deprecated) and the new LeafReaderContext? thank you in advance public class CustomCollector extends Collector { private HashSet data = new HashSet(); private Scorer scorer; private int docBase; private BinaryDocValues dataList; public boolean acceptsDocsOutOfOrder() { return true; } public void setScorer(Scorer scorer) { this.scorer = scorer; } public void setNextReader(AtomicReaderContext ctx) throws IOException{ this.docBase = ctx.docBase; dataList = FieldCache.DEFAULT.getTerms(ctx.reader(),"title",false); } public void collect(int doc) throws IOException { BytesRef t = new BytesRef(); dataList(doc); if (t.bytes != BytesRef.EMPTY_BYTES && t.bytes != BytesRef.EMPTY_BYTES) { data((t.utf8ToString())); } } public void reset() { data.clear(); dataList = null; } public HashSet getData() { return data; } } --001a113d4a74a414d30514ddf992--