Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 55593 invoked from network); 23 Dec 2010 23:33:10 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Dec 2010 23:33:10 -0000 Received: (qmail 50832 invoked by uid 500); 23 Dec 2010 23:33:08 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50778 invoked by uid 500); 23 Dec 2010 23:33:08 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50770 invoked by uid 99); 23 Dec 2010 23:33:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Dec 2010 23:33:08 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jsaardchit@go2.com designates 38.97.64.35 as permitted sender) Received: from [38.97.64.35] (HELO mail.go2.com) (38.97.64.35) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Dec 2010 23:33:02 +0000 Received: from [10.0.1.4] ([75.82.49.1]) by mail.go2.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Thu, 23 Dec 2010 18:32:41 -0500 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Get Analyzed/Tokenized Field List From: Jordon Saardchit In-Reply-To: Date: Thu, 23 Dec 2010 15:32:38 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1A297333-56F0-4485-BD74-8E3DC96AE39E@go2.com> To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.1082) X-OriginalArrivalTime: 23 Dec 2010 23:32:41.0568 (UTC) FILETIME=[B151DE00:01CBA2F9] The basic use case is determiniation of rules in regards to building a = query. I've got an application that programmatically builds queries = (without any pre existing knowledge of the contents of the index it is = searching). We have a custom designed analyzer and filter chain. = However, it is applied to certain fields at index time. The fields it = is applied to are unstored. On the search side, I want to be able to determine at runtime which = field the analyzer should be applied to, and which field not to. I = could be approaching the solution incorrectly, but I figured this would = be a pretty common or natural use case. Jordon On Dec 23, 2010, at 2:51 PM, Erick Erickson wrote: > Ah, you didn't mention indexed but unstored in your original message, > just indexed/analyzed.... >=20 > I don't think you can (someone jump in here if I'm wrong, please). The > problem > is that Lucene doesn't require any sort of schema. So if you are = perfectly > free to > store a field in one document and NOT store it in another. All the = variants > specified in IndexReader.fieldOption can quickly be determined by just > looking at the > various index files. But you'd have to spin through all the = #documents# in > order > to answer the question "is this field ever stored?". Sounds like a = table > scan in the > DB world. >=20 > I don't think Lucene keeps meta-data for this, and spinning through = all the > documents > would be expensive... >=20 > Why do you want to know? Perhaps there's another way to satisfy the > use-case. >=20 > I could be way off base here, I'm speaking from general principles not > knowledge of > the code... >=20 > Best > Erick >=20 > On Thu, Dec 23, 2010 at 4:43 PM, Jordon Saardchit = wrote: >=20 >> Yes I have, and after testing each of the various options denoted in >> IndexReader.FieldOption, I cannot retrieve fieldnames that are = indexed >> (analyzed), and unstored. I figured this would be relatively easy to = do and >> I was simply overlooking something. Is it perhaps not possible to do = this? >>=20 >> Jordon >>=20 >> On Dec 23, 2010, at 1:30 PM, Erick Erickson wrote: >>=20 >>> Have you looked at IndexReader.getFieldNames()? >>>=20 >>> Best >>> Erick >>>=20 >>> On Thu, Dec 23, 2010 at 3:23 PM, Jordon Saardchit = >> wrote: >>>=20 >>>> Is there an easy way to retrieve a collection of fields (or field = names) >>>> that are analyzed/tokenized from any given index? >>>>=20 >>>> Jordon >>>> = --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>=20 >>>>=20 >>=20 >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >>=20 >>=20 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org