Return-Path: Delivered-To: apmail-lucene-java-commits-archive@www.apache.org Received: (qmail 89555 invoked from network); 14 Jun 2007 12:44:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Jun 2007 12:44:02 -0000 Received: (qmail 22259 invoked by uid 500); 14 Jun 2007 12:44:06 -0000 Delivered-To: apmail-lucene-java-commits-archive@lucene.apache.org Received: (qmail 22241 invoked by uid 500); 14 Jun 2007 12:44:06 -0000 Mailing-List: contact java-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-commits@lucene.apache.org Received: (qmail 22230 invoked by uid 99); 14 Jun 2007 12:44:06 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2007 05:44:06 -0700 X-ASF-Spam-Status: No, hits=-99.5 required=10.0 tests=ALL_TRUSTED,NO_REAL_NAME X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO eris.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2007 05:44:02 -0700 Received: by eris.apache.org (Postfix, from userid 65534) id C85BD1A981A; Thu, 14 Jun 2007 05:43:41 -0700 (PDT) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r547234 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/document/package.html Date: Thu, 14 Jun 2007 12:43:41 -0000 To: java-commits@lucene.apache.org From: gsingers@apache.org X-Mailer: svnmailer-1.1.0 Message-Id: <20070614124341.C85BD1A981A@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: gsingers Date: Thu Jun 14 05:43:40 2007 New Revision: 547234 URL: http://svn.apache.org/viewvc?view=rev&rev=547234 Log: LUCENE-926: document package javadocs Modified: lucene/java/trunk/CHANGES.txt lucene/java/trunk/src/java/org/apache/lucene/document/package.html Modified: lucene/java/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=diff&rev=547234&r1=547233&r2=547234 ============================================================================== --- lucene/java/trunk/CHANGES.txt (original) +++ lucene/java/trunk/CHANGES.txt Thu Jun 14 05:43:40 2007 @@ -278,6 +278,8 @@ 5. LUCENE-925: Added analysis package javadocs. (Grant Ingersoll and Doron Cohen) + 6. LUCENE-926: Added document package javadocs. (Grant Ingersoll) + Build 1. LUCENE-802: Added LICENSE.TXT and NOTICE.TXT to Lucene jars. Modified: lucene/java/trunk/src/java/org/apache/lucene/document/package.html URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/package.html?view=diff&rev=547234&r1=547233&r2=547234 ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/document/package.html (original) +++ lucene/java/trunk/src/java/org/apache/lucene/document/package.html Thu Jun 14 05:43:40 2007 @@ -2,9 +2,37 @@ - -The Document abstraction. +

The logical representation of a {@link org.apache.lucene.document.Document} for indexing and searching.

+

The document package provides the user level logical representation of content to be indexed and searched. The +package also provides utilities for working with {@link org.apache.lucene.document.Document}s and {@link org.apache.lucene.document.Fieldable}s.

+

Document and Fieldable

+

A {@link org.apache.lucene.document.Document} is a collection of {@link org.apache.lucene.document.Fieldable}s. A + {@link org.apache.lucene.document.Fieldable} is a logical representation of a user's content that needs to be indexed or stored. + {@link org.apache.lucene.document.Fieldable}s have a number of properties that tell Lucene how to treat the content (like indexed, tokenized, + stored, etc.) See the {@link org.apache.lucene.document.Field} implementation of {@link org.apache.lucene.document.Fieldable} + for specifics on these properties. +

+

Note: it is common to refer to {@link org.apache.lucene.document.Document}s having {@link org.apache.lucene.document.Field}s, even though technically they have +{@link org.apache.lucene.document.Fieldable}s.

+

Working with Documents

+

First and foremost, a {@link org.apache.lucene.document.Document} is something created by the user application. It is your job + to create Documents based on the content of the files you are working with in your application (Word, txt, PDF, Excel or any other format.) + How this is done is completely up to you. That being said, there are many tools available in other projects that can make + the process of taking a file and converting it into a Lucene {@link org.apache.lucene.document.Document}. To see an example of this, + take a look at the Lucene demo and the associated source code + for extracting content from HTML. +

+

The {@link org.apache.lucene.document.DateTools} and {@link org.apache.lucene.document.NumberTools} classes are utility +classes to make dates, times and longs searchable (remember, Lucene only searches text).

+

The {@link org.apache.lucene.document.FieldSelector} class provides a mechanism to tell Lucene how to load Documents from +storage. If no FieldSelector is used, all Fieldables on a Document will be loaded. As an example of the FieldSelector usage, consider + the common use case of +displaying search results on a web page and then having users click through to see the full document. In this scenario, it is often + the case that there are many small fields and one or two large fields (containing the contents of the original file). Before the FieldSelector, +the full Document had to be loaded, including the large fields, in order to display the results. Now, using the FieldSelector, one +can {@link org.apache.lucene.document.FieldSelectorResult#LAZY_LOAD} the large fields, thus only loading the large fields +when a user clicks on the actual link to view the original content.