Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 38921 invoked from network); 2 Oct 2006 19:20:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 2 Oct 2006 19:20:02 -0000 Received: (qmail 81054 invoked by uid 500); 2 Oct 2006 19:19:55 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 81028 invoked by uid 500); 2 Oct 2006 19:19:55 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 81017 invoked by uid 99); 2 Oct 2006 19:19:55 -0000 Received: from idunn.apache.osuosl.org (HELO idunn.apache.osuosl.org) (140.211.166.84) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Oct 2006 12:19:55 -0700 Authentication-Results: idunn.apache.osuosl.org header.from=erickerickson@gmail.com; domainkeys=good X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=DNS_FROM_RFC_ABUSE,HTML_MESSAGE DomainKey-Status: good X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 Received: from [64.233.166.179] ([64.233.166.179:40047] helo=py-out-1112.google.com) by idunn.apache.osuosl.org (ecelerity 2.1.1.8 r(12930)) with ESMTP id 2F/D5-24395-95661254 for ; Mon, 02 Oct 2006 12:19:54 -0700 Received: by py-out-1112.google.com with SMTP id s49so1407609pyc for ; Mon, 02 Oct 2006 12:19:51 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=kGVcuYRj1ds6+Q9fjolHd5gVUn78x3faSVSuHWMm7isB71ZfXdwm6wIUXc/LEv5EZeoITQXl7QdYNkQjEF+RzszxJ8v2Xw2zFJRTNNib/RxhicQQ68KMbBezYCNv7InWLLOqdt1X44tRNDt8k3pK09yga2M7B9FZ0iPhM0XwaAQ= Received: by 10.35.78.13 with SMTP id f13mr8397951pyl; Mon, 02 Oct 2006 12:19:50 -0700 (PDT) Received: by 10.35.8.5 with HTTP; Mon, 2 Oct 2006 12:19:50 -0700 (PDT) Message-ID: <359a92830610021219u7107aac5lffa2950a5fab71a1@mail.gmail.com> Date: Mon, 2 Oct 2006 15:19:50 -0400 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: Search in HTML code In-Reply-To: <7b2520d40610020550k466f230ax80ab6cd2f8ebb495@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_64263_17717154.1159816790821" References: <7b2520d40610020550k466f230ax80ab6cd2f8ebb495@mail.gmail.com> X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_64263_17717154.1159816790821 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline I guess the thundering silence is rooted in the problem statement. I have a hard time understanding how this index is used. By storing things this way, you'll force the user to know the *exact* format of anything she's looking for. That is, it's hard to search for