From general-return-1002-apmail-lucene-general-archive=lucene.apache.org@lucene.apache.org Tue Jan 13 17:28:12 2009 Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 4120 invoked from network); 13 Jan 2009 17:28:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Jan 2009 17:28:12 -0000 Received: (qmail 32574 invoked by uid 500); 13 Jan 2009 17:28:10 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 32555 invoked by uid 500); 13 Jan 2009 17:28:10 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 32544 invoked by uid 99); 13 Jan 2009 17:28:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jan 2009 09:28:10 -0800 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jan 2009 17:28:01 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1LMn3E-0008Al-PV for general@lucene.apache.org; Tue, 13 Jan 2009 09:27:40 -0800 Message-ID: <21440406.post@talk.nabble.com> Date: Tue, 13 Jan 2009 09:27:40 -0800 (PST) From: ppuyen To: general@lucene.apache.org Subject: Get element Class DOM !!!! MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: khongkhi02@mail.ru X-Virus-Checked: Checked by ClamAV on apache.org hi everyone, I run example Indexing files HTML from "Lucene in Action " . there can getTitle and getBody of file HTML . protected String getTitle(Element rawDoc) { if (rawDoc == null) { return null; } //System.out.println("getTitle"); String title = ""; NodeList children = rawDoc.getElementsByTagName("title"); if (children.getLength() > 0) { Element titleElement = ((Element) children.item(0)); Text text = (Text) titleElement.getFirstChild(); if (text != null) { title = text.getData(); } } System.out.println("getTitle:"+ title); return title; } My project is commercial search engine. it's mean. when i find one product (example Nokia N72 ) . after click button "Submit" , the result need show name of product and Price each shop. I run file Indexing file HTML , there're can getTitle and getBody. My problem now is get Class ( example : $40 < /b> ) . but each web's Class name is different . Help me how could i do ? thanks so much. -- View this message in context: http://www.nabble.com/Get-element-Class-DOM-%21%21%21%21-tp21440406p21440406.html Sent from the Lucene - General mailing list archive at Nabble.com.