From java-user-return-24821-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Mon Dec 04 09:55:49 2006 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 30983 invoked from network); 4 Dec 2006 09:55:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Dec 2006 09:55:48 -0000 Received: (qmail 80674 invoked by uid 500); 4 Dec 2006 09:55:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 80285 invoked by uid 500); 4 Dec 2006 09:55:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 80272 invoked by uid 99); 4 Dec 2006 09:55:52 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Dec 2006 01:55:52 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of babuce@gmail.com designates 64.233.182.191 as permitted sender) Received: from [64.233.182.191] (HELO nf-out-0910.google.com) (64.233.182.191) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Dec 2006 01:55:41 -0800 Received: by nf-out-0910.google.com with SMTP id n28so3793832nfc for ; Mon, 04 Dec 2006 01:55:20 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=kIwgI5I/mdiqEGU8rWirHlrA0gR3hiUVqQYS/Ouyp+s38qZ1BTLfkqHmi5Uu7Q7z0Nzykxd1L1yw92aVNU5Q4+JphtdK0AdunvrjTTexIS+lM5VH6vbcrhPd/jWiJY1SRK7dWB/4FC0mswqJh42uecsLLeAGjxfUMVyDl02exrQ= Received: by 10.82.118.2 with SMTP id q2mr1436686buc.1165226119775; Mon, 04 Dec 2006 01:55:19 -0800 (PST) Received: by 10.82.135.20 with HTTP; Mon, 4 Dec 2006 01:55:19 -0800 (PST) Message-ID: <4fe781a50612040155k7d82bbe1pd96ef98a153a10da@mail.gmail.com> Date: Mon, 4 Dec 2006 13:55:19 +0400 From: "Eshwaramoorthy Babu" To: java-user@lucene.apache.org Subject: Re: lucene - general question In-Reply-To: <1ff1ea2f0612040142o2ff5c6eek7d380058397f6b2c@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_9202_10041688.1165226119750" References: <4fe781a50612032328q22e97299g96d9b78e57f787aa@mail.gmail.com> <52c3ddca0612032341pe9f454enb5271e1b73a965ba@mail.gmail.com> <4fe781a50612040008t7fcf3623gb0bbcb17f3b0229@mail.gmail.com> <52c3ddca0612040050s4eacfb32re73a8774ffeeb97e@mail.gmail.com> <1ff1ea2f0612040142o2ff5c6eek7d380058397f6b2c@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_9202_10041688.1165226119750 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Buics, Thanks for your response.. I will receive 2 xml files, I have to compare these 2 and generate a xml report with below 1. Matching id's from both xml 2. Duplicate id's from both xml The requirement is for reconcilation of 2 application data. For this I have to get the get all id's from 1st xml and search for the matching and duplicate in the 2nd xml. If I use database again I have to write procedute/JAVA to do the comparison and generate the report. Thanks, Babu On 12/4/06, rociel.buico@gmail.com wrote: > > Hi Babu, > > your sample xml schema contains only few fields, > why not consider to use db (mysql) > > todo: > read your xml file, then use digester to convert to java object after that > insert it your db. > when your done with your insert stuff, you can simply query your db > anytime > you like. > > cheers, > Buics > > > > On 12/4/06, Lukas Vlcek wrote: > > > > Hi, > > > > Try to look at Groovy (I haven't used it yet but some people say it is > > much > > easire to work with XML file in Groovy then in Java). It produces class > > files so it can be integrated with your exisitng Java code. 6MB file is > > not > > that much unless you are working in limited environment (like mobile > > device?). > > > > Also if the only thing you really need is to search for some strings in > > two > > files and you don't need to integrate this function with other Java code > > then you can simply go with *unix command line tools (grep, wc, ...) > that > > should give you what you need very quickly. > > > > Lukas > > > > On 12/4/06, Eshwaramoorthy Babu wrote: > > > > > > Hi Lukas, > > > > > > Thanks for your response. > > > I was planning to search for 1st xml ID's in 2nd XML. so I thought of > > > using > > > lucene for search. > > > Can you please suggest me some scripting solution. Is perl right > > solution? > > > > > > Thanks, > > > Babu > > > > > > > > > > > > > > > On 12/4/06, Lukas Vlcek wrote: > > > > > > > > Hi Babu, > > > > > > > > Sorry but I don't see any point in using Lucene if you don't need > > search > > > > functionality. Also for parsing XML files I would consider using > some > > > > scripting language (as opposed to pure Java based solution). The > > reason > > > is > > > > that scripting languages can be more effectire when simplicity of > > result > > > > code is important and as of Java 6 they can run right inside JVM - > so > > > > integration with you java code is very simple. > > > > > > > > Just my 2 cents. > > > > > > > > Lukas > > > > > > > > On 12/4/06, Eshwaramoorthy Babu wrote: > > > > > > > > > > Hi , > > > > > > > > > > we have a requirement to compare 2 xml files and generate > > > > > result(reconcilation report). > > > > > The xml file size is 6MB each and the flrmat is as below > > > > > > > > > > 123 > > > > > 123 > > > > > > > > > > > > > > > > > > > > I have to implement the below logic > > > > > > > > > > Number of matching ID'S in both xml > > > > > Number of non matching ID'S in both xml > > > > > Number of non matching ID'S in both xml > > > > > > > > > > I am planning to use digester and lucene for my above requirement. > > > > > > > > > > Is my desicion of using lucene correct? or is there any bettwr > > > approch > > > > > for > > > > > my above problem. > > > > > > > > > > Thanks, > > > > > Babu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > "Programs must be written for people to read, and only incidentally for > machines to execute." > > - Abelson & Sussman, SICP, preface to the first edition > > ------=_Part_9202_10041688.1165226119750--