Return-Path: Delivered-To: apmail-ant-dev-archive@www.apache.org Received: (qmail 8148 invoked from network); 24 Nov 2009 00:23:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Nov 2009 00:23:33 -0000 Received: (qmail 12874 invoked by uid 500); 24 Nov 2009 00:23:32 -0000 Delivered-To: apmail-ant-dev-archive@ant.apache.org Received: (qmail 12779 invoked by uid 500); 24 Nov 2009 00:23:31 -0000 Mailing-List: contact dev-help@ant.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "Ant Developers List" Reply-To: "Ant Developers List" Delivered-To: mailing list dev@ant.apache.org Received: (qmail 12769 invoked by uid 99); 24 Nov 2009 00:23:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 00:23:31 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xavier.hanin@gmail.com designates 209.85.219.220 as permitted sender) Received: from [209.85.219.220] (HELO mail-ew0-f220.google.com) (209.85.219.220) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Nov 2009 00:23:28 +0000 Received: by ewy20 with SMTP id 20so2345735ewy.20 for ; Mon, 23 Nov 2009 16:23:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=25ywMfrc7z0g/rWq8SdG+UtF8yCQAhIe5vYTRyULu6s=; b=iNELlzDqN1dIZ7NAaxR3URHlfihschSJuAV8sP1g0LmUrMpTOnNNzDoYYX8IKASuyv 6g3Q5B+3NPx1nTOfRLY45df6dedDc5qqLUVBJWZjZr1uf67YVjrEMywqKLuo1de6nqo9 gqYLJnby6S55K3MPnk6HUZvFV+8gfRSUO8Ptk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=e9PUqYhliluKURnxD9mkQP5V6qqa8KIXmxItjtD7S1JRI6BEAXI0c8xccZMZD12dOG AxQ+HAtbGmJWtU4sWsYiOuTQ73vIeD2G0o3Ur2JS7LguGADdKHmdAwiOdREBcVaDDQd3 hTerzCpJsIg+ebqmmTP1dKcbNjDc2O2i9kIhQ= MIME-Version: 1.0 Received: by 10.216.89.85 with SMTP id b63mr1720457wef.175.1259022186913; Mon, 23 Nov 2009 16:23:06 -0800 (PST) In-Reply-To: References: <7479d1a70911110721y7bffb152s47c9844b015eafec@mail.gmail.com> <200911171550.11326.nicolas.lalevee@hibnet.org> <7479d1a70911170755r2f271300hd15864851f27346c@mail.gmail.com> <200911181904.47385.nicolas.lalevee@hibnet.org> <7479d1a70911181117u2d310939nc29e821e5e20df52@mail.gmail.com> <635a05060911190306w287dd708he59037e988704748@mail.gmail.com> Date: Tue, 24 Nov 2009 01:23:06 +0100 Message-ID: <635a05060911231623o4648924cjb5dd1a845f420be0@mail.gmail.com> Subject: Re: Ivy Indexer From: Xavier Hanin To: Ant Developers List Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable 2009/11/21 Nicolas Lalev=E9e : > > Le 19 nov. 2009 =E0 12:06, Xavier Hanin a =E9crit : > >> I really like the idea to use a solr instance colocated with the reposit= ory. >> I've seen a presentation on solr yesterday at devoxx, and it sounds like= so >> close to what we need. The only problem I see with it is that it require= s to >> install a server side component, getting closer to what repository manag= ers >> do. I'm not sure about why if we install a slor instance we wouldn't use= it >> to update the index too. Solr takes care of problems like transactions, >> concurrency, so I think it's a perfect fit... > > I think the transaction would be supported at the Lucene index level. I d= on't think there is any mechanism to make solr manage an extra "data storag= e". As far as I remember Solr is just able to read the external "data stora= ge" to index it. > But what would work is a Solr deployed just next to an Ivy repository, le= t Ivy publish artifacts like it already does, but also make Ivy request Sol= r to index the newly published artifact. Yes, this is exactly what I was thinking about. > > And spotted by a friend, Solr 1.4 [1] support replication in Java [2] ala= rsync ! I'm not sure this is even necessary to use, except for very large implementations of Ivy with huge repositories. Most of the time only one solr instance should be enough. > > So Solr might be the easiest way of achieving an Ivy indexer. Probably. As a side note, while thinking of installing a server side component to provide search, I started to wonder why not use a repository manager in that case. During devoxx I discussed with people from artifactory, and their latest version is now supporting Ivy (may still be limited, but they are working on improving that). They also provide a REST api for their search feature, so maybe it would be interesting and easy to use their software. But if we don't want to be dependent on their API, maybe we can try to define some sort of "standard" REST api to access a repository search feature. This is something they are ok to discuss. Then any repository manager implementing this api could be used. Alternatively we could define a java interface to access a search service (there's already one, but it is very limited), and have different implementations: based ona local index as initially suggested, using solr, artifactory, or any other. Then we are open to the future. Note that compared to using artifactory, using solr still has the advantage of being probably usable with any kind of Ivy repo, not just artifactory, which has probably some limitations (because it has not been designed as an Ivy repo manager, I suppose it has some proxying and layout limitations). > > I have to admit I am not a big fan of having to deploy a webapp next to a= dumb simple repo. On the other hand managing an index on the client side d= epends enormously of the kind of repository (at work we have an ivy repo in= svn accessible form both http and checkouted), it would consume more bandw= idth, some publication locking would probably be in place, etc... I agree that having to deploy a webapp is an additional burden in the build ecosystem setup. But now people are used to install a CI server, a SCM server, and so on. So I don't think it should stop us, because I think dealing with that from the client side only will have some serious limitations. Xavier > > Nicolas > > [1] http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.0/CHANGE= S.txt > [2] https://issues.apache.org/jira/browse/SOLR-561 > >> >> My 2 c. >> >> Xavier >> >> 2009/11/18 Jon Schneider >> >>> While I digest Nicolas' novel :) (thanks for the additional insight on >>> Lucene by the way), I will suggest one other idea. >>> >>> We could allow for the option of a Solr instance collocated with the >>> repository on one machine to serve up the index stored on the repositor= y. >>> IvyDE could be configured by the user to either read the index directly >>> from the remote filesystem or send its requests via HTTP to a Solr serv= er. >>> The Solr server would not be responsible for maintaining the index in t= he >>> same way that Archiva/Nexus/Artifactory do, but would simply be a query= ing >>> tool. =A0In the case where Solr is serving the index, the index would s= till >>> be >>> maintained through some combination of the index ant task and the publi= sh >>> proxy. >>> >>> This way we don't get into the complexity of pushing out index updates = to >>> clients. >>> >>> The rsync strategy is a very intriguing idea though, especially in ligh= t of >>> how Lucene segments its index in multiple files. =A0What happens when >>> optimize >>> is called on the index and the segments are combined into one file? =A0= In >>> this >>> case, any search slaves would essentially have to download the whole in= dex >>> right? =A0How much segmentation is considered too much segmentation bef= ore we >>> optimize the index to cater to search speed over index publishing speed= ? >>> >>> I'll be trying to wrap this up enough (at least with the remote filesys= tem >>> index read strategy) to make a patch so others can see it in action. = =A0We >>> are >>> a little busy at work, but I will be coming back to it in the coming da= ys. >>> >>> Thanks for all the feedback so far, >>> Jon >>> >> >> >> >> -- >> Xavier Hanin - 4SH France - http://www.4sh.fr/ >> BordeauxJUG creator & leader - http://www.bordeauxjug.org/ >> Apache Ivy Creator - http://ant.apache.org/ivy/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org > For additional commands, e-mail: dev-help@ant.apache.org > > --=20 Xavier Hanin - 4SH France - http://www.4sh.fr/ BordeauxJUG creator & leader - http://www.bordeauxjug.org/ Apache Ivy Creator - http://ant.apache.org/ivy/ --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org For additional commands, e-mail: dev-help@ant.apache.org