From droids-dev-return-484-apmail-incubator-droids-dev-archive=incubator.apache.org@incubator.apache.org Tue Nov 10 11:11:52 2009 Return-Path: Delivered-To: apmail-incubator-droids-dev-archive@minotaur.apache.org Received: (qmail 48912 invoked from network); 10 Nov 2009 11:11:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Nov 2009 11:11:52 -0000 Received: (qmail 5504 invoked by uid 500); 10 Nov 2009 11:11:52 -0000 Delivered-To: apmail-incubator-droids-dev-archive@incubator.apache.org Received: (qmail 5446 invoked by uid 500); 10 Nov 2009 11:11:52 -0000 Mailing-List: contact droids-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: droids-dev@incubator.apache.org Delivered-To: mailing list droids-dev@incubator.apache.org Delivered-To: moderator for droids-dev@incubator.apache.org Received: (qmail 37927 invoked by uid 99); 10 Nov 2009 07:41:01 -0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) MIME-Version: 1.0 X-Originating-IP: [217.11.33.96] In-Reply-To: <1257770220.3697.2.camel@asf> References: <3f271d880911060529p6296e506j423b3262f45ad63b@mail.gmail.com> <1257770220.3697.2.camel@asf> Date: Tue, 10 Nov 2009 08:40:36 +0100 Message-ID: <142d723c0911092340q635dc012o9e2cdcd0e0c2a361@mail.gmail.com> Subject: Re: HandlerFactory fails with multithreaded implementation From: Chapuis Bertil To: droids-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=000e0cd3564807a2c20477ff6d6c --000e0cd3564807a2c20477ff6d6c Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I had the same problem and solved it in my handler's implementation by only using local variables and limited concurrent accesses. As I understand the issue the same limitation could occurs with a custom CrawlingDroid implementation since all workers are using the same Droid. A nice fix could be to make the Droid and the GenericFactory abstractions clonable and invoking the clone method in the Worker's constructor. Best regards, Bertil Chapuis On Mon, Nov 9, 2009 at 1:37 PM, Thorsten Scherler < thorsten.scherler.ext@juntadeandalucia.es> wrote: > On Fri, 2009-11-06 at 14:29 +0100, Javier Puerto wrote: > > Hi, I'm working with Droids and made some URL crawlers to save a lot of > web > > pages in disk. In JUnit test, I run a little http server and crawl 20 > pages, > > the most times everything works ok but in rare cases I get an error. I > found > > the problem in the HandlerFactory implementation, in the example the ca= ll > to > > handlers is like this: > > > > protected void handle(ContentEntity entity, Link link) > > throws DroidsException, IOException > > { > > droid.getHandlerFactory().handle(link.getURI(), entity); > > } > > > > > > If two or more workers is trying to handle at same time, the > HandlerFactory > > will handle the all with the same instance of the handler. The solution > > could be saving memory or improving performance. > > > > The first solution could be implemented adding a "synchronized" to > > HandlerFactory.handle like this. > > > > public synchronized boolean handle(URI uri, ContentEntity entity) > > throws DroidsException, IOException { > > for (Handler handler : getMap().values()) { > > handler.handle(uri, entity); > > } > > return true; > > } > > > > Only one handler to share with all workers but this solution is a > > performance killer. The other approx should be the opposite, each worke= r > had > > his own instance of > > the handlerfactory or handler. > > > > Solution that you think might be more appropriate? > > It depends on the usecase I guess. However I think the second option is > the more common solution. > > salu2 > > > > > Salu2. > -- > Thorsten Scherler > Open Source Java > > Sociedad Andaluza para el Desarrollo de la Sociedad > de la Informaci=F3n, S.A.U. (SADESI) > > > > > --000e0cd3564807a2c20477ff6d6c--