<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>droids-dev@incubator.apache.org Archives</title>
<link rel="self" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/?format=atom"/>
<link href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/"/>
<id>http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/</id>
<updated>2009-12-05T23:27:47Z</updated>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Javier Puerto &lt;jpuerto@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200912.mbox/%3c3f271d880912020951k3dc96554x9c1aa8a7fcf5efe1@mail.gmail.com%3e"/>
<id>urn:uuid:%3c3f271d880912020951k3dc96554x9c1aa8a7fcf5efe1@mail-gmail-com%3e</id>
<updated>2009-12-02T17:51:18Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I'm working on a solution for this problem and I think that the best fix
would be to move the HandlerFactory object from CrawlingDroid to
CrawlingWorker. With the cloneable solution, every handler must override the
clone() method and implements the interface Cloneable.
I think that moving the factory to the worker could be the best solution,
because with each thread (worker) you will have a new factory object with
new handlers and you we can to forget about the clone implementation. Also,
the clone solution will create and destroy all handlers for each target
handle in each thread. Opposite, the move fix will take more memory but all
the threads will have it's own handlers.

I have a problem about this fix, I wonder where can be the HandlerFactory
init for the Worker.

What do you think about?

Salu2.


</pre>
</div>
</content>
</entry>
<entry>
<title>NYC Search &amp; Discovery Meetup</title>
<author><name>Otis Gospodnetic &lt;otis_gospodnetic@yahoo.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200912.mbox/%3c858380.66889.qm@web50306.mail.re2.yahoo.com%3e"/>
<id>urn:uuid:%3c858380-66889-qm@web50306-mail-re2-yahoo-com%3e</id>
<updated>2009-12-01T20:39:08Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hello,

For those living in or near NYC, you may be interested in joining (and/or presenting?) at
the NYC Search &amp; Discovery Meetup.
Topics are: search, machine learning, data mining, NLP, information gathering, information
extraction, etc.

  http://www.meetup.com/NYC-Search-and-Discovery/

Our previous/first meetup was about solr-python and parse.ly (a service that makes use of
Solr and solr-python).

Tomorrow (December 2 2009) we have:

  Incorporating Probabilistic Retrieval Knowledge into TFIDF-based Search Engine

You can RSVP at:
  http://www.meetup.com/NYC-Search-and-Discovery/calendar/11745435/

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Closed: (DROIDS-69) Add link to Javadoc</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1143716591.1258366959752.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1143716591-1258366959752-JavaMail-jira@brutus%3e</id>
<updated>2009-11-16T10:22:39Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/DROIDS-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thorsten Scherler closed DROIDS-69.
-----------------------------------

    Resolution: Fixed

http://incubator.apache.org/droids/api/
committed revision 880695

&gt; Add link to Javadoc
&gt; -------------------
&gt;
&gt;                 Key: DROIDS-69
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-69
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: documentation
&gt;            Reporter: Otis Gospodnetic
&gt;            Priority: Minor
&gt;
&gt; It would be good to add a link to Droids Javadoc (trunk and the latest release when there
is one).
&gt;  http://ci.apache.org/projects/droids/api/overview-summary.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Queue: in memory or on disk?</title>
<author><name>Chapuis Bertil &lt;bchapuis@agimem.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c60B9C320-6705-4F6D-862D-7DDD01303BE3@agimem.com%3e"/>
<id>urn:uuid:%3c60B9C320-6705-4F6D-862D-7DDD01303BE3@agimem-com%3e</id>
<updated>2009-11-14T12:06:35Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Personally I used Droids to crawl a website of approximately 250000 pages. The queue was stored
in memory and I arbitrarily allocated 1GB of memory to java. Everything worked fine. 

That's not a large number of webpages but I think droids' current implementation is well suited
for such jobs: crawling a relatively small set of webpage or crawling an intranet. This is
particularly right if you need to customize the handling process of the pages. 

I Hope this experience may help.

Bertil Chapuis


On Nov 14, 2009, at 3:59 AM, Otis Gospodnetic wrote:

&gt; OK, thanks.
&gt; 
&gt; So how do people really use Droids at scale? e.g. crawling a large number of web pages?
 I happen to use it for something smalish, so I never had issues with the queue being in the
JVM heap and getting OOMs because of that.  But I imagine that anyone using it for a larger
crawl would hit OOM sooner or later, no?
&gt; 
&gt; Does this imply that either nobody is using Droids for large-scale crawls, or that everyone
who does implemented their own, custom disk-backed queue?
&gt; 
&gt; 
&gt; Thanks,
&gt; Otis
&gt; --
&gt; Sematext is hiring -- http://sematext.com/about/jobs.html?mls
&gt; Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
&gt; 
&gt; 
&gt; 
&gt; ----- Original Message ----
&gt;&gt; From: Ryan McKinley &lt;ryantxu@gmail.com&gt;
&gt;&gt; To: droids-dev@incubator.apache.org
&gt;&gt; Sent: Fri, November 13, 2009 5:17:51 PM
&gt;&gt; Subject: Re: Queue: in memory or on disk?
&gt;&gt; 
&gt;&gt; ya, the standard one is in memory.
&gt;&gt; 
&gt;&gt; It is easy to write one to store things to disk or whatever -- I use one that 
&gt;&gt; stores tasks to an h2 database, but it is not general enough to contribute 
&gt;&gt; back...
&gt;&gt; 
&gt;&gt; I think Migfa was looking at replacing the droids Queue interface with a 
&gt;&gt; standard java.util.Queue interface
&gt;&gt; 
&gt;&gt; ryan
&gt;&gt; 
&gt;&gt; 
&gt;&gt; On Nov 13, 2009, at 5:10 PM, Chapuis Bertil wrote:
&gt;&gt; 
&gt;&gt;&gt; I think the current implementation only provides in memory queues of tasks. 
&gt;&gt; However, since the TaskQueue interface is relatively simple it shouldn't be too 
&gt;&gt; hard to persists the data on the disk or to implement a TaskQueue which works 
&gt;&gt; with a JMS broker or something else.
&gt;&gt;&gt; 
&gt;&gt;&gt; 
&gt;&gt;&gt; On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote:
&gt;&gt;&gt; 
&gt;&gt;&gt;&gt; Hello,
&gt;&gt;&gt;&gt; 
&gt;&gt;&gt;&gt; I haven't looked at the sources.  But who stores items put in the Queue?
 Are 
&gt;&gt; they in memory, or does something write them to disk, or something else?
&gt;&gt;&gt;&gt; 
&gt;&gt;&gt;&gt; Thanks,
&gt;&gt;&gt;&gt; Otis
&gt;&gt;&gt;&gt; --
&gt;&gt;&gt;&gt; Sematext is hiring -- http://sematext.com/about/jobs.html?mls
&gt;&gt;&gt;&gt; Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
&gt;&gt;&gt;&gt; 
&gt;&gt;&gt; 
&gt; 



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (DROIDS-69) Add link to Javadoc</title>
<author><name>&quot;Otis Gospodnetic (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c432846708.1258167999944.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c432846708-1258167999944-JavaMail-jira@brutus%3e</id>
<updated>2009-11-14T03:06:39Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Add link to Javadoc
-------------------

                 Key: DROIDS-69
                 URL: https://issues.apache.org/jira/browse/DROIDS-69
             Project: Droids
          Issue Type: Improvement
          Components: documentation
            Reporter: Otis Gospodnetic
            Priority: Minor


It would be good to add a link to Droids Javadoc (trunk and the latest release when there
is one).
 http://ci.apache.org/projects/droids/api/overview-summary.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Queue: in memory or on disk?</title>
<author><name>Otis Gospodnetic &lt;ogjunk-droids@yahoo.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c355335.29756.qm@web50308.mail.re2.yahoo.com%3e"/>
<id>urn:uuid:%3c355335-29756-qm@web50308-mail-re2-yahoo-com%3e</id>
<updated>2009-11-14T02:59:39Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
OK, thanks.

So how do people really use Droids at scale? e.g. crawling a large number of web pages?  I
happen to use it for something smalish, so I never had issues with the queue being in the
JVM heap and getting OOMs because of that.  But I imagine that anyone using it for a larger
crawl would hit OOM sooner or later, no?

Does this imply that either nobody is using Droids for large-scale crawls, or that everyone
who does implemented their own, custom disk-backed queue?


Thanks,
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
&gt; From: Ryan McKinley &lt;ryantxu@gmail.com&gt;
&gt; To: droids-dev@incubator.apache.org
&gt; Sent: Fri, November 13, 2009 5:17:51 PM
&gt; Subject: Re: Queue: in memory or on disk?
&gt; 
&gt; ya, the standard one is in memory.
&gt; 
&gt; It is easy to write one to store things to disk or whatever -- I use one that 
&gt; stores tasks to an h2 database, but it is not general enough to contribute 
&gt; back...
&gt; 
&gt; I think Migfa was looking at replacing the droids Queue interface with a 
&gt; standard java.util.Queue interface
&gt; 
&gt; ryan
&gt; 
&gt; 
&gt; On Nov 13, 2009, at 5:10 PM, Chapuis Bertil wrote:
&gt; 
&gt; &gt; I think the current implementation only provides in memory queues of tasks. 
&gt; However, since the TaskQueue interface is relatively simple it shouldn't be too 
&gt; hard to persists the data on the disk or to implement a TaskQueue which works 
&gt; with a JMS broker or something else.
&gt; &gt; 
&gt; &gt; 
&gt; &gt; On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote:
&gt; &gt; 
&gt; &gt;&gt; Hello,
&gt; &gt;&gt; 
&gt; &gt;&gt; I haven't looked at the sources.  But who stores items put in the Queue?  Are

&gt; they in memory, or does something write them to disk, or something else?
&gt; &gt;&gt; 
&gt; &gt;&gt; Thanks,
&gt; &gt;&gt; Otis
&gt; &gt;&gt; --
&gt; &gt;&gt; Sematext is hiring -- http://sematext.com/about/jobs.html?mls
&gt; &gt;&gt; Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
&gt; &gt;&gt; 
&gt; &gt; 



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Queue: in memory or on disk?</title>
<author><name>Ryan McKinley &lt;ryantxu@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c284B06A9-0F29-4590-9EF1-A9C5C2880468@gmail.com%3e"/>
<id>urn:uuid:%3c284B06A9-0F29-4590-9EF1-A9C5C2880468@gmail-com%3e</id>
<updated>2009-11-13T22:17:51Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
ya, the standard one is in memory.

It is easy to write one to store things to disk or whatever -- I use  
one that stores tasks to an h2 database, but it is not general enough  
to contribute back...

I think Migfa was looking at replacing the droids Queue interface with  
a standard java.util.Queue interface

ryan


On Nov 13, 2009, at 5:10 PM, Chapuis Bertil wrote:

&gt; I think the current implementation only provides in memory queues of  
&gt; tasks. However, since the TaskQueue interface is relatively simple  
&gt; it shouldn't be too hard to persists the data on the disk or to  
&gt; implement a TaskQueue which works with a JMS broker or something else.
&gt;
&gt;
&gt; On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote:
&gt;
&gt;&gt; Hello,
&gt;&gt;
&gt;&gt; I haven't looked at the sources.  But who stores items put in the  
&gt;&gt; Queue?  Are they in memory, or does something write them to disk,  
&gt;&gt; or something else?
&gt;&gt;
&gt;&gt; Thanks,
&gt;&gt; Otis
&gt;&gt; --
&gt;&gt; Sematext is hiring -- http://sematext.com/about/jobs.html?mls
&gt;&gt; Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
&gt;&gt;
&gt;



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Queue: in memory or on disk?</title>
<author><name>Chapuis Bertil &lt;bchapuis@agimem.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c4A5C84AB-6CE2-40AA-A939-00E54098CDC6@agimem.com%3e"/>
<id>urn:uuid:%3c4A5C84AB-6CE2-40AA-A939-00E54098CDC6@agimem-com%3e</id>
<updated>2009-11-13T22:10:28Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I think the current implementation only provides in memory queues of tasks. However, since
the TaskQueue interface is relatively simple it shouldn't be too hard to persists the data
on the disk or to implement a TaskQueue which works with a JMS broker or something else.


On Nov 12, 2009, at 10:37 PM, Otis Gospodnetic wrote:

&gt; Hello,
&gt; 
&gt; I haven't looked at the sources.  But who stores items put in the Queue?  Are they in
memory, or does something write them to disk, or something else?
&gt; 
&gt; Thanks,
&gt; Otis
&gt; --
&gt; Sematext is hiring -- http://sematext.com/about/jobs.html?mls
&gt; Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
&gt; 



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Javadoc?</title>
<author><name>Chapuis Bertil &lt;bchapuis@agimem.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c319BBEA0-D00C-4722-8A98-0B41EDA66A6C@agimem.com%3e"/>
<id>urn:uuid:%3c319BBEA0-D00C-4722-8A98-0B41EDA66A6C@agimem-com%3e</id>
<updated>2009-11-13T22:01:58Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hello,

Here is the java doc.

http://ci.apache.org/projects/droids/api/overview-summary.html

Best regards,

Bertil



On Nov 12, 2009, at 10:38 PM, Otis Gospodnetic wrote:

&gt; vadoc, hoping my question about Queue would be in the javadoc, but 



</pre>
</div>
</content>
</entry>
<entry>
<title>Javadoc?</title>
<author><name>Otis Gospodnetic &lt;otis_gospodnetic@yahoo.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c175031.30683.qm@web50301.mail.re2.yahoo.com%3e"/>
<id>urn:uuid:%3c175031-30683-qm@web50301-mail-re2-yahoo-com%3e</id>
<updated>2009-11-12T21:38:46Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi,

I tried looking for Droids javadoc, hoping my question about Queue would be in the javadoc,
but I could not find the javadoc anywhere on Droids' site.  I then tried looking for the link
to the svn web view (so I don't have to svn co), but I couldn't find that either.

Am I just not seeing them?

Thanks,
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



</pre>
</div>
</content>
</entry>
<entry>
<title>Queue: in memory or on disk?</title>
<author><name>Otis Gospodnetic &lt;otis_gospodnetic@yahoo.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c787391.49037.qm@web50304.mail.re2.yahoo.com%3e"/>
<id>urn:uuid:%3c787391-49037-qm@web50304-mail-re2-yahoo-com%3e</id>
<updated>2009-11-12T21:37:22Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hello,

I haven't looked at the sources.  But who stores items put in the Queue?  Are they in memory,
or does something write them to disk, or something else?

Thanks,
Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Javier Puerto &lt;jpuerto@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c3f271d880911120828v726a38dbg7600420fa1363fa5@mail.gmail.com%3e"/>
<id>urn:uuid:%3c3f271d880911120828v726a38dbg7600420fa1363fa5@mail-gmail-com%3e</id>
<updated>2009-11-12T16:28:36Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
2009/11/12 Chapuis Bertil &lt;bchapuis@agimem.com&gt;

&gt; Not in a short scope, I am quite busy this month. But Javier could create a
&gt; ticket for the issue.
&gt;
&gt; Best regards,
&gt;
&gt; Bertil Chapuis
&gt;

Ticket created on https://issues.apache.org/jira/browse/DROIDS-68
I add the synchroniced fix as a workaround in the ticket for a fast
solution. The cloneable fix is better but I'm busy too.

Best regards.


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (DROIDS-68) HandlerFactory fails with multithreaded implementation</title>
<author><name>&quot;Javier Puerto (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c403269262.1258042779632.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c403269262-1258042779632-JavaMail-jira@brutus%3e</id>
<updated>2009-11-12T16:19:39Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
HandlerFactory fails with multithreaded implementation
------------------------------------------------------

                 Key: DROIDS-68
                 URL: https://issues.apache.org/jira/browse/DROIDS-68
             Project: Droids
          Issue Type: Bug
          Components: core
         Environment: Ubuntu 9.04 i386
Java Runtime 1.6_16
            Reporter: Javier Puerto
            Priority: Critical


Hi, I'm working with Droids and made some URL crawlers to save a lot of web pages in disk.
In JUnit test, I run a little http server and crawl 20 pages, the most times everything works
ok but in rare cases I get an error. I found the problem in the HandlerFactory implementation,
in the example the call to handlers is like this:

protected void handle(ContentEntity entity, Link link)
    throws DroidsException, IOException
{
  droid.getHandlerFactory().handle(link.getURI(), entity);
}


If two or more workers is trying to handle at same time, the HandlerFactory will handle with
the same instance. The solution could be saving memory or improving performance.

The first solution could be implemented adding a "synchronized" to HandlerFactory.handle like
this.

public synchronized boolean handle(URI uri, ContentEntity entity)
    throws DroidsException, IOException {
  for (Handler handler : getMap().values()) {
    handler.handle(uri, entity);
  }
  return true;
}
This solution works but it is a workaround.

The real solution was discussed in the dev list and it was make the Droid and the GenericFactory
abstractions clonable and invoking the clone method in the Worker's constructor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Chapuis Bertil &lt;bchapuis@agimem.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c142d723c0911112358p215be92i6d4b17c209c70ec8@mail.gmail.com%3e"/>
<id>urn:uuid:%3c142d723c0911112358p215be92i6d4b17c209c70ec8@mail-gmail-com%3e</id>
<updated>2009-11-12T07:58:49Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Wed, Nov 11, 2009 at 9:25 AM, Thorsten Scherler &lt;
thorsten.scherler.ext@juntadeandalucia.es&gt; wrote:

&gt; On Tue, 2009-11-10 at 08:40 +0100, Chapuis Bertil wrote:
&gt; &gt; I had the same problem and solved it in my handler's implementation by
&gt; only
&gt; &gt; using local variables and limited concurrent accesses. As I understand
&gt; the
&gt; &gt; issue the same limitation could occurs with a custom CrawlingDroid
&gt; &gt; implementation since all workers are using the same Droid. A nice fix
&gt; could
&gt; &gt; be to make the Droid and the GenericFactory abstractions clonable and
&gt; &gt; invoking the clone method in the Worker's constructor.
&gt;
&gt; Could you provide a patch?


Not in a short scope, I am quite busy this month. But Javier could create a
ticket for the issue.

Best regards,

Bertil Chapuis


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: [Vote] Javier Puerto as apache droids committer</title>
<author><name>Thorsten Scherler &lt;thorsten.scherler.ext@juntadeandalucia.es&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257928269.3697.15.camel@asf%3e"/>
<id>urn:uuid:%3c1257928269-3697-15-camel@asf%3e</id>
<updated>2009-11-11T08:31:09Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Tue, 2009-11-03 at 09:53 +0100, Thorsten Scherler wrote:
&gt; On Tue, 2009-11-03 at 09:34 +0100, Thorsten Scherler wrote:
&gt; &gt; The droids pmc proposes Javier Purto to become a new committer.

During the time period there were no negative votes, and more than 3
positive votes.

So Javier welcome as a new Apache Droids committer!

Here are the next steps. There is no rush. You can provide
the answers here or use the droids-private AT incubator.a.o address if
you prefer.

We are generally following the procedure at:
http://www.apache.org/dev/#pmc
http://www.apache.org/dev/pmc.html#newcommitter

You need to send a Contributor License Agreement to the ASF.
Normally you would send just an Individual CLA. If you also make
contributions done in work time or using work resources then
see the additional Corporate CLA. It is up to you if you
need that additional CLA, as the Individual CLA declares that
you are legally entitled. Ask us if you have any issues.
http://www.apache.org/licenses/#clas

You need to choose a preferred ASF user name and alternatives.
See the existing names http://www.apache.org/~jim/committers.html

When we see the ASF volunteer secretary record the receipt of
the CLA in an svn commit, we can proceed to ask Infrastructure
to set up your account.

The developer section of the website describes the roles
and provides other resources. Especially important are
the ones for "new committers".
http://www.apache.org/foundation/how-it-works.html
http://www.apache.org/dev/

salu2
-- 
Thorsten Scherler &lt;thorsten.at.apache.org&gt;
Open Source Java &lt;consulting, training and solutions&gt;

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Informaci贸n, S.A.U. (SADESI)






</pre>
</div>
</content>
</entry>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Thorsten Scherler &lt;thorsten.scherler.ext@juntadeandalucia.es&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257927908.3697.11.camel@asf%3e"/>
<id>urn:uuid:%3c1257927908-3697-11-camel@asf%3e</id>
<updated>2009-11-11T08:25:08Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Tue, 2009-11-10 at 08:40 +0100, Chapuis Bertil wrote:
&gt; I had the same problem and solved it in my handler's implementation by only
&gt; using local variables and limited concurrent accesses. As I understand the
&gt; issue the same limitation could occurs with a custom CrawlingDroid
&gt; implementation since all workers are using the same Droid. A nice fix could
&gt; be to make the Droid and the GenericFactory abstractions clonable and
&gt; invoking the clone method in the Worker's constructor.

Could you provide a patch? When I talked with Javier we discussed this
option as well and IMO it is an elegant solution.

Thanks for your feedback Bertil.

salu2

&gt; 
&gt; Best regards,
&gt; 
&gt; Bertil Chapuis
&gt; 
&gt; 
&gt; On Mon, Nov 9, 2009 at 1:37 PM, Thorsten Scherler &lt;
&gt; thorsten.scherler.ext@juntadeandalucia.es&gt; wrote:
&gt; 
&gt; &gt; On Fri, 2009-11-06 at 14:29 +0100, Javier Puerto wrote:
&gt; &gt; &gt; Hi, I'm working with Droids and made some URL crawlers to save a lot of
&gt; &gt; web
&gt; &gt; &gt; pages in disk. In JUnit test, I run a little http server and crawl 20
&gt; &gt; pages,
&gt; &gt; &gt; the most times everything works ok but in rare cases I get an error. I
&gt; &gt; found
&gt; &gt; &gt; the problem in the HandlerFactory implementation, in the example the call
&gt; &gt; to
&gt; &gt; &gt; handlers is like this:
&gt; &gt; &gt;
&gt; &gt; &gt; protected void handle(ContentEntity entity, Link link)
&gt; &gt; &gt;     throws DroidsException, IOException
&gt; &gt; &gt; {
&gt; &gt; &gt;   droid.getHandlerFactory().handle(link.getURI(), entity);
&gt; &gt; &gt; }
&gt; &gt; &gt;
&gt; &gt; &gt;
&gt; &gt; &gt; If two or more workers is trying to handle at same time, the
&gt; &gt; HandlerFactory
&gt; &gt; &gt; will handle the all with the same instance of the handler. The solution
&gt; &gt; &gt; could be saving memory or improving performance.
&gt; &gt; &gt;
&gt; &gt; &gt; The first solution could be implemented adding a "synchronized" to
&gt; &gt; &gt; HandlerFactory.handle like this.
&gt; &gt; &gt;
&gt; &gt; &gt; public synchronized boolean handle(URI uri, ContentEntity entity)
&gt; &gt; &gt;     throws DroidsException, IOException {
&gt; &gt; &gt;   for (Handler handler : getMap().values()) {
&gt; &gt; &gt;     handler.handle(uri, entity);
&gt; &gt; &gt;   }
&gt; &gt; &gt;   return true;
&gt; &gt; &gt; }
&gt; &gt; &gt;
&gt; &gt; &gt; Only one handler to share with all workers but this solution is a
&gt; &gt; &gt; performance killer. The other approx should be the opposite, each worker
&gt; &gt; had
&gt; &gt; &gt; his own instance of
&gt; &gt; &gt; the handlerfactory or handler.
&gt; &gt; &gt;
&gt; &gt; &gt; Solution that you think might be more appropriate?
&gt; &gt;
&gt; &gt; It depends on the usecase I guess. However I think the second option is
&gt; &gt; the more common solution.
&gt; &gt;
&gt; &gt; salu2
&gt; &gt;
&gt; &gt; &gt;
&gt; &gt; &gt; Salu2.
&gt; &gt; --
&gt; &gt; Thorsten Scherler &lt;thorsten.at.apache.org&gt;
&gt; &gt; Open Source Java &lt;consulting, training and solutions&gt;
&gt; &gt;
&gt; &gt; Sociedad Andaluza para el Desarrollo de la Sociedad
&gt; &gt; de la Informaci贸n, S.A.U. (SADESI)
&gt; &gt;
&gt; &gt;
&gt; &gt;
&gt; &gt;
&gt; &gt;
-- 
Thorsten Scherler &lt;thorsten.at.apache.org&gt;
Open Source Java &lt;consulting, training and solutions&gt;

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Informaci贸n, S.A.U. (SADESI)






</pre>
</div>
</content>
</entry>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Chapuis Bertil &lt;bchapuis@agimem.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c142d723c0911092340q635dc012o9e2cdcd0e0c2a361@mail.gmail.com%3e"/>
<id>urn:uuid:%3c142d723c0911092340q635dc012o9e2cdcd0e0c2a361@mail-gmail-com%3e</id>
<updated>2009-11-10T07:40:36Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I had the same problem and solved it in my handler's implementation by only
using local variables and limited concurrent accesses. As I understand the
issue the same limitation could occurs with a custom CrawlingDroid
implementation since all workers are using the same Droid. A nice fix could
be to make the Droid and the GenericFactory abstractions clonable and
invoking the clone method in the Worker's constructor.

Best regards,

Bertil Chapuis


On Mon, Nov 9, 2009 at 1:37 PM, Thorsten Scherler &lt;
thorsten.scherler.ext@juntadeandalucia.es&gt; wrote:

&gt; On Fri, 2009-11-06 at 14:29 +0100, Javier Puerto wrote:
&gt; &gt; Hi, I'm working with Droids and made some URL crawlers to save a lot of
&gt; web
&gt; &gt; pages in disk. In JUnit test, I run a little http server and crawl 20
&gt; pages,
&gt; &gt; the most times everything works ok but in rare cases I get an error. I
&gt; found
&gt; &gt; the problem in the HandlerFactory implementation, in the example the call
&gt; to
&gt; &gt; handlers is like this:
&gt; &gt;
&gt; &gt; protected void handle(ContentEntity entity, Link link)
&gt; &gt;     throws DroidsException, IOException
&gt; &gt; {
&gt; &gt;   droid.getHandlerFactory().handle(link.getURI(), entity);
&gt; &gt; }
&gt; &gt;
&gt; &gt;
&gt; &gt; If two or more workers is trying to handle at same time, the
&gt; HandlerFactory
&gt; &gt; will handle the all with the same instance of the handler. The solution
&gt; &gt; could be saving memory or improving performance.
&gt; &gt;
&gt; &gt; The first solution could be implemented adding a "synchronized" to
&gt; &gt; HandlerFactory.handle like this.
&gt; &gt;
&gt; &gt; public synchronized boolean handle(URI uri, ContentEntity entity)
&gt; &gt;     throws DroidsException, IOException {
&gt; &gt;   for (Handler handler : getMap().values()) {
&gt; &gt;     handler.handle(uri, entity);
&gt; &gt;   }
&gt; &gt;   return true;
&gt; &gt; }
&gt; &gt;
&gt; &gt; Only one handler to share with all workers but this solution is a
&gt; &gt; performance killer. The other approx should be the opposite, each worker
&gt; had
&gt; &gt; his own instance of
&gt; &gt; the handlerfactory or handler.
&gt; &gt;
&gt; &gt; Solution that you think might be more appropriate?
&gt;
&gt; It depends on the usecase I guess. However I think the second option is
&gt; the more common solution.
&gt;
&gt; salu2
&gt;
&gt; &gt;
&gt; &gt; Salu2.
&gt; --
&gt; Thorsten Scherler &lt;thorsten.at.apache.org&gt;
&gt; Open Source Java &lt;consulting, training and solutions&gt;
&gt;
&gt; Sociedad Andaluza para el Desarrollo de la Sociedad
&gt; de la Informaci髇, S.A.U. (SADESI)
&gt;
&gt;
&gt;
&gt;
&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: HandlerFactory fails with multithreaded implementation</title>
<author><name>Thorsten Scherler &lt;thorsten.scherler.ext@juntadeandalucia.es&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257770220.3697.2.camel@asf%3e"/>
<id>urn:uuid:%3c1257770220-3697-2-camel@asf%3e</id>
<updated>2009-11-09T12:37:00Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Fri, 2009-11-06 at 14:29 +0100, Javier Puerto wrote:
&gt; Hi, I'm working with Droids and made some URL crawlers to save a lot of web
&gt; pages in disk. In JUnit test, I run a little http server and crawl 20 pages,
&gt; the most times everything works ok but in rare cases I get an error. I found
&gt; the problem in the HandlerFactory implementation, in the example the call to
&gt; handlers is like this:
&gt; 
&gt; protected void handle(ContentEntity entity, Link link)
&gt;     throws DroidsException, IOException
&gt; {
&gt;   droid.getHandlerFactory().handle(link.getURI(), entity);
&gt; }
&gt; 
&gt; 
&gt; If two or more workers is trying to handle at same time, the HandlerFactory
&gt; will handle the all with the same instance of the handler. The solution
&gt; could be saving memory or improving performance.
&gt; 
&gt; The first solution could be implemented adding a "synchronized" to
&gt; HandlerFactory.handle like this.
&gt; 
&gt; public synchronized boolean handle(URI uri, ContentEntity entity)
&gt;     throws DroidsException, IOException {
&gt;   for (Handler handler : getMap().values()) {
&gt;     handler.handle(uri, entity);
&gt;   }
&gt;   return true;
&gt; }
&gt; 
&gt; Only one handler to share with all workers but this solution is a
&gt; performance killer. The other approx should be the opposite, each worker had
&gt; his own instance of
&gt; the handlerfactory or handler.
&gt; 
&gt; Solution that you think might be more appropriate?

It depends on the usecase I guess. However I think the second option is
the more common solution. 

salu2

&gt; 
&gt; Salu2.
-- 
Thorsten Scherler &lt;thorsten.at.apache.org&gt;
Open Source Java &lt;consulting, training and solutions&gt;

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Informaci贸n, S.A.U. (SADESI)






</pre>
</div>
</content>
</entry>
<entry>
<title>Incubator PMC/Board report for November 2009 (&quot;Droids Developers&quot; &lt;droids-dev@incubator.apache.org&gt;)</title>
<author><name>Marvin &lt;marvin@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c20091106142727.GA21808@minotaur.apache.org%3e"/>
<id>urn:uuid:%3c20091106142727-GA21808@minotaur-apache-org%3e</id>
<updated>2009-11-06T14:27:27Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Dear Droids Developers,

This email was sent by an automated system on behalf of the Apache Incubator PMC.
It is an initial reminder to give you plenty of time to prepare your quarterly
board report.

The board meeting is scheduled for  Wed, 18 November 2009, 2 pm Pacific. The report 
for your podling will form a part of the Incubator PMC report. The Incubator PMC 
requires your report to be submitted one week before the board meeting, to allow 
sufficient time for review.

Please submit your report with sufficient time to allow the incubator PMC, and 
subsequently board members to review and digest. Again, the very latest you 
should submit your report is one week prior to the board meeting.

Thanks,

The Apache Incubator PMC

Submitting your Report
----------------------

Your report should contain the following:

 * Your project name
 * A brief description of your project, which assumes no knowledge of the project
   or necessarily of its field
 * A list of the three most important issues to address in the move towards 
   graduation.
 * Any issues that the Incubator PMC or ASF Board might wish/need to be aware of
 * How has the community developed since the last report
 * How has the project developed since the last report.
 
This should be appended to the Incubator Wiki page at:

  http://wiki.apache.org/incubator/November2009

Note: This manually populated. You may need to wait a little before this page is
      created from a template.

Mentors
-------
Mentors should review reports for their project(s) and sign them off on the 
Incubator wiki page. Signing off reports shows that you are following the 
project - projects that are not signed may raise alarms for the Incubator PMC.

Incubator PMC



</pre>
</div>
</content>
</entry>
<entry>
<title>Board report reminder emails</title>
<author><name>Upayavira &lt;uv@odoko.co.uk&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257516472.15730.1006.camel@urgyen%3e"/>
<id>urn:uuid:%3c1257516472-15730-1006-camel@urgyen%3e</id>
<updated>2009-11-06T14:07:52Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Every month, a third of incubator podlings must submit reports to the
incubator PMC.

I have written and am about to test a script that will run at the
beginning of each month to send out reminders to those podlings that are
due to report.

As your report is due this month, if everything goes to plan you'll see
a reminder mail soon after this one.

When you do, please let me know if you see any errors. Please note, I'm
likely not subscribed to this list.

Upayavira



</pre>
</div>
</content>
</entry>
<entry>
<title>HandlerFactory fails with multithreaded implementation</title>
<author><name>Javier Puerto &lt;jpuerto@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c3f271d880911060529p6296e506j423b3262f45ad63b@mail.gmail.com%3e"/>
<id>urn:uuid:%3c3f271d880911060529p6296e506j423b3262f45ad63b@mail-gmail-com%3e</id>
<updated>2009-11-06T13:29:46Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi, I'm working with Droids and made some URL crawlers to save a lot of web
pages in disk. In JUnit test, I run a little http server and crawl 20 pages,
the most times everything works ok but in rare cases I get an error. I found
the problem in the HandlerFactory implementation, in the example the call to
handlers is like this:

protected void handle(ContentEntity entity, Link link)
    throws DroidsException, IOException
{
  droid.getHandlerFactory().handle(link.getURI(), entity);
}


If two or more workers is trying to handle at same time, the HandlerFactory
will handle the all with the same instance of the handler. The solution
could be saving memory or improving performance.

The first solution could be implemented adding a "synchronized" to
HandlerFactory.handle like this.

public synchronized boolean handle(URI uri, ContentEntity entity)
    throws DroidsException, IOException {
  for (Handler handler : getMap().values()) {
    handler.handle(uri, entity);
  }
  return true;
}

Only one handler to share with all workers but this solution is a
performance killer. The other approx should be the opposite, each worker had
his own instance of
the handlerfactory or handler.

Solution that you think might be more appropriate?

Salu2.


</pre>
</div>
</content>
</entry>
<entry>
<title>Free live video streaming of ApacheCon US 2009</title>
<author><name>Michael McCandless &lt;lucene@mikemccandless.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c9ac0c6aa0911040525g1ec9402dr7cbb8bd5ea614aea@mail.gmail.com%3e"/>
<id>urn:uuid:%3c9ac0c6aa0911040525g1ec9402dr7cbb8bd5ea614aea@mail-gmail-com%3e</id>
<updated>2009-11-04T13:25:25Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Team,

For those Lucene fanatics not in Oakland this week for ApacheCon US,
don't miss the FREE live video streaming, starting today:

  http://streaming.linux-magazin.de/en/program-apachecon-us-2009.htm

Note that there are many talks available, covering Apache Hadoop,
Apache HTTPD, Lucene, as well as the Apache Pioneer's Panel and
keynote presentations.

Lucene's track is this Friday (NOTE these times are UTC -- use
http://www.timeanddate.com to map to your time zone):

 17:00 Implementing an Information Retrieval Framework for an
       Organizational Repository, Sithu D Sudarsan

 18:00 Apache Mahout - Going from raw data to information
       Isabel Drost

 19:15 MIME Magic with Apache Tika
       Jukka Zitting

 20:15 Keynote: How Open Source Developers Can (Still!) Save The World
       Brian Behlendorf

 22:00 Building Intelligent Search Applications with the Lucene
       Ecosystem, Ted Dunning

 23:00 Realtime Search
       Jason Rutherglen

Happy viewing,

Mike


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: [Vote] Javier Purto as apache droids committer</title>
<author><name>Ryan McKinley &lt;ryantxu@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3cC9181963-53D2-4F7C-853F-996FFC8CB246@gmail.com%3e"/>
<id>urn:uuid:%3cC9181963-53D2-4F7C-853F-996FFC8CB246@gmail-com%3e</id>
<updated>2009-11-03T20:43:30Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

On Nov 3, 2009, at 3:34 AM, Thorsten Scherler wrote:

&gt; The droids pmc proposes Javier Purto to become a new committer.
&gt;

+1


</pre>
</div>
</content>
</entry>
<entry>
<title>Re: [Vote] Javier Puerto as apache droids committer</title>
<author><name>Thorsten Scherler &lt;thorsten.scherler.ext@juntadeandalucia.es&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257238976.7747.6.camel@asf%3e"/>
<id>urn:uuid:%3c1257238976-7747-6-camel@asf%3e</id>
<updated>2009-11-03T09:02:56Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Tue, 2009-11-03 at 09:34 +0100, Thorsten Scherler wrote:
&gt; The droids pmc proposes Javier Purto to become a new committer.

Upss sorry Javier! It is Puerto. Javier Puerto!

sorry!

salu2

&gt; 
&gt; He has offered great contributions since the beginning of the project  
&gt; (even in labs) follows them through with the feeback given from the  
&gt; developers.
&gt; 
&gt; Please cast your votes. The voting period will end a week from today.
&gt; 
&gt; http://www.timeanddate.com/counters/customcounter.html?year=2009&amp;month=11&amp;day=10
&gt; 
&gt; salu2
-- 
Thorsten Scherler &lt;thorsten.at.apache.org&gt;
Open Source Java &lt;consulting, training and solutions&gt;

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Informaci贸n, S.A.U. (SADESI)






</pre>
</div>
</content>
</entry>
<entry>
<title>Re: [Vote] Javier Purto as apache droids committer</title>
<author><name>Thorsten Scherler &lt;thorsten.scherler.ext@juntadeandalucia.es&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c1257238380.7747.5.camel@asf%3e"/>
<id>urn:uuid:%3c1257238380-7747-5-camel@asf%3e</id>
<updated>2009-11-03T08:53:00Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
On Tue, 2009-11-03 at 09:34 +0100, Thorsten Scherler wrote:
&gt; The droids pmc proposes Javier Purto to become a new committer.

+1

salu2
-- 
Thorsten Scherler &lt;thorsten.at.apache.org&gt;
Open Source Java &lt;consulting, training and solutions&gt;

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la Informaci贸n, S.A.U. (SADESI)






</pre>
</div>
</content>
</entry>
<entry>
<title>[Vote] Javier Purto as apache droids committer</title>
<author><name>Thorsten Scherler &lt;thorsten@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c20CDF737-5D90-4BD7-A763-107DB7BAD3EB@apache.org%3e"/>
<id>urn:uuid:%3c20CDF737-5D90-4BD7-A763-107DB7BAD3EB@apache-org%3e</id>
<updated>2009-11-03T08:34:53Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
The droids pmc proposes Javier Purto to become a new committer.

He has offered great contributions since the beginning of the project  
(even in labs) follows them through with the feeback given from the  
developers.

Please cast your votes. The voting period will end a week from today.

http://www.timeanddate.com/counters/customcounter.html?year=2009&amp;month=11&amp;day=10

salu2


</pre>
</div>
</content>
</entry>
<entry>
<title>assistance with Apache Forrest</title>
<author><name>David Crossley &lt;crossley@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200911.mbox/%3c20091102032651.GE15843@igg.indexgeo.com.au%3e"/>
<id>urn:uuid:%3c20091102032651-GE15843@igg-indexgeo-com-au%3e</id>
<updated>2009-11-02T03:26:51Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
I did some research to find the ASF projects that
manage their websites with Apache Forrest, and am
sending similar email to each project's dev mail list.

The purposes of this email are to remind people
about some of the useful facilities of Forrest,
and also alert them to discussion about the status
and future directions of Forrest, and to appeal for
people to assist Forrest.

--- oOo ---

These are useful facilities to assist with developing
and managing a Forrest solution for your project's website.

"How to deploy documentation with the Forrestbot
svn workstage"
This explains how the Forrest project manages our
own documentation.
http://forrest.apache.org/howto-forrestbot-svn.html

"Generate an ASF mirrors page using interactive web form"
http://forrest.apache.org/docs/dev/howto/howto-asf-mirror.html

"ForrestBar - Firefox toolbar to ease navigation
and search of Forrest resources"
http://forrest.apache.org/tools/forrestbar.html

"How to do development with Apache Forrest"
http://forrest.apache.org/howto-dev.html

"Frequently Asked Questions"
http://forrest.apache.org/faq.html

"The Anakia output plugin"
This was developed to assist the old Incubator
website to stop using Forrest and export all content
to an Anakia "xdoc" format. From there it could used
by an Anakia-based build system, or be further transformed.
http://forrest.apache.org/pluginDocs/plugins_0_80/org.apache.forrest.plugin.output.Anakia/

As usual, if you need further assistance with anything
then please ask on the Forrest mail lists.

--- oOo ---

There is discussion currently underway on the Forrest
dev mail list about the current status and future
direction of Forrest.
http://thread.gmane.org/gmane.text.xml.forrest.devel/27325

If anyone can assist Forrest, in any capacity, then
please do.

-David


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Closed: (DROIDS-67) Better control of the finished task</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c264491259.1256557379379.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c264491259-1256557379379-JavaMail-jira@brutus%3e</id>
<updated>2009-10-26T11:42:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/DROIDS-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thorsten Scherler closed DROIDS-67.
-----------------------------------

    Resolution: Fixed

thanks javier

&gt; Better control of the finished task
&gt; -----------------------------------
&gt;
&gt;                 Key: DROIDS-67
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-67
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: core
&gt;            Reporter: Javier Puerto
&gt;         Attachments: awaitTermination.patch
&gt;
&gt;
&gt; The interface TaskMaster defines the method:
&gt; void awaitTermination(long timeout, TimeUnit unit) throws InterruptedException;
&gt; But the implementation of the MultiThreadedTaskMaster is based on ExecutorService and
this have the same method returning a boolean value.
&gt; This kind of behaviour is better because in the case of you want to made two MultiThread
Droid and one is the consumer from the other, you can't control if the first finished the
work (producer) to run the second (consumer).
&gt; Attached is a patch that solves the problem, is implemented also on SequentialTaskMaster.
&gt; Salu2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (DROIDS-67) Better control of the finished task</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c28617644.1256556899434.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c28617644-1256556899434-JavaMail-jira@brutus%3e</id>
<updated>2009-10-26T11:34:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/DROIDS-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12769991#action_12769991
] 

Thorsten Scherler commented on DROIDS-67:
-----------------------------------------

Committed revision 829755.

&gt; Better control of the finished task
&gt; -----------------------------------
&gt;
&gt;                 Key: DROIDS-67
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-67
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: core
&gt;            Reporter: Javier Puerto
&gt;         Attachments: awaitTermination.patch
&gt;
&gt;
&gt; The interface TaskMaster defines the method:
&gt; void awaitTermination(long timeout, TimeUnit unit) throws InterruptedException;
&gt; But the implementation of the MultiThreadedTaskMaster is based on ExecutorService and
this have the same method returning a boolean value.
&gt; This kind of behaviour is better because in the case of you want to made two MultiThread
Droid and one is the consumer from the other, you can't control if the first finished the
work (producer) to run the second (consumer).
&gt; Attached is a patch that solves the problem, is implemented also on SequentialTaskMaster.
&gt; Salu2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Reopened: (DROIDS-65) Add Droids to ASF Maven repository</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c391566366.1256111399472.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c391566366-1256111399472-JavaMail-jira@brutus%3e</id>
<updated>2009-10-21T07:49:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/DROIDS-65?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thorsten Scherler reopened DROIDS-65:
-------------------------------------


sonatype does not response to the above search, but if you browse the rep in the tree view
you will find a couple of artifacts. However not all artifacts are there. This issue needs
a little more investigation.

&gt; Add Droids to ASF Maven repository
&gt; ----------------------------------
&gt;
&gt;                 Key: DROIDS-65
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-65
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: infrastructure
&gt;            Reporter: Otis Gospodnetic
&gt;            Priority: Minor
&gt;
&gt; Droids should be added to Apache's Maven repository: https://repository.apache.org/index.html#nexus-search;droids

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Closed: (DROIDS-65) Add Droids to ASF Maven repository</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c1085737115.1256075759466.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1085737115-1256075759466-JavaMail-jira@brutus%3e</id>
<updated>2009-10-20T21:55:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/DROIDS-65?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thorsten Scherler closed DROIDS-65.
-----------------------------------

    Resolution: Fixed

When all the mirrors are updated droids will be listed in the snapsshots rep.

&gt; Add Droids to ASF Maven repository
&gt; ----------------------------------
&gt;
&gt;                 Key: DROIDS-65
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-65
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: infrastructure
&gt;            Reporter: Otis Gospodnetic
&gt;            Priority: Minor
&gt;
&gt; Droids should be added to Apache's Maven repository: https://repository.apache.org/index.html#nexus-search;droids

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Commented: (DROIDS-65) Add Droids to ASF Maven repository</title>
<author><name>&quot;Thorsten Scherler (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c243078274.1256075639392.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c243078274-1256075639392-JavaMail-jira@brutus%3e</id>
<updated>2009-10-20T21:53:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

    [ https://issues.apache.org/jira/browse/DROIDS-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12767954#action_12767954
] 

Thorsten Scherler commented on DROIDS-65:
-----------------------------------------

Committed revision 827812.

You have to add the following to your settings.xml to be able to deploy the snapshot:
&lt;settings&gt;
    &lt;servers&gt;
      &lt;server&gt;
        &lt;id&gt;org.apache.people&lt;/id&gt;
        &lt;directoryPermissions&gt;775&lt;/directoryPermissions&gt;
      &lt;filePermissions&gt;644&lt;/filePermissions&gt;
      &lt;privateKey&gt;/Users/YOU/.ssh/id_rsa&lt;/privateKey&gt;
      &lt;passphrase&gt;yourPhrase&lt;/passphrase&gt; 
     &lt;/server&gt;
    &lt;/servers&gt;
  &lt;/settings&gt;

You need to be committer to be able to make the deploy.

&gt; Add Droids to ASF Maven repository
&gt; ----------------------------------
&gt;
&gt;                 Key: DROIDS-65
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-65
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: infrastructure
&gt;            Reporter: Otis Gospodnetic
&gt;            Priority: Minor
&gt;
&gt; Droids should be added to Apache's Maven repository: https://repository.apache.org/index.html#nexus-search;droids

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Updated: (DROIDS-67) Better control of the finished task</title>
<author><name>&quot;Javier Puerto (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c1934004631.1256055719709.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1934004631-1256055719709-JavaMail-jira@brutus%3e</id>
<updated>2009-10-20T16:21:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>

     [ https://issues.apache.org/jira/browse/DROIDS-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Javier Puerto updated DROIDS-67:
--------------------------------

    Attachment: awaitTermination.patch

Changes the interface TaskMaster and the classes that implement it to support the boolean
return type on the awaitTermination method.

&gt; Better control of the finished task
&gt; -----------------------------------
&gt;
&gt;                 Key: DROIDS-67
&gt;                 URL: https://issues.apache.org/jira/browse/DROIDS-67
&gt;             Project: Droids
&gt;          Issue Type: Improvement
&gt;          Components: core
&gt;            Reporter: Javier Puerto
&gt;         Attachments: awaitTermination.patch
&gt;
&gt;
&gt; The interface TaskMaster defines the method:
&gt; void awaitTermination(long timeout, TimeUnit unit) throws InterruptedException;
&gt; But the implementation of the MultiThreadedTaskMaster is based on ExecutorService and
this have the same method returning a boolean value.
&gt; This kind of behaviour is better because in the case of you want to made two MultiThread
Droid and one is the consumer from the other, you can't control if the first finished the
work (producer) to run the second (consumer).
&gt; Attached is a patch that solves the problem, is implemented also on SequentialTaskMaster.
&gt; Salu2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (DROIDS-67) Better control of the finished task</title>
<author><name>&quot;Javier Puerto (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c69488605.1256055479614.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c69488605-1256055479614-JavaMail-jira@brutus%3e</id>
<updated>2009-10-20T16:17:59Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Better control of the finished task
-----------------------------------

                 Key: DROIDS-67
                 URL: https://issues.apache.org/jira/browse/DROIDS-67
             Project: Droids
          Issue Type: Improvement
          Components: core
            Reporter: Javier Puerto


The interface TaskMaster defines the method:

void awaitTermination(long timeout, TimeUnit unit) throws InterruptedException;

But the implementation of the MultiThreadedTaskMaster is based on ExecutorService and this
have the same method returning a boolean value.
This kind of behaviour is better because in the case of you want to made two MultiThread Droid
and one is the consumer from the other, you can't control if the first finished the work (producer)
to run the second (consumer).

Attached is a patch that solves the problem, is implemented also on SequentialTaskMaster.

Salu2.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>Re: Niocchi - java asynchronous crawl library released</title>
<author><name>Andrzej Bialecki &lt;ab@getopt.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c4ADB29F0.3000502@getopt.org%3e"/>
<id>urn:uuid:%3c4ADB29F0-3000502@getopt-org%3e</id>
<updated>2009-10-18T14:45:04Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Luk谩拧 Vl膷ek wrote:
&gt; Hi,
&gt; 
&gt; I just noticed that Niocchi has been released recently.
&gt; http://www.niocchi.com/
&gt; 
&gt; Niocchi is a java asynchronous crawl library implemented with NIO. It is 
&gt; designed to crawl several thousands of hosts in parallel on a single low 
&gt; end server.It is currently being used in production by Enormo 
&gt; &lt;http://www.enormo.com/&gt; to crawl thousands of websites daily, and 
&gt; by Vitalprix &lt;http://www.vitalprix.com/&gt;.

Well, of course we should optimize our use of resources, and we could 
check what this library can offer - but I doubt that optimizations on 
this level would bring significant benefits in terms of increased speed 
of crawling - low-level IO handling is rarely the bottleneck. Most of 
the time the politeness limits (max rate of requests per host) are the 
bottleneck.


-- 
Best regards,
Andrzej Bialecki     &lt;&gt;&lt;
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



</pre>
</div>
</content>
</entry>
<entry>
<title>Niocchi - java asynchronous crawl library released</title>
<author><name>=?UTF-8?B?THVrw6HFoSBWbMSNZWs=?= &lt;lukas.vlcek@gmail.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c52c3ddca0910180411h55cf543dv9e05a4055f767b0e@mail.gmail.com%3e"/>
<id>urn:uuid:%3c52c3ddca0910180411h55cf543dv9e05a4055f767b0e@mail-gmail-com%3e</id>
<updated>2009-10-18T11:11:41Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hi,
I just noticed that Niocchi has been released recently.
http://www.niocchi.com/

Niocchi is a java asynchronous crawl library implemented with NIO. It is
designed to crawl several thousands of hosts in parallel on a single low end
server.It is currently being used in production by
Enormo&lt;http://www.enormo.com/&gt; to
crawl thousands of websites daily, and by Vitalprix&lt;http://www.vitalprix.com/&gt;
.

Regards,
Lukas


</pre>
</div>
</content>
</entry>
<entry>
<title>[OMAS] Features of a multi-&quot;cognitive agent&quot; system</title>
<author><name>=?UTF-8?Q?Florent_Andr=C3=A9?= &lt;florent.andre-dev@4sengines.com&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c00832768e1bd5043f8255bbeffbb8cb4@4sengines.eu%3e"/>
<id>urn:uuid:%3c00832768e1bd5043f8255bbeffbb8cb4@4sengines-eu%3e</id>
<updated>2009-10-14T09:32:50Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Hy Droids,

I found the OMAS (Open Multi-Agent System) platform that have some cool
feature (but works only on windows/mac). 

Description : 
"OMAS is a  multi-agent platform which allows prototyping systems of
cognitive agents"

I recommend to read this : Main features :
http://www.utc.fr/%7Ebarthes/OMAS/N244L-OMAS-7-Features.pdf

For further inspection : official web-site :
http://www.utc.fr/~barthes/OMAS/


I don't well know if multi-agent (+ cognitive?) is on the target of droids,
but maybe it gave some idea to someone...

++


</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (DROIDS-66) Create DOAP for Droids</title>
<author><name>&quot;Otis Gospodnetic (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c2087183202.1254934351674.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c2087183202-1254934351674-JavaMail-jira@brutus%3e</id>
<updated>2009-10-07T16:52:31Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Create DOAP for Droids
----------------------

                 Key: DROIDS-66
                 URL: https://issues.apache.org/jira/browse/DROIDS-66
             Project: Droids
          Issue Type: Improvement
          Components: infrastructure
            Reporter: Otis Gospodnetic
            Priority: Minor


DOAP for Droids should be created and added to http://projects.apache.org/indexes/alpha.html#D

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>[jira] Created: (DROIDS-65) Add Droids to ASF Maven repository</title>
<author><name>&quot;Otis Gospodnetic (JIRA)&quot; &lt;jira@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3c1878395908.1254934351638.JavaMail.jira@brutus%3e"/>
<id>urn:uuid:%3c1878395908-1254934351638-JavaMail-jira@brutus%3e</id>
<updated>2009-10-07T16:52:31Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Add Droids to ASF Maven repository
----------------------------------

                 Key: DROIDS-65
                 URL: https://issues.apache.org/jira/browse/DROIDS-65
             Project: Droids
          Issue Type: Improvement
          Components: infrastructure
            Reporter: Otis Gospodnetic
            Priority: Minor


Droids should be added to Apache's Maven repository: https://repository.apache.org/index.html#nexus-search;droids

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



</pre>
</div>
</content>
</entry>
<entry>
<title>ApacheCon US</title>
<author><name>Grant Ingersoll &lt;gsingers@apache.org&gt;</name></author>
<link rel="alternate" href="http://mail-archives.apache.org/mod_mbox/incubator-droids-dev/200910.mbox/%3cEE4984D0-A622-4A17-AB89-B7FC2AD6AB65@apache.org%3e"/>
<id>urn:uuid:%3cEE4984D0-A622-4A17-AB89-B7FC2AD6AB65@apache-org%3e</id>
<updated>2009-10-07T10:35:42Z</updated>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<pre>
Just a friendly reminder to all about Lucene ecosystem events at  
ApacheCon US this year.  We have two days of talks on pretty much  
every project under Lucene (see http://lucene.apache.org/#14+August+2009+-+Lucene+at+US+ApacheCon

) plus a meetup and a two day training on Lucene and a 1 day training  
on Solr.  The Lucene training will cover Lucene 2.9 and I'm sure  
Erik's Solr one will cover Solr 1.4.  I also know there will be quite  
a few Lucene, et. al. committers at ApacheCon this year, so it should  
be a good year to interact and discuss your favorite projects.

ApacheCon US is in Oakland (near San Francisco) the week of November  
2nd.  The trainings are on the 2nd and 3rd, and the main conference  
starts on the 4th.

You can register at http://www.us.apachecon.com/c/acus2009/

Hope to see you there,
Grant


</pre>
</div>
</content>
</entry>
</feed>
