incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rodrigo Fernandes Moreira <rodr...@eteg.com.br>
Subject New Project Proposal: Sinon
Date Mon, 25 Apr 2005 15:28:04 GMT

Hi,

My name is Rodrigo and I am a partner of a Brazilian company called
Eteg, that is focused in Java software development. I will present a
project that I would like to become part of the Apache Incubator.

In 2001 we launched a site called eBabel
<http://web.archive.org/web/20021002235400/http:/www.ebabel.com.br>,
a site suited to do price research, similar to MySimon
<http://www.mysimon.com/> and Froogle <http://froogle.google.com/>. 
It had some success, but not enough to be commercially feasible, so
unfortunately it had to be closed.

Last year, a company (that knew that we had built eBabel) asked us to
develop a system designed to watch their competitors (based on the
information these competitors make public in the Internet). Since
technology had developed a lot since eBabel and since there were lots of
things we would not do the same way again, we developed this system from
scratch.

This system is composed of a core that we call *Sinon* and another part
that was specific for that company (mainly to consolidate all the
information retrieved). Sinon is what we would like to donate to the ASF.

Briefly, *Sinon* is a engine built to retrieve information from a
partially or totally structured environment in the Internet. It collects
information from the Internet based on XML configuration files. It is
written in Java and uses JDOM, Jakarta Commons HTTP Client, Jakarta
Velocity and Jakarta Commons Logging.

As an example, with *Sinon*, it is possible to do a system to collect
the price of books (the case of books is simpler, because they have a
unique identifier, the ISBN) in Amazon and B&N in one hour.

Sinon is up and running for some months in this client. So, in a way it
is a "finished" product, in the other side we a have a lot of ideas of
improvements such as:
* Use of a changing flag. If the web page structure has changed, the
developer would be noticed in order to change the XML file.
* Use of cache, for the case that page has not changed since the last
research. This way, we could achieve a better performance.

I would like to know if this makes sense to the ASF and if it does, what
should be my next step in order to enter the incubation process.

Best Regards,
Rodrigo Fernandes
Eteg Internet Ltd




---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message