hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henri Yandell <flame...@gmail.com>
Subject Re: What became of norbert the robots.txt parser?
Date Fri, 24 Jun 2005 13:10:13 GMT
It never made it into the sandbox for two reasons:

1) It just never hit the top of my todo list
2) Popping things into the sandbox wasn't working well for something
else and I've been trying to avoid putting things in there that I'm
the sole community for. I tend to forget about it, whereas sitting at
osjava I don't.

Maturity-wise, Norbert is helped by the norobots spec coming with a
nice set of test files; I was easily able to use these for the unit
tests, so from day 1 it was more mature than you'd usually expect a
new library to be.

Norbert had a 0.3 release a month ago, which was just the addition of
a settable User-Agent header so that it can hit the robots.txt file as
your application and not as 'OSJava Norbert' or whatever it was.

http://www.osjava.org/norbert/changes-report.html

Hen

On 6/20/05, Oleg Kalnichevski <olegk@apache.org> wrote:
> On Mon, 2004-11-01 at 18:37 -0500, Henri Yandell wrote:
> > <snip>
> > I'll go ahead and migrate it into the sandbox at some point soon.
> >
> > On the web crawler side; there's:
> >
> > http://www.osjava.org/scraping-engine/
> >
> > I need to migrate it to use commons-configuration, and it already sits
> > on top of HttpClient. Food for thought anyway I hope. I use it
> > personally and its used at my workplace, but haven't really pushed it
> > outside of my own use yet.
> >
> > Hen
> >
> 
> Henri, was going to migrate it to commons-sandbox but probably had more
> important things to do ever since.
> 
> Actually, at this point migrating 'norbert' straight to Jakarta
> HttpClient (not to be confused with Jakarta Commons HttpClient) is a
> feasible option as well.
> 
> Oleg
> PS: cc-ing to Henri
> 
> On Mon, 2005-06-20 at 16:22 -0400, Alexander Fairley wrote:
> > I'm thinking about incorporating norbert in a project I'm working on.
> > Googling around, I came across a discussion amongst you(circa Nov.
> > 2004) about putting norbert into the sandbox and potentially including
> > it in a revision of HttpClient. This was good news to me, because it
> > made me think Norbert was likely fairly functional. However, when I
> > went poking about the sandbox, norbert was nowhere to be found. Did
> > you folks pass judgement on Norbert and find him wanting, or just
> > never end up making the moves to put him into the sandbox?
> >
> > Alexander Fairley
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> >
> >
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message