lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nader S. Henein" <>
Subject RE: Italian web sites
Date Wed, 24 Apr 2002 09:11:40 GMT
sniff the IP and then using the database at the
internet topology website
you can find the country of origin, (use that to populate your
own DB) so retrieval decreases as you accumulate IPs), but that will
give you the website in Italy (not Italian websites). Unfortunately unless
uses a different encoding for the page, picking it up from the page
won't help much.

-----Original Message-----
From: []
Sent: Wednesday, April 24, 2002 1:03 PM
Subject: Italian web sites

Hi all,

I'm using Jobo for spidering web sites and lucene for indexing. The
problem is that I'd like spidering only Italian web sites.
How can I see discover the country of a web site?

Dou you know some method that tou can suggest me?



To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message