www-community mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Hyde <bh...@pobox.com>
Subject Re: Where are we?
Date Thu, 30 Jan 2003 02:23:32 GMT
This will scrap the locations...

#!/usr/bin/perl
use LWP::UserAgent;
my $ua = LWP::UserAgent->new(timeout=>30, agent=>"Krell-GeoScraper/0.1  
");
# $ua->agent();
open(F, "<urls.txt");
while(<F>){
   chop;
   my ($username, $url) = split(/: */, $_, 2);
   my $res = $ua->request(HTTP::Request->new(GET => $url));
   my ($lat, $lon)
     = $res->content
       =~  
m{<meta\s+name\s*=\s*"ICBM"\s+content="(\s*[.+0-9-]*)\s*[,;]\s*([.+0-9- 
]*)\s*"\s*[/]*>}is
	if $res->is_success;
   print "$username:$lat:$lon:$url\n";
}
close(F);

...  i.e.

bhyde:42.41528:-71.15694:http://enthusiasm.cozy.org/
erikabele:48.7942:10.1151:http://www.codefaktor.de/weblog/
coar:35.90528:-78.85000:http://Ken.Coar.Org/blog/
fitz:-87.67350:41.97200:http://www.red-bean.com/fitz/
jwoolley:::http://www.cs.virginia.edu/~jcw5q/
stevenn:51.0749:3.7473:http://blogs.cocoondev.org/stevenn/
thommay:51.502798:-0.329835:http://www.planetarytramp.net/

   - ben

ps. I kind of feel somewhat that the user names are private and ought  
not appear in any public reports; humm...
pps. you gotta love regular expression, well you do if your ever going  
to understand da bastards.


---------------------------------------------------------------------
To unsubscribe, e-mail: community-unsubscribe@apache.org
For additional commands, e-mail: community-help@apache.org


Mime
View raw message