lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Øie <k...@gan.no>
Subject Re: Italian web sites
Date Wed, 24 Apr 2002 11:57:38 GMT
hm... this looks very interesting! if it is a perl exe you can just copy the 
text into a temp file and run the per exe on that file and redirect the 
output to another tmp file. then read the file and use the result in a lucene 
keyword.

mvh karl øie

On Wednesday 24 April 2002 13:46, lucene@libero.it wrote:
> Hi all,
> 
> I have found a very interesting library which is written in perl.
> The problem is now how I can use this library.
> 
> Anyway the library is Textcat an you can find it:
> 
> http://odur.let.rug.nl/~vannoord/TextCat/
> 
> Bye
> 
> Laura
> 
>
> > combined with that you could use an italian stop-
>
> word list to run statistics 
>
> > on a page :-) ?!?
> > 
> > On Wednesday 24 April 2002 11:02, lucene@libero.it wrote:
> >
> > > Hi all,
> > > 
> > > I'm using Jobo for spidering web sites and lucene for indexing. The 
> > > problem is that I'd like spidering only Italian web sites. 
> > > How can I see discover the country of a web site?
> > > 
> > > Dou you know some method that tou can suggest me?
> > > 
> > > Thanks
> > > 
> > > 
> > > Laura
> > > 
> >
> > 
> > 
> > --
> > To unsubscribe, e-mail:   <mailto:lucene-user-
>
> unsubscribe@jakarta.apache.org>
>
> > For additional commands, e-mail: <mailto:lucene-user-
>
> help@jakarta.apache.org>
>
> > 


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message