poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amrun" <Am...@gmx.at>
Subject Re: function to get only plain text?
Date Sat, 24 Apr 2004 13:37:35 GMT
I'm sorry for my unclear explanation.
I try to analyse a word document to index them for searching, e.g. I delete
stopwords and so on. I use POI to get the content of the document, but as I
said before, I also get "keywords" like "HYPERLINK", that I don't need. So I
want to know if I have to delete this keywords on my own or if there is a
option that I should give POI that this words are ignored?!

thx Amrun


> I don't know what exactly you are expecting 
> 
> if it is for word document and what i understand is correct then
> use api from http://www.textmining.org  this is basically 
> developed using POI api only
> 
> regards
> 
> 
> --- Amrun <Amrun@gmx.at> wrote:
> > Hi, 
> > 
> > is there a function in POI to get only the plain text without
> > "keywords"
> > like "HYPERLINK" (if a www-url is included in the text)? Or do
> > I need to
> > filter them on my own?
> > 
> > thx for help
> > Amrûn
> > 
> > -- 
> > NEU : GMX Internet.FreeDSL
> > Ab sofort DSL-Tarif ohne Grundgebühr: http://www.gmx.net/dsl
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > poi-dev-help@jakarta.apache.org
> > 
> 
> 
> =====
> "No one can earn a million dollars honestly."- William Jennings Bryan
> (1860-1925) 
> 
> "Make everything as simple as possible, but not simpler."- Albert Einstein
> (1879-1955)
> 
> "It is dangerous to be sincere unless you are also stupid."- George
> Bernard Shaw (1856-1950)
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Yahoo! Photos: High-quality 4x6 digital prints for 25¢
> http://photos.yahoo.com/ph/print_splash
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: poi-dev-help@jakarta.apache.org
> 

-- 
"Sie haben neue Mails!" - Die GMX Toolbar informiert Sie beim Surfen!
Jetzt aktivieren unter http://www.gmx.net/info


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-dev-help@jakarta.apache.org


Mime
View raw message