poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koundinya \(Sudhakar Chavali\)" <sudhakar_koundi...@yahoo.com>
Subject Silly mistake in textmining
Date Tue, 30 Mar 2004 11:43:11 GMT
Hi Ryan,

I have identified the Simple mistake in WordExtractor code of
textmining. Just have a look at following code.
Method :extractText

// code snippet of extractText method
    while (runIt.hasNext())
    {
      CHPX chpx = (CHPX)runIt.next();
      boolean deleted = isDeleted(chpx.getGrpprl());
      if (deleted)
      {
        continue;
      }

      int runStart = chpx.getStart();
      int runEnd = chpx.getEnd();

      while (runStart >= currentTextEnd) //possibilty of raising
exceptions 
      {
        currentPiece = (TextPiece) textIt.next (); //because of
this :(
        currentTextStart = currentPiece.getStart ();
        currentTextEnd = currentPiece.getEnd ();
      }



---------------------------------------------------
----------------------------------------------------




>>       while (runStart >= currentTextEnd)  this line is
mistake

it should be       

while (runStart >= currentTextEnd && textIt.hasNext()) 

otherwise parser may raise exception for certain documents. I
faced problem for atleast 2 documents out of 100 documents

Regards
Sudhakar

=====
"No one can earn a million dollars honestly."- William Jennings Bryan (1860-1925) 

"Make everything as simple as possible, but not simpler."- Albert Einstein (1879-1955)

"It is dangerous to be sincere unless you are also stupid."- George Bernard Shaw (1856-1950)

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: poi-user-help@jakarta.apache.org


Mime
View raw message