poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 48075] New: Broken paragraph to text mapping in some documents
Date Wed, 28 Oct 2009 14:00:03 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=48075

           Summary: Broken paragraph to text mapping in some documents
           Product: POI
           Version: 3.5-dev
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HWPF
        AssignedTo: dev@poi.apache.org
        ReportedBy: max.valjanski@gmail.com


WordExtractor.getParagraphText() extracts incomplete and broken text data from
attached document. Hovever, WordExtractor.getTextFromPieces() extracts complete
correct text (the same as in MS Office).

It seems that there is a problem in paragraph to text mapping.

Problem exists on few documents from the same source, text extraction from many
other documents works fine.

POI version poi-3.6-beta1-20091002 (svn trunk)

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message