jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Chalupka <martin.chalu...@gmx.net>
Subject best practices for searching binary content?
Date Wed, 04 May 2005 15:12:08 GMT
Hello,

what is the best practice for managing searchable binary content (like word- or
pdf-documents) in jackrabbit?
I am thinking about stripping the text with tools like Jakarta Apache POI and
writing it as text content to the repository, with some structure like

mynt:wordDocument
 |
 +- nt:unstructured (stripped text goes here)
 |
 +- nt:file (word doc as binary goes here)
 
 
would that be the right way?
 
greetings,
Martin



Mime
View raw message