manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chalitha udara Perera <chalithaud...@gmail.com>
Subject Repository document stream empty after Tika Transformation
Date Fri, 17 Jul 2015 11:25:46 GMT
Hi All,

I'm writing a transformation connector to extract low level features from
images. First I used that connector without tika extractor and I worked
fine. But when I used it with Tika connector (after tika) if fails to
extract features. After debugging I found out that the stream is empty
after tika transformation.
Actually inside tika connector, it creates a new in memory or file stream
output, but original input stream is never copied to it. Connector should
reset binary stream after utilizing the stream to get metadata so the
original inputstream is available from connector to connector.

Here I have attached a simple solution of stream copy and reset that worked
for me.

Thanks,
Chalitha

-- 
J.M Chalitha Udara Perera

*Department of Computer Science and Engineering,*
*University of Moratuwa,*
*Sri Lanka*

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message