oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Palsulich <tpalsul...@gmail.com>
Subject Re: Tyler - I may need your help
Date Wed, 21 Jan 2015 22:13:27 GMT
Hi Val,

Hmm... Is there a particular (wrong) mime-type that keeps getting detected
(like text/plain, or something)? I'm curious if the type is just returning
a default. Or, is it a seemingly random file type? What are the contents of
your mime-types.xml file? If it's different than
https://raw.githubusercontent.com/apache/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml,
can you try copying it over?

I'm not sure I'll be able to replicate your error on my computer without a
bit of difficulty. Do you think there is any way you could create a JUnit
test case with the problem?

Tyler


On Wed, Jan 21, 2015 at 1:26 PM, Mallder, Valerie <
Valerie.Mallder@jhuapl.edu> wrote:

> Hi Tyler,
>
> I'm have been looking into an issue that cropped up in my OODT system when
> I upgraded to OODT 0.8. The issue is, my AutoDetectProductCrawler, which is
> launched from a PGETaskInstance is unable to determine the mime-type for my
> product files.  I am using the same filemgr/etc/mime-types.xml file that I
> was using with OODT 0.7, and I am using the same
> oodt/extensions/policy/mime-extractor-map.xml file that I was using with
> OODT 0.7, but now, in MimeTypeRepo::getExtractorSpecsForFile, the call to
> this.mimeRepo.getMimeType(file) is returning the wrong mime-types for all
> of my files, and so the AutoDetectProductCrawler is telling me I have no
> extractor specs for my files.
>
> I noticed that you did some work on MimeTypeUtils for OODT-630 in OODT
> 0.8. At first glance, it doesn't' look like any of this work would be
> directly responsible. Can you think of anything that might be causing this
> to happen? I don't know anything about tika. Do I need to make any changes
> to my policy files to remain compatible.  Just looking for clues on how to
> resolve this.  I have verified by adding log messages throughout the code
> that, prior to launching the AutoDetectProductCrawler, all of the policy
> files are read correctly. The MimeExtractorConfigReader is reading the
> correct mim-extractor-map.xml file, and it is calling setMimeRepoFile with
> the correct mime-types.xml file, and it is setting the correct extractor
> config file, etc. But, once AutoDetectProductCrawler starts crawling it try
> to getExtractorSpecsForFile but determines the wrong mime type and then
> can't find the extractor spec.
>
> Thanks,
> Val
>
>
>
> Valerie A. Mallder
>
> New Horizons Deputy Mission System Engineer
> The Johns Hopkins University/Applied Physics Laboratory
> 11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723
> 240-228-7846 (Office) 410-504-2233 (Blackberry)
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message