oodt-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OODT-754) contribute ProdTypePatternMetExtractor
Date Thu, 25 Sep 2014 01:11:33 GMT

    [ https://issues.apache.org/jira/browse/OODT-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147207#comment-14147207
] 

Lewis John McGibbney commented on OODT-754:
-------------------------------------------

[~rickdn] this is an excellent idea. [~skhudiky] and myself were discussing this today and
it is certainly a shortcoming of other extractor implementations where they do not account
for the following case
Say you have a file which is as follows AAAA-BB-CCCCCC-DD.png which you wish to consider as
a product.
 * AAAA represents the instrument/device which produced the picture
 * BB is an identifier for the project the picture was produced for
 * CCCCCC is the datee.g. YYMMDD
 * DD is the number of products produced on that date for that project by that instrument.
What happens is DD > 99?
Well what happens is that the FileNameExtractor (or whatever it is called) policy is broken
and we begin ingesting incorrect information.
The extractor you describe on the wiki makes life so much easier to deal with cases like the
above.
Thanks 

> contribute ProdTypePatternMetExtractor
> --------------------------------------
>
>                 Key: OODT-754
>                 URL: https://issues.apache.org/jira/browse/OODT-754
>             Project: OODT
>          Issue Type: New Feature
>          Components: metadata container
>            Reporter: Ricky Nguyen
>            Assignee: Ricky Nguyen
>             Fix For: 0.8
>
>
> There has been renewed interest in implementing the ProdTypePatternMetExtractor proposed
[here|https://cwiki.apache.org/confluence/display/OODT/MetExtractors+for+Crawler].
> I was going to add it to the "metadata" module under the "org.apache.oodt.cas.metadata.extractors"
package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message