oodt-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Two File Manager Questions
Date Wed, 09 Nov 2011 16:08:29 GMT
Hey Tim,

Thanks for your questions. My comments below:

On Nov 8, 2011, at 3:00 PM, Stough, Timothy M (388F) wrote:

> Hello All,
> 
> I've been using CAS for a while and have it mostly working.  I have a couple of issues
with versioning and search that I need some help with.  See below.
> 
> Thanks in advance!
> Tim.
> 
> A) Getting the right versioning behavior.
> 
> 1) I need to have versioning set up so that multiple products of the same name can be
ingested and a version number incremented.

Right now the ProductionDateTimeVersioner will do this, using the Date/Time as a versioning
key. However, the file name actually changes 
on disk when it performs this operation.

You're probably going to have to roll your own Versioner and client-side Met Extractor for
this particular use case. Do you want the actual 
product name in the catalog to be the same? If so, then you don't need a client-side Met Extractor
-- if you want the version number in the 
Product/File name, then you'll need one and to compute it there. 

As for the Versioner, you can model it off of the MetadataBasedFileVersioner. Just define
a file path spec like:

/some/path/[FilenameBase]_[Version]_[Ext]

Then, compute FilenameBase, Version, and Ext from the metadata, and call super.createDataStoreReferences
with your 
computed/derived met. It won't be plumbed back to the catalog (b/c the met object is read-only
at that point), but it will be used
in the versioning process. DirectoryProductVersioner does something like this if you want
to see an example.

> 
> 2) When I search, if multiple versions are available, I only need to return the most
recent.

Yep, that's the default behavior -- you can just pop the latest one off the top.

> 
> B) Disallowed characters in product names?
> 
> My product names look like this:  "ALPSRP225250610-H1.0__A"  If I try to search for a
product by name using a Lucene query through query_tool, the query breaks across the "_".
 What I see in the log is:
> 
> WARNING: Query: [q=Filename:ALPSRP154650650-H1.0 AND Filename:A] for Product Type: [urn:oodt:GenericFile]
returned no results
> 
> So it looks like the __ gets turned into an "AND".  What's the deal and how do I fix
it?

Yep this has to do with the LuceneQuery analysis that's going on an with CASAnalyzer. You
may want to simply 
use the SQLQuery interface.

Thanks!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Mime
View raw message