chemistry-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Hübel (Commented) (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CMIS-515) Enhance command line client in test-util to upload local files with metadata extraction
Date Tue, 27 Mar 2012 08:03:50 GMT

    [ https://issues.apache.org/jira/browse/CMIS-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239293#comment-13239293
] 

Jens Hübel commented on CMIS-515:
---------------------------------


The command line test client gets a new command and new parameters. The following types of
commands is now supported (connection parameters omitted here):

org.apache.chemistry.opencmis.client.main.ObjGenApp --Command=CopyFiles --File=picture.jpg
--RepositoryId=A1
This uploads a single file to repository A1 

org.apache.chemistry.opencmis.client.main.ObjGenApp --Command=CopyFiles --Dir=D:\temp\mymedia
--RepositoryId=A1
This uploads all files recursively from a directory to repository A1

There are more parameters available, see usage for details.

Metadata extraction and mapping from MIME types to CMIS document types is configurable in
a properties file called mapping.properties which must be located in the classpath.
Default parser for metadata extraction is Apache Tika. Custom parsers can be added by configuration.
Metadata extraction is optional and a default type for files without extraction can be configured.

Mime type recognition is done by Apache Tika and can be overridden based on the file extension.

The InMemory server comes with default types to support the default configuration for files
of types Office, pictures, audio, emails and PDF

A sample configuration with comments looks like this (delivered as default):



# configuration file how to map file metadata to CMIS types and properties
mapping.contentTypes = image, mp3, pdf, office, email

#default CMIS type id for those files/folders without a special mapping configured
mapping.contentType.default.document = cmis:document
mapping.contentType.default.folder = cmis:folder

# configure the MIME types for each key in value of mapping.contentTypes 
# syntax can be like image/jpeg or image/* or colon separated list image/jpeg:image/tiff:image/png
mapping.contentType.mp3 = audio/mpeg
mapping.contentType.image = image/jpeg
mapping.contentType.pdf = application/pdf
mapping.contentType.office = application/vnd.openxmlformats-officedocument.presentationml.presentation:application/vnd.openxmlformats-officedocument.presentationml.template:application/vnd.openxmlformats-officedocument.presentationml.slideshow:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet:vnd.openxmlformats-officedocument.spreadsheetml.template:application/vnd.openxmlformats-officedocument.wordprocessingml.document:application/vnd.openxmlformats-officedocument.wordprocessingml.template
mapping.contentType.email = application/vnd.ms-outlook:message/rfc822

# CMIS type ids mapped to each key in value of mapping.contentTypes
mapping.contentType.mp3.typeId = audioFile
mapping.contentType.image.typeId = exifImage
mapping.contentType.pdf.typeId = pdfDocument
mapping.contentType.office.typeId = officeDocument
mapping.contentType.email.typeId = emailDocument

# classes for parsers and property mappers, usually MetadataParserTika and PropertyMapperTika
# can be overridden with special implementations for other parsers, MetadataParserTika and
# PropertyMapperTika are the default if not specified
mapping.contentType.image.parserClass = org.apache.chemistry.opencmis.client.parser.MetadataParserExif
mapping.contentType.image.mapperClass = org.apache.chemistry.opencmis.client.mapper.PropertyMapperExif
mapping.contentType.mp3.parserClass = org.apache.chemistry.opencmis.client.parser.MetadataParserTika
mapping.contentType.mp3.mapperClass = org.apache.chemistry.opencmis.client.mapper.PropertyMapperTika
mapping.contentType.pdf.parserClass = org.apache.chemistry.opencmis.client.parser.MetadataParserTika
mapping.contentType.pdf.mapperClass = org.apache.chemistry.opencmis.client.mapper.PropertyMapperTika
mapping.contentType.office.parserClass = org.apache.chemistry.opencmis.client.parser.MetadataParserTika
mapping.contentType.office.mapperClass = org.apache.chemistry.opencmis.client.mapper.PropertyMapperTika
mapping.contentType.email.parserClass = org.apache.chemistry.opencmis.client.parser.MetadataParserTika
mapping.contentType.email.mapperClass = org.apache.chemistry.opencmis.client.mapper.PropertyMapperTika

# for content types that are not detected by Tika or that you want to treat differently
# can be overridden depending on the file extension (.xyz to foo/bar)
mapping.contentType.forceContentType.mp4 = video/mp4
mapping.contentType.forceContentType.webm = video/webm

# CMIS properties in use
# a standard syntax for a Tika parser
mapping.contentType.mp3.id.xmpDM\:artist = artist
mapping.contentType.mp3.id.xmpDM\:album = album
mapping.contentType.mp3.id.title = title
mapping.contentType.mp3.id.xmpDM\:logComment = comment
mapping.contentType.mp3.id.xmpDM\:genre = genre
mapping.contentType.mp3.id.xxx = length
mapping.contentType.mp3.id.xmpDM\:trackNumber = track
mapping.contentType.mp3.id.xmpDM\:releaseDate = year
mapping.contentType.mp3.id.xmpDM:composer = composer
mapping.contentType.mp3.id.yyy = discNumber
mapping.contentType.mp3.id.xmpDM\:audioCompressor = audioFormat
mapping.contentType.mp3.id.xmpDM\:audioSampleRate = sampleRate
mapping.contentType.mp3.id.xmpDM\:audioChannelType = audioChannelType
mapping.contentType.mp3.id.channels = noChannels
mapping.contentType.mp3.id.version = compressorVersion

# images tags in exif directory
# This is an example for a custom parser with a substructure in the tags
mapping.contentType.image.exif.id.0x0100 = imageWidth
mapping.contentType.image.exif.id.0x0101 = imageHeight
mapping.contentType.image.exif.id.0x0102 = bitsPerSample
mapping.contentType.image.exif.id.0x0103 = compression
mapping.contentType.image.exif.id.0x0106 = photometricInterpretation
mapping.contentType.image.exif.id.0x010e = imageDescription
mapping.contentType.image.exif.id.0x010f = make
mapping.contentType.image.exif.id.0x0110 = model
mapping.contentType.image.exif.id.0x0112 = orientation       
mapping.contentType.image.exif.id.0x011a = xResolution
mapping.contentType.image.exif.id.0x011b = yResolution
mapping.contentType.image.exif.id.0x0128 = resolutionUnit
mapping.contentType.image.exif.id.0x0131 = software
mapping.contentType.image.exif.id.0x0132 = dateTime
mapping.contentType.image.exif.id.0x013b = artist
mapping.contentType.image.exif.id.0x0213 = yCbCrPositioning  
mapping.contentType.image.exif.id.0xa406 = sceneCaptureType
mapping.contentType.image.exif.id.0x8298 = copyright
mapping.contentType.image.exif.id.0x829a = exposureTime
mapping.contentType.image.exif.id.0x829d = fNumber
mapping.contentType.image.exif.id.0x8822 = exposureProgram
mapping.contentType.image.exif.id.0x8827 = isoSpeed
mapping.contentType.image.exif.id.0x8825 = gpsAltitudeRef-0x0005
mapping.contentType.image.exif.id.0x882b = selfTimerMode
mapping.contentType.image.exif.id.0x882a = timeZoneOffset
mapping.contentType.image.exif.id.0x9003 = dateTimeOriginal
mapping.contentType.image.exif.id.0x9004 = createDate
mapping.contentType.image.exif.id.0x9201 = shutterSpeedValue
mapping.contentType.image.exif.id.0x9202 = apertureValue
mapping.contentType.image.exif.id.0x9203 = brightnessValue
mapping.contentType.image.exif.id.0x9204 = exposureCompensation
mapping.contentType.image.exif.id.0x9205 = maxApertureValue
mapping.contentType.image.exif.id.0x9207 = meteringMode
mapping.contentType.image.exif.id.0x9206 = subjectDistance
mapping.contentType.image.exif.id.0x9208 = lightSource
mapping.contentType.image.exif.id.0x9209 = flash
mapping.contentType.image.exif.id.0x920a = focalLength
mapping.contentType.image.exif.id.0x9286 = userComment
mapping.contentType.image.exif.id.0xa001 = colorSpace
mapping.contentType.image.exif.id.0xa002 = pixelXDimension
mapping.contentType.image.exif.id.0xa003 = pixelYDimension
mapping.contentType.image.exif.id.0xa402 = exposureMode
mapping.contentType.image.exif.id.0xa403 = whiteBalance
mapping.contentType.image.exif.id.0xa420 = imageUniqueId
mapping.contentType.image.exif.id.0xa430 = ownerName
mapping.contentType.image.exif.id.0xa431 = serialNumber
mapping.contentType.image.exif.id.0x4746 = rating
mapping.contentType.image.exif.id.0x4749 = ratingPercent
# gps directory
mapping.contentType.image.gps.id.0x0001 = gpsLatitudeRef
mapping.contentType.image.gps.id.0x0002 = gpsLatitude
mapping.contentType.image.gps.id.0x0003 = gpsLongitudeRef
mapping.contentType.image.gps.id.0x0004 = gpsLongitude
mapping.contentType.image.gps.id.0x0005 = gpsAltitudeRef
mapping.contentType.image.gps.id.0x0006 = gpsAltitude
# jpeg directory 
mapping.contentType.image.jpeg.id.0x0000 = dataPrecision
mapping.contentType.image.jpeg.id.0x0001 = imageHeight
mapping.contentType.image.jpeg.id.0x0003 = imageWidth

# PDF type
mapping.contentType.pdf.id.xmpTPg\:NPages = noPages
mapping.contentType.pdf.id.title = title
mapping.contentType.pdf.id.author = author
mapping.contentType.pdf.id.creator = creator
mapping.contentType.pdf.id.Keywords = keywords
mapping.contentType.pdf.id.producer = producer
mapping.contentType.pdf.id.subject = subject
mapping.contentType.pdf.id.Creation-Date = createdDate
mapping.contentType.pdf.id.Last-Modified = modifiedDate
mapping.contentType.pdf.id.trapped  = trapped

# Office type
mapping.contentType.office.id.Application-Name = applicationName
mapping.contentType.office.id.Application-Version = applicationVersion
mapping.contentType.office.id.Author = author
mapping.contentType.office.id.Category = category
mapping.contentType.office.id.Content-Status = contentStatus
mapping.contentType.office.id.Comments = comments
mapping.contentType.office.id.Company = company
mapping.contentType.office.id.Keywords = keywords
mapping.contentType.office.id.Last-Author = lastAuthor
mapping.contentType.office.id.Manager = manager
mapping.contentType.office.id.Notes = notes
mapping.contentType.office.id.Presentation-Format = presentationFormat
mapping.contentType.office.id.Revision-Number = revisionNumber
mapping.contentType.office.id.Template = template
mapping.contentType.office.id.Version = version
mapping.contentType.office.id.Character-Count = characterCount
mapping.contentType.office.id.Character-Count-With-Spaces = characterCountWithSpaces
mapping.contentType.office.id.Word-Count = wordCount
mapping.contentType.office.id.Line-Count = lineCount
mapping.contentType.office.id.Page-Count = pageCount
mapping.contentType.office.id.Slide-Count = slideCount
mapping.contentType.office.id.Paragraph-Count = paragraphCount
mapping.contentType.office.id.Total-Time = totalTime
mapping.contentType.office.id.Edit-Time = editTime
mapping.contentType.office.id.Creation-Date = creationDate
mapping.contentType.office.id.Last-Save-Date = lastSaveDate
mapping.contentType.office.id.Last-Printed = lastPrinted

#email
mapping.contentType.email.id.Message-Recipient-Address messageRecipientAddress
mapping.contentType.email.id.Message-From = from
mapping.contentType.email.id.Message-To = to
mapping.contentType.email.tokenizer.to = ;
mapping.contentType.email.id.Message-Cc = cc
mapping.contentType.email.tokenizer.cc = ;
mapping.contentType.email.id.Message-Bcc = bcc
mapping.contentType.email.tokenizer.bcc = ;
mapping.contentType.email.id.subject = subject
mapping.contentType.email.id.Creation-Date = creationDate
mapping.contentType.email.id.Last-Save-Date = lastSaveDate



                
> Enhance command line client in test-util to upload local files with metadata extraction
> ---------------------------------------------------------------------------------------
>
>                 Key: CMIS-515
>                 URL: https://issues.apache.org/jira/browse/CMIS-515
>             Project: Chemistry
>          Issue Type: New Feature
>          Components: opencmis-test
>            Reporter: Jens Hübel
>            Assignee: Jens Hübel
>            Priority: Minor
>             Fix For: OpenCMIS 1.0.0
>
>
> For nicer demos it would be useful to have a tool that can upload files from the file
system to a CMIS server. This could be enhanced by automatic metadata extraction and a mapping
to CMIS types and properties. The InMemory server should get by default types that support
metadata for a few standard file types.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message