Return-Path: Delivered-To: apmail-lucene-tika-dev-archive@www.apache.org Received: (qmail 93412 invoked from network); 11 Feb 2009 06:00:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Feb 2009 06:00:58 -0000 Received: (qmail 54417 invoked by uid 500); 11 Feb 2009 06:00:58 -0000 Delivered-To: apmail-lucene-tika-dev-archive@lucene.apache.org Received: (qmail 54377 invoked by uid 500); 11 Feb 2009 06:00:58 -0000 Mailing-List: contact tika-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: tika-dev@lucene.apache.org Delivered-To: mailing list tika-dev@lucene.apache.org Received: (qmail 54366 invoked by uid 99); 11 Feb 2009 06:00:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Feb 2009 22:00:58 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jonathan@soe.ucsc.edu designates 128.114.48.10 as permitted sender) Received: from [128.114.48.10] (HELO services.cse.ucsc.edu) (128.114.48.10) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Feb 2009 06:00:47 +0000 Received: from [192.168.64.65] (soenat1.cse.ucsc.edu [128.114.60.40]) (authenticated bits=0) by services.cse.ucsc.edu (8.13.6/8.13.6) with ESMTP id n1B60PCD006217 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Tue, 10 Feb 2009 22:00:25 -0800 (PST) Message-Id: From: Jonathan Koren To: tika-dev@lucene.apache.org In-Reply-To: <510143ac0902090941k5c7ac9e9u9870b7a68d95842@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Using standard XMP schemas for image and audio metadata Date: Tue, 10 Feb 2009 22:00:25 -0800 References: <510143ac0902071132s58b56745s4759622666e296ce@mail.gmail.com> <092DBA10-DED0-4421-89B4-280609360414@soe.ucsc.edu> <510143ac0902080555p42bb41bbgfc0fdb3a1e7abc0d@mail.gmail.com> <8CA7BB6E-2747-476E-BDBC-B112B1567CD3@soe.ucsc.edu> <510143ac0902081059u46429398i5c68f01a30879107@mail.gmail.com> <2BE048CC-F0C9-4114-8DB4-D713E627E20A@soe.ucsc.edu> <510143ac0902090941k5c7ac9e9u9870b7a68d95842@mail.gmail.com> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On Feb 9, 2009, at 9:41 AM, Jukka Zitting wrote: > Because they are useful pieces of metadata that are already accurately > defined in the respective XMP schemas. I for example didn't propose > changing the MIDI metadata key "patches", as AFAIK there is no > standard schema that covers that piece of information. Thats what I realized after I slept on it. >> You create a new class that takes the raw key-value pairs that >> stored in >> Tika::Metadata and translates them to something else. Call it >> Metadata2XMP >> or whatever. That can be packaged within Tika as a convenient class >> that does least common denominator mapping in a well defined way. > > Having such a mapping class within Tika is an alternative, but as > discussed in the Dublin Core thread [1] in December, I'm not sure if > it's worth the added complexity. My proposal covers the use case with > much less extra code or documentation. Storing the raw metadata in Metadata according to its native ontology is most important thing. Which seems to be a consensus from the December thread you linked to. -- Jonathan Koren jonathan@soe.ucsc.edu http://www.soe.ucsc.edu/~jonathan/