Return-Path: Delivered-To: apmail-jakarta-commons-user-archive@www.apache.org Received: (qmail 54598 invoked from network); 19 Apr 2006 14:46:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 19 Apr 2006 14:46:06 -0000 Received: (qmail 44176 invoked by uid 500); 19 Apr 2006 14:46:00 -0000 Delivered-To: apmail-jakarta-commons-user-archive@jakarta.apache.org Received: (qmail 44124 invoked by uid 500); 19 Apr 2006 14:45:59 -0000 Mailing-List: contact commons-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "Jakarta Commons Users List" Reply-To: "Jakarta Commons Users List" Delivered-To: mailing list commons-user@jakarta.apache.org Received: (qmail 44113 invoked by uid 99); 19 Apr 2006 14:45:59 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 07:45:59 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [134.98.65.18] (HELO gateway.elsag.de) (134.98.65.18) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 07:45:58 -0700 Received: from vwall.elsag.de by gateway.elsag.de via smtpd (for asf.osuosl.org [140.211.166.49]) with ESMTP; Wed, 19 Apr 2006 16:45:37 +0200 Received: from esmail ([192.168.2.88]) by vwall-vs.elsag.de with InterScan Messaging Security Suite; Wed, 19 Apr 2006 16:45:35 +0200 Received: from esmail by esmail via smtpd (for vwall.elsag.de [192.168.100.108]) with ESMTP; Wed, 19 Apr 2006 16:45:35 +0200 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Subject: RE: JMimeMagic (was [fileUpload] file content-type) Date: Wed, 19 Apr 2006 16:45:35 +0200 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: JMimeMagic (was [fileUpload] file content-type) Thread-Index: AcZjujqMEbamJI39TpiPPtMfRxfbkQAANH+g From: =?iso-8859-1?Q?J=F6rg_Schaible?= To: "Jakarta Commons Users List" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Andrea Spinelli wrote on Wednesday, April 19, 2006 4:05 PM: > J=F6rg Schaible wrote: >=20 >> 1) It is definitely not possible to built on jmimemagic code because >> of licencing reasons=20 >>=20 >>=20 > Yep, but Mark wants to start a new one - reusing ideas is not > forbidden ;-)=20 >=20 >> 2) Although jMimeMagic claims to use an imported magic file > itself, its magic.xml misses a lot of formats (e.g. tiff), > that are present in file magic since ages >>=20 >>=20 > Mark and his friends [including me, as far as I have some spare time] > could read the files magic and magic.mime and generate > something similar > to magic.xml (or better). The problem with an automated process here is that a) due to the limitations with variable lengths most magic bytes for the = text formats have to be revised b) magic bytes in magic and mime.magic differ for same formats c) a+b permits a repetition of the automated process So it is just the question, if writing a generator is worth the time for = a one-time-task. Additionally it is the question whether the content of = the two files should be merged at all, e.g. for "image/gif" it does not = matter, which of the two GIF formats is used. If we also wanna support a = more informational textual format description, the matching trees should = be separated internally even if we have a single configuration file. >> 3) Debug it! The code was definately not designed for speed > as one would expect from a utility that should do such > examinations on the fly >>=20 >>=20 > I agree - maybe you can produce a checklist of point not to be > missed? a.=20 For the design keep in mind, that some formats don't have a header, they = simply append info (e.g. mp3 tags) If you look at file magic you can see, that they use precompiled = versions of the two files. This might be an option too. Another approach = would be to generate a lexer from the configuration. General problem is the sorting of the matchers. A more general matcher = may not globber a specialized one. This problem increases for multiple = configuration files. An application typically deals with the same mime types all the time. A = user should be able to define which formats he wants to be looked for at = all and in what sequence (may be a priority). Support a callback/monitoring/listener mechanism that fires events if a = matcher fails or succeeds. This helps to optimize the sequence (maybe = even on the fly). This is just a summary from thinking loud though ... - J=F6rg --------------------------------------------------------------------- To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-user-help@jakarta.apache.org