jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adal Gil Darias" <adar...@avantic.net>
Subject Re: Why lucene content search in txt files doesn work?
Date Wed, 31 Jan 2007 10:02:28 GMT
I look for a library and I found this project in sourceforge, i'm not using
it but maybe it can be usefull for you.
http://sourceforge.net/projects/jmimemagic/

2007/1/31, Paco Avila <pavila@git.es>:
>
> El mar, 30-01-2007 a las 22:17 +0200, Jukka Zitting escribió:
> > Hi,
> >
> > On 1/30/07, Adal Gil Darias <adarias@avantic.net> wrote:
> > > I'm trying to search a text in content of my documents.i added all the
> > > filters to configuration xml but i can't obtain the documents wih
> extensions
> > > *.txt I use enconding of UTF-8 when i save the contents in the
> repository.
> > > What is wrong? Where I fail?
> >
> > Do you set the jcr:mimeType property of the node? Jackrabbit uses that
> > property to determine the index filter to use when indexing node. For
> > plain text documents you should set the property to text/plain.
> >
> > The indexer could of course be smarter and interpret common name
> > extensions like ".txt" or magic numbers within the binary streams to
> > automatically determine how to index things... Perhaps an improvement
> > request in Jira is in place.
>
> I'm looking for a java library to get the MIME from the magic numbers.
> Do you know any one? Actually I get the MIME from the file extension,
> but the magic numbers approach is much better, like the "file" command
> in Linux.
>
> Thanks in advance!
>
> --
> GIT Consultors S.L.
> c\ Francesc Rover 2-B
> 07003 Palma de Mallorca
> (Illes Balears)
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message