lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bernd Mueller" <bernd.muel...@scai.fraunhofer.de>
Subject RE: indexing images of a document
Date Tue, 10 Jun 2008 18:40:29 GMT
Hi Steve,


Actually it is not about searching the binary data. I just wanna have the
whole document (together with the "image stuff") to be stored in the
index. So when I got a hit of a search on a document that I can extract
the document with its images (and the images' meta information).

Nice would be some data structure to be stored in the "image"-field of the
document (maybe a HashSet?). So I could extract the found documents with
all their image information from the index for easy visualization purpose
of the search results.


Regards,

Bernd


Steven A Rowe wrote:
> Hi Bernd,
>
> It's still not clear what you want to do.  What will a search look like?
>
> On 06/10/2008 at 8:36 AM, Bernd Mueller wrote:
>> I will try to explain what I mean with image stuff. An image in
>> xml-documents is usually an url to the location where the image is
>> stored. Additionally, such an image tag contains information like
>> offset, width and height. I am thinking about storing the
>> images (binary data) and meta-information (offset, width, height) in one
>> field instead of using these tagging information as values for a field
>> in the index.
>
> You seem to be saying that you want to store the binary image along with
> its metadata all in one field - is this correct?  Why do you want to do
> this?
>
>> I guess, your last statement "Lucene doesn't index binary data..."
>> indicates that this isn't possible...
>
> While Lucene will not *index* binary data, it can *store* it.
>
> Steve
>
>> Erick Erickson wrote:
>> > You add as many fields to the document while indexing as you need to
>> > correctly contain your "image stuff".
>> >
>> > If this answer seems cryptic, it's as clear as your problem statement
>> > <G>. To give you a meaningful answer, we need a much clearer
>> > problem statement. What is "image stuff", what format is it in, and
>> > what do you want to do with it?
>> >
>> > Lucene doesn't index binary data...
>> >
>> > Best
>> > Erick
>> >
>> > On Tue, Jun 10, 2008 at 8:13 AM, Bernd Mueller
>> > <bernd.mueller@scai.fhg.de> wrote:
>> >
>> > > Hello,
>> > >
>> > > I have XML-documents containing image information. These images
>> should
>> > > be indexed with the document by having one additional field with the
>> > > image stuff. Could anyone please give me some hints how I can manage
>> > > this?
>> > >
>> > > Regards,
>> > > Bernd
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Diplom Informatiker (FH)
Bernd Mueller
Fraunhofer SCAI
SchloƟ Birlinghoven
53754 Sankt Augustin

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message