lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Marr <>
Subject Re: semantic vectors
Date Mon, 06 Apr 2009 20:45:08 GMT
Hi Nitin,

I'm assuming you're asking about Latent Semantic Indexing and similar.
This may not be the best place to ask about this. Not sure where else
to suggest though.

If I understand your quesion correctly, the basic idea is that you
take a document (usually text but it could also be an image, video,
whatever) and create a meaningful vector from it. With a text document
you could create the vector by specifying each word in your vocabulary
as a dimension and defining the position in that dimension as the
number of occurences of that word. You can then treat the Euclidian
distance between documents as a measure of similarity. Similar
documents are clustered together.

With images you need different techniques to create the vectors.
Different vector creation algorithms create different measures of
similarity (colour, shape, etc).

I wouldn't say that semantic indexing methods are "different" from the
semantic web, because I see them as overlapping concepts. From my
perspective the semantic web comprises of many approaches, both
top-down (centralised interpretation of meaning like LSI) and
bottom-up (manual meta-tagging of data through mark-up, linking,
microformats, etc.)

Does that help at all? If anyone has corrections, or more to add,
please pile in... I'm wearing my flame-retardant underpants  :)


2009/4/1 nitin gopi <>:
> hi all,
>        I want to know everything about semantic vectors. I want to know how
> does it indexes the documents such that the results produced are
> semantically better than normal search. I also want to know how it is
> different from semantic web, which uses the concept of ontologies and
> metadata. It would be very helpful if somebody mail me all the study
> material related to it?
> Thanking You
> Nitin

Richard Marr
07976 910 515

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message