lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Herb Roitblat <>
Subject Re: Dimension mismatch exception
Date Thu, 20 Mar 2014 22:09:39 GMT
If you want to compute the cosines between pairs of documents (each a compared with each b),
then the dimension is 100, the size of each document. If you want to compare the whole index
then you will need to make them the same length (number of elements) by padding the shorter
with zeroes. There are computational shortcuts, but this is the principle. 

How are you representing the sentences as numerical values?


Sent from my iPad

> On Mar 20, 2014, at 5:07 PM, "Stefy D." <> wrote:
> Dear all,
> I am trying to compute the cosine similarity between several documents. I have an indexed
directory A made using 10000 files and another indexed directory B made using 20000 files.
All the indexed documents from both directories have the same length (100 sentences). I want
to get the cosine similarity between documents from directory A and documents from directory
B. I have used the code from here but on the two indexed directories. So I use something like
getCosineSimilarity(docs_A[i], docs_B[j]);
> I get the following error:
> Exception in thread "main" org.apache.commons.math3.exception.DimensionMismatchException:
44,375 != 596,263
>     at org.apache.commons.math3.linear.RealVector.checkVectorDimensions(
>     at org.apache.commons.math3.linear.RealVector.checkVectorDimensions(
>     at org.apache.commons.math3.linear.RealVector.dotProduct(
>     at NewApp.testCosine.getCosineSimilarity(
> Please help me. Thank you very much!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message