devicemap-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reza Naghibi <reza.nagh...@yahoo.com.INVALID>
Subject RE: Deterministic Ngram Matcher Hits
Date Mon, 29 Dec 2014 14:55:55 GMT
I believe the traversal ordering should match the index ordering. This was done on purpose
because I think a similar bug existed in the past.

As for ranking, all matches are considered and the highest ranking is picked. If you look
at the ranking function, it has several inputs.

So are you saying we return multiple possible devices to the user? I'm going to say no, it's
the projects job to remove this kind of ambiguity for the user.

The example below is not an algorithm problem. It's a data problem. We just need to clean
up the data and get rid of these incorrect patterns.



<div>-------- Original message --------</div><div>From: Volkan YAZICI <volkan.yazici@gmail.com>
</div><div>Date:12/29/2014  6:46 AM  (GMT-05:00) </div><div>To: dev@devicemap.apache.org
</div><div>Cc:  </div><div>Subject: Deterministic Ngram Matcher Hits
</div><div>
</div>Hi all,

If I am not mistaken, the employed ngram matcher has potential to return
different results for different traversel orderings provided by the
underlying collections framework. This is also evident from the following
issues:

   - HTC One X+ matches to both HTC One X and HTC_One_X.
   <http://markmail.org/message/rzgioqbm22wtzt3p>
   - DMAP-112: Java client test fails with JDK 1.8.0-25
   <https://issues.apache.org/jira/browse/DMAP-112>

I have been thinking about this and it occurred to me that instead of
returning a single hit with the highest score (which varies with the
employed collection traversal ordering), we can return the set of all
feasible hits with the same score. I believe, this will make it easier to
unit test the matcher on different platforms. Comments?

Best.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message