devicemap-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reza Naghibi <reza.nagh...@yahoo.com.INVALID>
Subject Re: Deleting the legacy ODDR client and related artifacts from SVN
Date Fri, 02 Jan 2015 19:13:18 GMT
Also, here are 2 examples of pure pattern matching algorithms which work extremely well (both
happen to be written by myself):

-dClass [0]. This project successfully uses pure pattern matching to do device classification.
This project also is successfully able to do OS detection and browser detection using a separate
OS and browser index.

-stats.zone [1]. This is a side project I wrote which parses human language in the baseball
domain. It uses pure pattern matching to accomplish language classification (NLP). This should
demonstrate the power of pattern matching on a much more complicated domain, the English language.

I strongly feel that we need to embrace a full on parallel pattern matching algorithm as the
path forward in this project. This is why I would like to distance this project from the legacy
serial user agent parsing algorithm. Parsing user agents has failed time and time again due
to complexities and an ever changing device landscape. Its also slow, complicated, and very
error prone. Pattern matching is extremely simple, extremely fast, and purely data driven.
I dont expect the core algorithm to change much over the course of major releases, only the
data powering the algorithm.

This project is in a very treacherous landscape since device classification is not widely
accepted. Therefor we need to be as state of the art as possible. We also need to be flexible
and fast.

[0] https://github.com/TheWeatherChannel/dClass
[1] http://stats.zone/

---
      From: Reza Naghibi <reza.naghibi@yahoo.com.INVALID>
 To: Devicemap-dev <dev@devicemap.apache.org> 
 Sent: Friday, January 2, 2015 12:58 PM
 Subject: Deleting the legacy ODDR client and related artifacts from SVN
   
Any objections to deleting the legacy ODDR java client and its related artifacts from SVN?
This is purely a code cleanup. Here are my thoughts on this matter:

-The legacy client was rewritten a year ago and it offers a huge set of improvements. Its
simpler, several orders of magnitude faster, more predictable, and it moves all of the device
logic from code to data. Basically, its modern. One of the biggest changes is that the legacy
ODDR client loops thru every pattern looking for a match, one by one, using a complicated
set of heuristics specific to each class of user-agents. This does not scale. The new client
is able to check all patterns in parallel using pure pattern matching. This scales extremely
well.

-The DDR data can no longer evolve to support the legacy client. While the 1.0.x releases
may work, once 2.0 is released, the legacy client will in no way shape or form still work.

-The legacy client is distraction. Its taking focus away from moving our current objectives
forward. This project, like all projects, must evolve. This means rewriting clients, reformatting
data, and basically throwing old things away. This is a natural process in any software development
project. The same considerations must be given to old artifacts in this project. This project
must evolve.

If there are no objections, I will be removing the legacy artifacts from SVN in 5 days (120
hours).


  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message