Return-Path: X-Original-To: apmail-devicemap-dev-archive@www.apache.org Delivered-To: apmail-devicemap-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9EEB610D32 for ; Wed, 10 Dec 2014 17:15:56 +0000 (UTC) Received: (qmail 87182 invoked by uid 500); 10 Dec 2014 17:15:56 -0000 Delivered-To: apmail-devicemap-dev-archive@devicemap.apache.org Received: (qmail 87146 invoked by uid 500); 10 Dec 2014 17:15:56 -0000 Mailing-List: contact dev-help@devicemap.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@devicemap.apache.org Delivered-To: mailing list dev@devicemap.apache.org Received: (qmail 87133 invoked by uid 99); 10 Dec 2014 17:15:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Dec 2014 17:15:55 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of werner.keil@gmail.com designates 209.85.214.182 as permitted sender) Received: from [209.85.214.182] (HELO mail-ob0-f182.google.com) (209.85.214.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Dec 2014 17:15:29 +0000 Received: by mail-ob0-f182.google.com with SMTP id wo20so2574311obc.13 for ; Wed, 10 Dec 2014 09:13:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=7mz4d1Hp0ymhwJBPJN2bIsIJmDykChlTia/FIh/KV9Q=; b=dPQ/kD590A2CB/zpLFzdn62PQYprOqFMey15A6zA4d0ziLLmYBJOJ+mRosZmIoye+t HCZEqTnpxqfeGfbRNdU3jF3p5TTsbwMFgJX2HEPeVPDjDBQV8hMHEBToghD385M5T7We 83MP6IwWykkaN/EtC7/UfDU2VyZaNFtwpbPl/P7L7WyQaR4zlhbzy1tRL0BoOJ8x7VVd KitEbNN2Vw4tRNgjSTmoy73mzUnOGlKNruxIIn4JqWd5zzb7l87aEUJse87Orb2q+cgK jiACLe5zk8cWqywIj+QYcLFHJK+WifD+vPzrIdamPyOw+SxZg4aziWk8l3EQQ6IdXYx1 +VjQ== MIME-Version: 1.0 X-Received: by 10.182.102.36 with SMTP id fl4mr3481325obb.24.1418231637674; Wed, 10 Dec 2014 09:13:57 -0800 (PST) Received: by 10.202.195.144 with HTTP; Wed, 10 Dec 2014 09:13:57 -0800 (PST) In-Reply-To: <281202337.7922423.1418229785999.JavaMail.yahoo@jws106143.mail.bf1.yahoo.com> References: <281202337.7922423.1418229785999.JavaMail.yahoo@jws106143.mail.bf1.yahoo.com> Date: Wed, 10 Dec 2014 18:13:57 +0100 Message-ID: Subject: Re: 2x Performance Increase in classify() From: Werner Keil To: dev@devicemap.apache.org, Reza Naghibi Content-Type: multipart/alternative; boundary=089e0149d084f910a10509dfc865 X-Virus-Checked: Checked by ClamAV on apache.org --089e0149d084f910a10509dfc865 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Volkan/Reza, Let's keep in mind, the W3C DDR implementation has specialized recognition classes like OrderedTokenDeviceBuilder or TwoStepDeviceBuilder and subclasses that analyze the UserAgent more thoroughly, and currently provide better recognition of say an update to Android 4 or 5. Werner On Wed, Dec 10, 2014 at 5:43 PM, Reza Naghibi < reza.naghibi@yahoo.com.invalid> wrote: > Volkan, > > Thanks for the performance patch. I reviewed it and it looks pretty good. > Pre patch, we were running each ngram set thru some raw string processing > normalizations. You patch does a good job moving that to the beginning an= d > optimizing the regex. Good job :) > > As for pattern matching, if you look at the normalization method, we only > look at alpha-numerics. This was done for simplicity sake. The downside > here is that we weaken any pattern which contains non alpha numerics. The= re > are several ways to address and fix this, but since DeviceMap has control > over its own data, I prefer fixing the patterns and keeping the matching > engine simple. The thing to remember is that our data came from OpenDDR > which had a more complex classification algorithm and heuristics, so we > kind of have a bit of legacy baggage to sort thru as this project evolves= . > > Regarding our next release, I already have the Java client 1.1.0 ready to > go. I would like to get your patch in on the next release, 1.1.1. > > Reza > > > From: Volkan YAZICI > To: "devicemap-dev@incubator.apache.org" < > devicemap-dev@incubator.apache.org> > Sent: Wednesday, December 10, 2014 9:32 AM > Subject: 2x Performance Increase in classify() > > Good news everyone! > > Here is the patch that introduces JMH-based benchmarks for Java client: > DMAP-106 > > And here is the patch that introduces >2x performance gain: DMAP-107 > > > *Sample output:* > > $ export userAgentFile=3D/path/to/user-agents.txt > $ wc -l $userAgentFile > 195325 > $ java \ > -jar > devicemap/java/classifier-benchmark/target/devicemap-client-benchmark.jar > \ > -jvmArgsAppend "-server -XX:+TieredCompilation -XX:+AggressiveOpts > -Xms1024m -Xmx4096m -DuserAgentFile=3D$userAgentFile" \ > -wi 5 -i 5 -bm avgt -tu ms -f 3 \ > ".*DeviceMapClientBenchmark.*" > > # Using the most recent trunk. > Result: 12079.408 =C2=B1(99.9%) 1240.628 ms/op [Average] > Statistics: (min, avg, max) =3D (11232.424, 12079.408, 16011.000), > stdev =3D 1160.484 > Confidence interval (99.9%): [10838.781, 13320.036] > > # Using the enhanced classify(). > Result: 5505.355 =C2=B1(99.9%) 441.748 ms/op [Average] > Statistics: (min, avg, max) =3D (5060.269, 5505.355, 6508.699), stdev = =3D > 413.211 > Confidence interval (99.9%): [5063.607, 5947.103] > > > Cheers! > > > --089e0149d084f910a10509dfc865--