Return-Path: X-Original-To: apmail-devicemap-dev-archive@www.apache.org Delivered-To: apmail-devicemap-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8CAB6109BD for ; Tue, 9 Dec 2014 17:12:22 +0000 (UTC) Received: (qmail 90794 invoked by uid 500); 9 Dec 2014 17:12:22 -0000 Delivered-To: apmail-devicemap-dev-archive@devicemap.apache.org Received: (qmail 90759 invoked by uid 500); 9 Dec 2014 17:12:22 -0000 Mailing-List: contact dev-help@devicemap.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@devicemap.apache.org Delivered-To: mailing list dev@devicemap.apache.org Received: (qmail 90747 invoked by uid 99); 9 Dec 2014 17:12:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Dec 2014 17:12:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of werner.keil@gmail.com designates 209.85.218.49 as permitted sender) Received: from [209.85.218.49] (HELO mail-oi0-f49.google.com) (209.85.218.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Dec 2014 17:11:54 +0000 Received: by mail-oi0-f49.google.com with SMTP id i138so703918oig.22 for ; Tue, 09 Dec 2014 09:11:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vTxUPslYqVFXQSpSGcqpGsxPqngsgfOgoJW3rzhSJhM=; b=YNNmEXvquuJs0VFnXshiBq4R2sdZMNs9faKXepdLuQ8UdF6mF/+m1r+NX6twYYk8kF 4V9RMV/IQN56WMpOhYQ3uKkXKJ9RTRLXWGgCPZ3jVmxYRBipVKNas5tjbAfu6faCQcBi W9XehBvNhxjer2RTUaDYGFU+GqP0l11q/rlFX1g6rO3nGssAHiRA5Bv6h09BugAsaX17 yxQAOTI7BSwBWNo7xdtx/GUvMH3gCwK7qcjGmOJ06mfvnvXE1+L0oeBeezgL385Owjkk RnLIP38y3B8iYvlOBwgeAqt7BWZ3DAoTi5jcNTN36M0iDdx21Bf1kiCtb8p9w8CaC07f Perg== MIME-Version: 1.0 X-Received: by 10.60.80.5 with SMTP id n5mr12061181oex.65.1418145068296; Tue, 09 Dec 2014 09:11:08 -0800 (PST) Received: by 10.202.195.144 with HTTP; Tue, 9 Dec 2014 09:11:08 -0800 (PST) In-Reply-To: References: <1064959939.7194951.1418131661540.JavaMail.yahoo@jws10675.mail.bf1.yahoo.com> Date: Tue, 9 Dec 2014 18:11:08 +0100 Message-ID: Subject: Re: Handling Bots and HTTP Clients From: Werner Keil To: dev@devicemap.apache.org Cc: Reza Naghibi Content-Type: multipart/alternative; boundary=089e01228eb009327f0509cba195 X-Virus-Checked: Checked by ClamAV on apache.org --089e01228eb009327f0509cba195 Content-Type: text/plain; charset=UTF-8 A new benchmark would of course be great. As of now in the absence of other performance tests, I had to present the figures from the W3C DDR implementation. Should there be others (I believe Eberhard or other contributors once or twice mentioned blazing fast performance, but so far there has been no sustainable benchmark for others to execute and measure themselves;-) it would benefit not just events like ApacheCon, etc. Werner On Tue, Dec 9, 2014 at 5:59 PM, Volkan YAZICI wrote: > The model I proposed will not buy us a significant performance gain, which > was also not my major motivation. (That being said, I also second the idea > of implementing a benchmark.) Instead, I wanted to address the issue of > separating the concerns of handling bots and regular devices. > > Maybe I better should rephrase my starting point: How can we add new bot > and HTTP client footprints to the existing DDR? > > On Tue Dec 09 2014 at 2:31:24 PM Reza Naghibi > wrote: > > > So let me explain some of the issues with this. Regardless, I would still > > like you to benchmark said patch and share the results. This will help > > drive the direction of future work on the clients. > > > > 1) Im almost certain isBot(ua) will perform worse than classify(ua), > > defeating the whole purpose of short circuiting classify. How do you plan > > on implementing isBot()? If that algorithm performs better than > classify(), > > we might as well use it to match the entire DDR. No? > > > > 2) Under no circumstances should we implement DDR logic in code. The code > > should remain as a generic as possible. This means that its just a plain > > old ngram matcher. This kind of logic belongs in the DDR definition. > Right > > now this allows for patterns and ranking. So maybe what you asking is > that > > high ranking patterns be checked for first in a very quick way? Well, why > > are bots so high ranking? In normal traffic, bots make up a very small > > percentage. So wouldnt it make sense to check for Samsung and Apple > > products? > > > > Once again, if possible, please benchmark some before and afters so we > can > > get a better idea of what we are working with here. Eventhough im leaning > > towards saying this is a bad idea, I think it is a good exercise. > > > > > > From: Volkan YAZICI > > To: "devicemap-dev@incubator.apache.org" > apache.org> > > Sent: Tuesday, December 9, 2014 7:34 AM > > Subject: Handling Bots and HTTP Clients > > > > Hello, > > > > In the context of discussion "how do we handle HTTP clients", I would > like > > to vote for treating them as bots. Further, I want to propose adding a > thin > > layer above DeviceMapClient.classify() to make a shortcut for handling of > > the bots as follows. > > > > private final static Map botAttributes = > > Collections.singletonMap("is_bot", "true"); > > > > public Map classify(String userAgent) { > > if (isBot(userAgent)) return botAttributes; > > } > > > > The motivation for this change is as follows: > > > > - Almost all of the attributes are making no sense for a bot and we are > > losing time to match it against the whole DDR. > > - Bot database will be able to evolve independently. > > - We can come up with a single compiled j.u.regex.Pattern to check > bots. > > (I am pretty sure Reza knows a lot better performing approaches, but > > maybe > > for a future release.) > > > > If the development team is ok with that, I want to implement this > feature. > > > > Best. > > > > > > > --089e01228eb009327f0509cba195--