Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2D10B200B56 for ; Sat, 30 Jul 2016 14:18:16 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2B82D160A8A; Sat, 30 Jul 2016 12:18:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 718F5160A81 for ; Sat, 30 Jul 2016 14:18:15 +0200 (CEST) Received: (qmail 47551 invoked by uid 500); 30 Jul 2016 12:18:14 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 47539 invoked by uid 99); 30 Jul 2016 12:18:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jul 2016 12:18:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id CE8AA187862 for ; Sat, 30 Jul 2016 12:18:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id O5fmf3eQdgcA for ; Sat, 30 Jul 2016 12:18:13 +0000 (UTC) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A301E5F1F5 for ; Sat, 30 Jul 2016 12:18:12 +0000 (UTC) Received: by mail-wm0-f48.google.com with SMTP id p129so46001425wmp.0 for ; Sat, 30 Jul 2016 05:18:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=xMP3TXVBTZq6r8k+zdJuuR9ski44V2f1uDWk7362gFY=; b=O288zxmKkYUnWQUrA4hhu0KbCG7j8CTx+h+qNX6slf7+vJ3ycOEXzKALUd98RDxDa1 YFOT6nCI8iuO7Q7h3fTmrpdSan5Zi/vmbHlOEB+IgI6tH9JORgDiB9/N7cCvbFp6U4Wg 5/3mhjTaFfizgbPrVtVDPj9GC0D6tC4jz5W3i3qvd1fpiT9RZecP5vQOKinRpHX7pM6m o8zUD1uG7O94Ob4tTo0aqsw1aV6H36M9Avez5COdfSL3ol8ReoCHNmkUs+/9LVUTTk9V o0QrRK0jMyWhcuVyUmV4NWe6B541GuCzyc8nneXtbOASfbK6wMDjzFQVjVqWk7K85dWM tnPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=xMP3TXVBTZq6r8k+zdJuuR9ski44V2f1uDWk7362gFY=; b=Vs19NJlRPu1u8dthIAPlSj2dmyWeoVCavTeM8Bc2m8gtiBnhTeH59FJHhXehA/wdG3 md0Y7sSC8+K8gKmuwvxoATy7Gef+UOOPxc8LURnzxWQS7ruwuEHCUwJpk1JUwiIwAw2J XtWB7IsheOHFP2qIiN1FgCyAfvILb5ZB22SS4GsBlm3fvfeVq4zovLdrBzmXDvUBdApw 5FLAzXA6udMWVsUer0Nu2tbFKl901LJV1yTpyTBDhw5a/Fizd6Jl+zOx3ApguL3dtzuG CoTf/a44L2iJjBhSwrVo3Q9ewIL3Vtv7ez20TQmbF5ZGM8CmfoYY1I+p+MrtwuNYCBeH efGA== X-Gm-Message-State: AEkoouuBeC89DOpa/8ysYu5V3gymupOLwBKUJWPlm8Qg5UrmKXDD2+wFMZrATvr3kPn36PxxK9cW2VmolIxrIQ== X-Received: by 10.194.61.205 with SMTP id s13mr41869053wjr.86.1469881091184; Sat, 30 Jul 2016 05:18:11 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.104.132 with HTTP; Sat, 30 Jul 2016 05:17:41 -0700 (PDT) From: Constantin Teodorescu Date: Sat, 30 Jul 2016 15:17:41 +0300 Message-ID: Subject: Mango full text search is immune to accented letters? To: dev@couchdb.apache.org Content-Type: multipart/alternative; boundary=047d7b86c8724d66a30538d95ccf archived-at: Sat, 30 Jul 2016 12:18:16 -0000 --047d7b86c8724d66a30538d95ccf Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Is Mango Full text indexer/search (or would it be) immune for accented letters? I'm planning to use it for searching "posta" but it may be "po=C5=9Ft=C4=83= " in documents! SQLite3 FTS4 is able to do that! For the moment I'm using CouchDB 1.6 views with explicit "flatten function" in JavaScript to create a non-accented index: var translate_re =3D /[=C5=9E=C8=98=C5=A2=C8=9A=C3=8E=C4=82=C3=82=C3=81= =C5=9F=C8=99=C5=A3=C8=9B=C3=AE=C4=83=C3=A2=C3=A1]/g, translate =3D { '=C5=9E': 'S', '=C5=9F': 's', '=C8=98': 'S', '=C8=99': 's', '=C5=A2': 'T', '=C5=A3': 't', '=C8=9A': 'T', '=C8=9B': 't', '=C4=82': 'A', '=C4=83': 'a', '=C3=82': 'A', '=C3=A2': 'a', '=C3=81': 'A', '=C3=A1': 'a', '=C3=8E': 'I', '=C3=AE': 'i' }; function makeSearchString(s) { return ( s.replace(translate_re, function(match) { return translate[match]; }) ); } Teo --047d7b86c8724d66a30538d95ccf--