Return-Path: X-Original-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EE101DABA for ; Mon, 20 Aug 2012 07:59:24 +0000 (UTC) Received: (qmail 74968 invoked by uid 500); 20 Aug 2012 07:59:24 -0000 Delivered-To: apmail-incubator-crunch-dev-archive@incubator.apache.org Received: (qmail 74874 invoked by uid 500); 20 Aug 2012 07:59:22 -0000 Mailing-List: contact crunch-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-dev@incubator.apache.org Delivered-To: mailing list crunch-dev@incubator.apache.org Received: (qmail 74673 invoked by uid 99); 20 Aug 2012 07:59:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Aug 2012 07:59:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [93.94.224.195] (HELO owa.exchange-login.net) (93.94.224.195) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Aug 2012 07:59:10 +0000 Received: from HC6.hosted.exchange-login.net (93.94.224.215) by edge2.hosted.exchange-login.net (93.94.224.195) with Microsoft SMTP Server (TLS) id 14.2.298.4; Mon, 20 Aug 2012 09:59:03 +0200 Received: from [192.168.1.246] (93.94.224.250) by owa.exchange-login.net (93.94.224.215) with Microsoft SMTP Server (TLS) id 14.2.318.1; Mon, 20 Aug 2012 09:58:48 +0200 Message-ID: <5031EE36.3040303@xebia.com> Date: Mon, 20 Aug 2012 13:28:46 +0530 From: Rahul User-Agent: Mozilla/5.0 (Windows NT 6.0; WOW64; rv:14.0) Gecko/20120713 Thunderbird/14.0 MIME-Version: 1.0 To: Subject: BloomFilters in Crunch Content-Type: multipart/mixed; boundary="------------040209090206060603030605" X-Originating-IP: [93.94.224.250] --------------040209090206060603030605 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Hi, Today I tried to create BloomFilters using Crunch, attached is the testcase for the same. I do not know if there is a better way of accomplishing the same. I think APIs to create/load BloomFilters could be a good add-on to Crunch's existing set. If people feel like it could be added then I can make a patch for the same. regards, Rahul --------------040209090206060603030605 Content-Type: text/plain; charset="windows-1252"; name="BloomFiltersTest.java" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="BloomFiltersTest.java" cGFja2FnZSBvcmcuYXBhY2hlLmNydW5jaC5ibG9vbWZpbHRlcjsKCmltcG9ydCBzdGF0aWMg b3JnLmp1bml0LkFzc2VydC5hc3NlcnRGYWxzZTsKaW1wb3J0IHN0YXRpYyBvcmcuanVuaXQu QXNzZXJ0LmFzc2VydFRydWU7CgppbXBvcnQgamF2YS5pby5JT0V4Y2VwdGlvbjsKaW1wb3J0 IGphdmEuaW8uU2VyaWFsaXphYmxlOwppbXBvcnQgamF2YS51dGlsLkFycmF5czsKaW1wb3J0 IGphdmEudXRpbC5JdGVyYXRvcjsKaW1wb3J0IGphdmEudXRpbC5MaXN0OwoKaW1wb3J0IG9y Zy5hcGFjaGUuY29tbW9ucy5sYW5nLlN0cmluZ1V0aWxzOwppbXBvcnQgb3JnLmFwYWNoZS5j cnVuY2guQ29tYmluZUZuLkFnZ3JlZ2F0b3I7CmltcG9ydCBvcmcuYXBhY2hlLmNydW5jaC5D b21iaW5lRm4uQWdncmVnYXRvckNvbWJpbmVGbjsKaW1wb3J0IG9yZy5hcGFjaGUuY3J1bmNo LkRvRm47CmltcG9ydCBvcmcuYXBhY2hlLmNydW5jaC5FbWl0dGVyOwppbXBvcnQgb3JnLmFw YWNoZS5jcnVuY2guUENvbGxlY3Rpb247CmltcG9ydCBvcmcuYXBhY2hlLmNydW5jaC5QVGFi bGU7CmltcG9ydCBvcmcuYXBhY2hlLmNydW5jaC5QYWlyOwppbXBvcnQgb3JnLmFwYWNoZS5j cnVuY2guaW1wbC5tci5NUlBpcGVsaW5lOwppbXBvcnQgb3JnLmFwYWNoZS5jcnVuY2guaW8u QXQ7CmltcG9ydCBvcmcuYXBhY2hlLmNydW5jaC50ZXN0LlRlbXBvcmFyeVBhdGg7CmltcG9y dCBvcmcuYXBhY2hlLmNydW5jaC50ZXN0LlRlbXBvcmFyeVBhdGhzOwppbXBvcnQgb3JnLmFw YWNoZS5jcnVuY2gudHlwZXMuUFR5cGVGYW1pbHk7CmltcG9ydCBvcmcuYXBhY2hlLmNydW5j aC50eXBlcy53cml0YWJsZS5Xcml0YWJsZXM7CmltcG9ydCBvcmcuYXBhY2hlLmhhZG9vcC51 dGlsLmJsb29tLkJsb29tRmlsdGVyOwppbXBvcnQgb3JnLmFwYWNoZS5oYWRvb3AudXRpbC5i bG9vbS5LZXk7CmltcG9ydCBvcmcuYXBhY2hlLmhhZG9vcC51dGlsLmhhc2guSGFzaDsKaW1w b3J0IG9yZy5qdW5pdC5SdWxlOwppbXBvcnQgb3JnLmp1bml0LlRlc3Q7CgppbXBvcnQgY29t Lmdvb2dsZS5jb21tb24uY29sbGVjdC5JbW11dGFibGVMaXN0OwoKcHVibGljIGNsYXNzIEJs b29tRmlsdGVyc1Rlc3QgaW1wbGVtZW50cyBTZXJpYWxpemFibGUgewogIEBSdWxlCiAgcHVi bGljIHRyYW5zaWVudCBUZW1wb3JhcnlQYXRoIHRtcERpciA9IFRlbXBvcmFyeVBhdGhzLmNy ZWF0ZSgpOwoKICBAVGVzdAogIHB1YmxpYyB2b2lkIHRlc3RGaWx0ZXJDcmVhdGlvbigpIHRo cm93cyBJT0V4Y2VwdGlvbiB7CiAgICBTdHJpbmcgaW5wdXRQYXRoID0gdG1wRGlyLmNvcHlS ZXNvdXJjZUZpbGVOYW1lKCJzYW1wbGUxLnR4dCIpOwogICAgQmxvb21GaWx0ZXJGbjxTdHJp bmc+IGZpbHRlckZuID0gbmV3IEJsb29tRmlsdGVyRm48U3RyaW5nPigpIHsKICAgICAgQE92 ZXJyaWRlCiAgICAgIExpc3Q8U3RyaW5nPiBnZXRLZXlzKFN0cmluZyBpbnB1dCkgewogICAg ICAgIGlmIChpbnB1dC5sZW5ndGgoKSA+IDQpCiAgICAgICAgICByZXR1cm4gQXJyYXlzLmFz TGlzdChpbnB1dCk7CiAgICAgICAgcmV0dXJuIG51bGw7CiAgICAgIH0KICAgIH07CiAgICBC bG9vbUZpbHRlciBmaWx0ZXIgPSBjcmVhdGVGaWx0ZXIoaW5wdXRQYXRoLCBmaWx0ZXJGbik7 CiAgICBhc3NlcnRGYWxzZShmaWx0ZXIubWVtYmVyc2hpcFRlc3QobmV3IEtleSgiTWNiZXRo Ii5nZXRCeXRlcygpKSkpOwogICAgYXNzZXJ0VHJ1ZShmaWx0ZXIubWVtYmVyc2hpcFRlc3Qo bmV3IEtleSgiYXBwbGUiLmdldEJ5dGVzKCkpKSk7CiAgfQoKICBAVGVzdAogIHB1YmxpYyB2 b2lkIHRlc3RGaWx0ZXJDcmVhdGlvbjIoKSB0aHJvd3MgSU9FeGNlcHRpb24gewogICAgU3Ry aW5nIGlucHV0UGF0aCA9IHRtcERpci5jb3B5UmVzb3VyY2VGaWxlTmFtZSgic2hha2VzLnR4 dCIpOwogICAgQmxvb21GaWx0ZXJGbjxTdHJpbmc+IGZpbHRlckZuID0gbmV3IEJsb29tRmls dGVyRm48U3RyaW5nPigpIHsKICAgICAgQE92ZXJyaWRlCiAgICAgIExpc3Q8U3RyaW5nPiBn ZXRLZXlzKFN0cmluZyBpbnB1dCkgewogICAgICAgIHJldHVybiBBcnJheXMuYXNMaXN0KFN0 cmluZ1V0aWxzLnNwbGl0KGlucHV0LCAiICIpKTsKICAgICAgfQogICAgfTsKICAgIEJsb29t RmlsdGVyIGZpbHRlciA9IGNyZWF0ZUZpbHRlcihpbnB1dFBhdGgsIGZpbHRlckZuKTsKICAg IGFzc2VydFRydWUoZmlsdGVyLm1lbWJlcnNoaXBUZXN0KG5ldyBLZXkoIk1jYmV0aCIuZ2V0 Qnl0ZXMoKSkpKTsKICAgIGFzc2VydFRydWUoZmlsdGVyLm1lbWJlcnNoaXBUZXN0KG5ldyBL ZXkoImFwcGxlcyIuZ2V0Qnl0ZXMoKSkpKTsKICB9CgogIHByaXZhdGUgQmxvb21GaWx0ZXIg Y3JlYXRlRmlsdGVyKFN0cmluZyBpbnB1dFBhdGgsIEJsb29tRmlsdGVyRm48U3RyaW5nPiBm aWx0ZXJGbikgdGhyb3dzIElPRXhjZXB0aW9uIHsKICAgIE1SUGlwZWxpbmUgcGlwZWxpbmUg PSBuZXcgTVJQaXBlbGluZShCbG9vbUZpbHRlcnNUZXN0LmNsYXNzKTsKICAgIFBDb2xsZWN0 aW9uPFN0cmluZz4gc2hha2VzcGVhcmUgPSBwaXBlbGluZS5yZWFkKEF0LnRleHRGaWxlKGlu cHV0UGF0aCkpOwogICAgUFR5cGVGYW1pbHkgdGYgPSBzaGFrZXNwZWFyZS5nZXRUeXBlRmFt aWx5KCk7CiAgICBQVGFibGU8Qm9vbGVhbiwgQmxvb21GaWx0ZXI+IHRhYmxlID0gc2hha2Vz cGVhcmUucGFyYWxsZWxEbyhmaWx0ZXJGbiwKICAgICAgICB0Zi50YWJsZU9mKHRmLmJvb2xl YW5zKCksIFdyaXRhYmxlcy53cml0YWJsZXMoQmxvb21GaWx0ZXIuY2xhc3MpKSk7CiAgICBQ VGFibGU8Qm9vbGVhbiwgQmxvb21GaWx0ZXI+IGNvbWJpbmVWYWx1ZXMgPSB0YWJsZS5ncm91 cEJ5S2V5KDEpLmNvbWJpbmVWYWx1ZXMoCiAgICAgICAgbmV3IEFnZ3JlZ2F0b3JDb21iaW5l Rm48Qm9vbGVhbiwgQmxvb21GaWx0ZXI+KG5ldyBCbG9vbUZpbHRlckFnZ3JlZ2F0b3IoKSkp OwogICAgSXRlcmF0b3I8UGFpcjxCb29sZWFuLCBCbG9vbUZpbHRlcj4+IGl0ZXJhdG9yID0g Y29tYmluZVZhbHVlcy5tYXRlcmlhbGl6ZSgpLml0ZXJhdG9yKCk7CiAgICBzaGFrZXNwZWFy ZS5nZXRQaXBlbGluZSgpLnJ1bigpOwogICAgcmV0dXJuIGl0ZXJhdG9yLm5leHQoKS5zZWNv bmQoKTsKICB9Cn0KCmNsYXNzIEJsb29tRmlsdGVyQWdncmVnYXRvciBpbXBsZW1lbnRzIEFn Z3JlZ2F0b3I8Qmxvb21GaWx0ZXI+IHsKICBwcml2YXRlIHN0YXRpYyBmaW5hbCBsb25nIHNl cmlhbFZlcnNpb25VSUQgPSAxTDsKICBwcml2YXRlIHRyYW5zaWVudCBCbG9vbUZpbHRlciBi bG9vbUZpbHRlciA9IG51bGw7CgogIEBPdmVycmlkZQogIHB1YmxpYyB2b2lkIHJlc2V0KCkg ewogICAgaWYgKGJsb29tRmlsdGVyID09IG51bGwpIHsKICAgICAgYmxvb21GaWx0ZXIgPSBu ZXcgQmxvb21GaWx0ZXIoMTAwMCwgNSwgSGFzaC5NVVJNVVJfSEFTSCk7CiAgICB9CgogIH0K CiAgQE92ZXJyaWRlCiAgcHVibGljIHZvaWQgdXBkYXRlKEJsb29tRmlsdGVyIHZhbHVlKSB7 CiAgICBibG9vbUZpbHRlci5vcih2YWx1ZSk7CiAgfQoKICBAT3ZlcnJpZGUKICBwdWJsaWMg SXRlcmFibGU8Qmxvb21GaWx0ZXI+IHJlc3VsdHMoKSB7CiAgICByZXR1cm4gSW1tdXRhYmxl TGlzdC5vZihibG9vbUZpbHRlcik7CiAgfQoKfQoKYWJzdHJhY3QgY2xhc3MgQmxvb21GaWx0 ZXJGbjxTPiBleHRlbmRzIERvRm48UywgUGFpcjxCb29sZWFuLCBCbG9vbUZpbHRlcj4+IHsK ICBwcml2YXRlIHN0YXRpYyBmaW5hbCBsb25nIHNlcmlhbFZlcnNpb25VSUQgPSAtNDE3MDkw NzQ5MDA0NzMzNTM4N0w7CiAgcHJpdmF0ZSB0cmFuc2llbnQgQmxvb21GaWx0ZXIgYmxvb21G aWx0ZXIgPSBudWxsOwoKICBAT3ZlcnJpZGUKICBwdWJsaWMgdm9pZCBpbml0aWFsaXplKCkg ewogICAgc3VwZXIuaW5pdGlhbGl6ZSgpOwogICAgYmxvb21GaWx0ZXIgPSBuZXcgQmxvb21G aWx0ZXIoMTAwMCwgNSwgSGFzaC5NVVJNVVJfSEFTSCk7CiAgfQoKICBAT3ZlcnJpZGUKICBw dWJsaWMgdm9pZCBwcm9jZXNzKFMgaW5wdXQsIEVtaXR0ZXI8UGFpcjxCb29sZWFuLCBCbG9v bUZpbHRlcj4+IGVtaXR0ZXIpIHsKICAgIExpc3Q8U3RyaW5nPiBrZXkgPSBnZXRLZXlzKGlu cHV0KTsKICAgIGlmIChrZXkgIT0gbnVsbCkgewogICAgICBmb3IgKFN0cmluZyB2YWx1ZSA6 IGtleSkgewogICAgICAgIGlmIChTdHJpbmdVdGlscy5pc05vdEJsYW5rKHZhbHVlKSkKICAg ICAgICAgIGJsb29tRmlsdGVyLmFkZChuZXcgS2V5KHZhbHVlLmdldEJ5dGVzKCkpKTsKICAg ICAgfQogICAgfQogIH0KCiAgYWJzdHJhY3QgTGlzdDxTdHJpbmc+IGdldEtleXMoUyBpbnB1 dCk7CgogIEBPdmVycmlkZQogIHB1YmxpYyB2b2lkIGNsZWFudXAoRW1pdHRlcjxQYWlyPEJv b2xlYW4sIEJsb29tRmlsdGVyPj4gZW1pdHRlcikgewogICAgZW1pdHRlci5lbWl0KFBhaXIu b2YodHJ1ZSwgYmxvb21GaWx0ZXIpKTsKICB9Cn0= --------------040209090206060603030605--