Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B3C51200BD4 for ; Thu, 1 Dec 2016 11:08:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B23FD160B0F; Thu, 1 Dec 2016 10:08:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D291E160B0B for ; Thu, 1 Dec 2016 11:08:00 +0100 (CET) Received: (qmail 88017 invoked by uid 500); 1 Dec 2016 10:07:57 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 87958 invoked by uid 99); 1 Dec 2016 10:07:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Dec 2016 10:07:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 054B0CA758 for ; Thu, 1 Dec 2016 10:07:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.081 X-Spam-Level: *** X-Spam-Status: No, score=3.081 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, KAM_COUK=1.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=flax-co-uk.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id zUsVQykVxy7m for ; Thu, 1 Dec 2016 10:07:46 +0000 (UTC) Received: from mail-vk0-f54.google.com (mail-vk0-f54.google.com [209.85.213.54]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 0EB6E61015 for ; Thu, 1 Dec 2016 10:07:45 +0000 (UTC) Received: by mail-vk0-f54.google.com with SMTP id p9so125618137vkd.3 for ; Thu, 01 Dec 2016 02:07:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=flax-co-uk.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=WOLfnqKcNNa8Yj2e1HMWNTQDCME2U007YKv1wmEX4Ik=; b=sW4PMSVe+Pa8BcrARRe5JQqBloauCqTbgpovVVjTcpIHwYnPb+G/+nF4fk6L+FP3+p XazR1i4bLTGkB1pAVDzY6xG/320UXmF1EIPYENDz181nIDdy8q6fox2UnlB0cLXW94Ji e083xtf5GsdHis+hZuTbjBs+O7tCohcru3N8OdQuIgiQrsBuQeIDsTgz3uxeKfRgms7g cGsWbIHDA1I43e2crFqyhseJ3uTFtMD8XY07QZWnYBaL+U4vm2SswT9iB+yWevJUeyN+ cLF4hUNANEpn8vJzZ5tVYJAANcyEaftNvmp+6SrjVQ7ZUmXCUGo9V25geeji/4QhNidW vX2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=WOLfnqKcNNa8Yj2e1HMWNTQDCME2U007YKv1wmEX4Ik=; b=JhrvWZ2JUaW1qbC5OwTp6n2lkpnIR0KutdGlP83aLzGcqGU6g07mowgNDiPTS85URy 4w4Gi573sXT4orFoUoLpg1xj9oZftxmu8TVSyv9b0aZ4GdIazfxfxOnoya61ReKxGbc4 Xf8cpWQvi5WrTiyFsn+YkaWXi3lbb8Ue4qrNqTUYa9tDHIJ4/BsOhLu1lZdDVVh6sReS BXCJWClu0pNz+PaHSFs2vA/Swrk+h9c7MmGTD3jn2fxUyac4p0lppJKxIuSpB/irlwuk gy+H2SSNvbhLNSHKed3NpkMINV+7/6XCgqrYQyv4mfRLOqutA6xoVCvjuQrRycmfTZ5b 5R/A== X-Gm-Message-State: AKaTC00BJh+9NsWUTpyrwBpQyBAhsVBB1nz5oS4CENFMpQKj/Gincppa8JgVhgQBMZrL78sr+NThzmO2zPgxZA== X-Received: by 10.31.4.210 with SMTP id 201mr14105958vke.105.1480586863655; Thu, 01 Dec 2016 02:07:43 -0800 (PST) MIME-Version: 1.0 Received: by 10.176.82.228 with HTTP; Thu, 1 Dec 2016 02:07:43 -0800 (PST) In-Reply-To: References: <3AB58BD2-F08B-4165-8EA7-E581DD245161@gmail.com> From: Charlie Hull Date: Thu, 1 Dec 2016 10:07:43 +0000 Message-ID: Subject: Re: Feasability To: general@lucene.apache.org Content-Type: multipart/alternative; boundary=001a11429c8c1150e5054295fe9b archived-at: Thu, 01 Dec 2016 10:08:01 -0000 --001a11429c8c1150e5054295fe9b Content-Type: text/plain; charset=UTF-8 Hi, You have several options: 1. Try and recruit someone with existing Solr skills (hard, there is a skills shortage certainly in the UK and I suspect worldwide, good Solr people are rare and expensive) 2. Try and recruit someone with good enterprise Java skills, hopefully some interest in search and Solr, and train them up (slightly easier but a longer timescale, there are various organisations providing Solr training/mentoring including my own) 3. Engage a consultancy like us with pre-existing experience in Solr development to build what you need (much quicker but obviously there's a cost) 4. Buy a 'packaged' Solr-based search engine such as Fusion from our partner Lucidworks which will save a lot of time/effort developing something (very quick but also an initial and ongoing subscription cost). Hope this helps! Charlie Flax www.flax.co.uk On 1 December 2016 at 05:21, Xavier Morera wrote: > Yes, you need someone I would say with Solr and some sort of development > skills. > > Xavier > > -------------------------------------------- > Sent from a small attention grabbing screen > > > > On Nov 30, 2016, at 21:37, Reda Kouba wrote: > > > > Someone with a good experience in programming and a good knowledge of > Lucene and IR. > > > > best, > > reda > > > >> On 1 Dec. 2016, at 14:33, Chris Manu wrote: > >> > >> Thank you for responding. So, theoretically, I would need to hire > someone with Apache programing experience to do this correct (given that I > know nothing about programing)? What type of experience should I look for? > >> > >> > >> ________________________________ > >> From: Xavier Morera xavier@familiamorera.com>> > >> Sent: December 1, 2016 2:23 AM > >> To: general@lucene.apache.org > >> Subject: Re: Feasability > >> > >> The answer is yes, but you would need to do some programming and > >> configuring. > >> > >>> On Wed, Nov 30, 2016 at 7:54 PM, Chris Manu > wrote: > >>> > >>> Hello, > >>> > >>> > >>> I want to start off by saying that I am not a programmer...and have > very > >>> little knowledge in this area. > >>> > >>> > >>> What I would like to know if Apache would be capable of doing the > >>> following: > >>> > >>> Take an extensive list (A) of strings of unique words (these are > titles - > >>> anywhere from 4 words to 30) saved in either an Excel worksheet or in a > >>> text file and search for instances (B) where these can be found in PDF > >>> files saved on a hard drive (over 100k files). The search would need > to be > >>> done using a fuzzy logic rather than exact matching and the output > would be > >>> in an Excel file list the unique string found (A), the file name in > which > >>> the match was made (B), the page number where the match was made and > the > >>> surrounding text on either side of As well, would this be a complicated > >>> program, usable by novices coached in the process necessary to input > the > >>> title file (A) and direct the search to the relevant folder containing > the > >>> PDF files (B). > >>> > >>> > >>> I eagerly await (hopefully) an affirmative answer. > >>> > >>> > >>> Cheers! > >>> > >>> > >> > >> > >> -- > >> > >> *Xavier Morera* > >> > >> Entrepreneur | Author & Trainer | Consultant | Developer & Scrum Master > >> > >> *www.xaviermorera.com * > >> [https://i2.wp.com/www.xaviermorera.com/wp-content/ > uploads/2016/06/xavier-morera.jpg?resize=150%2C150 xaviermorera.com/wp-content/uploads/2016/06/xavier-morera. > jpg?resize=150%2C150>] http://www.xaviermorera.com/>> > >> > >> Xavier Morera http://www.xaviermorera.com/>> > >> www.xaviermorera.com > >> I have been working with Solr for a while, mainly from the .NET world > and I basically love it. I use SolrNet which I think it is a very mature > and stable library. > >> > >> > >> > >> office: (305) 600-4919 > >> > >> cel: +506 8849-8866 > >> > >> skype: xmorera > >> Twitter > | > LinkedIn > >> [https://pbs.twimg.com/profile_images/464050157344940033/7AA_lsgC_ > 400x400.jpeg 464050157344940033/7AA_lsgC_400x400.jpeg>] https://twitter.com/xmorera>> > >> > >> xmorera (@xmorera) | Twitter https://twitter.com/xmorera>> > >> twitter.com > >> The latest Tweets from xmorera (@xmorera). Eternal optimist, > entrepreneur, lifelong learner, passionate about technology. Costa Rica > >> > >> > >> xmorera>> | Pluralsight Author > >> [https://media.licdn.com/mpr/mpr/shrinknp_200_200/p/5/005/ > 07f/033/28fdf8e.jpg mpr/shrinknp_200_200/p/5/005/07f/033/28fdf8e.jpg>] www.linkedin.com/in/xmorera > > >> > >> Xavier Morera | LinkedIn https://www.linkedin.com/in/xmorera>> > >> www.linkedin.com > >> Xavier Morera is an entrepreneur, project manager, Pluralsight author, > speaker, trainer, Certified Scrum Master & Professional and Certified > Microsoft professional ... > >> > >> > >> http://www.pluralsight.com/author/xavier-morera>> > >> Xavier Morera - .Net Author | Pluralsight pluralsight.com/author/xavier-morera author/xavier-morera>> > >> www.pluralsight.com > >> Xavier is an entrepreneur, project manager, technical author, trainer, > Certified Scrum Professional & Scrum Master, and Certified Microsoft > Professional. > > > --001a11429c8c1150e5054295fe9b--