lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reda Kouba <redateksys...@gmail.com>
Subject Re: Feasability
Date Thu, 01 Dec 2016 03:37:03 GMT
Someone with a good experience in programming and a good knowledge of Lucene and IR.

best,
reda

> On 1 Dec. 2016, at 14:33, Chris Manu <chrismanu90@hotmail.com> wrote:
> 
> Thank you for responding. So, theoretically, I would need to hire someone with Apache
programing experience to do this correct (given that I know nothing about programing)? What
type of experience should I look for?
> 
> 
> ________________________________
> From: Xavier Morera <xavier@familiamorera.com <mailto:xavier@familiamorera.com>>
> Sent: December 1, 2016 2:23 AM
> To: general@lucene.apache.org <mailto:general@lucene.apache.org>
> Subject: Re: Feasability
> 
> The answer is yes, but you would need to do some programming and
> configuring.
> 
> On Wed, Nov 30, 2016 at 7:54 PM, Chris Manu <chrismanu90@hotmail.com> wrote:
> 
>> Hello,
>> 
>> 
>> I want to start off by saying that I am not a programmer...and have very
>> little knowledge in this area.
>> 
>> 
>> What I would like to know if Apache would be capable of doing the
>> following:
>> 
>> Take an extensive list (A) of strings of unique words (these are titles -
>> anywhere from 4 words to 30) saved in either an Excel worksheet or in a
>> text file and search for instances (B) where these can be found in PDF
>> files saved on a hard drive (over 100k files). The search would need to be
>> done using a fuzzy logic rather than exact matching and the output would be
>> in an Excel file list the unique string found (A), the file name in which
>> the match was made (B), the page number where the match was made and the
>> surrounding text on either side of As well, would this be a complicated
>> program, usable by novices coached in the process necessary to input the
>> title file (A) and direct the search to the relevant folder containing the
>> PDF files (B).
>> 
>> 
>> I eagerly await (hopefully) an affirmative answer.
>> 
>> 
>> Cheers!
>> 
>> 
> 
> 
> --
> 
> *Xavier Morera*
> 
> Entrepreneur | Author & Trainer | Consultant | Developer & Scrum Master
> 
> *www.xaviermorera.com <http://www.xaviermorera.com/>*
> [https://i2.wp.com/www.xaviermorera.com/wp-content/uploads/2016/06/xavier-morera.jpg?resize=150%2C150
<https://i2.wp.com/www.xaviermorera.com/wp-content/uploads/2016/06/xavier-morera.jpg?resize=150%2C150>]<http://www.xaviermorera.com/
<http://www.xaviermorera.com/>>
> 
> Xavier Morera<http://www.xaviermorera.com/ <http://www.xaviermorera.com/>>
> www.xaviermorera.com <http://www.xaviermorera.com/>
> I have been working with Solr for a while, mainly from the .NET world and I basically
love it. I use SolrNet which I think it is a very mature and stable library.
> 
> 
> 
> office:  (305) 600-4919
> 
> cel:     +506 8849-8866
> 
> skype: xmorera
> Twitter <https://twitter.com/xmorera <https://twitter.com/xmorera>> | LinkedIn
> [https://pbs.twimg.com/profile_images/464050157344940033/7AA_lsgC_400x400.jpeg <https://pbs.twimg.com/profile_images/464050157344940033/7AA_lsgC_400x400.jpeg>]<https://twitter.com/xmorera
<https://twitter.com/xmorera>>
> 
> xmorera (@xmorera) | Twitter<https://twitter.com/xmorera <https://twitter.com/xmorera>>
> twitter.com <http://twitter.com/>
> The latest Tweets from xmorera (@xmorera). Eternal optimist, entrepreneur, lifelong learner,
passionate about technology. Costa Rica
> 
> 
> <https://www.linkedin.com/in/xmorera <https://www.linkedin.com/in/xmorera>>
| Pluralsight Author
> [https://media.licdn.com/mpr/mpr/shrinknp_200_200/p/5/005/07f/033/28fdf8e.jpg <https://media.licdn.com/mpr/mpr/shrinknp_200_200/p/5/005/07f/033/28fdf8e.jpg>]<https://www.linkedin.com/in/xmorera
<https://www.linkedin.com/in/xmorera>>
> 
> Xavier Morera | LinkedIn<https://www.linkedin.com/in/xmorera <https://www.linkedin.com/in/xmorera>>
> www.linkedin.com <http://www.linkedin.com/>
> Xavier Morera is an entrepreneur, project manager, Pluralsight author, speaker, trainer,
Certified Scrum Master & Professional and Certified Microsoft professional ...
> 
> 
> <http://www.pluralsight.com/author/xavier-morera <http://www.pluralsight.com/author/xavier-morera>>
> Xavier Morera - .Net Author | Pluralsight<http://www.pluralsight.com/author/xavier-morera
<http://www.pluralsight.com/author/xavier-morera>>
> www.pluralsight.com <http://www.pluralsight.com/>
> Xavier is an entrepreneur, project manager, technical author, trainer, Certified Scrum
Professional & Scrum Master, and Certified Microsoft Professional.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message