lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William Morgenweck" <morgenw...@gmail.com>
Subject Jumping back in.
Date Sun, 28 Aug 2016 14:01:32 GMT
To all,

 

I'm looking for a little direction.  I've been a member of the Lucene.Net
group for many years (all most 15+?) but never used it at the level that I'm
hoping to do so now.  I can still remember the caricature of Dan where the
code was originally kept. In the past I've indexed some data from a MS SQL
database and used the search function just to keep my fingers in it.  I work
for a Cancer Center and now we want to do a Non-Profit Commercial quality /
Enterprise level application (meaning bullet proof) that indexes NIH grant
announcements  such as:
(http://grants.nih.gov/grants/guide/pa-files/PAR-16-393.html), there will be
over a 1000 pages/files at a  time all with individual url's.  I either want
to create a web bot or create a process that I pass the url and index the
page or download, save and process the file locally.  I have a process that
saves the file locally, also a process that takes many of the searchable
elements of the document and saves it as a JSON object locally.  I'm looking
for opinions about what direction I should take and maybe point me in the
direction to some sample code, I'm even open to using a No SQL database if
that will be the best long time solution.  Also should I use 3.0.3 RC2 or
Beta 4.8.  I don't have a lot of time to do trial and error checking for
each area and was hoping for the input of some of the readers since you
folks have enormously more expertise than I do using Lucene.Net but
hopefully I can give it a little exposure to governmental agencies and
Cancer Center across the United States.

 

If this question should be directed more to Stack Overflow just let me know.

 

Thanks

 

Bill

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message