lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Bickerstaff <j...@johnbickerstaff.com>
Subject Re: Archiving documents
Date Thu, 29 Sep 2016 21:28:10 GMT
I'm not the expert, but I'm thinking you would need an external process to
handle this.  SOLR itself doesn't seem built to use it's own collection
data to act on collection data (I'd love to be wrong about that).

So - barring any corrections from the committers, I'm imagining you'd need
to write some software that does a query against your collection for the
relevant last_modified_date and then either using the returned solr
document data (if you stored everything) or by re-querying the data from
the original source based on an id -- you would add the document(s) to the
"archive" collection.  Once you were sure all was well with this process,
you could issue a command to delete all the docs with a last_modified_date
past a certain point (from the main collection)

If there's a built-in way to accomplish this - or if others have already
thought this through extensively, I'm certainly interested in hearing about
it.

Good luck!



On Thu, Sep 29, 2016 at 6:55 AM, Vasu Y <vyal2k@gmail.com> wrote:

> Hi,
>  We would like to archive documents based on some criteria (like those that
> were not modified for more than an year OR are least used) in order to
> reduce storage requirements.
> I would like hear some of the best practices followed.
>
> How about having main collection and optionally an archive collection (or
> one or more archive collections?) to where we move documents (at regular
> intervals) from the main collection based on some criteria (least used or
> modified date etc.) and provide a flag during search whether to include
> archived documents in search or not?
>
> Thanks,
> Vasu
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message