archiva-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Lamy <ol...@apache.org>
Subject Re: Archiva 2.0.1: Unwanted scans running
Date Thu, 24 Jul 2014 00:09:58 GMT
On 23 July 2014 23:28, Stallard,David <stallard@oclc.org> wrote:
> Brett, we did do some tweaking to the cron schedules for both snapshots
> and internal yesterday, that¹s probably what initiated the scan.  And I¹m
> guessing a directory scan of snapshots is sitting in the queue waiting for
> the internal scan to finish.  We will probably bounce Archiva to stop
> these scans and clear the queues.  Is there any harmful side effect to
> bouncing during a scan?  I think we¹ve done it before without impact.  As
> an enhancement, an admin button to abort an in-progress scan would be
> useful.
>

I don't see any weird side effect.
Good idea regarding the button to abort. Can you create a jira issue for that?

Thanks
Olivier

> Thanks,
> David
>
>
> On 7/23/14, 12:59 AM, "Brett Porter" <brett@apache.org> wrote:
>
> >From a quick look at the code, it looks like that scan will happen
>>whenever the configuration for the repository is changed. Is that what
>>happened for you?
>>
>>Not sure if that was intentional or not.
>>
>>- Brett
>>
>>On 23 Jul 2014, at 7:13 am, Stallard,David <stallard@oclc.org> wrote:
>>
>>> We have roughly 1.6 terabytes of data in our largest Archiva instance
>>>it it grows rapidly.  Because of this amount of data, and/or perhaps
>>>because of limitations of our current hardware (which we are working to
>>>improve), doing a full directory scan degrades performance of Archiva as
>>>a whole and it can take quite a long time to complete...48 hours or more.
>>>
>>> Because of that, we don't do directory scans unless we feel it's
>>>necessary to fix some unusual situation.  The index scans are usually
>>>sufficient.
>>>
>>> Today, a directory scan of the internal repository mysteriously started
>>>up.  Although the System Status page doesn't say what type of scan is
>>>running, I believe it's a directory scan because the Files Processed
>>>number is equal to the New Files number.  This has bogged down the
>>>system as expected and we're getting complaints from users about uploads
>>>and downloads taking a long time.
>>>
>>> Looking in the log to try and find how this scan was started, I found
>>>the following line:
>>>
>>> 2014-07-22 11:09:26,770 [pool-5-thread-1] INFO
>>>org.apache.archiva.scheduler.repository.ArchivaRepositoryScanningTaskExec
>>>utor [] - Executing task from queue with job name: RepositoryTask
>>>[repositoryId=internal, resourceFile=null, scanAll=true,
>>>updateRelatedArtifacts=false]
>>>
>>> This seems to indicate that either the scheduler kicked it off, or at
>>>some point in the past a directory scan was added to the queue and it is
>>>just now being processed.  I don't know if the latter is even possible
>>>or not...I thought that the stuff in the queue was individual artifacts
>>>that had been marked by scans for later processing.
>>>
>>> Our Cron Expression for the internal repository is the following, which
>>>should not have kicked off a scan at the time shown above.  However,
>>>even if it did, I believe that the Cron Expression usually kicks off
>>>index scans rather than directory scans?
>>>
>>> 0 0 19 * * ?
>>>
>>> So, two questions:
>>>
>>>
>>>  1.  Any idea why this directory scan might have been started?
>>>  2.  Is there any way to stop a scan after it has started?  I'm
>>>assuming a bounce of Archiva would stop it, but an option that didn't
>>>incur downtime would be preferable.
>>>
>>> Thanks,
>>> David
>>
>
>



-- 
Olivier Lamy
Ecetera: http://ecetera.com.au
http://twitter.com/olamy | http://linkedin.com/in/olamy

Mime
View raw message