archiva-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stallard,David" <stall...@oclc.org>
Subject Re: Archiva 2.0.1: Unwanted scans running
Date Wed, 23 Jul 2014 13:28:48 GMT
Brett, we did do some tweaking to the cron schedules for both snapshots
and internal yesterday, that¹s probably what initiated the scan.  And I¹m
guessing a directory scan of snapshots is sitting in the queue waiting for
the internal scan to finish.  We will probably bounce Archiva to stop
these scans and clear the queues.  Is there any harmful side effect to
bouncing during a scan?  I think we¹ve done it before without impact.  As
an enhancement, an admin button to abort an in-progress scan would be
useful.

Thanks,
David


On 7/23/14, 12:59 AM, "Brett Porter" <brett@apache.org> wrote:

>From a quick look at the code, it looks like that scan will happen
>whenever the configuration for the repository is changed. Is that what
>happened for you?
>
>Not sure if that was intentional or not.
>
>- Brett
>
>On 23 Jul 2014, at 7:13 am, Stallard,David <stallard@oclc.org> wrote:
>
>> We have roughly 1.6 terabytes of data in our largest Archiva instance
>>it it grows rapidly.  Because of this amount of data, and/or perhaps
>>because of limitations of our current hardware (which we are working to
>>improve), doing a full directory scan degrades performance of Archiva as
>>a whole and it can take quite a long time to complete...48 hours or more.
>> 
>> Because of that, we don't do directory scans unless we feel it's
>>necessary to fix some unusual situation.  The index scans are usually
>>sufficient.
>> 
>> Today, a directory scan of the internal repository mysteriously started
>>up.  Although the System Status page doesn't say what type of scan is
>>running, I believe it's a directory scan because the Files Processed
>>number is equal to the New Files number.  This has bogged down the
>>system as expected and we're getting complaints from users about uploads
>>and downloads taking a long time.
>> 
>> Looking in the log to try and find how this scan was started, I found
>>the following line:
>> 
>> 2014-07-22 11:09:26,770 [pool-5-thread-1] INFO
>>org.apache.archiva.scheduler.repository.ArchivaRepositoryScanningTaskExec
>>utor [] - Executing task from queue with job name: RepositoryTask
>>[repositoryId=internal, resourceFile=null, scanAll=true,
>>updateRelatedArtifacts=false]
>> 
>> This seems to indicate that either the scheduler kicked it off, or at
>>some point in the past a directory scan was added to the queue and it is
>>just now being processed.  I don't know if the latter is even possible
>>or not...I thought that the stuff in the queue was individual artifacts
>>that had been marked by scans for later processing.
>> 
>> Our Cron Expression for the internal repository is the following, which
>>should not have kicked off a scan at the time shown above.  However,
>>even if it did, I believe that the Cron Expression usually kicks off
>>index scans rather than directory scans?
>> 
>> 0 0 19 * * ?
>> 
>> So, two questions:
>> 
>> 
>>  1.  Any idea why this directory scan might have been started?
>>  2.  Is there any way to stop a scan after it has started?  I'm
>>assuming a bounce of Archiva would stop it, but an option that didn't
>>incur downtime would be preferable.
>> 
>> Thanks,
>> David
>



Mime
View raw message