nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James McMahon <jsmcmah...@gmail.com>
Subject Re: Unable to restart nifi service
Date Tue, 12 Sep 2017 09:59:06 GMT
Mark, can one execute this thread dump command
bin/nifi.sh dump thread-dump.txt
while nifi continues to try and start up, or in cases where nifi is
actively running? Or does it result in termination of the running nifi
instance?

Here is what I tried and what finally worked last night:
1. I deleted all prov.gz files from my provenance_repository
2. when I still failed on start up, I deleted all under my
provenance_repository
3. still failed, so I ensured nifi was entirely shut down and then I
deleted all swap files from flowfile_repository/swap
4. still failed. We continued to get GC out of memory errors in the
nifi-app.log. We added memory to our nifi server and bumped up -Xmx to
24576m
If anyone else reads this let me emphasize that I am no nifi sys admin
expert, and these steps are not without drawbacks re: provenance and flow.
I leave it to the experts to offer better guidance. All I can offer in my
defense is this: I tried a sequence of steps in progression from low cost
to higher cost. When those didn't work, I fell back on brute force, waived
a white flag in surrender, and threw more memory at the problem
to *hopefully* allow it to get through start up. I had to get our
production workflows up and running again.

Thank you again to all for your advice. I'd welcome feedback. -Jim

On Mon, Sep 11, 2017 at 3:46 PM, Mark Payne <markap14@hotmail.com> wrote:

> Jim,
>
> Would recommend you grab a thread dump and see what the 'main' thread is
> doing.
> (kill -3 <nifi pid>) or (bin/nifi.sh dump thread-dump.txt) or (jstack
> <nifi pid>).
>
> That will tell you what it's doing.
>
> Thanks
> -Mark
>
>
> On Sep 11, 2017, at 2:39 PM, James McMahon <jsmcmahon3@gmail.com> wrote:
>
> Am getting this error in the nifi-app.log file on start attempt:
>
> ERROR [ Framework Task Thread Thread-3] .o.a.n.controller.tasks.ExpireFlowFiles
> Failed to expire FlowFiles due to java.lang.IllegalStateException: Cannot
> update repository until record recovery has been performed
>
>
>
> On Mon, Sep 11, 2017 at 2:27 PM, James McMahon <jsmcmahon3@gmail.com>
> wrote:
>
>> I have nearly 1000 prov.gz files in my provenance_repository. Could it be
>> that the system is taking a long time to restore flowfiles to state?
>>
>> On Mon, Sep 11, 2017 at 2:23 PM, James McMahon <jsmcmahon3@gmail.com>
>> wrote:
>>
>>> my oversight....sorry....0.7.x
>>>
>>> On Mon, Sep 11, 2017 at 2:22 PM, Joe Witt <joe.witt@gmail.com> wrote:
>>>
>>>> Version of nifi?
>>>>
>>>> On Mon, Sep 11, 2017 at 2:20 PM, James McMahon <jsmcmahon3@gmail.com>
>>>> wrote:
>>>> > About a half hour ago my Nifi UI hung, and in the log I found this
>>>> error:
>>>> >
>>>> > ERROR [Provenance Repository Rollover Thread-2] o.a.n.p.
>>>> > PersistentProvenanceRepository
>>>> > java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>> >
>>>> > I was unable to get the UI to respond.
>>>> >
>>>> > I went to my command line and executed   service nifi restart
>>>> >
>>>> > My Nifi log shows a last message of org.apache.nifi.BootstrapListener
>>>> > Successfully initiated connection with Bootstrap
>>>> >
>>>> > but I have been waiting for fifteen minutes and no further messages
>>>> are
>>>> > posted. I'm not sure whether this is simply taking long to review
>>>> provenance
>>>> > records, or hung.
>>>> >
>>>> > What can I do to get my production workflows back up and running?
>>>>  -Jim
>>>>
>>>
>>>
>>
>
>

Mime
View raw message