From dev-return-18503-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Wed Nov 21 19:04:22 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3BB56180668 for ; Wed, 21 Nov 2018 19:04:22 +0100 (CET) Received: (qmail 12245 invoked by uid 500); 21 Nov 2018 18:04:21 -0000 Mailing-List: contact dev-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nifi.apache.org Delivered-To: mailing list dev@nifi.apache.org Received: (qmail 12226 invoked by uid 99); 21 Nov 2018 18:04:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Nov 2018 18:04:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2FE1CD2F3D for ; Wed, 21 Nov 2018 18:04:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.341 X-Spam-Level: X-Spam-Status: No, score=0.341 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-1.459, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id aLISCTHmmwKQ for ; Wed, 21 Nov 2018 18:04:17 +0000 (UTC) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 335475F481 for ; Wed, 21 Nov 2018 18:04:15 +0000 (UTC) Received: by mail-lf1-f48.google.com with SMTP id e26so4678455lfc.2 for ; Wed, 21 Nov 2018 10:04:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Ncq7TCvzOpgESzgyrxIRETkSqkTFQyv8/inwdc2Bnpk=; b=XRuB81jAw5fBZyAttrzzM8dE9Sth/D9UE/3LnPSHxTPCJis1YlBzv/MDi16fcKz1kJ 2JyVwbJkRQ46f/179CzUiJBTx3t3jNst/e6zRx1Nh63bnZaJ1KY1y6xaQE2YgyLj1XJd gdr656uyOxqh2/iCtlygA4IBVMnXGaqyRBI7TPXuMqWTC0v/bYQyzV0HzFrwoE+ejASc x3XvKS9OA/CTiMRkISm2zRGhjCykE0jgTJXs0slXfFrlVMKyXSRnNhLrlctua4EA5Gmb FCDcHetlf8jpBeFDDGH92A70XcJHziZCbXEu8wNQNjyo7vcX/n31veDBRUPzndV0qzGD ZYeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Ncq7TCvzOpgESzgyrxIRETkSqkTFQyv8/inwdc2Bnpk=; b=Nxt40IiCeKUZ0C4QBaWeN8v1fNVsbtVtRGH0qKZnycrVunhb9xRbPK1Hp685Vju7UZ jc4n5USrziDIJIaOdKlFFWeJac11IOJ/7HE+IYBi8UpHo2kRi9es7aDDGjw0/h4Unswp hYGYlsrrIqA5lQ662+H/4Ik6QZzcSIaqP8lT5uXF6x2vVbMrUtnvR0wOHZeGwF3Q2tH7 2XxZc/isKgH7nEN351QBtQ2c0OnXUs0TvCHtOVlBsOAGFKVEpt+bJpvnyeRAoEgw5dG2 z/OJH2NxZFXpFNtcIUZGEYiKttb6FwosIRDE/TmxkasOjzC/rsW7f39VsUm1NaY7Hgr2 /Nlg== X-Gm-Message-State: AGRZ1gIuCoivJY+ncguZMsyh71A2x6ryr/SmBBbINLlihIlHGQlRZDkt ESq95vB33UIAC8suBJBtLppgzBWC6OPgA441RbDJkQ== X-Google-Smtp-Source: AJdET5d+JO7wbSk5p1DXl1L0hCUGk2p+Odg86Kornv0QckD7TlFtJohy/ay3+035XG74NmOE0antUlHYGRoCfKNJW9o= X-Received: by 2002:a19:db4a:: with SMTP id s71mr4600398lfg.36.1542823454073; Wed, 21 Nov 2018 10:04:14 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Joe Gresock Date: Wed, 21 Nov 2018 18:04:00 +0000 Message-ID: Subject: Re: Recovering corrupt flowfile repo To: dev@nifi.apache.org Content-Type: multipart/alternative; boundary="000000000000ee6a3b057b309371" --000000000000ee6a3b057b309371 Content-Type: text/plain; charset="UTF-8" Ok... I have an interesting update. I was able to remotely debug the server with the corrupt ff repo. I set a breakpoint at the line where the above exception was thrown, and then I used Eclipse to manually "introspect" the following code: while(sentinelByte != 1) { sentinelByte = in.read(); } I had to do this several times, but after each "fast forwarding" of the input stream, it read some actual WALI update records and was able to successfully recover the repo. At this point, the repo seems to be fixed, but I have no idea what effect my shenanigans had on it. On Wed, Nov 21, 2018 at 12:26 AM Joe Gresock wrote: > Ok, I finally found the jar and got it working using java -cp instead of > java -jar. However, I suspect this procedure might not meet my particular > case, because it's looking for parition-* directories, and I don't have > those. Perhaps the flowfile repo implementation I'm using is different > from the one for which that tool ( > https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-toolkit/nifi-toolkit-flowfile-repo/src/main/java/org/apache/nifi/toolkit/repos/flowfile/RepairCorruptedFileEndings.java) > applies: > > # My configuration > > nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository > > nifi.flowfile.repository.wal.implementation=org.apache.nifi.wali.SequentialAccessWriteAheadLog > > So, here is the actual error I'm getting, in case anyone can help me work > through it. I don't want to give up on the flowfile repo, because this was > production data. > > 2018-11-20 20:07:53,901 INFO [main] > o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 249151 > records and 10 swap files from Snapshot at > /data/nifi/flowfile_repository/checkpoint with Max Transaction ID of > 4233134189 in 8593 milliseconds. Now recovering records from 1 journal files > 2018-11-20 20:07:53,907 INFO [main] o.a.nifi.wali.LengthDelimitedJournal > Recovering records from journal > /data/nifi/flowfile_repository/journals/4233134190.journal > 2018-11-20 20:07:54,592 INFO [main] o.a.nifi.wali.LengthDelimitedJournal > 6.78% of the way finished recovering journal > /data/nifi/flowfile_repository/journals/4233134190.journal, having > recovered 15845 updates > > ... skipping some lines > > 018-11-20 20:08:05,363 INFO [main] o.a.nifi.wali.LengthDelimitedJournal > 88.09% of the way finished recovering journal > /data/nifi/flowfile_repository/journals/4233134190.journal, having > recovered 322778 updates > 2018-11-20 20:08:06,054 INFO [main] o.a.nifi.wali.LengthDelimitedJournal > 94.89% of the way finished recovering journal > /data/nifi/flowfile_repository/journals/4233134190.journal, having > recovered 352084 updates > 2018-11-20 20:08:06,576 ERROR [main] > o.a.nifi.controller.StandardFlowService Failed to load flow from cluster > due to: org.apache.nifi.cluster.ConnectionException: Failed to connect node > to cluster due to: java.io.IOException: Expected to read a Sentinel Byte of > '1' but got a value of '64' instead > org.apache.nifi.cluster.ConnectionException: Failed to connect node to > cluster due to: java.io.IOException: Expected to read a Sentinel Byte of > '1' but got a value of '64' instead > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:946) > at > org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:516) > at > org.apache.nifi.web.server.JettyServer.start(JettyServer.java:872) > at org.apache.nifi.NiFi.(NiFi.java:157) > at org.apache.nifi.NiFi.(NiFi.java:71) > at org.apache.nifi.NiFi.main(NiFi.java:292) > Caused by: java.io.IOException: Expected to read a Sentinel Byte of '1' > but got a value of '64' instead > at > org.apache.nifi.repository.schema.SchemaRecordReader.readRecord(SchemaRecordReader.java:65) > at > org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeRecord(SchemaRepositoryRecordSerde.java:124) > at > org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeEdit(SchemaRepositoryRecordSerde.java:109) > at > org.apache.nifi.controller.repository.SchemaRepositoryRecordSerde.deserializeEdit(SchemaRepositoryRecordSerde.java:46) > at > org.apache.nifi.wali.LengthDelimitedJournal.recoverRecords(LengthDelimitedJournal.java:335) > at > org.apache.nifi.wali.SequentialAccessWriteAheadLog.recoverRecords(SequentialAccessWriteAheadLog.java:198) > at > org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:545) > at > org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:746) > at > org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:956) > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:928) > ... 5 common frames omitted > 2018-11-20 20:08:06,576 INFO [main] o.a.n.c.c.node.NodeClusterCoordinator > ip-172-31-55-35.ec2.internal:8443 requested disconnection from cluster due > to org.apache.nifi.cluster.ConnectionException: Failed to connect node to > cluster due to: java.io.IOException: Expected to read a Sentinel Byte of > '1' but got a value of '64' instead > > Thanks, > Joe > > > On Tue, Nov 20, 2018 at 8:18 PM Joe Gresock wrote: > >> Hi, >> >> I'm trying to restore a corrupt flowfile repo using these instructions: >> https://community.hortonworks.com/content/supportkb/149943/errorjavaioioexception-expected-to-read-a-sentinel.html >> >> I get basically the same error noted on that post. However, the >> instructions reference a jar called >> >> nifi-toolkit-flowfile-repo-1.2.0-SNAPSHOT-jar-with-dependencies.jar >> >> I downloaded nifi-toolkit 1.6.0 (since that's the nifi version I'm using), and I see a nifi-toolkit-flowanalyzer-1.6.0.jar, >> but nothing called nifi-toolkit-flowfile-repo. >> >> Does anyone know where this tool lives now? >> >> Thanks, >> >> Joe >> >> >> -- >> I know what it is to be in need, and I know what it is to have plenty. I >> have learned the secret of being content in any and every situation, >> whether well fed or hungry, whether living in plenty or in want. I can >> do all this through him who gives me strength. *-Philippians 4:12-13* >> > > > -- > I know what it is to be in need, and I know what it is to have plenty. I > have learned the secret of being content in any and every situation, > whether well fed or hungry, whether living in plenty or in want. I can > do all this through him who gives me strength. *-Philippians 4:12-13* > -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13* --000000000000ee6a3b057b309371--