asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1969) org.apache.asterix.common.exceptions.ACIDException: Failed to read a checkpoint file
Date Sun, 09 Jul 2017 20:55:02 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16079717#comment-16079717
] 

ASF subversion and git services commented on ASTERIXDB-1969:
------------------------------------------------------------

Commit c80f53b6d1062fa60540f00e1d88e6eddb756098 in asterixdb's branch refs/heads/master from
[~mhubail]
[ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=c80f53b ]

[ASTERIXDB-1969][STO] Ignore corrupted checkpoints

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
- Ignore and delete corrupted checkpoint files.
- In case all checkpoint files are corrupted, force full recovery.
- Add test to check the new behavior of CheckpointManager.
- Remove unused recovery manager method.

Change-Id: Ied8a188501b63a0d339e6391cac684e3378f4c37
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1871
Sonar-Qube: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Reviewed-by: abdullah alamoudi <bamousaa@gmail.com>


> org.apache.asterix.common.exceptions.ACIDException: Failed to read a checkpoint file
> ------------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1969
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1969
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: STO - Storage
>            Reporter: Taewoo Kim
>            Assignee: Murtadha Hubail
>
> I have a four-node cluster.
> When I start the instance, I saw the following exception in the log of one NC (first
one).
> {code}
> Jun 30, 2017 6:04:15 PM org.apache.hyracks.control.nc.NodeControllerService start
> INFO: Starting NodeControllerService
> Jun 30, 2017 6:04:15 PM org.apache.hyracks.control.nc.NodeControllerService start
> INFO: Setting uncaught exception handler org.apache.hyracks.api.lifecycle.LifeCycleComponentManager@6043cd28
> Jun 30, 2017 6:04:15 PM org.apache.asterix.hyracks.bootstrap.NCApplication start
> INFO: Starting Asterix node controller: 1
> Jun 30, 2017 6:04:16 PM org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager
getLatest
> WARNING: Reading snapshot file: /lv_scratch/scratch/taewok2/4node/txnlog/checkpoint_1498733111492
> Jun 30, 2017 6:04:16 PM org.apache.hyracks.control.nc.NCDriver main
> SEVERE: Exiting NCDriver due to exception
> org.apache.asterix.common.exceptions.ACIDException: Failed to read a checkpoint file
> 	at org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager.getLatest(AbstractCheckpointManager.java:95)
> 	at org.apache.asterix.app.nc.TransactionSubsystem.<init>(TransactionSubsystem.java:78)
> 	at org.apache.asterix.app.nc.NCAppRuntimeContext.initialize(NCAppRuntimeContext.java:189)
> 	at org.apache.asterix.hyracks.bootstrap.NCApplication.start(NCApplication.java:108)
> 	at org.apache.hyracks.control.nc.NodeControllerService.startApplication(NodeControllerService.java:337)
> 	at org.apache.hyracks.control.nc.NodeControllerService.start(NodeControllerService.java:275)
> 	at org.apache.hyracks.control.nc.NCDriver.main(NCDriver.java:47)
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: com.fasterxml.jackson.databind.JsonMappingException:
No content to map due to end-of-input
>  at [Source: ; line: 1, column: 0]
> 	at org.apache.asterix.common.transactions.Checkpoint.fromJson(Checkpoint.java:131)
> 	at org.apache.asterix.transaction.management.service.recovery.AbstractCheckpointManager.getLatest(AbstractCheckpointManager.java:93)
> 	... 6 more
> Caused by: com.fasterxml.jackson.databind.JsonMappingException: No content to map due
to end-of-input
>  at [Source: ; line: 1, column: 0]
> 	at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:270)
> 	at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3838)
> 	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3783)
> 	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2842)
> 	at org.apache.asterix.common.transactions.Checkpoint.fromJson(Checkpoint.java:129)
> 	... 7 more
> {code}
> The contents of the transaction log directory is:
> {code}
> total 8.0K
> -rw-r--r-- 1 taewok2 grad 120 Jun 29 03:10 checkpoint_1498731014520
> -rw-r--r-- 1 taewok2 grad   0 Jun 29 03:45 checkpoint_1498733111492
> -rw-r--r-- 1 taewok2 grad 288 Jun 29 03:45 transaction_log_144
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message