[CONTROLLER-1297] Clustering: Journal recovery error on restart Created: 07/May/15  Updated: 25/Aug/15  Resolved: 25/Aug/15

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Tom Pantelis Assignee: Tom Pantelis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 3154
Priority: High

 Description   

The following error was seen after a controller restart (Helium SR2):

java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Metadata not available for modification [NodeModification [identifier=(com:brocade:neutron:odl?revision=2014-10-02)subnet, modificationType=SUBTREE_MODIFIED, childModification={(com:brocade:neutron:odl?revision=2014-10-02)subnet[

{(com:brocade:neutron:odl?revision=2014-10-02)id=ace19864-b874-47a9-9cef-b02afd52f37b}

]=NodeModification [identifier=(com:brocade:neutron:odl?revision=2014-10-02)subnet[

{(com:brocade:neutron:odl?revision=2014-10-02)id=ace19864-b874-47a9-9cef-b02afd52f37b}

], modificationType=DELETE, childModification={}]}]]
at java.util.concurrent.FutureTask.report(FutureTask.java:122)[:1.7.0_76]
at java.util.concurrent.FutureTask.get(FutureTask.java:188)[:1.7.0_76]
at org.opendaylight.controller.cluster.datastore.Shard.syncCommitTransaction(Shard.java:586)[301:org.opendaylight.controller.sal-distributed-datastore:1.1.2.Helium-SR2]
at org.opendaylight.controller.cluster.datastore.Shard.onRecoveryComplete(Shard.java:729)[301:org.opendaylight.controller.sal-distributed-datastore:1.1.2.Helium-SR2]
at org.opendaylight.controller.cluster.raft.RaftActor.onRecoveryCompletedMessage(RaftActor.java:257)[294:org.opendaylight.controller.sal-akka-raft:1.1.2.Helium-SR2]
at org.opendaylight.controller.cluster.raft.RaftActor.handleRecover(RaftActor.java:160)[294:org.opendaylight.controller.sal-akka-raft:1.1.2.Helium-SR2]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveRecover(AbstractUntypedPersistentActor.java:52)[293:org.opendaylight.controller.sal-clustering-commons:1.1.2.Helium-SR2]
at org.opendaylight.controller.cluster.datastore.Shard.onReceiveRecover(Shard.java:237)[301:org.opendaylight.controller.sal-distributed-datastore:1.1.2.Helium-SR2]

The modification is for a node delete and it seems "Metadata not available ..." indicates the node doesn't exist. If that's true, how did this modification entry get into the persisted journal? Transaction modifications should only get into the journal if the transaction succeeds.

The ramification of this failure is that the rest of the data failed to recover as well. This is b/c we batch journal entries 5000 at a time into a single transaction. This is more performant but the side effect is that one failed modification fails everything.

In addition, the failed entry remains in the RaftActor's in-memory journal so, in a 3 node cluster, if it becomes the leader then it wipes out the other nodes too. We need to protect against a corrupted journal (or a recovery failure) on one node from corrupting the whole cluster.



 Comments   
Comment by Tom Pantelis [ 08/May/15 ]

Interestingly, changing it to apply the journal modification entries one at a time, as was done when they were originally committed, alleviates the issue - no errors and all the data was recovered.

So it seems there's an issue when certain modification sequences are applied all at once rather than one by one. Theoretically, both ways should yield the same result.
I suspect this issue doesn't exist in Li as the tree modification code was refactored and there were bug fixes. Will verify to make sure.

Comment by Moiz Raja [ 11/May/15 ]

Some of my colleagues have reported a problem where in a single transaction they have seen this problem. I'll try to find the exact sequence and let you know. I think this problem could be happening due that reason.

To summarize,

This works,

tx1.put(id, node);
tx1.submit()

tx2.merge(id, node);
tx2.submit();

This does not

tx3.put(id, node);
tx3.merge(id, node);
tx3.submit();

Comment by Tom Pantelis [ 22/May/15 ]

Submitted https://git.opendaylight.org/gerrit/#/c/20736/ to stable/helium to defauklt the journal recovery batch size to 1.

Comment by Moiz Raja [ 25/Aug/15 ]

May be fixed. Appears to work in Lithium.

Generated at Wed Feb 07 19:55:10 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.