[CONTROLLER-1297] Clustering: Journal recovery error on restart Created: 07/May/15 Updated: 25/Aug/15 Resolved: 25/Aug/15 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | Helium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Tom Pantelis | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 3154 |
| Priority: | High |
| Description |
|
The following error was seen after a controller restart (Helium SR2): java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: Metadata not available for modification [NodeModification [identifier=(com:brocade:neutron:odl?revision=2014-10-02)subnet, modificationType=SUBTREE_MODIFIED, childModification={(com:brocade:neutron:odl?revision=2014-10-02)subnet[ {(com:brocade:neutron:odl?revision=2014-10-02)id=ace19864-b874-47a9-9cef-b02afd52f37b}]=NodeModification [identifier=(com:brocade:neutron:odl?revision=2014-10-02)subnet[ {(com:brocade:neutron:odl?revision=2014-10-02)id=ace19864-b874-47a9-9cef-b02afd52f37b}], modificationType=DELETE, childModification={}]}]] The modification is for a node delete and it seems "Metadata not available ..." indicates the node doesn't exist. If that's true, how did this modification entry get into the persisted journal? Transaction modifications should only get into the journal if the transaction succeeds. The ramification of this failure is that the rest of the data failed to recover as well. This is b/c we batch journal entries 5000 at a time into a single transaction. This is more performant but the side effect is that one failed modification fails everything. In addition, the failed entry remains in the RaftActor's in-memory journal so, in a 3 node cluster, if it becomes the leader then it wipes out the other nodes too. We need to protect against a corrupted journal (or a recovery failure) on one node from corrupting the whole cluster. |
| Comments |
| Comment by Tom Pantelis [ 08/May/15 ] |
|
Interestingly, changing it to apply the journal modification entries one at a time, as was done when they were originally committed, alleviates the issue - no errors and all the data was recovered. So it seems there's an issue when certain modification sequences are applied all at once rather than one by one. Theoretically, both ways should yield the same result. |
| Comment by Moiz Raja [ 11/May/15 ] |
|
Some of my colleagues have reported a problem where in a single transaction they have seen this problem. I'll try to find the exact sequence and let you know. I think this problem could be happening due that reason. To summarize, This works, tx1.put(id, node); tx2.merge(id, node); This does not tx3.put(id, node); |
| Comment by Tom Pantelis [ 22/May/15 ] |
|
Submitted https://git.opendaylight.org/gerrit/#/c/20736/ to stable/helium to defauklt the journal recovery batch size to 1. |
| Comment by Moiz Raja [ 25/Aug/15 ] |
|
May be fixed. Appears to work in Lithium. |