[CONTROLLER-1358] Clustering: Persisted log incorrect on follower after install snapshot Created: 03/Jun/15 Updated: 11/Jun/15 Resolved: 11/Jun/15 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | mdsal |
| Affects Version/s: | Lithium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Tom Pantelis | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 3570 |
| Priority: | High |
| Description |
|
I did a test where I created 18K car items with 18 transactions each with 1000 cars. Everything replicated fine and all 3 nodes were in sync. I stopped a follower, deleted the journal and restarted to simulate node re-install after catastrophic failure. The leader decided to install a snapshot to catch up the follower. This succeeded and follower's JMX data looked correct. I then stopped the follower and switched to single-node to verify persistence recovery. However, on restart, the log was empty and no car data. When the leader installs a snapshot, the follower applies the snapshot and re-initializes its in-memory log but doesn't persist the snapshot. It seems it should persist the snapshot and clear out the persisted journal as a normal snapshot does. |
| Comments |
| Comment by Moiz Raja [ 04/Jun/15 ] |
|
Tom, I think we should persist on install snapshot. It was a mistake not to - I think this is critical |
| Comment by Tom Pantelis [ 04/Jun/15 ] |
|
Agree. I changed it back to critical. (In reply to Moiz Raja from comment #1) |
| Comment by Moiz Raja [ 10/Jun/15 ] |
| Comment by Moiz Raja [ 11/Jun/15 ] |