[CONTROLLER-1328] Clustering: Recovery misses flows installed Created: 18/May/15 Updated: 02/Jun/15 Resolved: 02/Jun/15 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | Post-Helium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Kamal Rameshan | Assignee: | Kamal Rameshan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 3260 |
| Priority: | Highest |
| Description |
|
Steps to reproduce The log is filled with messages : 2015-05-18 14:07:20,796 | INFO | lt-dispatcher-25 | Shard | 175 - org.opendaylight.controller.sal-akka-raft - 1.2.0.SNAPSHOT | Recovery snapshot applied for member-1-shard-inventory-config in 2.119 s: snapshotIndex=39999, snapshotTerm=1, journal-size=0 The snapshot was taken at index=39999 with 0 unapplied entries. |
| Comments |
| Comment by Kamal Rameshan [ 18/May/15 ] |
|
Attachment recovery-issue-1.log has been added with description: recovery log |
| Comment by Kamal Rameshan [ 18/May/15 ] |
|
Sanpshots are taken with 0 unapplied entries at batches of 20000 2015-05-18 14:02:42,024 | INFO | lt-dispatcher-33 | Shard | 175 - org.opendaylight.controller.sal-akka-raft - 1.2.0.SNAPSHOT | member-1-shard-inventory-config: Persisting of snapshot done:Snapshot= {lastTerm:1, lastIndex:39999, LastAppliedIndex:39999, LastAppliedTerm:1, UnAppliedEntries size:0}2015-05-18 14:02:42,025 | INFO | lt-dispatcher-33 | Shard | 175 - org.opendaylight.controller.sal-akka-raft - 1.2.0.SNAPSHOT | member-1-shard-inventory-config: Removed in-memory snapshotted entries, adjusted snaphsotIndex:39999 and term:1 |
| Comment by Tom Pantelis [ 18/May/15 ] |
|
The "Received ReplicatedLogEntry for recovery" output is logged to DEBUG. I assume you changed it to INFO in your build. I suspect when we trimmed the persistent journal after the snapshot we blew away the 322 journal entries that occurred in between the time the snapshot was started and it was committed. So something's wrong there. Either the lastSequenceNumber to delete was incorrect or the 322 entries should've been in the unapplied list. |
| Comment by Kamal Rameshan [ 18/May/15 ] |
|
Yes i did change the log levels to info (somehow logback.xml change was not working for me) Am trying to dig in... |
| Comment by Tom Pantelis [ 18/May/15 ] |
|
I've never used logback.xml - I think that's legacy. With karaf, you turn on debug in etc/org.pos4.pax.logging.cfg. It's standard log4j, e.g. log4j.logger.org.opendaylight...Shard=DEBUG I'll take a look at the code as well. |
| Comment by Kamal Rameshan [ 28/May/15 ] |