[CONTROLLER-1557] Snapshot files showed a size of 0 bytes on restart after poweroff Created: 12/Oct/16  Updated: 19/Oct/17  Resolved: 28/Oct/16

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Lithium
Fix Version/s: None

Type: Bug
Reporter: HeYunBo Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Linux
Platform: All


External issue ID: 6929

 Description   

Power off the virtual machine(Centos) when I was creating 100K entries with dsbenchmark test script which triggered snapshots with the config(shard-snapshot-batch-count=100) , and then restart the virtual machine, I found all snapshot files written just before power off showed a size of 0 bytes, finally the datastore recovery failed.

This happened only in virtual machine environment but it was normal in physical host environment



 Comments   
Comment by Robert Varga [ 18/Oct/16 ]

This sounds like a problem in either the persistence provider or the filesystem itself. Moving to clustering component.

Comment by HeYunBo [ 19/Oct/16 ]

I have consulted Akka community about this problem,They replied

"What snapshot store are you using? The LocalSnapshotStore is only provided for use in tests/dev scenarios not for actual use in a production application."

Actually,ODL is using LocalSnapshotStore provided for saving snapshot if didn't specify any snapshot-store in akka.conf.

I think it's a file system problem,it sounds like the snapshot buffer didn't synchronize to the physical medium

In addition,Akka also provided some plugins to Akka Persistence

http://akka.io/community/#plugins-to-akka-persistence

These are related to database storage, I don't know how to choose and use

Comment by Tom Pantelis [ 27/Oct/16 ]

We use the simple file-based local snapshot store by default. We can't use an external DB out-of-box, similar to the journal. Of course, users are free to choose a different persistent provider based on their needs.

I agree this sounds like a file system buffering issue. Closing the bug...

Comment by Vratko Polak [ 28/Oct/16 ]

> Closing the bug...

I think INVALID is for closing reports bugs that are not really happening (typically user errors).

Even if the real cause is in upstream (in this case the filesystem), ODL could still attempt to implement a workaround. And if we are not interested in implementing any workarounds, the correct state is WONTFIX.

Generated at Wed Feb 07 19:55:51 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.