Details
-
Improvement
-
Status: Resolved
-
Medium
-
Resolution: Done
-
None
-
None
-
None
-
None
Description
It is noticed that when amount of data in datastore is large, daexim export operation, esp. when performed on the non-leader, fails with AskTimeOutException. Current export implementation reads data from root of the data-tree in one-shot, and so the read operation does not scale very well.
Reading data in smaller chunks e.g. on per module/node basis will be more scalable, but I think the reason data is read in one-shot, is to keep it consistent across the different modules. However in some scenarios e.g. when there is no data dependency across models or when write operations to data-store can be prevented while export is going on, this consistency need not be enforced.
This ticket will add a new boolean option called 'strict-data-consistency' in input of export operation. When value of this option is true (default), one-shot read the way it happens currently will happen. But when value is false, reads will be performed on per module/node basis, after removing the exclusions, and the data will be combined to write the output file. Output files produced by both means will be exactly same as each other, it's just the method of producing them that will be different.