[CONTROLLER-1511] CDS: persist SchemaContext Created: 13/Apr/16  Updated: 25/Jul/23

Status: Confirmed
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Improvement
Reporter: Robert Varga Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks CONTROLLER-1448 Optimize NormalizedNode streaming by ... Resolved
blocks CONTROLLER-1517 Upgrading model leads to existing dat... Resolved

 Description   

Recovery of data persisted in previous versions has an interesting problem, where we need to interpret stored data based on upgraded SchemaContext – without us knowing what the data actually was.

Distributed data store has access to the current global SchemaContext, as it is required for DataTree operation. Include a portable (either YIN- or YANG-based) copy of SchemaContext in the data store Snapshot and use that for initial recovery. Also perform an explicit snapshot when the SchemaContext changes.

This will allow us to restore the data store snapshot, irrespective of what the runtime software is, and then perform an explicit schema adaptation step, where we convert the data tree to our current running schema context.

Furthermore it will allow software-asymmetric cluster, such as those existing when a cluster is being upgraded node-by-node, without it having been shut down.

A final benefit is that SchemaContext will be available at the persistence layer, allowing us to perform schema-informed serialization and deserialization (including object de-duplication). This capability is needed to deal with heavily-aliased data, such as coming from BGP, which reuses NormalizedNode instances for equal path attributes.



 Comments   
Comment by Robert Varga [ 26/May/16 ]

Actually, be cannot really rely on schema context assembly, because it is subject to implementation bugs, which means that we have no guarantee that the YANG models persisted by the old version will result in exactly the same schema context when assembled by the new version.

In order to perform reasonable data migrations, the new version has to recover the same effective schema context.

Hence we need to define a data format for the persisted data.

I think the best option is to use XSD specification and emit XMLWriter events. This will give use XML output, which we can encode into binary using EXI. We can even do schema-informed encoding, as Exificient can turn XSDs into grammars.

Comment by Robert Varga [ 26/May/16 ]

Since we are changing the snapshot format as part of BUG-5280, I will reserve a place for such a document in the new format, so we can retrofit it later without any breakage.

Also linking updated BUG-2880 as it is blocking progress here.

Comment by Robert Varga [ 15/Mar/19 ]

I think an acceptable solution is to store the YANG models and rely on SchemaContext assembly producing sufficiently-compatible result across major revision bumps.

Comment by Robert Varga [ 29/Mar/19 ]

We actually everything we need to make this happen, as we can easily turn SchemaContext into a set of ModuleDeclaredStatements and stream them out as YANG text. Since we are bumping serialization format in CONTROLLER-1888 it makes sense to batch this change in. It requires a new payload, emitted when the schemacontext changes and update to the snapshot format (i.e. additional metadata class).

Generated at Wed Feb 07 19:55:44 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.