[CONTROLLER-497] Provide uniform Clustering Services across AD-SAL and MD-SAL Created: 20/May/14  Updated: 25/Jul/23  Resolved: 06/Jun/15

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: PC


External issue ID: 1052
Priority: High

 Description   

In the current version of the controller we do support data clustering using Infinispan. Infinispan is a key value store that provides notifications on data changes. This form of clustering is primarily being used by AD-SAL applications. One drawback of Infinispan is that it only supports storing data in a Map structure which is not very efficient when it comes to storing trees. The MD-SAL provides a datastore which stores all the data in a hierarchical tree like structure which closely models the yang model defined by a service provider.

We need a solution which does the following,

1. Allow clustering of data regardless of the data structure. At the very mininmum we need to support both a key-value type of store and a tree store.
2. Use the same clustering and remoting mechanism for invoking remote operations and for notifications.

This needs to be fixed in Helium



 Comments   
Comment by Moiz Raja [ 20/May/14 ]

See proposed design for an MD-SAL Clustered Data Store. This needs to be expanded to support a key-value store as well.

https://wiki.opendaylight.org/view/OpenDaylight_Controller:MD-SAL:Architecture:Clustered_Data_Store

Comment by Giovanni Meo [ 20/May/14 ]

(In reply to Moiz Raja from comment #0)
> In the current version of the controller we do support data clustering using
> Infinispan. Infinispan is a key value store that provides notifications on
> data changes. This form of clustering is primarily being used by AD-SAL
> applications. One drawback of Infinispan is that it only supports storing
> data in a Map structure which is not very efficient when it comes to storing
> trees. The MD-SAL provides a datastore which stores all the data in a
> hierarchical tree like structure which closely models the yang model defined
> by a service provider.
>
> We need a solution which does the following,
>
> 1. Allow clustering of data regardless of the data structure. At the very
> mininmum we need to support both a key-value type of store and a tree store.
> 2. Use the same clustering and remoting mechanism for invoking remote
> operations and for notifications.
>
>
> This needs to be fixed in Helium

Moiz, one clarification here. Infinispan allows also to store tree data, as shown in:

https://wiki.opendaylight.org/view/OpenDaylight_Controller:MD-SAL:The_Infinispan_Data_Store

In fact it's not clear what exactly Infinispan is missing to be the clustering solution instead of reinventing the wheel.

Thanks,
Giovanni

Comment by Mathieu Lemay [ 20/May/14 ]

Clustering should be designed in a way that allows for various feature deployments on nodes and as such is not strictly bound to datastore.. Need to sync up to discuss how this is a runtime service and not specifically MD-SAL related.

Comment by Moiz Raja [ 20/May/14 ]

Giovanni,

Here are some of my issues with Infinispan,

a. Infinispan is essentially a key-value cache with a TreeCache interface built on top. It is not natively a tree and I am concerned that due to this generalization we may not be able to make it as performant as a native tree structure itself which NormalizedNode is.

b. With TreeCache Data Notifications are not guaranteed to work. Moreover they do not work the same way that we expect to use it (see the MD-SAL DataNotification interfaces). Now we could argue that the MD-SAL DataNotification interface is wrong - but I believe it is driven by features that consumers need and we would need to provide an alternative if that interface has to be changed.

c. In the Infinispan prototype TreeCache#removeNode was not working properly - sometimes it would just not remove the node.

d. AFAIK Infinispan does not provide a mechanism to invoke remote operations and do non-data related notifications. This is something that MD-SAL components would need.

Akka seems to provide a the appropriate building blocks for our needs. I've been playing with Akka and Scala for about a week now and I find that it's pretty easy to learn and it offers a lot of the features that we're looking for in a clustering solution. Ofcourse we do need to take it through it's paces by prototyping and running performance tests on it which hopefully will give us a clear answer.

Please do lookup the Akka documentation. Specifically the parts on akka-clustering, akka-remoting and akka-persistence and that may give you a better appreciation for it.

Comment by Giovanni Meo [ 21/May/14 ]

Moiz,

thanks for replying on this topic, some more points inline ...

(In reply to Moiz Raja from comment #4)
> Giovanni,
>
> Here are some of my issues with Infinispan,
>
> a. Infinispan is essentially a key-value cache with a TreeCache interface
> built on top. It is not natively a tree and I am concerned that due to this
> generalization we may not be able to make it as performant as a native tree
> structure itself which NormalizedNode is.

Well i'm curious to understand how will you organize natively the tree then to perform better. Infinispan simply uses the FQDN and the key of the record as a key in the underneath KV store. Which doesn't sound that wrong. But open to understand how this is inefficient versus your projected way of implementing the tree.

> b. With TreeCache Data Notifications are not guaranteed to work. Moreover
> they do not work the same way that we expect to use it (see the MD-SAL
> DataNotification interfaces). Now we could argue that the MD-SAL
> DataNotification interface is wrong - but I believe it is driven by features
> that consumers need and we would need to provide an alternative if that
> interface has to be changed.

That doesn't mean cannot be fixed in Infinispan i guess, it's open source anyway.

> c. In the Infinispan prototype TreeCache#removeNode was not working properly
> - sometimes it would just not remove the node.

Debugging probably would tell the reason for it.

> d. AFAIK Infinispan does not provide a mechanism to invoke remote operations
> and do non-data related notifications. This is something that MD-SAL
> components would need.

You can look at this:

http://infinispan.org/docs/6.0.x/user_guide/user_guide.html#_distributed_execution_framework

> Akka seems to provide a the appropriate building blocks for our needs. I've
> been playing with Akka and Scala for about a week now and I find that it's
> pretty easy to learn and it offers a lot of the features that we're looking
> for in a clustering solution. Ofcourse we do need to take it through it's
> paces by prototyping and running performance tests on it which hopefully
> will give us a clear answer.

Akka is wonderful, till probably we don't get to use it fully. Everything from a 10000 feet looks great.

> Please do lookup the Akka documentation. Specifically the parts on
> akka-clustering, akka-remoting and akka-persistence and that may give you a
> better appreciation for it.

Looked at them, very nice documentation. Just Akka is a remoting infra, and all the data structures need to be implemented on it (both tree and kv store). Now looking at the Infinispan code base, it seems like the task isn't trivial, and still need to go through hardening, which i don't quite see happening in the few months of Helium release.

Comment by Moiz Raja [ 21/May/14 ]

Giovanni,

Thanks for this conversation - I think it's great.

Performance concern
====================

So previously I didn't fully explain why in the long run Infinispan may not be able to match the native NormalizedNode performance.

Firstly the DataStore interfaces needs to offer a NormalizedNode to it's consumers when the request for some data. The NormalizedNode is a tree structure but it's richer in the sense that it's structure very closely models the yang structure of the model and it makes it easier for consumer code to convert the NormalizedNode structure into a BindingAware DataObject.

Now if I flattened and stored the NormalizedNode in an Infinispan TreeCache I would have to reconstruct the NormalizedNode when a consumer requested it. This obviously would not be as efficient in Infinispan which is a key-value store that with a native tree structure like NormalizedNode. In the key value case I would need to look through several entries in the cache as I reconstruct the required NormalizedNode whereas if the structure was natively a NormalizedNode I would just have to walk the tree and return a reference to a node in the tree which would itself be a NormalizeNode since NormalizedNode is a composite.

Notifications
==============

Fixing notifications in Infinispan for the TreeCache could certainly be a something we could do, I agree with you on the principle - but I don't find that to be something practical we can achieve in the short term. Primarily because we may go ahead and implement it and we still might not get a released version of Infinispan in the Helium time frame. I think we need to be able to control our destiny.

Distributed Execution
=====================

I agree that this could be a way for us to execute operations remotely.

Akka
=====

I agree with you that this is new stuff and that it could have problems that we don't know of yet but the building blocks we need are there and I think if we work together we could possibly create data-structures which are tailored to our needs.

If you have input on how we could take Akka through the paces and what kinds of prototyping we should do to ensure that Akka is a good fit I would like to hear it.

Comment by Giovanni Meo [ 22/May/14 ]

Hi Moiz,

comment inline ...

(In reply to Moiz Raja from comment #6)
> Giovanni,
>
> Thanks for this conversation - I think it's great.
>
> Performance concern
> ====================
>
> So previously I didn't fully explain why in the long run Infinispan may not
> be able to match the native NormalizedNode performance.
>
> Firstly the DataStore interfaces needs to offer a NormalizedNode to it's
> consumers when the request for some data. The NormalizedNode is a tree
> structure but it's richer in the sense that it's structure very closely
> models the yang structure of the model and it makes it easier for consumer
> code to convert the NormalizedNode structure into a BindingAware DataObject.

This puzzle me, how can you offer a tree structure when that tree itself could be spread in multiple shards? Does it mean that we claim to shard the tree, but then we don't shard effectively to support the NormalizedNode?

> Now if I flattened and stored the NormalizedNode in an Infinispan TreeCache
> I would have to reconstruct the NormalizedNode when a consumer requested it.
> This obviously would not be as efficient in Infinispan which is a key-value
> store that with a native tree structure like NormalizedNode. In the key
> value case I would need to look through several entries in the cache as I
> reconstruct the required NormalizedNode whereas if the structure was
> natively a NormalizedNode I would just have to walk the tree and return a
> reference to a node in the tree which would itself be a NormalizeNode since
> NormalizedNode is a composite.

Well this walking the tree to return the NormalizedNode make me think things are not going to be very scalable, see concern above.

> Notifications
> ==============
>
> Fixing notifications in Infinispan for the TreeCache could certainly be a
> something we could do, I agree with you on the principle - but I don't find
> that to be something practical we can achieve in the short term. Primarily
> because we may go ahead and implement it and we still might not get a
> released version of Infinispan in the Helium time frame. I think we need to
> be able to control our destiny.

That is not at all an issue, with OSGi you can simply override and substitute any class starting from an Infinispan released version, in fact for example in clustering.services_implementation we embed the Infinispan artifacts and as such we can override any class because any class provided in the class path comes before the one embedded.

> Distributed Execution
> =====================
>
> I agree that this could be a way for us to execute operations remotely.
>
>
> Akka
> =====
>
> I agree with you that this is new stuff and that it could have problems that
> we don't know of yet but the building blocks we need are there and I think
> if we work together we could possibly create data-structures which are
> tailored to our needs.
>
> If you have input on how we could take Akka through the paces and what kinds
> of prototyping we should do to ensure that Akka is a good fit I would like
> to hear it.

Moiz, the only suggestion is to just do it i guess. Many variables around better to start fixing some of them.

Comment by Mathieu Lemay [ 22/May/14 ]

(In reply to Giovanni Meo from comment #7)
> Hi Moiz,
>
> comment inline ...
>
> (In reply to Moiz Raja from comment #6)
> > Giovanni,
> >
> > Thanks for this conversation - I think it's great.
> >
> > Performance concern
> > ====================
> >
> > So previously I didn't fully explain why in the long run Infinispan may not
> > be able to match the native NormalizedNode performance.
> >
> > Firstly the DataStore interfaces needs to offer a NormalizedNode to it's
> > consumers when the request for some data. The NormalizedNode is a tree
> > structure but it's richer in the sense that it's structure very closely
> > models the yang structure of the model and it makes it easier for consumer
> > code to convert the NormalizedNode structure into a BindingAware DataObject.
>
> This puzzle me, how can you offer a tree structure when that tree itself
> could be spread in multiple shards? Does it mean that we claim to shard the
> tree, but then we don't shard effectively to support the NormalizedNode?

I really had to jump in on that one .. as this is my current concern with the approaches. I understand data is closely mapped to yang but is that a good thing from a scalability perspective. Maps shard bests, trees are shardable but harder
to manager and graphs are worst.. Every system I've seen shard maps and have localized tree / graph caches.. if we want to benefit from external stores I think we need to also think about that issues and I know you guys have been thinking about this.

> > Now if I flattened and stored the NormalizedNode in an Infinispan TreeCache
> > I would have to reconstruct the NormalizedNode when a consumer requested it.
> > This obviously would not be as efficient in Infinispan which is a key-value
> > store that with a native tree structure like NormalizedNode. In the key
> > value case I would need to look through several entries in the cache as I
> > reconstruct the required NormalizedNode whereas if the structure was
> > natively a NormalizedNode I would just have to walk the tree and return a
> > reference to a node in the tree which would itself be a NormalizeNode since
> > NormalizedNode is a composite.
>
> Well this walking the tree to return the NormalizedNode make me think things
> are not going to be very scalable, see concern above.

I side with Gio on that one... I want to propose "context composition" for endpoints.. more to come on that soon but the idea is simple try to keep things flat as long as you can and bring up to a tree or a graph only when necessary. Most plugins, add-ons will have limited scope of actions (limited number of levels) so given these endpoints stacked scopes in which they can act instead of a full tree node will yield much better performance results.

>
> > Notifications
> > ==============
> >
> > Fixing notifications in Infinispan for the TreeCache could certainly be a
> > something we could do, I agree with you on the principle - but I don't find
> > that to be something practical we can achieve in the short term. Primarily
> > because we may go ahead and implement it and we still might not get a
> > released version of Infinispan in the Helium time frame. I think we need to
> > be able to control our destiny.
>
> That is not at all an issue, with OSGi you can simply override and
> substitute any class starting from an Infinispan released version, in fact
> for example in clustering.services_implementation we embed the Infinispan
> artifacts and as such we can override any class because any class provided
> in the class path comes before the one embedded.
>
> > Distributed Execution
> > =====================
> >
> > I agree that this could be a way for us to execute operations remotely.
> >
> >
> > Akka
> > =====
> >
> > I agree with you that this is new stuff and that it could have problems that
> > we don't know of yet but the building blocks we need are there and I think
> > if we work together we could possibly create data-structures which are
> > tailored to our needs.
> >
> > If you have input on how we could take Akka through the paces and what kinds
> > of prototyping we should do to ensure that Akka is a good fit I would like
> > to hear it.
>
> Moiz, the only suggestion is to just do it i guess. Many variables around
> better to start fixing some of them.

I am biased here but I think Gio has a point about infinispan.. to me clustered data, cluster management and endpoints / components are important but different matters to solve. Let try doing this one at a time? For Akka based clusters functionalities will be exposed via messaging and OSGi as well as GUI by our end of sprint.

Cheers
Mathieu

Generated at Wed Feb 07 19:53:10 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.