[INFRAUTILS-11] Ready service Created: 10/May/17  Updated: 19/Mar/18  Resolved: 15/Mar/18

Status: Resolved
Project: infrautils
Component/s: General
Affects Version/s: Nitrogen
Fix Version/s: Oxygen

Type: Improvement Priority: Medium
Reporter: Michael Vorburger Assignee: Michael Vorburger
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks CONTROLLER-519 Provide information on MD-SAL's "Read... Resolved
blocks DAEXIM-3 New "auto-import-on-boot" feature Resolved
is blocked by INFRAUTILS-17 BundleDiagInfos Missing dependencies ... Resolved
is blocked by INFRAUTILS-27 Ready API for more fine grained per-b... Resolved
is blocked by ODLPARENT-86 Milestore: upgrade karaf to 4.1.2 or ... Verified
Duplicate
is duplicated by CONTROLLER-519 Provide information on MD-SAL's "Read... Resolved

 Description   

Previous discussions, incl. a session at last year's ODL DDF, have repeatedly identified the need to have a general way to reliably detect when ODL is "ready".

Today projects sometimes have ad-hoc solution for this; for example netvirt's org.opendaylight.netvirt.statemanager.StateManager (which logs "StateManager all is ready") is related to (the lack of a general solution for) this.

I've hit what's basically the same gap again while contributing an "auto-import-on-boot" feature to the Daexim project in https://git.opendaylight.org/gerrit/#/c/55035/, where the Daexim committers would like that feature to hold importing until "the system is full ready" (I don't fully understand why, but that is not the point of this bug), and will therefore as part of the work I'm doing in that context contribute a general new Ready Service, to infrautils.

This will build on top of the work I've done for the Extended SingleFeatureTest (SFT) incl. TestBundleDiag in odlparent (see e.g. https://lists.opendaylight.org/pipermail/release/2017-January/009062.html), and actually re-use that same code.

The first contribution for this will be minimalistic and fulfil the immediate need I have (in Daexim). Later Enhancements by others or myself on top of that first iteration can then obviously extend it; ideas for possible future follow-up improvements which won't be in my initial proposal include but are not limited to:

1. YANG model to expose the simple API I'll offer as RESTCONF etc. RPC

2. YANG data model which sets a flag in the data store (like netvirt's StateManager)

3. JMX ?

3. ...



 Comments   
Comment by Robert Varga [ 10/May/17 ]

Doesn't daexim really want to have the data store and models ready, but applications not started?

Comment by Michael Vorburger [ 10/May/17 ]

> Doesn't daexim really want to have the data store
> and models ready, but applications not started?

I've no idea; but that is not the point of this bug... New DAEXIM-3 open in Daexim now.

Comment by Michael Vorburger [ 10/May/17 ]

> contribute a general new Ready Service, to infrautils

==> https://git.opendaylight.org/gerrit/#/c/56749/, basic v1 done from my end and ready for general review now; I'm hoping hoping to merge that soon-ish (i.e. "days, not weeks"). I'll be raising a separate follow-up Gerrit to add Karaf 3 support on top of above.

Comment by David Suarez [ 12/May/17 ]

We are interested in using this functionality to know when a restore procedure is really finished.

Comment by Vratko Polak [ 25/Sep/17 ]

> https://bugs.opendaylight.org/show_bug.cgi?id=9165#c9

Responding here, as CONTROLLER-1771 was opened just for a simple logging change.

I have never studied software design, so here is my list of ad-hoc ideas.

What is the "northbound" for this service? Both it term of mechanism (Just write to operational datastore, or provide some other publishing method?) and amount of data presented (Will there be timestamps? Will there be IDs to identify currently un-ready components?).

Cluster-wide or local member only?

What about granularity? For example, topology-netconf might be ready, but when detecting configuration change (initiating a connect attempt to a netconf device) should the device connector be regarded as a sub-component with its own readiness? Can we create a hierarchy of such sub-components in an easy fashion? (Our own implementation of ListenableFuture which notifies the readiness service?)

Do we want to report readiness, or quiescence (or both)? In the previous example, while device connector is not ready, topology-netconf (as a service) is ready, but not quiet (as the device operational status is in transient state).

In CSIT we would use both. Global quiescence in only jobs to ensure conditions are repeatable (and karaf.log clean), fine-grained readiness in all jobs to speed up test execution while making sure the functionality works even if ODL is not quiet yet.

What if a sub-component fails (connection to netconf device refused). Should the sub-component report readiness when it processes the failure? Can super-component "unregister" the sub-component when performing cleanup? Do we want a Failure Service?

Any leaks we should think about before finalizing our design?
If a sub-component did not register and is garbage collected, we want to let the garbage collection happen even if it has registered (and unregister it).

Comment by Michael Vorburger [ 25/Sep/17 ]

Changing Target from Nitrogen to Oxygen. Pending:

I'm definitely wanting to close this general "launch" issue within Oxygen, and then have more specific follow-up new issues (linked here as Blocks) for future extensions...

Comment by Michael Vorburger [ 25/Sep/17 ]

> What is the "northbound" for this service?

I'm not sure how you mean the term "northbound" in this context - I always understood "northbound" as RESTCONF API - whereas infrautils.ready is lower-level, just an internal OSGi service.

> write to operational datastore, or provide some other publishing method?

infrautils.ready itself will never write to the operational datastore (because infrautils will never depend on controller/mdsal), BUT infrautils.diag is proposing a CLI command which among other things will also expose infrautils.ready's status, after we merge .

You can find out more about infrautils.diag in the spec on https://git.opendaylight.org/gerrit/#/c/51171/

Another project, perhaps genius will, could write the infrautils.ready status and/or the infrautils.diag services status into MDSAL - if someone needs that and wants to contribute this.

And infrautils.ready happily spams the log, with the same information that you get from SFT, now included at run-time.

> Will there be timestamps?

Not in the API/CLI, no, because it's not "historical". Yes in the LOG.

> Will there be IDs to identify currently un-ready components?).

infrautils.diag has something a bit like this, but not 100% sure.

> Cluster-wide or local member only?

Currently infrautils.ready reports local status only.

infrautils.diag has an upcoming CLI command to query the entire cluster's status, see https://git.opendaylight.org/gerrit/#/c/62900/

> What about granularity?

infrautils.ready's scope is the OSGi bundles.

infrautils.diag's scope is (possible N several) "functional services" within those bundles. It currently has no notion of hierarchy, but if you feel strongly that this could be of value, then perhaps it is something you'd like to open a separate new enhancement issue for, and discuss further (likely more with Faseela than with me) over there.

> Do we want to report readiness, or quiescence (or both)?

infrautils.ready reports readiness of OSGi bundles (incl. their BP state).

I'm not sure yet how quiescence relates to this, but curious to learn more...

Comment by Michael Vorburger [ 15/Mar/18 ]

The infrautils.ready service is available in Oxygen, and this issue is overly broad and has nothing actionable (that I can see) for Fluorine. I encourage all 13 watchers of this issue who have an interest in infrautils.ready to email infrautils-dev with new requirements, which can lead to more specific fine grained issues than this for possible Improvements in Fluorine (an example of such an extension is INFRAUTILS-27).

Comment by Jamo Luhrsen [ 16/Mar/18 ]

vorburger and any other folks that worked to provide this functionality, THANK YOU!

I'm starting work to use it in our CSIT jobs so we can just wait for this for-sure (hopefully) utility to tell us
when we are good to go.

I'll email infrautils, but I think a feature request I have is to make this information available via REST API.
I know deployment tools out there that can benefit from this. Also other fun things like haproxy.

Comment by Faseela K [ 17/Mar/18 ]

jluhrsen ready service status is Incorporated in the showSvcStatus Cli output of diagstatus, which you would have seen sathwik incorporating in genius CSIT.. The same status is available as Mbean, which can be accessed over REST..

Comment by Jamo Luhrsen [ 17/Mar/18 ]

Jamo Luhrsen ready service status is Incorporated in the showSvcStatus Cli output of diagstatus, which you would have seen sathwik incorporating in genius CSIT.. The same status is available as Mbean, which can be accessed over REST..

I do remember seeing a CSIT patch(es) with that showSvcStatus CLI, but didn't think it was related to this "ready status". What is the URI to hit to see this
with REST?

Comment by Michael Vorburger [ 19/Mar/18 ]

jluhrsen just create a new JIRA, if still needed following this email thread.

Comment by Daniel Farrell [ 19/Mar/18 ]

This is awesome, thanks vorburger!

jluhrsen - Once you have an example of this in CSIT I'd like to copy it in Int/Pack logic.

Comment by Jamo Luhrsen [ 19/Mar/18 ]

This is awesome, thanks Michael Vorburger!

Jamo Luhrsen - Once you have an example of this in CSIT I'd like to copy it in Int/Pack logic.

dfarrell07 , here is a patch where
I'm grepping the karaf.log for the specific message. I'll figure out the REST way of doing this too and maybe propose that as well.

Generated at Wed Feb 07 20:02:02 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.