[INFRAUTILS-8] JobCoordinator (ex-DataStoreJobCoordinator) job failures should indicate stack trace of original caller who submitted job Created: 09/Mar/17 Updated: 24/Sep/21 Resolved: 24/Sep/21 |
|
| Status: | Resolved |
| Project: | infrautils |
| Component/s: | General |
| Affects Version/s: | (unspecified) |
| Fix Version/s: | None |
| Type: | Improvement | ||
| Reporter: | Michael Vorburger | Assignee: | Unassigned |
| Resolution: | Won't Do | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Issue Links: |
|
||||||||
| Description |
|
In order to better understand root causes of issues such as e.g. NB: https://bugs.opendaylight.org/show_bug.cgi?id=7917#c3 "Adding appropriate info in the caller's mainWorker toString would help to identify the originator. However I think capturing the caller's stack trace would be too expensive in production although it could be done in a debug mode." I'm wondering how other Java frameworks deal with not loosing the stack of the original caller when working with async lambdas in Java .. there must be some.. "prior art" in this domain? Perhaps worth trying to learn a bit more about this through online research, before jumping into an implementation. |
| Comments |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
https://bugs.opendaylight.org/show_bug.cgi?id=7917#c5 : > The only way to capture caller identity is to capture it via a Throwable. > Clean way of achieving this is to route the failure back to the requestor – > I mean, at the end of the day, the requestor needs to know about the failure, right? Right... so the REAL problem here is that all the enqueueJob methods in the JobCoordinator (ex-DataStoreJobCoordinator) really instead of void should be returning a ListenableFuture that you can attach some sort of LoggingFutureCallback to (via Futures.addCallback), right? But if with this, you still wouldn't get a nice stack trace in a log, would you? You would just get an ERROR log from the Lambda you passed as the onFailure to the FutureCallback... so this, alone, still wouldn't actually solve the real problem I was after above, I believe. Is there any solution to that? But returning Future from JobCoordinator enqueueJob would actually be very interesting for testability as well (it's something I've been battling with for the component tests). Once https://git.opendaylight.org/gerrit/#/c/51431/ is in (I don't want to hold it back further), we probably should be changing & adding that then... |
| Comment by Michael Vorburger [ 22/Sep/17 ] |
|
New |
| Comment by Michael Vorburger [ 04/Oct/17 ] |
|
On further thought, I think (optionally, due to perf impact) capturing is a low priority, because e.g. in One thing we ARE missing badly is context in case of thread death due to "Thread terminated due to uncaught exception" - such as in |
| Comment by Robert Varga [ 24/Sep/21 ] |
|
|