[AAA-176] Random errors while fetching JOLOKIA URLs in 3 node CSIT Created: 05/Jul/18  Updated: 30/Aug/18  Resolved: 30/Aug/18

Status: Resolved
Project: aaa
Component/s: None
Affects Version/s: Oxygen
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Jamo Luhrsen Assignee: Ryan Goulding
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

We are seeing the following 500 errors coming in 3node CSIT jobs (netvirt at least):

 	<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body>
<h2>HTTP ERROR: 500</h2>
<p>Problem accessing /jolokia/read/org.opendaylight.controller:Category=ShardManager,name=shard-manager-config,type=DistributedConfigDatastore. Reason:
<pre>    org.apache.shiro.session.UnknownSessionException: There is no session with id [4c109fd8-a9e1-40ea-a7b1-d0de164b77f8]</pre></p>
<hr />
</body>
</html>

An email discussion is here



 Comments   
Comment by Tom Pantelis [ 05/Jul/18 ]

The full error trace is:

2018-07-02T02:48:05,934 | WARN  | qtp1371496041-116 | HttpChannel                      | 156 - org.eclipse.jetty.util - 9.3.21.v20170918 | //10.30.170.46:8181/jolokia/read/org.opendaylight.controller:type=DistributedOperationalDatastore,Category=ShardManager,name=shard-manager-operational/LocalShards
org.apache.shiro.session.UnknownSessionException: There is no session with id [bf682403-082b-4520-993f-987a8804bd31]
        at org.apache.shiro.session.mgt.eis.AbstractSessionDAO.readSession(AbstractSessionDAO.java:170) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.DefaultSessionManager.retrieveSessionFromDataSource(DefaultSessionManager.java:236) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.DefaultSessionManager.retrieveSession(DefaultSessionManager.java:222) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.AbstractValidatingSessionManager.doGetSession(AbstractValidatingSessionManager.java:118) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.AbstractNativeSessionManager.lookupSession(AbstractNativeSessionManager.java:148) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.AbstractNativeSessionManager.lookupRequiredSession(AbstractNativeSessionManager.java:152) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.AbstractNativeSessionManager.getHost(AbstractNativeSessionManager.java:237) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.mgt.DelegatingSession.getHost(DelegatingSession.java:111) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.ProxiedSession.getHost(ProxiedSession.java:93) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.session.ProxiedSession.getHost(ProxiedSession.java:93) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.subject.support.DefaultSubjectContext.resolveHost(DefaultSubjectContext.java:270) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.mgt.DefaultSubjectFactory.createSubject(DefaultSubjectFactory.java:45) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.mgt.DefaultSecurityManager.doCreateSubject(DefaultSecurityManager.java:369) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.mgt.DefaultSecurityManager.createSubject(DefaultSecurityManager.java:344) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.mgt.DefaultSecurityManager.createSubject(DefaultSecurityManager.java:183) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.mgt.DefaultSecurityManager.login(DefaultSecurityManager.java:283) [134:org.apache.shiro.core:1.3.2]
        at org.apache.shiro.subject.support.DelegatingSubject.login(DelegatingSubject.java:256) [134:org.apache.shiro.core:1.3.2]
        at org.opendaylight.aaa.authenticator.ODLAuthenticator.authenticate(ODLAuthenticator.java:42) [198:org.opendaylight.aaa.shiro:0.8.0.SNAPSHOT]
        at Proxyb44c2d91_1f07_4b19_be55_2f3cb0209c21.authenticate(Unknown Source) [?:?]
        at org.jolokia.osgi.security.ServiceAuthenticationHttpContext.handleSecurity(ServiceAuthenticationHttpContext.java:72) [189:org.jolokia.osgi:1.5.0]
        at org.ops4j.pax.web.service.internal.WebContainerContextWrapper.handleSecurity(WebContainerContextWrapper.java:46) [440:org.ops4j.pax.web.pax-web-runtime:6.0.9]
        at org.ops4j.pax.web.service.jetty.internal.HttpServiceServletHandler.doHandle(HttpServiceServletHandler.java:69) [438:org.ops4j.pax.web.pax-web-jetty:6.0.9]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) [151:org.eclipse.jetty.security:9.3.21.v20170918]
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.ops4j.pax.web.service.jetty.internal.HttpServiceContext.doHandle(HttpServiceContext.java:284) [438:org.ops4j.pax.web.pax-web-jetty:6.0.9]
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [154:org.eclipse.jetty.servlet:9.3.21.v20170918]
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.ops4j.pax.web.service.jetty.internal.JettyServerHandlerCollection.handle(JettyServerHandlerCollection.java:80) [438:org.ops4j.pax.web.pax-web-jetty:6.0.9]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.Server.handle(Server.java:534) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [153:org.eclipse.jetty.server:9.3.21.v20170918]
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) [145:org.eclipse.jetty.io:9.3.21.v20170918]
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) [145:org.eclipse.jetty.io:9.3.21.v20170918]
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [145:org.eclipse.jetty.io:9.3.21.v20170918]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [156:org.eclipse.jetty.util:9.3.21.v20170918]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [156:org.eclipse.jetty.util:9.3.21.v20170918]
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [156:org.eclipse.jetty.util:9.3.21.v20170918]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [156:org.eclipse.jetty.util:9.3.21.v20170918]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [156:org.eclipse.jetty.util:9.3.21.v20170918]
        at java.lang.Thread.run(Thread.java:748) [?:?]
Comment by Tom Pantelis [ 05/Jul/18 ]

Do we have anymore context for this issue? Eg, does it involve downed nodes etc

Comment by Jamo Luhrsen [ 05/Jul/18 ]

I haven't dug in to the upstream netvirt CSIT where k.faseela has seen this, but the one instance I notice recently in downstream CSIT
saw this happen after a node was isolated and brought back. The 500 was happening for 60s before the job gave up.

Comment by Tom Pantelis [ 08/Jul/18 ]

I have a potential solution for this - will push a patch in he next couple days.

Comment by Tom Pantelis [ 08/Jul/18 ]

Let's see if https://git.opendaylight.org/gerrit/#/c/73821/ alleviates the issue.

Comment by Tom Pantelis [ 09/Jul/18 ]

Patch is merged.

Comment by Jamo Luhrsen [ 09/Jul/18 ]

k.faseela do you have the easy way to check this is not happening any more? I think your netvirt 3node csit work was
seeing this right? Could you check again after this patch was merged and close this jira out if it's not there? please

Comment by Faseela K [ 09/Jul/18 ]

jluhrsen : I was hitting this error, when I remove the retry for fetching the jolokia URL. I will see if I am hitting the issue on the older patchset of my integration/test patch

Comment by Victor Pickard [ 27/Jul/18 ]

I'm seeing this issue also in the d/s scale lab. I'm using the clustering monitoring tool, which works and I can see the cluster/shard status just fine.

 

Then, when I start the scale tests using browbeat, all of the nodes return http 500 errors, and I see all of these same exceptions in the logs.  

 

 

Generated at Wed Feb 07 19:08:50 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.