[NEUTRON-154] Neutron's Jersey Problem during the big odlparent/yangtools version bump Created: 15/Jan/18  Updated: 22/Jan/18  Resolved: 16/Jan/18

Status: Resolved
Project: neutron
Component/s: General
Affects Version/s: None
Fix Version/s: master

Type: Bug Priority: Highest
Reporter: Michael Vorburger Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NEUTRON-124 Upgrade Jersey from 1.X to 2.X Resolved
relates to NETCONF-502 API docs does not work after upstream... Resolved

 Description   

The "big bump" of odlparent & yangtools with https://git.opendaylight.org/gerrit/#/c/66509/ breaks Neutron:

[INFO] --- maven-failsafe-plugin:2.20.1:integration-test (default) @ integration-test ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.opendaylight.neutron.e2etest.ITNeutronE2E
(...)
Karaf started in 1s. Bundle stats: 12 active, 12 total
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 221.886 s <<< FAILURE! - in org.opendaylight.neutron.e2etest.ITNeutronE2E
[ERROR] test(org.opendaylight.neutron.e2etest.ITNeutronE2E) Time elapsed: 195.033 s <<< FAILURE!
java.lang.AssertionError: Network Collection GET to URL http://127.0.0.1:8181/controller/nb/v2/neutron/networks with wait failed
 at org.opendaylight.neutron.e2etest.ITNeutronE2E.test(ITNeutronE2E.java:90)

The cause is this which can be seen in integration/test/target/exam/.../data/log/karaf.log:

2018-01-15T14:42:51,103 | ERROR | qtp881932043-138 | ContainerResponse                | 35 - com.sun.jersey.jersey-server - 1.17.0 | The registered message body writers compatible with the MIME media type are:
*/* ->
  com.sun.jersey.core.impl.provider.entity.FormProvider
  com.sun.jersey.core.impl.provider.entity.MimeMultipartProvider
  com.sun.jersey.core.impl.provider.entity.StringProvider
  com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
  com.sun.jersey.core.impl.provider.entity.FileProvider
  com.sun.jersey.core.impl.provider.entity.InputStreamProvider
  com.sun.jersey.core.impl.provider.entity.DataSourceProvider
  com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
  com.sun.jersey.core.impl.provider.entity.ReaderProvider
  com.sun.jersey.core.impl.provider.entity.DocumentProvider
  com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider
  com.sun.jersey.core.impl.provider.entity.SourceProvider$SourceWriter
  com.sun.jersey.server.impl.template.ViewableMessageBodyWriter
  com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
  com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
2018-01-15T14:42:51,103 | ERROR | qtp881932043-138 | ContainerResponse                | 35 - com.sun.jersey.jersey-server - 1.17.0 | Mapped exception to response: 500 (Internal Server Error)
javax.ws.rs.WebApplicationException: com.sun.jersey.api.MessageException: A message body writer for Java class org.opendaylight.neutron.northbound.api.NeutronNetworkRequest, and Java type class org.opendaylight.neutron.northbound.api.NeutronNetworkRequest, and MIME media type application/json was not found
        at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:285) [35:com.sun.jersey.jersey-server:1.17.0]

There are other exceptions in the log, but those are not causing this problem; notably the problem below is unrelated, even though it seem to sometimes cause the ITNeutronE2E to at first fail with org.ops4j.pax.swissbox.tracker.ServiceLookupException: gave up waiting for service org.ops4j.pax.exam.ProbeInvoker, but on retrying a build it will get it and hit above, this problem is something else which skitt says he has a solution coming up for in controller:

2018-01-15T14:39:56,109 | ERROR | ConfigFeatureListener - ConfigPusher | FeatureConfigPusher              | 47 - config-persister-feature-adapter - 0.8.0.SNAPSHOT | Giving up (after 100 retries) on Karaf featuresService.listInstalledFeatures() which has not yet finished installing feature jaas-boot 0.0.0
2018-01-15T14:39:56,154 | INFO  | ConfigFeatureListener - ConfigPusher | FeatureConfigPusher              | 47 - config-persister-feature-adapter - 0.8.0.SNAPSHOT | Karaf Feature Service has not yet finished installing feature jaas-boot/0.0.0 (retry 0)

It is trivial to reproduce this issue, by starting neutron/karaf/target/assembly/bin/karaf and hitting HTTP GET http://127.0.0.1:8181/controller/nb/v2/neutron/networks, even just with your web browser, and HTTP BASIC admin/admin, will produce above in the Karaf log.



 Comments   
Comment by Tom Pantelis [ 15/Jan/18 ]

I got by the "A message body writer..." error by adding:

 
     <dependency>
        <groupId>com.sun.jersey</groupId>
        <artifactId>jersey-json</artifactId>
      </dependency>
 
to the odl-neutron-northbound-api feature pom based on https://stackoverflow.com/questions/13108161/a-message-body-writer-for-java-class-not-found  but now I get:
 
2018-01-13T02:18:44,772 | ERROR | qtp1602373834-139 | ContainerResponse                | 35 - com.sun.jersey.jersey-server - 1.17.0 | Mapped exception to response: 500 (Internal Server Error)
javax.ws.rs.WebApplicationException: javax.xml.bind.JAXBException
 - with linked exception:
[java.lang.ClassNotFoundException: com.sun.xml.bind.v2.ContextFactory cannot be found by org.opendaylight.neutron.northbound-api_0.10.0.SNAPSHOT]
        at com.sun.jersey.core.provider.jaxb.AbstractRootElementProvider.writeTo(AbstractRootElementProvider.java:159) [34:com.sun.jersey.core:1.17.0]
        at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:306) [35:com.sun.jersey.jersey-server:1.17.0]
 
I got by that by adding com.sun.xml.bind.v2 to Import-Package but now get another failure:
 
java.lang.AssertionError: E2E Tests Failed - Collection not Array
  at org.opendaylight.neutron.e2etest.ITNeutronE2E.test_fetch_collection_response(ITNeutronE2E.java:276)
  at org.opendaylight.neutron.e2etest.ITNeutronE2E.test_fetch_with_one_query_item(ITNeutronE2E.java:296)    
  at org.opendaylight.neutron.e2etest.ITNeutronE2E.test(ITNeutronE2E.java:95)
 
 
 
 

Comment by Robert Varga [ 15/Jan/18 ]

This is probably related to https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.8 and more specifically to https://github.com/FasterXML/jackson-jaxrs-providers/issues/22 .

Comment by Michael Vorburger [ 15/Jan/18 ]

https://git.opendaylight.org/gerrit/#/c/67165/ from skitt fixes the ConfigPusher (which has nothing at all directly to do with this, but initially seemed to be the reason for the ITNeutronE2E failure, but turned out not to be).

The "Collection not Array" is because that jsonElementValue now suddenly is not a JSOn array anymore, so somehow Jersey is behaving differently after the bump; I've added logging to the test and see that it now is this: "id":"4e8e5957-649f-477b-9e5b-f1f75b21c03c","tenant_id":"9bacb3c5d39d41a79512987f338cf177","name":"net1","admin_state_up":"true","status":"ACTIVE","shared":"false","external":"false","network_type":"flat","segments":[null,null]. I know nothing at all about neutron code, but will try to understand if this is some... mapping annotation that needs to be adjusted?

Comment by Robert Varga [ 15/Jan/18 ]

Well, I think this has something to do with MOXyJsonProvider, as that seems to be the thing that is supposed to take care of JSON data. I suspect adding jersey-json changes the provider behavior...

Comment by Michael Vorburger [ 15/Jan/18 ]

Right, so the full response we get froma GET on http://127.0.0.1:8181/controller/nb/v2/neutron/networks?status=ACTIVE is:

{"networks":{"id":"4e8e5957-649f-477b-9e5b-f1f75b21c03c","tenant_id":"9bacb3c5d39d41a79512987f338cf177","name":"net1","admin_state_up":"true","status":"ACTIVE","shared":"false","external":"false","network_type":"flat","segments":[null,null]}}

which in the code comes from the class NeutronNetworksNorthbound which returns a NeutronNetworkRequest (yup, you read that right, that appears to be the Response despite the class being named Request!). Now that NeutronNetworkRequest has an @XmlElement(name = "networks") List<NeutronNetwork> which due to some change in the JSON serialization with this bump is no longer serialized as a JSON array despite being a List. Hm.

Comment by Michael Vorburger [ 15/Jan/18 ]

Today's free fun fact: If we hack an bulk.add(new NeutronNetwork()); in line 42 of NeutronNetworkRequest then it is a JSON array! I'm obviously not suggesting that be the fix, but proving that there is some... wrong optimization in whatever JSON serializer that is being use here that thinks that is a List contains only 1 element it can be written out as just JSON object instead of array. Huh.

Perhaps the better next step here would be to truly understand how the Big Bump of odlparent and yangtools is causing this change...

Comment by Robert Varga [ 15/Jan/18 ]

Yeah, that is codec weirdness/optimization of flattening a collection to a single element? At any rate we should figure out why moxy no longer works at it should.

Comment by Robert Varga [ 15/Jan/18 ]

Another thing is the question of whether moxy was ever used at all the breakage actually comes from upgraded jackson

 

Comment by Robert Varga [ 15/Jan/18 ]

So to sum this up:

  • in default configuration MOXy does not seem to be used to perform serialization (anymore?, that needs to be confirmed vs. nitrogen)
  • with jersey-json we get serialization working, but it is broken around single-item lists

 

Hence we either need to get moxy working or replace it with a different JSON provider – for AAA we have the proposal to use GSON at https://git.opendaylight.org/gerrit/#/c/66056/.

Comment by Michael Vorburger [ 15/Jan/18 ]

https://www.eclipse.org/eclipselink/api/2.4/org/eclipse/persistence/jaxb/MarshallerProperties.html#JSON_REDUCE_ANY_ARRAYS looks interesting... but yes rovarga you raise a good point, this seems to be a in complete mess now - I'm myself confused which JSON serializer (codec) is actually used here now.

Note the presence of neutron/northbound-api/src/main/resources/org/opendaylight/controller/networkconfig/neutron/northbound/jaxb.properties, but that is in a weird package, given that NeutronNetworkRequest and Co. are in org.opendaylight.neutron.northbound.api, no? Was this project (neutron) originally using Moxy and by adding jersey-json we've made it use Jackson with (non-Moxy) JAXB instead, causing this (and probably other...) issues? Perhaps instead of fixing JSON array problem in isolation, we should take a step back and understand better what's really going on here. The jersey-json dependency may well not actually be the right solution here.

But the real question probably is what actually changed in odlparent 3.0.2 compared to 2.0.x to cause the original issue... suggestions, anyone?

Comment by Michael Vorburger [ 15/Jan/18 ]

> Hence we either need to get moxy working or replace it with a different JSON provider – for AAA we have the proposal to use GSON

given that the world is broken and a lot of people cannot work on master, I think the priority here should be to get neutron working as it used to before the odlparent bump, which (I think) means understanding what makes Jersey use Moxy and how that broke compared to how it worked before the bump. I'm not clear if that means that jersey-json is a bad idea - we probably need to understand how jersey can be made to work with moxy; how it did before, why it broken, and how to get it back working?

Actually replacing it with a different JSON serializer, like GSON, seems safer to do separately later - who knows what other surprises that may lead to.

Comment by Robert Varga [ 15/Jan/18 ]

The only related change I am aware of the upgrade from jackson-2.3.2 to jackson-2.8.9...

Comment by Robert Varga [ 15/Jan/18 ]

Given that https://git.opendaylight.org/gerrit/#/c/67168/ , I am not sure the setup story in this project was completely correct to begin with.

Comment by Michael Vorburger [ 15/Jan/18 ]

So the NeutronNorthboundRSApplication configures a MOXyJsonProvider, so Neutron clearly originally wanted to use Moxy instead Jackson...

Comment by Robert Varga [ 15/Jan/18 ]

jaxb.properties seems to be correct accordking to https://www.eclipse.org/eclipselink/documentation/2.6/moxy/runtime001.htm#CACFEGHC .

Comment by Michael Vorburger [ 15/Jan/18 ]

> Given that https://git.opendaylight.org/gerrit/#/c/67168/ , I am not sure the setup story in this project was completely correct to begin with.

Interesting but my suspicion would be that's more of a minor oversight leading to that WARN log, but not the real cause of / related to the problem we're chasing here? The MOXyJsonProvider seems to ahave been configured both as class as well as instance, and it looks like if it's configured as instance then one does not need to configure it as class, which makes sense.

Let us remove the jersey-json dependency again, and try to understand the real cause of the initial WebApplicationException: com.sun.jersey.api.MessageException - despite the configuration of the MOXyJsonProvider ?

Comment by Robert Varga [ 15/Jan/18 ]

One more possibility is that we have a clash on javax.ws.rs(.ext) package and we end up not interpreting MOXyJsonProvider as a correct Provider...

Comment by Robert Varga [ 15/Jan/18 ]

Yup, confirmed here:

javax.ws.rs.client                                                                                                                                                                                 │ 2.0.1                                  │ 68  │ javax.ws.rs-api
javax.ws.rs.container                                                                                                                                                                              │ 2.0.1                                  │ 68  │ javax.ws.rs-api
javax.ws.rs.core                                                                                                                                                                                   │ 1.1.1                                  │ 34  │ com.sun.jersey.core
javax.ws.rs.core                                                                                                                                                                                   │ 2.0.1                                  │ 68  │ javax.ws.rs-api
javax.ws.rs.ext                                                                                                                                                                                    │ 1.1.1                                  │ 34  │ com.sun.jersey.core
javax.ws.rs.ext                                                                                                                                                                                    │ 2.0.1                                  │ 68  │ javax.ws.rs-api
javax.ws.rs                                                                                                                                                                                        │ 1.1.1                                  │ 34  │ com.sun.jersey.core
javax.ws.rs                                                                                                                                                                                        │ 2.0.1                                  │ 68  │ javax.ws.rs-api

Comment by Michael Vorburger [ 15/Jan/18 ]

> jaxb.properties seems to be correct accordking to https://www.eclipse.org/eclipselink/documentation/2.6/moxy/runtime001.htm#CACFEGHC .

I don't think so; note "the same package (directory) in which your model classes reside", and it's in an old org/opendaylight/controller/networkconfig/neutron/northbound/ instead of in org/opendaylight/neutron/northbound/api/ where it should be now... but that's not our problem here, that javax.ws.rs.WebApplicationException and com.sun.jersey.api.MessageException where Jersey isn't able to find a Codec for MIME application/json most probably hits us and fails long before any JAXB related stuff? And if the MOXyJsonProvider did work, I suspect that probably has the equivalent of javax.xml.bind.context.factory=org.eclipse.persistence.jaxb.JAXBContextFactory in jaxb.properties in code instead, so I bet that was just wrong since year and we could probably completely remove that jaxb.properties file, wherever it is, and it will still work (once it uses the MOXyJsonProvider).

> One more possibility is that we have a clash on javax.ws.rs(.ext) package and we end up not interpreting MOXyJsonProvider as a correct Provider...
> Yup, confirmed here:

Bloody OSGi! So wait, what does this mean? We got the same package in two bundles with different JAX RS versions? And something sets some stupid static in the wrong one of the two? Phew.

Comment by Robert Varga [ 15/Jan/18 ]

Looking at moxy's headers, it does not specify a version for its javax.ws.rs import:

clipseLink MOXy (176)
----------------------
Archiver-Version = Plexus Archiver
Build-Jdk = 1.7.0_80
Built-By = genie.eclipselink
Created-By = 1.6.0_21 (Sun Microsystems Inc.)
HK2-Bundle-Name = org.eclipse.persistence:org.eclipse.persistence.moxy
Manifest-Version = 1.0

Bundle-ManifestVersion = 2
Bundle-Name = EclipseLink MOXy
Bundle-SymbolicName = org.eclipse.persistence.moxy
Bundle-Vendor = Eclipse.org - EclipseLink Project
Bundle-Version = 2.6.2.v20151217-774c696

Export-Package =
org.eclipse.persistence.internal.jaxb;version=2.6.2,
org.eclipse.persistence.internal.jaxb.many;version=2.6.2,
org.eclipse.persistence.jaxb;version=2.6.2,
org.eclipse.persistence.jaxb.attachment;version=2.6.2,
org.eclipse.persistence.jaxb.compiler;version=2.6.2,
org.eclipse.persistence.jaxb.dynamic;version=2.6.2,
org.eclipse.persistence.jaxb.dynamic.metadata;version=2.6.2,
org.eclipse.persistence.jaxb.javamodel;version=2.6.2,
org.eclipse.persistence.jaxb.javamodel.oxm;version=2.6.2,
org.eclipse.persistence.jaxb.javamodel.reflection;version=2.6.2,
org.eclipse.persistence.jaxb.metadata;version=2.6.2,
org.eclipse.persistence.jaxb.rs;version=2.6.2,
org.eclipse.persistence.jaxb.xmlmodel;version=2.6.2
Import-Package =
com.sun.xml.bind;resolution:=optional,
com.sun.xml.bind.annotation;resolution:=optional,
com.sun.xml.bind.api;resolution:=optional,
com.sun.xml.bind.api.impl;resolution:=optional,
com.sun.codemodel;resolution:=optional;version="[2.2.11,3)",
com.sun.xml.xsom;resolution:=optional,
com.sun.xml.xsom.impl;resolution:=optional,
com.sun.xml.xsom.impl.parser;resolution:=optional,
com.sun.tools.xjc;resolution:=optional;version="[2.2.11,3)",
com.sun.tools.xjc.model;resolution:=optional;version="[2.2.11,3)",
com.sun.tools.xjc.outline;resolution:=optional;version="[2.2.11,3)",
javax.activation;resolution:=optional,
javax.json;resolution:=optional,
javax.json.stream;resolution:=optional,
javax.naming;resolution:=optional,
javax.validation;resolution:=optional;version=1.1.0,
javax.validation.constraints;resolution:=optional;version=1.1.0,
javax.validation.groups;resolution:=optional;version=1.1.0,
javax.ws.rs;resolution:=optional,
javax.ws.rs.core;resolution:=optional,
javax.ws.rs.ext;resolution:=optional,

 

Hence I suspect it binds to 2.0.1 and jersey no longer recognizes it.

Comment by Robert Varga [ 15/Jan/18 ]

Yup, it is precisely that:

karaf@root()> bundle:requirements 176
org.eclipse.persistence.moxy_2.6.2.v20151217-774c696 [176] requires:
--------------------------------------------------------------------
osgi.wiring.package; (osgi.wiring.package=javax.activation) resolved by:
   osgi.wiring.package; javax.activation 1.1.0 from org.eclipse.osgi_3.11.3.v20170209-1843 [0]
osgi.wiring.package; (osgi.wiring.package=javax.json) resolved by:
   osgi.wiring.package; javax.json 1.0.0 from org.glassfish.javax.json_1.0.4 [179]
osgi.wiring.package; (osgi.wiring.package=javax.json.stream) resolved by:
   osgi.wiring.package; javax.json.stream 1.0.0 from org.glassfish.javax.json_1.0.4 [179]
osgi.wiring.package; (osgi.wiring.package=javax.naming) resolved by:
   osgi.wiring.package; javax.naming 0.0.0 from org.eclipse.osgi_3.11.3.v20170209-1843 [0]
osgi.wiring.package; (&(osgi.wiring.package=javax.validation)(version>=1.1.0)) resolved by:
   osgi.wiring.package; javax.validation 1.1.0.Final from javax.validation.api_1.1.0.Final [66]
osgi.wiring.package; (&(osgi.wiring.package=javax.validation.constraints)(version>=1.1.0)) resolved by:
   osgi.wiring.package; javax.validation.constraints 1.1.0.Final from javax.validation.api_1.1.0.Final [66]
osgi.wiring.package; (&(osgi.wiring.package=javax.validation.groups)(version>=1.1.0)) resolved by:
   osgi.wiring.package; javax.validation.groups 1.1.0.Final from javax.validation.api_1.1.0.Final [66]
osgi.wiring.package; (osgi.wiring.package=javax.ws.rs) resolved by:
   osgi.wiring.package; javax.ws.rs 2.0.1 from javax.ws.rs-api_2.0.1 [68]
osgi.wiring.package; (osgi.wiring.package=javax.ws.rs.core) resolved by:
   osgi.wiring.package; javax.ws.rs.core 2.0.1 from javax.ws.rs-api_2.0.1 [68]
osgi.wiring.package; (osgi.wiring.package=javax.ws.rs.ext) resolved by:
   osgi.wiring.package; javax.ws.rs.ext 2.0.1 from javax.ws.rs-api_2.0.1 [68]

Comment by Robert Varga [ 15/Jan/18 ]

Looking at moxy 2.7.1, it will explicitly bind to version 2.0.1, so upgrading it will not solve our problem. Looking at jersey, 1.19.4 no longer packages javax.ws.rs, so that may be path out of this. Alternatively we wrap moxy and force it to resolve javax.ws.rs to 1.1.x.

Comment by Robert Varga [ 15/Jan/18 ]

Confirmed to be present.

Comment by Michael Vorburger [ 15/Jan/18 ]

> Looking at moxy 2.7.1, it will explicitly bind to version 2.0.1,

was just looking at that as well... we're apparently on moxy 2.6.2, so attempting a bump to 2.7.1 seems reasonable?

So if we want to avoid surprises down the road on the next round of bumps and get this over with, that would be the right thing to do here, IMHO...

> so upgrading it will not solve our problem. Looking at jersey, 1.19.4 no longer packages javax.ws.rs, so that may be path out of this.

that sounds like worth a try?

> Alternatively we wrap moxy and force it to resolve javax.ws.rs to 1.1.x.

yeah, short term alternative; but then we have to remember to remove this work-around again sooner or later, no?

Comment by Robert Varga [ 15/Jan/18 ]

Yeah, so to unblock netvirt I propose to just munge moxy into northbound-api (yeah, it is ugly) and discuss the jersey situation on the kernel projects call. At this point, everything for the upgrade to 2.x should be ready from odlparent and is really all about aaa+netconf and other downstreams...

Comment by Robert Varga [ 15/Jan/18 ]

This just shows how technical debt can explode at the worst possible of times.

Comment by Robert Varga [ 15/Jan/18 ]

As for jersey-1.19 – is probably the middle ground here, but given how much non-progress we have made over the years in moving from 1.17 I would strongly suggest biting the bullet (again) and cleaning the mess once and hopefully for ever.

Comment by Tom Pantelis [ 15/Jan/18 ]

Just catching up - had to step out for a while. Nice find Robert and Michael. 

Comment by Robert Varga [ 16/Jan/18 ]

https://git.opendaylight.org/gerrit/#/c/66509/8

Comment by Isaku Yamahata [ 17/Jan/18 ]

Thanks for working hard during weekend.
For Florine, let's fix it right uniformly among projects.

Comment by Michael Vorburger [ 22/Jan/18 ]

FTR: NETCONF-502 is hitting something very similar...

Generated at Wed Feb 07 20:25:42 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.