[OPNFLWPLUG-974] Message deserialization failed...msgType: 36864 oxm_field: 6 experimenterID: null was not found - please verify that all needed deserializers ale loaded correctly Created: 23/Jan/18 Updated: 13/Feb/18 Resolved: 13/Feb/18 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | Nitrogen, Carbon |
| Fix Version/s: | Carbon-SR4, Nitrogen-SR2, Oxygen |
| Type: | Bug | Priority: | Highest |
| Reporter: | Sam Hague | Assignee: | Gobinath Suganthan |
| Resolution: | Done | Votes: | 0 |
| Labels: | csit:exception | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
CSIT tests are failing and looks to be from deserialization errors as shown below. 2018-01-22 20:35:19,822 | INFO | nsole user karaf | core | 112 - org.apache.karaf.log.core - 4.0.10 | ROBOT MESSAGE: Starting test Ping External Network PNF from Vm Instance 1 2018-01-22 20:35:21,619 | WARN | entLoopGroup-5-4 | OFDecoder | 360 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.5.2.SNAPSHOT | Message deserialization failed java.lang.IllegalStateException: Deserializer for key: msgVersion: 4 objectClass: org.opendaylight.yang.gen.v1.urn.opendaylight.openflow.oxm.rev150225.match.entries.grouping.MatchEntry msgType: 36864 oxm_field: 6 experimenterID: null was not found - please verify that all needed deserializers ale loaded correctly 2018-01-22 20:35:22,617 | WARN | entLoopGroup-5-4 | OFDecoder | 360 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.5.2.SNAPSHOT | Message deserialization failed java.lang.IllegalStateException: Deserializer for key: msgVersion: 4 objectClass: org.opendaylight.yang.gen.v1.urn.opendaylight.openflow.oxm.rev150225.match.entries.grouping.MatchEntry msgType: 36864 oxm_field: 6 experimenterID: null was not found - please verify that all needed deserializers ale loaded correctly 2018-01-22 20:35:23,619 | WARN | entLoopGroup-5-4 | OFDecoder | 360 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.5.2.SNAPSHOT | Message deserialization failed java.lang.IllegalStateException: Deserializer for key: msgVersion: 4 objectClass: org.opendaylight.yang.gen.v1.urn.opendaylight.openflow.oxm.rev150225.match.entries.grouping.MatchEntry msgType: 36864 oxm_field: 6 experimenterID: null was not found - please verify that all needed deserializers ale loaded correctly
|
| Comments |
| Comment by Anil Vishnoi [ 23/Jan/18 ] | |
|
shague We haven't merge any patch in stable/nitrogen (specially around de/serializer after 8th dec), so were you seeing this issue before 8th dec as well? | |
| Comment by Sam Hague [ 24/Jan/18 ] | |
|
Not sure... around that time is when the infra changes were happening and most jobs were aborting. We are collecting tcpdump also for this test case and will attach when the job finishes. | |
| Comment by Gobinath Suganthan [ 24/Jan/18 ] | |
|
shague Could you please set debug logs for this file org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl too . Also could you share the wireshark capture for that test case (Ping from external PNF to VM) too. Thanks | |
| Comment by Sam Hague [ 24/Jan/18 ] | |
|
debug logs are at https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-nitrogen/196/. I goofed up the tcpdump so need to rerun and collect those with the logs. This job should have the tcpdump and extra debug logs when it finishes in 60m: https://jenkins.opendaylight.org/releng/view/netvirt/job/netvirt-csit-1node-openstack-ocata-gate-stateful-nitrogen/197
| |
| Comment by Sam Hague [ 24/Jan/18 ] | |
|
job with debug and pcaps: | |
| Comment by Sam Hague [ 30/Jan/18 ] | |
|
Verified is a job from 12/06/17 where things passed fine. Shortly after that the infra was changed so we don't have a good history, but around 1/2/18 with [2] we can see the failure reliably.
| |
| Comment by Gobinath Suganthan [ 31/Jan/18 ] | |
|
shague I want to run CSIT with some additional logging. The whole CSIT job takes a while to complete (and sometimes the job fails due to timeout). Is there any way to run only a particular TC in the CSIT suite ? Test Case required to be run:
Avishnoi The deseriialization is failing due to the absence of any matchdeserializer registered for the match of the packetIn message received. This is due to the unknown "msgType" received. The expected value is "0x8000" but is decoded as "0x9000" and hence the matchdeserializer could not be found for the packetIn. On analysing the attached wireshark logs, I could not see any packet with msgType as "0x9000" and so I guess this must be due some problem while extracting the "msgType" from the packet in OFplugin. | |
| Comment by Sam Hague [ 03/Feb/18 ] | |
|
We can't run a single test but we can run a single suite that contains that test. On the job config you just set the suite to run in the SUITES parameter. Let me know if you need that as I don't think you can start a job unless you push it to sandbox. | |
| Comment by Gobinath Suganthan [ 05/Feb/18 ] | |
|
shague Thanks to the wireshark logs you had sent earlier , I was able to hard code and reproduce the issue locally. Observations:
This issue is observed only in stable/nitrogen because the ct_mark action is not present in the other releases(carbon,oxygen) Nitrogen flows: cookie=0x6900000, duration=327.336s, table=213, n_packets=5, n_bytes=490, priority=1003,ct_state=+trk,icmp,metadata=0xe0000000000/0xfffff0000000000 actions=ct(commit,zone=5002,exec(set_field:0x1->ct_mark)),resubmit(,17) Oxygen and Carbon flows: cookie=0x6900000, duration=293.840s, table=213, n_packets=15, n_bytes=1110, priority=1002,ct_state=+new+trk,tcp,metadata=0x130000000000/0xfffff0000000000 actions=ct(commit,zone=5002),resubmit(,17)
Avishnoi I have raised a patch for this : https://git.opendaylight.org/gerrit/#/c/67905/ Could someone enlighten us which module/project programs this table 213? And why is this change in pipeline present only in Nitrogen? | |
| Comment by Aswin Suryanarayanan [ 05/Feb/18 ] | |
|
This issue should be the result of [1], in which a new action, ct_mark, is used by netvirt. Since the change is not yet merged in branches other than Nitrogen, the issue is not observed there. [1]https://git.opendaylight.org/gerrit/#/c/66886/ | |
| Comment by Bertrand Low [ 06/Feb/18 ] | |
|
What happened is that [1] used my implementation for ct_mark support, which assumed a mask for the match entry (see [2]). As Aswin mentioned, [1] has only been merged in Nitrogen. If Gobinath's patch [3] fixes the deserialization failed errors, then the problematic flows are probably those with match entries like "ct_mark=0x1", not those with the ct_mark set in the conntrack portion of the action (e.g. "actions=ct(commit,zone=5002,exec(set_field:0x1->ct_mark)"). [1] https://git.opendaylight.org/gerrit/#/c/66886/ | |
| Comment by Gobinath Suganthan [ 06/Feb/18 ] | |
|
Thanks Bertrand for pointing about the flaws.I've added a check to check the mask as Aswin had suggested in the latest patch. |