[VTN-44] System test for VTN Manager fails. Created: 07/Oct/14  Updated: 19/Oct/17  Resolved: 30/Jun/15

Status: Resolved
Project: vtn
Component/s: VTN Manager
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Hideyuki Tai Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
is blocked by CONTROLLER-927 The order of switch event notificatio... Resolved
External issue ID: 2158

 Description   

Continuous System Integration Test for VTN Manager features failed continuously.

https://jenkins.opendaylight.org/integration/view/CSIT%20Jobs/job/integration-master-csit-karaf-vtn-all/

Tests from #165 to #174 of the above Jenkins job failed.

Sometimes it failed to forward ping packets.
https://jenkins.opendaylight.org/integration/view/CSIT%20Jobs/job/integration-master-csit-karaf-vtn-all/167/robot/

Other times it failed to delete a virtual tenant network.
https://jenkins.opendaylight.org/integration/view/CSIT%20Jobs/job/integration-master-csit-karaf-vtn-all/174/robot/



 Comments   
Comment by Hideyuki Tai [ 08/Oct/14 ]

We've analyzed the failure of build #173 of the csit-karaf-vtn-all.
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/173/

The #173 succeeded to forward ping packets between h1 and h3 on a virtual bridge "vBridge1".
However, it failed to forward ping packets between h2 and h4 on a virtual bridge "vBridge2".
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/173/robot/

We've also checked the log file of the #173.
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/173/artifact/karaf.log

We've found out that TopologyManager received the notification of the edge addition (s3-eth2 <-> s1-eth2) before the notification of Node & NodeConnector (s3, s3-eth2, and s1-eth2) addition so that TopologyManager ignored the edge.
In such a case, dijkstra_implementation and VTN Manager are not notified of edge notification.
We guess this is related to the ping failure.

2014-10-06 08:58:02,394 | WARN | DOM-OPER-DCL-8 | TopologyManagerImpl | 359 - org.opendaylight.controller.topologymanager - 0.4.2.SNAPSHOT | Ignore edge that contains invalid node connector: (OF|3@OF|00:00:00:00:00:00:00:03->OF|2@OF|00:00:00:00:00:00:00:01)

2014-10-06 08:58:02,398 | WARN | DOM-OPER-DCL-8 | TopologyManagerImpl | 359 - org.opendaylight.controller.topologymanager - 0.4.2.SNAPSHOT | Ignore edge that contains invalid node connector: (OF|2@OF|00:00:00:00:00:00:00:01->OF|3@OF|00:00:00:00:00:00:00:03)

2014-10-06 08:58:02,401 | INFO | DOM-OPER-DCL-6 | VTNManagerImpl | 450 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addNode: New node OF|00:00:00:00:00:00:00:03

2014-10-06 08:58:02,401 | INFO | DOM-OPER-DCL-6 | VTNManagerImpl | 450 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addPort: New port: port=OF|2@OF|00:00:00:00:00:00:00:03, prop=PortProperty[name=s3-eth2,cost=1000,enabled]

2014-10-06 08:58:02,416 | INFO | DOM-OPER-DCL-6 | VTNManagerImpl | 450 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addPort: New port: port=OF|2@OF|00:00:00:00:00:00:00:01, prop=PortProperty[name=s1-eth2,cost=1000,enabled]

In this case, we think that VTN Manager would output an ERROR message saying a Path Fault occurred, and it would fail to forward ping packets between h1 and h3 as well.
However, we cannot find the ERROR message saying a Path Fault, and it didn't fail to forward ping packets between h1 and h3.
That's weird.

Further investigation is required for this bug.

Comment by Hideyuki Tai [ 09/Oct/14 ]

I've also analyzed the failure of build #175 of the csit-karaf-vtn-all.

https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/175/

On the #175, the exactly same thing happened.

It means that TopologyManager received the notification of the edge addition (s3-eth2 <-> s1-eth2) before the notification of Node & NodeConnector (s3, s3-eth2, and s1-eth2) addition so that TopologyManager ignored the edge.

Comment by Hideyuki Tai [ 10/Oct/14 ]

https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/174/

In the #174, the following eight tests failed.

1. Delete a vtn Tenant1.
2. Add a vtn Tenant1.
3. Add a vBridge vBridge1.
4. Add a interface If1.
5. Add a interface if2.
6. Add a vBridge vBridge2.
7. Add a interface If3.
8. Add a interface if4.

The last seven tests assume that the first test (Delete a vtn Tenant1) successes and it cleans up all VTN configuration.
Therefore, if the first test fails to delete the vtn, the last seven tests inevitably fail.

Therefore, I've focused on analyzing why it was failed to delete "Tenant1".

https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/174/robot/Karaf-Vtn/VTN/Vtn%20Manager/Delete%20a%20vtn%20Tenant1/

The test client output the following error message for the failure of the deletion of "Tenant1".

ConnectionError: [Errno 104] Connection reset by peer

I think it means that the controller sent a RST packet to the test client.
But, I don't know why it happens.
I could not find any message related to this issue in the log file of the controller.
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/174/artifact/karaf.log

The #172 also failed due to the exactly same reason.
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/172/
The test client failed to delete a vtn named "Tenant1", and it output the following message.
ConnectionError: [Errno 104] Connection reset by peer
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-all/172/robot/Karaf-Vtn/VTN%20OF13/Vtn%20Manager/Delete%20a%20vtn%20Tenant1/

Anyway, I think the root cause of the failure of the #174 is different from the root cause of #173.

Comment by Luis Gomez [ 27/Oct/14 ]

Hi Hideyuki,

After trying with my local device, I do not really see any ping failure BUT I do see the problem of deleting tenant time to time.

ConnectionError: [Errno 104] Connection reset by peer

BR/Luis

Comment by Hideyuki Tai [ 28/Oct/14 ]

Hi Luis,

Thank you for sharing the observation!

We think the CONTROLLER-927 caused the ping problem.

> ConnectionError: [Errno 104] Connection reset by peer

I'm not sure why this happened.
I think it means that the controller sent a RST packet to the test client.
Because, VTN Manager never send RST packets, I guess there are something wrong in other bundles.

Comment by Ed Warnicke [ 12/Nov/14 ]

Is this a TCP RST being sent to a REST client?

Comment by Hideyuki Tai [ 19/Nov/14 ]

(In reply to Ed Warnicke from comment #6)
> Is this a TCP RST being sent to a REST client?

I think so.
I'm not sure which bundle makes this happen.

Comment by Hideyuki Tai [ 01/Dec/14 ]

I've submitted a patch for this issue into controller.git.
https://git.opendaylight.org/gerrit/#/c/12876/

Comment by Kuldip [ 19/Dec/14 ]

After starting my Karaf I see Karaf becoming very slow. The main packages installed are : odl-nsf-all, odl-openflowplugin-all, odl-adsal-compatibility-all, and odl-vtn-manager-all .

Is there any conflict in this.

The modules which are installed in karaf include:
Name | Version | Installed | Repository | Description
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
standard | 3.0.1 | x | standard-3.0.1 | Karaf standard feature
config | 3.0.1 | x | standard-3.0.1 | Provide OSGi ConfigAdmin support
region | 3.0.1 | x | standard-3.0.1 | Provide Region Support
package | 3.0.1 | x | standard-3.0.1 | Package commands and mbeans
http | 3.0.1 | x | standard-3.0.1 | Implementation of the OSGI HTTP Service
war | 3.0.1 | x | standard-3.0.1 | Turn Karaf as a full WebContainer
kar | 3.0.1 | x | standard-3.0.1 | Provide KAR (KARaf archive) support
ssh | 3.0.1 | x | standard-3.0.1 | Provide a SSHd server on Karaf
management | 3.0.1 | x | standard-3.0.1 | Provide a JMX MBeanServer and a set of MBeans in K
odl-ovsdb-all | 1.0.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OpenDaylight :: OVSDB :: all
odl-ovsdb-library | 1.0.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OVSDB :: Library
odl-ovsdb-schema-openvswitch | 1.0.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OVSDB :: Schema :: Open_vSwitch
odl-ovsdb-schema-hardwarevtep | 1.0.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OVSDB :: Schema :: hardware_vtep
odl-ovsdb-plugin | 1.0.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OpenDaylight :: OVSDB :: Plugin
odl-ovsdb-northbound | 0.6.1-Helium-SR1 | x | ovsdb-1.0.1-Helium-SR1 | OpenDaylight :: OVSDB :: Northbound
odl-openflowjava-protocol | 0.5.1-Helium-SR1 | x | odl-openflowjava-0.5.1-Helium-SR1 | OpenDaylight :: Openflow Java :: Protocol
odl-vtn-manager-all | 0.2.1-Helium-SR1 | x | vtn-manager-0.2.1-Helium-SR1 | OpenDaylight VTN Manager All
odl-vtn-manager-java-api | 0.2.1-Helium-SR1 | x | vtn-manager-0.2.1-Helium-SR1 | OpenDaylight :: VTN Manager :: Java API
odl-vtn-manager-northbound | 0.2.1-Helium-SR1 | x | vtn-manager-0.2.1-Helium-SR1 | OpenDaylight :: VTN Manager :: Northbound
odl-vtn-manager-neutron | 0.2.1-Helium-SR1 | x | vtn-manager-0.2.1-Helium-SR1 | OpenDaylight :: VTN Manager :: Neutron Interface
odl-restconf | 1.1.1-Helium-SR1 | x | odl-controller-1.1.1-Helium-SR1 | OpenDaylight :: Restconf
odl-restconf-noauth | 1.1.1-Helium-SR1 | x | odl-controller-1.1.1-Helium-SR1 | OpenDaylight :: Restconf
odl-mdsal-apidocs | 1.1.1-Helium-SR1 | x | odl-controller-1.1.1-Helium-SR1 | OpenDaylight :: MDSAL :: APIDOCS
odl-openflowplugin-all | 0.0.4-Helium-SR1 | x | openflowplugin-0.0.4-Helium-SR1 | OpenDaylight :: Openflow Plugin :: All
odl-openflowplugin-southbound | 0.0.4-Helium-SR1 | x | openflowplugin-0.0.4-Helium-SR1 | OpenDaylight :: Openflow Plugin :: SouthBound
odl-openflowplugin-flow-services | 0.0.4-Helium-SR1 | x | openflowplugin-0.0.4-Helium-SR1 | OpenDaylight :: Openflow Plugin :: Flow Services
odl-openflowplugin-flow-services-rest | 0.0.4-Helium-SR1 | x | openflowplugin-0.0.4-Helium-SR1 | OpenDaylight :: Openflow Plugin :: Flow Services :
odl-openflowplugin-flow-services-ui | 0.0.4-Helium-SR1 | x | openflowplugin-0.0.4-Helium-SR1 | OpenDaylight :: Openflow Plugin :: Flow Services :
odl-mdsal-common | 1.1.1-Helium-SR1 | x | odl-config-0.2.6-Helium-SR1 | OpenDaylight :: Config :: All
odl-config-api | 0.2.6-Helium-SR1 | x | odl-config-0.2.6-Helium-SR1 | OpenDaylight :: Config :: API
odl-config-netty-config-api | 0.2.6-Helium-SR1 | x | odl-config-0.2.6-Helium-SR1 | OpenDaylight :: Config :: Netty Config API
odl-config-core | 0.2.6-Helium-SR1 | x | odl-config-0.2.6-Helium-SR1 | OpenDaylight :: Config :: Core
odl-config-manager | 0.2.6-Helium-SR1 | x | odl-config-0.2.6-Helium-SR1 | OpenDaylight :: Config :: Manager
odl-protocol-framework | 0.5.1-Helium-SR1 | x | odl-protocol-framework-0.5.1-Helium-SR1 | OpenDaylight :: Protocol Framework
odl-mdsal-broker | 1.1.1-Helium-SR1 | x | odl-mdsal-1.1.1-Helium-SR1 | OpenDaylight :: MDSAL :: Broker
odl-mdsal-xsql | 1.1.1-Helium-SR1 | x | odl-mdsal-1.1.1-Helium-SR1 |
odl-yangtools-models | 0.6.3-Helium-SR1 | x | odl-yangtools-0.6.3-Helium-SR1 | OpenDaylight :: Yangtools :: Models
odl-yangtools-data-binding | 0.6.3-Helium-SR1 | x | odl-yangtools-0.6.3-Helium-SR1 | OpenDaylight :: Yangtools :: Data Binding
odl-yangtools-binding | 0.6.3-Helium-SR1 | x | odl-yangtools-0.6.3-Helium-SR1 | OpenDaylight :: Yangtools :: Binding
odl-yangtools-common | 0.6.3-Helium-SR1 | x | odl-yangtools-0.6.3-Helium-SR1 | OpenDaylight :: Yangtools :: Common
odl-yangtools-binding-generator | 0.6.3-Helium-SR1 | x | odl-yangtools-0.6.3-Helium-SR1 | OpenDaylight :: Yangtools :: Binding Generator
odl-flow-model | 1.1.1-Helium-SR1 | x | odl-flow-1.1.1-Helium-SR1 | OpenDaylight :: Flow :: Model
odl-flow-services | 1.1.1-Helium-SR1 | x | odl-flow-1.1.1-Helium-SR1 | OpenDaylight :: Flow :: Services
odl-dlux-core | 0.1.1-Helium-SR1 | x | odl-dlux-0.1.1-Helium-SR1 |
odl-platformmanager | 0.0.1-SNAPSHOT | x | platformmanager-0.0.1-SNAPSHOT | OpenDaylight :: OneController :: Platform Manager
odl-netconf-api | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: API
odl-netconf-mapping-api | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: Mapping API
odl-netconf-util | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 |
odl-netconf-impl | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: Impl
odl-config-netconf-connector | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: Connector
odl-netconf-netty-util | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: Netty Util
odl-netconf-monitoring | 0.2.6-Helium-SR1 | x | odl-netconf-0.2.6-Helium-SR1 | OpenDaylight :: Netconf :: Monitoring
transaction | 1.0.1 | x | enterprise-3.0.1 | OSGi Transaction Manager
odl-aaa-authn | 0.1.1-Helium-SR1 | x | odl-aaa-0.1.1-Helium-SR1 | OpenDaylight :: AAA :: Authentication
pax-jetty | 8.1.14.v20131031 | x | org.ops4j.pax.web-3.1.0 | Provide Jetty engine support
pax-http | 3.1.0 | x | org.ops4j.pax.web-3.1.0 | Implementation of the OSGI HTTP Service
pax-http-whiteboard | 3.1.0 | x | org.ops4j.pax.web-3.1.0 | Provide HTTP Whiteboard pattern support
pax-war | 3.1.0 | x | org.ops4j.pax.web-3.1.0 | Provide support of a full WebContainer
odl-config-netty | 0.2.6-Helium-SR1 | x | odl-config-persister-0.2.6-Helium-SR1 | OpenDaylight :: Config-Netty
odl-config-persister | 0.2.6-Helium-SR1 | x | odl-config-persister-0.2.6-Helium-SR1 | OpenDaylight :: Config Persister
odl-config-startup | 0.2.6-Helium-SR1 | x | odl-config-persister-0.2.6-Helium-SR1 | OpenDaylight :: Config Persister:: Config Startup
odl-adsal-all | 0.8.2-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight AD-SAL All Features
odl-adsal-core | 0.8.2-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Core
odl-adsal-networkconfiguration | 0.0.4-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Network Configuration
odl-adsal-connection | 0.1.3-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Connection
odl-adsal-clustering | 0.5.2-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Clustering
odl-adsal-configuration | 0.4.4-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Configuration
odl-adsal-thirdparty | 0.8.2-Helium-SR1 | x | adsal-0.8.2-Helium-SR1 | OpenDaylight :: AD-SAL :: Third-Party Depenencies
odl-base-all | 1.4.3-Helium-SR1 | x | odl-base-1.4.3-Helium-SR1 | OpenDaylight Controller
odl-base-dummy-console | 1.1.1-Helium-SR1 | x | odl-base-1.4.3-Helium-SR1 | Temporary Dummy Console
odl-base-felix-dm | 3.1.0 | x | odl-base-1.4.3-Helium-SR1 | Felix Dependency Manager
odl-base-aries-spi-fly | 1.0.0 | x | odl-base-1.4.3-Helium-SR1 | Aries SPI Fly
odl-base-netty | 4.0.23.Final | x | odl-base-1.4.3-Helium-SR1 |
odl-base-jersey | 1.17 | x | odl-base-1.4.3-Helium-SR1 | Jersey
odl-base-jackson | 2.3.2 | x | odl-base-1.4.3-Helium-SR1 | Jackson JAX-RS
odl-base-slf4j | 1.7.2 | x | odl-base-1.4.3-Helium-SR1 | SLF4J Logging
odl-base-apache-commons | 1.4.3-Helium-SR1 | x | odl-base-1.4.3-Helium-SR1 | Apache Commons Libraries
odl-base-eclipselink-persistence | 2.0.4.v201112161009 | x | odl-base-1.4.3-Helium-SR1 | EclipseLink Persistence API
odl-base-gemini-web | 2.2.0.RELEASE | x | odl-base-1.4.3-Helium-SR1 | Gemini Web
odl-base-tomcat | 7.0.53 | x | odl-base-1.4.3-Helium-SR1 | OpenDaylight Tomcat
odl-base-spring | 3.1.3.RELEASE | x | odl-base-1.4.3-Helium-SR1 | Opendaylight Spring Support
odl-base-spring-web | 3.1.3.RELEASE | x | odl-base-1.4.3-Helium-SR1 | OpenDaylight Spring Web
odl-base-spring-security | 3.1.3.RELEASE | x | odl-base-1.4.3-Helium-SR1 | OpenDaylight Spring Security
odl-nsf-all | 0.4.3-Helium-SR1 | x | nsf-0.4.3-Helium-SR1 | OpenDaylight :: NSF :: All Network Service Functio
odl-nsf-managers | 0.4.3-Helium-SR1 | x | nsf-0.4.3-Helium-SR1 | OpenDaylight :: AD-SAL :: Network Service Function
odl-adsal-northbound | 0.4.3-Helium-SR1 | x | nsf-0.4.3-Helium-SR1 | OpenDaylight :: AD-SAL :: Northbound APIs

Comment by Hideyuki Tai [ 06/Jan/15 ]

Patches to the Topology Manager for the bug have been merged into the master branch and the stable/helium branch of the controller.git.

[master]
https://git.opendaylight.org/gerrit/#/c/12876/

[stable/helium]
https://git.opendaylight.org/gerrit/#/c/13920/

Comment by Hideyuki Tai [ 30/Jun/15 ]

Recently, all CSIT jobs for VTN project works well.

https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-cds-manager-all-master/
https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-cds-manager-only-master/
https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-cds-manager-all-stable-lithium/
https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-cds-manager-only-stable-lithium/

I think the following patches also helped us.

https://git.opendaylight.org/gerrit/#/c/22650/ (stable/lithium)
https://git.opendaylight.org/gerrit/#/c/22254/ (master)

Generated at Wed Feb 07 20:47:51 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.