[VTN-137] CSIT Failure in Mininet Ping test cases of VTN Manager for Beryllium and Boron Created: 21/Jul/16  Updated: 19/Oct/17

Status: Open
Project: vtn
Component/s: VTN Manager
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Karthik Sivasamy Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 6245

 Description   

In releng, sometimes CSIT test cases are failing in Mininet Ping test cases. Also failure in differnt functionality in each failing time. By analyzing log seems it's a path-fault issue.

And this is occurs both in Beryllium and Boron.

This issue occurs only in releng, it is not reproduced in local machine.

Below is one of the scenarios in which Mininet Ping Test cases are failing due path-fault issue.

Test scenario:
==============
CSIT for "VBR IF Flowfilter" with action "vtn-set-vlan-pcp-action"

Topology used
=============
Created topology in mininet for Network with hosts in different vlan.
https://wiki.opendaylight.org/view/OpenDaylight_Virtual_Tenant_Network_(VTN):Scripts:Mininet#Network_with_hosts_in_different_vlan

Following are the steps followed in the CSIT test script for both OF10 and OF13.

1. Create Tenant1
2. Create vBridge1
3. Create interfaces(if1 and if2) in vBridge1
4. Create portmapping for if1 and if2.
5. Add flow condition to match h1, h3 ip address
6. Add VTN flowfilter to set flow action "vtn-set-vlan-pcp-action")
7. Ping h1 and h3
8. Verify ping Success ---> Here test case fails.
9. Read the flow entries in mininet switch s2 to verify actions=mod_vlan_pcp is set or not. --> Due to ping failure in before step here flow entry is not installed with vlan-pcp , so here also tets failure.

The above failure is in the below link of jenkins releng,
https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-beryllium/505/archives/log.html.gz#s1-s1-s6

I have analyzed the karaf.log for the above failure which is in below link,

https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-beryllium/505/archives/karaf.log.gz

From the above log i have observed, path-fault issue occur which makes vbridge status to down. Which in turn gives error while we ping between hosts.

2016-07-21 07:56:54,660 | TRACE | Runner: VTN Main | VTNPassFilter | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | vBridge-IF:Tenant1/vBridge1/if1%IN.1: Flow action was applied: VTNSetVlanPcpAction[pcp=6,order=1]
2016-07-21 07:56:54,660 | TRACE | Runner: VTN Main | VTNPassFilter | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | vBridge-IF:Tenant1/vBridge1/if1%IN.1: Packet matched the condition: cond_1
2016-07-21 07:56:54,665 | TRACE | TN Flow Thread-0 | FlowAddContext | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | Flow entry has been installed: flow=[id=7f5600000000002b-1, pri=10, timeout=(0,0), node=openflow:1, ingress=openflow:1:2, cond=

{DL_SRC=d6:6a:2d:b5:2e:96,DL_DST=2a:f9:1d:d3:fd:9e,DL_VLAN=200}

, actions=

{OUTPUT(port=openflow:1:1, len=65535)}

2016-07-21 07:56:54,666 | WARN | on-dispatcher-70 | VTenantManager | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | vBridge:Tenant1/vBridge1: Path fault: openflow:1 -> openflow:2
2016-07-21 07:56:54,666 | INFO | on-dispatcher-70 | VTenantManager | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | vBridge:Tenant1/vBridge1: vBridge status has been changed: old=

{state=UP, path-faults=0}

, new=

{state=DOWN, path-faults=1}

2016-07-21 07:56:54,667 | TRACE | TN Flow Thread-0 | FlowAddContext | 177 - org.opendaylight.vtn.manager.implementation - 0.4.3.SNAPSHOT | Flow entry has been installed: flow=[id=7f5600000000002b-0, pri=10, timeout=(300,0), node=openflow:2, ingress=openflow:2:3, cond=

{DL_SRC=d6:6a:2d:b5:2e:96,DL_DST=2a:f9:1d:d3:fd:9e,DL_VLAN=200}

, actions=

{OUTPUT(port=openflow:2:1, len=65535)}

 Comments   
Comment by Karthik Sivasamy [ 21/Jul/16 ]

This failures occured in below jobs,

1) https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-manager-only-beryllium/505/

2) https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/550/archives/

3) https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-all-beryllium/367/archives/

4) https://logs.opendaylight.org/releng/jenkins092/vtn-csit-3node-manager-all-boron/14/archives/odl1_karaf.log.gz

5)https://logs.opendaylight.org/releng/jenkins092/vtn-csit-3node-manager-only-boron/29/archives/odl1_karaf.log.gz

Comment by Karthik Sivasamy [ 09/Aug/16 ]

Pushed patch https://git.opendaylight.org/gerrit/#/c/42703/.
In this patch, we have modified vlanmap topology, added topology wait time to increase wait time for topology updation, modified test cases for the new topology.

Reason for the patch:
====================
We had seen the issue was due to delay in update of Inter-switch link connection. To verify we have modified topology which is same as other topology used in tests.

Tested the above patch in Sandbox and issue is not reproduced.

After the patch got merged, need to check in releng if the changes of topology solved the path-fault issue. If else need to investigate what makes topology issue in vlanmapping.

Comment by Karthik Sivasamy [ 09/Aug/16 ]

Patch Merged in Git on Aug 2nd.

Checked releng jobs after the patch got merged, but still again path-fault issue reproduced in some releng jobs. Mainly in vtn-csit-3node-all-boron.

https://logs.opendaylight.org/releng/jenkins092/vtn-csit-3node-manager-all-boron/35/archives/log.html.gz

To investigate further, Now we are trying to enable TRACE log for openflowplugin and reducing topology wait time lesser than default time 3000ms.

Patch pushed to analyse the issue,
https://git.opendaylight.org/gerrit/#/c/43210/

Comment by Venkatrangan Govindarajan [ 24/Aug/16 ]

Tests have been excluded tempororily. This will be taken up after release to analyze the problem again.

Comment by Karthik Sivasamy [ 02/Dec/16 ]

Pushed patch by enabling vlan-pcp test cases from vbrifflowfilter test file.

Patch,
https://git.opendaylight.org/gerrit/#/c/48911/2

Tested and Verified in sandbox with the below jobs, issue is not reproduced. All tests are working fine.

https://jenkins.opendaylight.org/sandbox/job/vtn-csit-1node-manager-only-boron/
https://jenkins.opendaylight.org/sandbox/job/vtn-csit-3node-manager-all-boron/
https://jenkins.opendaylight.org/sandbox/job/vtn-csit-3node-manager-only-boron/

After the patch get merge, will wait few test to run in jenkins. IF the issue is not reproduced again will close this bug.

Comment by Venkatrangan Govindarajan [ 11/Jan/17 ]

Problem recreated here
hhttps://logs.opendaylight.org/releng/jenkins092/vtn-csit-3node-manager-all-boron/185/archives/log.html.gz#s1-s2-s7-t13

Generated at Wed Feb 07 20:48:09 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.