Troubleshooting Controller CSIT (NETVIRT-1315)

[NETVIRT-1316] review existing tests Created: 19/Jun/18  Updated: 03/Oct/18  Resolved: 03/Oct/18

Status: Resolved
Project: netvirt
Component/s: None
Affects Version/s: None
Fix Version/s: Fluorine

Type: Sub-task Priority: Medium
Reporter: Sam Hague Assignee: Victor Pickard
Resolution: Done Votes: 0
Labels: csit:3node
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NETVIRT-1007 Review tests to verify if they are valid Resolved

 Description   

The controller csit tests should be validated to be relevant and applicable.



 Comments   
Comment by Jamo Luhrsen [ 26/Jun/18 ]

this job is probably best to start digging in to next:
https://jenkins.opendaylight.org/releng/user/jluhrsen/my-views/view/controller%203node/job/controller-csit-3node-clustering-all-oxygen

Comment by Victor Pickard [ 28/Jun/18 ]

Yes, agree. If you look into this job, you will see that these tests:

 

  1. Start ODL with Tell Based False or True
  2. Run some tests
  3. Restart ODL with Tell Based True or False
  4. Run some more tests

 

I think the priority would be to focus on issues where ODL is started/restarted with Tell Based False (meaning Ask Based), as my understanding is that Ask Based is what is being currently used by most (if not all) d/s consumers.  So, I'm going to focus on those failures.

Comment by Victor Pickard [ 28/Jun/18 ]

Controller CSIT 3node clustering oxygen
Failing test cases when Tell Based is False (Ask Based)

Job 310 - 0 failures

Job 309 - 0 failures

Job 308 - 0 failures

Job 307 - 2 failures - Partition and Heal

Job 306 - 0

Job 305 - 1 failure - Chasing the Leader

Job 304 - no robot plugin link

Job 303 - 4 failures - Global RPC Kill

Job 302 - 0

Job 301 - 0

Job 300 - 0

Job 299 - 0

Job 298 - 0

 

I've also got a job queued up that I will run with just these tests above. They run just like the normal controller csit, i.e, start/restart odl, run tests, start/restart odl, run tests, repeat.....

Should give us a quicker way to run tests and see results (and add debug logs to get quicker results).

Run CSIT with above jobs only as follows:

Restart Odl With Tell Based False

Partition and Heal

Restart Odl with Tell Based False

Chasing the Leader

Restart ODL with Tell Based False

Global RPC Kill

controller/dom_data_broker/restart_odl_with_tell_based_false.robot

controller/cluster_singleton/partition_and_heal.robot

controller/dom_data_broker/restart_odl_with_tell_based_false.robot

controller/cluster_singleton/chasing_the_leader.robot

controller/dom_data_broker/restart_odl_with_tell_based_false.robot

controller/singleton_service/global_rpc_kill.robot

 

Comment by Jamo Luhrsen [ 28/Jun/18 ]

vpickard don't forget the sandbox gets purged at some point on Saturday, every week. All these jobs and logs
are purged. best to keep your job configs in sync with commits in your local (or gerrit patches) releng/builder and
int/test repos, so it's easy to push them back after the purging. Also, if there are some logs you want to keep to look
at later, use the "copy-logs: <job_name>/<job_number>" gerrit keyword (on any patch)

Generated at Wed Feb 07 20:23:46 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.