[NETVIRT-609] very small load brings odl/ovs system in cycle reconnect mood Created: 07/Apr/17  Updated: 30/Oct/17  Resolved: 21/Apr/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Boron
Fix Version/s: None

Type: Bug
Reporter: Nikolas Hermanns Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: JPEG File small load.JPG    
Issue Links:
Duplicate
duplicates NETVIRT-436 Enhancement: Allow configuration of i... Resolved
External issue ID: 8186

 Description   

I have only some few vm start in my deployment:

[root@overcloud-controller-0 ~]# nova list
neutron po+----------------------------------------------------------------------------------------------------------------------------------

ID Name Status Task State Power State Networks

----------------------------------------------------------------------------------------------------------------------------------+

ebac75a9-2471-4429-aa3b-16d42980d497 sdnvpn-3-2-quagga ACTIVE
Running sdnvpn-3-2-quagga-net=10.10.11.5, 192.168.37.202
348ff4e8-85da-478a-8cc5-bd04fda2f92d sdnvpn-4-1 ACTIVE
Running sdnvpn-4-1-net=10.10.10.7
e1adbebb-0a35-4ba9-bdc4-78d28dea5d9f sdnvpn-4-2 ACTIVE
Running sdnvpn-4-1-net=10.10.10.12
847734a2-af60-4852-b151-ab3c2e30f8ed sdnvpn-4-3 ACTIVE
Running sdnvpn-4-1-net=10.10.10.8
ee728f1a-ddb8-4ea9-9e6e-99164c064f92 sdnvpn-4-4 ACTIVE
Running sdnvpn-4-2-net=10.10.11.3
1c03e896-340f-4626-98d5-265854ddb668 sdnvpn-4-5 ACTIVE
Running sdnvpn-4-2-net=10.10.11.11
3593e96e-3fe6-45e1-8715-7b7ba239b45c sdnvpn-8-1 ACTIVE
Running sdnvpn-8-1=10.10.10.11, 192.168.37.203
bbd4f29b-0b18-459b-957b-b95f0ef73859 sdnvpn-8-2 ACTIVE
Running sdnvpn-8-2=10.10.20.5

----------------------------------------------------------------------------------------------------------------------------------+
r[root@overcloud-controller-0 ~]# neutron port-list
------------------------------------------------------------------------------------------------------------------------------------------------+

id name mac_address fixed_ips

------------------------------------------------------------------------------------------------------------------------------------------------+

04e9f21e-0336-4773-9020-e9642149201e   fa:16:3e:32:99:a4 {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.204"}
0680b874-d259-4250-a983-d88f779fcfd7   fa:16:3e:eb:d9:ff {"subnet_id": "35d017d1-a098-4c29-9bcd-8bd9274e493f", "ip_address": "10.10.10.2"}
13de230f-5e18-4936-8c61-832f088174c8   fa:16:3e:d1:34:6d {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.215"}
2329e32a-888e-4a2c-adca-6fddcbffc179   fa:16:3e:16:25:be {"subnet_id": "35d017d1-a098-4c29-9bcd-8bd9274e493f", "ip_address": "10.10.10.1"}
246ffaad-3f1f-48cd-8a15-c34c848ec2ea   fa:16:3e:a0:92:e5 {"subnet_id": "430a2296-c009-4b13-9984-2f532913acc4", "ip_address": "10.10.10.1"}
36c5d510-a339-4e35-a42e-8a295c9858e8   fa:16:3e:7e:77:8b {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.208"}
3acb8e14-c3ff-4925-83be-fdfecbf0dc65   fa:16:3e:f2:99:c0 {"subnet_id": "cd3bb114-9f4e-4936-9eff-8989b01427e5", "ip_address": "10.10.11.2"}
3d20beb3-9251-46fa-9cdd-ad8b8ef2a2bf   fa:16:3e:ae:17:5f {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.202"}
4b9e6f37-871a-4717-983d-b87be9b38d67   fa:16:3e:0a:d0:8a {"subnet_id": "7c6b5f56-71fb-4728-bb95-5020b33b55c5", "ip_address": "10.10.20.5"}
5e92c202-d215-49b2-b9d7-1dd516d8e9fa   fa:16:3e:eb:02:87 {"subnet_id": "58399558-1fd9-43ba-a214-6236c0113194", "ip_address": "10.10.10.2"}
60549335-ff8c-4eaa-9c9d-f845c261b4ff   fa:16:3e:75:cb:83 {"subnet_id": "58399558-1fd9-43ba-a214-6236c0113194", "ip_address": "10.10.10.1"}
74c879d8-f94c-4df6-85a5-96f096b26990   fa:16:3e:65:58:fc {"subnet_id": "ed189d60-b81b-4f05-87b3-5f436548345f", "ip_address": "10.10.11.5"}
75c558c8-b4b3-4c0f-a5a1-2266593b14a5   fa:16:3e:96:84:65 {"subnet_id": "430a2296-c009-4b13-9984-2f532913acc4", "ip_address": "10.10.10.2"}
7743129e-e28b-4aac-b0df-49b11aa9e450   fa:16:3e:e0:eb:79 {"subnet_id": "ed189d60-b81b-4f05-87b3-5f436548345f", "ip_address": "10.10.11.2"}
79be70a5-afe4-484a-ba2d-dda93c1fbcd4   fa:16:3e:61:1c:90 {"subnet_id": "430a2296-c009-4b13-9984-2f532913acc4", "ip_address": "10.10.10.7"}
81484fb1-4265-4d6d-a1c6-2a4aadc4f571   fa:16:3e:37:3d:57 {"subnet_id": "7c6b5f56-71fb-4728-bb95-5020b33b55c5", "ip_address": "10.10.20.2"}
99dbe740-a456-471b-ad85-ef708b9fac03   fa:16:3e:2b:14:38 {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.203"}
9a72f914-2dac-4c0e-869e-d1cf84ab350b   fa:16:3e:02:59:42 {"subnet_id": "ed189d60-b81b-4f05-87b3-5f436548345f", "ip_address": "10.10.11.1"}
a89e432a-c485-4549-bdf0-9802a04efbca   fa:16:3e:26:a8:6d {"subnet_id": "cd3bb114-9f4e-4936-9eff-8989b01427e5", "ip_address": "10.10.11.11"}
abc364c2-6560-4657-8402-ad890a049cb7   fa:16:3e:c5:ff:13 {"subnet_id": "430a2296-c009-4b13-9984-2f532913acc4", "ip_address": "10.10.10.8"}
b99dc5ff-0506-49fb-a54a-67606a6a438b   fa:16:3e:fd:f6:03 {"subnet_id": "cd3bb114-9f4e-4936-9eff-8989b01427e5", "ip_address": "10.10.11.3"}
be694def-9bff-461a-aea9-dfcb15b74497   fa:16:3e:fa:33:09 {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.211"}
c9aadf0f-8608-4f39-96e1-af7b13178960   fa:16:3e:01:47:29 {"subnet_id": "850a01b8-4d98-4638-8e49-4fff4471c303", "ip_address": "192.168.37.201"}
e82cda88-ce29-4f0c-a519-80fc1686b6ca   fa:16:3e:8c:61:6c {"subnet_id": "35d017d1-a098-4c29-9bcd-8bd9274e493f", "ip_address": "10.10.10.11"}
e9484795-4a9c-4267-8f8f-32baf275406c   fa:16:3e:37:18:96 {"subnet_id": "430a2296-c009-4b13-9984-2f532913acc4", "ip_address": "10.10.10.12"}

------------------------------------------------------------------------------------------------------------------------------------------------+

and as you can see in the appended jpg ovs connects and disconnects always from odl. ODL consume 100% cpu load:

1 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||98.1%] Tasks: 135, 1192 thr; 23 running
2 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||95.5%] Load average: 12.45 13.26 9.64
3 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||96.8%] Uptime: 02:11:36
4 [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||97.4%]
Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||7.35G/9.61G]
Swp[ 0K/0K]

PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
28511 odl 20 0 6177M 2597M 21064 S 362. 26.4 1h16:33 /usr/bin/java -Djava.security.properties=/opt/opendaylight/etc/odl.java.security -server -Xms128M -Xmx2048m
28679 odl 20 0 6177M 2597M 21064 R 92.6 26.4 13:36.81 /usr/bin/java -Djava.security.properties=/opt/opendaylight/etc/odl.java.security -server -Xms128M -Xmx2048m
28678 odl 20 0 6177M 2597M 21064 R 88.1 26.4 13:33.38 /usr/bin/java -Djava.security.properties=/opt/opendaylight/etc/odl.java.security -server -Xms128M -Xmx2048m
28680 odl 20 0 6177M 2597M 21064 R 84.9 26.4 13:37.51 /usr/bin/java -Djava.security.properties=/opt/opendaylight/etc/odl.java.security -server -Xms128M -Xmx2048m
28677 odl 20 0 6177M 2597M 21064 R 82.3 26.4 13:36.90 /usr/bin/java -Djava.security.properties=/opt/opendaylight/etc/odl.java.security -server -Xms128M -Xmx2048m

this is insane. 10 GB ram and 4 cores should be enough to run with 3 ovs.



 Comments   
Comment by Nikolas Hermanns [ 07/Apr/17 ]

Attachment small load.JPG has been added with description: small load

Comment by Nikolas Hermanns [ 07/Apr/17 ]

ok, it is not insane, I see that the setup is too small however it should still not be the case that ovs connectes and diconnectes. I guess that has something todo with odl overlaoded and heart beet not transmitted. I tried fetching logs but odl was too overloaded.

Comment by Vishal Thapar [ 21/Apr/17 ]

This is a known issue and at some point one will always run into inactivity proble timeouts for given resources. That is why it is recommended to tweak the inactivity_probe on OVS to a non-default value based on setup. Optimizations to reduce usage are always ongoing, but there will always be some number where one can run into issue.

Option added as part of fix for https://bugs.opendaylight.org/show_bug.cgi?id=7591

Generated at Wed Feb 07 20:22:01 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.