[OPNFLWPLUG-1060] instance not able to curl metadata Created: 12/Dec/18  Updated: 19/Mar/19  Resolved: 19/Mar/19

Status: Resolved
Project: OpenFlowPlugin
Component/s: General, openflowplugin-impl
Affects Version/s: None
Fix Version/s: Fluorine-SR2, Neon

Type: Bug Priority: High
Reporter: Jamo Luhrsen Assignee: Somashekhar Javalagi
Resolution: Cannot Reproduce Votes: 0
Labels: apex:gate, csit:failures
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File NETVIRT-1538.karaf.log.gz     File NETVIRT-1538.neutron-server.log     File karaf_busted_metadata_instance_flows.tar.gz    

 Description   

once in aprox 75 jobs, the apex job we are trying to make stable so it can be a voting
gate saw a failure where one openstack instance did not get it's IP address.

Failed Job



 Comments   
Comment by Jamo Luhrsen [ 12/Dec/18 ]

The job failed both the l2 and external suites for the same reason. In the l2 suite, 6 instances are booted and
all but 5 received their ip address. This is not the case where a single compute node is completely bad like
we've seen before with some main flow missing, or ovs not connected.

the karaf.log and neutron server log is attached.

Comment by Vishal Thapar [ 12/Dec/18 ]

This looks like very old dhcp issue where VM couldn't contact metadata service. At the time it was failing to acquire dhcp address and we increased timeout to 3 minutes. Here, it seems to get DHCP IP but something else is missing and not able to contact metadata service [1].

Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending select for 21.0.0.24...
Lease of 21.0.0.24 obtained, lease time 86400
route: SIOCADDRT: File exists
WARN: failed: route add -net "0.0.0.0/0" gw "21.0.0.1"
cirros-ds 'net' up at 8.07
checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 8.27. request failed
failed 2/20: up 20.71. request failed
failed 3/20: up 33.00. request failed
failed 4/20: up 45.14. request failed
failed 5/20: up 57.38. request failed

Route is setup correctly [2]

default via 192.0.2.1 dev br-ex
10.9.9.0/24 dev internet_tap proto kernel scope link src 10.9.9.9
10.10.10.0/24 dev br-datacentre proto kernel scope link src 10.10.10.250
169.254.169.254 via 192.0.2.1 dev br-ex
172.31.0.0/24 dev docker0 proto kernel scope link src 172.31.0.1
192.0.2.0/24 dev br-ex proto kernel scope link src 192.0.2.19
[heat-admin@overcloud-controller-0 ~]$

[1] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/560/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/9/robot-plugin/log_full.html#s1-s1-k1-k13-k2-k2-k7-k1

[2] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/560/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/9/robot-plugin/log_full.html#s1-s1-k2-k1-k1-k2-k1-k2-k1-k4-k5

Comment by Jamo Luhrsen [ 12/Dec/18 ]

some of the .gz files in the logs I've/Vishal has given will be corrupt. Here is a better link to
logs that should be ok:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/562/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/9/

Comment by Jamo Luhrsen [ 12/Dec/18 ]

Thanks for poking around thapar.

This is a short job (30-40m) so just pay attention to min:sec
timestamps when trying to correlate between the robot html
logs (which get converted to local time) and all the others.
the VMs in the l2 suite started around 47:00

anyway, looking at the meta-data log
you can see that the ip from the broken instance never hit
the metadata service. that ip is 21.0.0.24. you can see the
other ips hitting it.

in the dhcp log you can see that 21.0.0.24 was allocated there.

nothing stands out to me in the neutron server log around
that same time frame though.

Anyway, I didn't find anything more to go on

Comment by Jamo Luhrsen [ 13/Dec/18 ]

here is another example of this bug:

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/564/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/115

Comment by Vishal Thapar [ 13/Dec/18 ]

Second job also has exact same issue. Can we log console log for scenario where it does work? May increase logging but will help confirm if the route add error is relevant or not.

Issue is VM not able to reach metadata agent. Most likely a routing/arp issue. Can we capture arp table also along with console log, not sure if possible.

Also found this bug https://jira.opnfv.org/browse/APEX-593 though in this case it goes through but just takes longer. In our case instance-id is never acquired, so suspect some route/arp for metadata is not resolved.

Need someone more familiar with metadata and how networking for it works to figure out what is going wrong.

Comment by Tim Rozet [ 13/Dec/18 ]

After looking through the logs, the metadata agent does not receive the curl from the failed instance. It does however, receive the requests from the instance before and after it on the same network. This looks like an openflow problem, but looking at the flows I don't see an issue. I'm running this in an infinite loop to recreate the same network, subnet, security group, and instance on my machine to see if I can reproduce it.

Comment by Jamo Luhrsen [ 13/Dec/18 ]

yeah, trozet, that's what I noticed and linked to in an earlier comment about the
meta data server log not seeing the request coming in for that failed instance. I also
didn't see anything stand out in the flow dumps.

I can also try to run some extra loops/instances in the csit and let jenkins run
it in a loop. The problem is what debugs to collect if we do hit the failure.

Comment by Jamo Luhrsen [ 13/Dec/18 ]

and another:
https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/567/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/123/robot-plugin/log_full.html.gz

Comment by Tim Rozet [ 14/Dec/18 ]

I was able to reproduce the problem after 120 iterations. After metadata failed, I was unable to even ping or ssh into the cirros instance (but it did DHCP successfully). The ping request was making it to the instance but the reply was not making it back, and there was no entry in conntrack. Thanks to aswins for helping me debug the flows, he noticed that the ping request was actually bypassing conntrack. The ping request should end up in table 220, then piped to 239 to create a conntrack session. However this flow was directly sending it to the VM in table 220:

cookie=0x8000007, duration=48738.948s, table=220, n_packets=174, n_bytes=14406, priority=9,reg6=0x20300 actions=output:"tap4386be41-ba"

instead of hitting this flow (due to priority):

cookie=0x6900000, duration=48738.948s, table=220, n_packets=0, n_bytes=0, priority=6,reg6=0x20300 actions=load:0x90020300->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

 

Therefore when the packets came back from the VM, conntrack was dropping them because there was no session:

#compute: recirc_id(0),in_port(4),eth(src=fa:16:3e:8a:4e:0a,dst=fa:16:3e:58:31:2b),eth_type(0x8100),vlan(vid=1121,pcp=0),encap(eth_type(0x0800),ipv4(src=0.0.0.0/128.0.0.0,dst=0.0.0.0/128.0.0.0,frag=no)), packets:10, bytes:1020, used:0.118s, actions:pop_vlan,5 recirc_id(0),in_port(4),eth(src=fa:16:3e:8a:4e:0a,dst=fa:16:3e:58:31:2b),eth_type(0x8100),vlan(vid=1121,pcp=0),encap(eth_type(0x0806),arp(sip=21.0.0.2,tip=21.0.0.19,op=2/0xff)), packets:0, bytes:0, used:never, actions:pop_vlan,userspace(pid=4294963155,controller(reason=1,dont_send=0,continuation=0,recirc_id=864,rule_cookie=0x822002e,controller_id=0,max_len=65535)),5 recirc_id(0),in_port(5),eth(src=fa:16:3e:58:31:2b,dst=fa:16:3e:8a:4e:0a),eth_type(0x0806),arp(sip=21.0.0.19,tip=21.0.0.2,op=1/0xff,sha=fa:16:3e:58:31:2b), packets:0, bytes:0, used:never, actions:userspace(pid=4286503971,controller(reason=4,dont_send=0,continuation=0,recirc_id=863,rule_cookie=0x822002d,controller_id=0,max_len=65535)),push_vlan(vid=1121,pcp=0),4 recirc_id(0),in_port(5),eth(src=fa:16:3e:58:31:2b),eth_type(0x0800),ipv4(src=21.0.0.19,proto=1,frag=no), packets:10, bytes:980, used:0.118s, actions:ct(zone=5128),recirc(0x35e) recirc_id(0x35e),in_port(5),ct_state(-new-est-rel+inv+trk),ct_mark(0/0x1),eth_type(0x0800),ipv4(frag=no), packets:10, bytes:980, used:0.118s, actions:drop [root@overcloud-novacompute-0 ~]# ovs-dpctl show system@ovs-system: lookups: hit:34958088 missed:721437 lost:0 flows: 21 masks: hit:296264755 total:18 hit/pkt:8.30 port 0: ovs-system (internal) port 1: vxlan_sys_4789 (vxlan: packet_type=ptap) port 2: br-int (internal) port 3: br-ex (internal) port 4: eth0 port 5: tap4386be41-ba

 

I then deleted that flow, and I could then ssh into the cirros machine. Then I tried to curl metadata and it worked. I then needed to figure out if the same flow was causing issues for metadata. From there I created a cron job in cirros to curl for metadata every minute. I could see it hitting the metadata agent. I then added the problem flow back, and metadata curls were not coming into the server anymore. Therefore I believe this flow is the same root cause as the ping issue and the reason for this bug.

In the CI logs I see that the flow that should go from 220 to 239 is missing, and the problem flow exists in that log (two times?):

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/560/jamo-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/9/robot-plugin/log_full.html#s1-s1-k2-k1-k1-k2-k1-k2-k1-k12-k5

cookie=0x8000007, duration=272.526s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xd00,metadata=0x1/0x1 actions=drop cookie=0x8000007, duration=272.522s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xc00,metadata=0x1/0x1 actions=drop cookie=0x8000007, duration=721.965s, table=220, n_packets=118, n_bytes=11146, priority=9,reg6=0x400 actions=output:tuna642e00a5c1 cookie=0x8000007, duration=721.965s, table=220, n_packets=10, n_bytes=448, priority=9,reg6=0x300 actions=output:tun3884a6aa6dc cookie=0x8000007, duration=272.527s, table=220, n_packets=170, n_bytes=14940, priority=9,reg6=0xd00 actions=push_vlan:0x8100,set_field:5217->vlan_vid,output:"br-datacentre-"

cookie=0x8000007, duration=272.522s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0xc00 actions=output:"br-datacentre-" cookie=0x6900000, duration=272.470s, table=220, n_packets=107, n_bytes=11218, priority=6,reg6=0xb00 actions=load:0x90000b00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239 cookie=0x8000007, duration=272.470s, table=220, n_packets=101, n_bytes=10494, priority=9,reg6=0x90000b00 actions=output:"tapd419615e-50"

 

 

*cookie=0x8000007, duration=267.979s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0x90000e00 actions=output:"tap8b015b1e-82"

cookie=0x8000007, duration=267.979s, table=220, n_packets=170, n_bytes=13918, priority=9,reg6=0xe00 actions=output:"tap8b015b1e-82"*

 

cookie=0x6900000, duration=248.137s, table=220, n_packets=107, n_bytes=10856, priority=6,reg6=0x1200 actions=load:0x90001200->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

cookie=0x8000007, duration=248.137s, table=220, n_packets=101, n_bytes=10436, priority=9,reg6=0x90001200 actions=output:"tap2559ce3e-48"

cookie=0x6900000, duration=721.965s, table=239, n_packets=154, n_bytes=19010, priority=100,ip actions=ct_clear,goto_table:240 cookie=0x6900000, duration=721.965s, table=239, n_packets=16, n_bytes=1168, priority=100,ipv6 actions=ct_clear,goto_table:240 cookie=0x6900000, duration=721.965s, table=239, n_packets=44, n_bytes=1896, priority=0 actions=goto_table:240

 

I'm not sure why the there is no flow to match on reg6, 0xe00 or 0x90000e00 and goto table 239. Could be timing with when the flow dumps are collected? Either way in my reproduction environment there is a bug. We can fix that and see if the problem goes away in CI. Will attach karaf log. Flow and other info here:

https://gist.githubusercontent.com/trozet/8a55951c09a7f207ebaf4a111d9d6697/raw/c15f20e405d464d5a13ae8393dea6e5951d11b5d/broken_metadata_instance

 

 

Comment by Aswin Suryanarayanan [ 14/Dec/18 ]

*cookie=0x8000007, duration=267.979s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0x90000e00 actions=output:"tap8b015b1e-82"

cookie=0x8000007, duration=267.979s, table=220, n_packets=170, n_bytes=13918, priority=9,reg6=0xe00 actions=output:"tap8b015b1e-82"*

Here the second flow should not have been there. Instead of that there should be a flow that sends packet to table=239 should be there matching reg6=0xe00.

The second flow where it directly sends the packet to output port from table 220 is valid for ports for which SG is disabled. Usually the neutron port create happens in this order.(this is when I debugged an older release hope this is still the case.)

1)Port with security group disabled is created. (the second flow here will be programmed.)
2)Port update follows immediately which enables the port security.( The flow programmed in 1 will be replaced with a flow to send to table 239)

In this case the openflowplugin may be writing flows out of sync.

Comment by Jamo Luhrsen [ 14/Dec/18 ]

ok, looks like we're narrowing in on the problem. Thanks guys. Here are two table=220 flow dumps from
a failed and passing case of our CSIT jobs. Keep in mind there are 3 instances on a vlan network and
3 on a vxlan network:

PASSING CASE:

153: cookie=0x8000007, duration=122.815s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xc00,metadata=0x1/0x1 actions=drop
154: cookie=0x8000007, duration=122.801s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xd00,metadata=0x1/0x1 actions=drop
155: cookie=0x8000007, duration=582.334s, table=220, n_packets=16, n_bytes=1384, priority=9,reg6=0x300 actions=output:tun3884a6aa6dc
156: cookie=0x8000007, duration=582.334s, table=220, n_packets=117, n_bytes=11104, priority=9,reg6=0x400 actions=output:tuna642e00a5c1
157: cookie=0x8000007, duration=122.815s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0xc00 actions=output:"br-datacentre-"
158: cookie=0x8000007, duration=122.801s, table=220, n_packets=236, n_bytes=22292, priority=9,reg6=0xd00 actions=push_vlan:0x8100,set_field:5217->vlan_vid,output:"br-datacentre-"
159: cookie=0x6900000, duration=122.695s, table=220, n_packets=107, n_bytes=11218, priority=6,reg6=0xb00 actions=load:0x90000b00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
160: cookie=0x8000007, duration=122.688s, table=220, n_packets=101, n_bytes=10494, priority=9,reg6=0x90000b00 actions=output:"tapa92fefa9-ca"
161: cookie=0x8000007, duration=118.411s, table=220, n_packets=101, n_bytes=10494, priority=9,reg6=0x90000e00 actions=output:"tap3c18abac-bd"
162: cookie=0x6900000, duration=118.397s, table=220, n_packets=107, n_bytes=11218, priority=6,reg6=0xe00 actions=load:0x90000e00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
163: cookie=0x8000007, duration=100.575s, table=220, n_packets=100, n_bytes=10394, priority=9,reg6=0x90001200 actions=output:"tap80f684f4-60"
164: cookie=0x6900000, duration=100.575s, table=220, n_packets=106, n_bytes=10814, priority=6,reg6=0x1200 actions=load:0x90001200->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

FAILING CASE:

153: cookie=0x8000007, duration=263.348s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xc00,metadata=0x1/0x1 actions=drop
154: cookie=0x8000007, duration=263.348s, table=220, n_packets=0, n_bytes=0, priority=10,reg6=0xd00,metadata=0x1/0x1 actions=drop
155: cookie=0x8000007, duration=691.042s, table=220, n_packets=16, n_bytes=1384, priority=9,reg6=0x300 actions=output:tun3884a6aa6dc
156: cookie=0x8000007, duration=691.042s, table=220, n_packets=117, n_bytes=11104, priority=9,reg6=0x400 actions=output:tuna642e00a5c1
157: cookie=0x8000007, duration=263.348s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0xc00 actions=output:"br-datacentre-"
158: cookie=0x8000007, duration=263.348s, table=220, n_packets=150, n_bytes=13842, priority=9,reg6=0xd00 actions=push_vlan:0x8100,set_field:5217->vlan_vid,output:"br-datacentre-"
159: cookie=0x8000007, duration=263.336s, table=220, n_packets=92, n_bytes=10116, priority=9,reg6=0x90000b00 actions=output:"tapde1ea417-33"
160: cookie=0x6900000, duration=263.332s, table=220, n_packets=98, n_bytes=10840, priority=6,reg6=0xb00 actions=load:0x90000b00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
161: cookie=0x8000007, duration=252.629s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0x90000e00 actions=output:"tap0dd6c327-54"
162: cookie=0x8000007, duration=252.629s, table=220, n_packets=157, n_bytes=12997, priority=9,reg6=0xe00 actions=output:"tap0dd6c327-54"
163: cookie=0x8000007, duration=240.479s, table=220, n_packets=100, n_bytes=10394, priority=9,reg6=0x90001200 actions=output:"tap04bf4472-dd"
164: cookie=0x6900000, duration=240.479s, table=220, n_packets=106, n_bytes=10814, priority=6,reg6=0x1200 actions=load:0x90001200->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

karaf.log from the above failing case.

Comment by Jamo Luhrsen [ 15/Dec/18 ]

Reproduced with TRACE debugging for openflowplugin.

job logs

karaf log

Comment by Vishal Thapar [ 17/Dec/18 ]

aswins Which code is responsible for replacing the flow? Will give an idea for markers in case any exceptions in netvirt code.

Comment by Aswin Suryanarayanan [ 17/Dec/18 ]

First the interface will be created with out port security later Acl service will update the binding

From the logs order in which OF plugin revives the flow programming request seems to be the issue. In the failing case the the flow mod with updated regex after acl service is bound is received before the default bind.

thapar Should this be an issue in genius?

Failing Case

2018-12-15T01:57:27,014 | DEBUG | nc-exe-26 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.9}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2415926784*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

2018-12-15T01:57:27,024 | DEBUG | nc-exe-30 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*7680*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

Success Case

2018-12-15T01:47:51,184 | DEBUG | nc-exe-20 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.d2cf928a-1c67-48f3-ba78-490534828239, getHardTimeout=0, getId=Uri{_value=260122599241271.220.d2cf928a-1c67-48f3-ba78-490534828239.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=8}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2816*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

2018-12-15T01:47:51,207 | DEBUG | nc-exe-20 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.d2cf928a-1c67-48f3-ba78-490534828239, getHardTimeout=0, getId=Uri{_value=260122599241271.220.d2cf928a-1c67-48f3-ba78-490534828239.9}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=8}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2415921920*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

Comment by Vishal Thapar [ 18/Dec/18 ]

Could be. We should also look at genius servicebinding logs to see if genius is getting service bindings in correct order or not.

k.faseela The flow with output to port is the default binding created by IFM, right?

Comment by Faseela K [ 18/Dec/18 ]

thapar : Yes!

Comment by Vishal Thapar [ 18/Dec/18 ]

k.faseela Looks like aswins had right idea but used wrong log entries to explain point.

Issue is the default action with output to port is being added AFTER the action to go to Table239 in failure case.

SUCCESS

2018-12-15T01:47:51,184 | DEBUG | nc-exe-20 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.d2cf928a-1c67-48f3-ba78-490534828239, getHardTimeout=0, getId=Uri{_value=260122599241271.220.d2cf928a-1c67-48f3-ba78-490534828239.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, *getOutputNodeConnector=Uri{_value=8}*, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2816*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T01:47:51,204 | DEBUG | nc-exe-20 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=110100480}, getFlowName=acl.egressacl.d2cf928a-1c67-48f3-ba78-490534828239, getHardTimeout=0, getId=Uri{_value=260122599241271.220.d2cf928a-1c67-48f3-ba78-490534828239.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=WriteMetadataCase{getWriteMetadata=WriteMetadata{getMetadata=0, getMetadataMask=1099511627774, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}, Instruction{getInstruction=GoToTableCase{getGoToTable=GoToTable{getTableId=239, augmentations={}}, augmentations={}}, getOrder=1, augmentations={}}, Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=NxActionRegLoadNodesNodeTableFlowApplyActionsCase{getNxRegLoad=NxRegLoad{getDst=Dst{getDstChoice=DstNxRegCase{getNxReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, augmentations={}}, getEnd=31, getStart=0, augmentations={}}, getValue=2415921920, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=2, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2816*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=6, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T01:54:15,347 | DEBUG | nc-exe-24 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=110100480}, getFlowName=acl.egressacl.d2cf928a-1c67-48f3-ba78-490534828239, getHardTimeout=0, getId=Uri{_value=260122599241271.220.d2cf928a-1c67-48f3-ba78-490534828239.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=WriteMetadataCase{getWriteMetadata=WriteMetadata{getMetadata=0, getMetadataMask=1099511627774, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}, Instruction{getInstruction=GoToTableCase{getGoToTable=GoToTable{getTableId=239, augmentations={}}, augmentations={}}, getOrder=1, augmentations={}}, Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=NxActionRegLoadNodesNodeTableFlowApplyActionsCase{getNxRegLoad=NxRegLoad{getDst=Dst{getDstChoice=DstNxRegCase{getNxReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, augmentations={}}, getEnd=31, getStart=0, augmentations={}}, getValue=2684357376, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=2, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*2816*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=6, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

FAILURE

2018-12-15T01:57:27,016 | DEBUG | nc-exe-26 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=110100480}, getFlowName=acl.egressacl.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=WriteMetadataCase{getWriteMetadata=WriteMetadata{getMetadata=0, getMetadataMask=1099511627774, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}, Instruction{getInstruction=GoToTableCase{getGoToTable=GoToTable{getTableId=239, augmentations={}}, augmentations={}}, getOrder=1, augmentations={}}, Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=NxActionRegLoadNodesNodeTableFlowApplyActionsCase{getNxRegLoad=NxRegLoad{getDst=Dst{getDstChoice=DstNxRegCase{getNxReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, augmentations={}}, getEnd=31, getStart=0, augmentations={}}, getValue=2415926784, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=2, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*7680*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=6, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T01:57:27,024 | DEBUG | nc-exe-30 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, *getOutputNodeConnector=Uri{_value=12}*, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*7680*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T02:02:08,667 | DEBUG | nc-exe-31 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, *getOutputNodeConnector=Uri{_value=12}*, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=*7680*, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

Comment by Faseela K [ 18/Dec/18 ]

thapar : Paste the flow on switch in working and non-working case?

Comment by Aswin Suryanarayanan [ 18/Dec/18 ]
Not Working

cookie=0x6900000, duration=248.818s, table=220, n_packets=0, n_bytes=0, priority=6,reg6=0x1e00 actions=load:0x90001e00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
cookie=0x8000007, duration=248.817s, table=220, n_packets=157, n_bytes=12369, priority=9,reg6=0x1e00 actions=output:"tapa992567a-ae"

Success

cookie=0x6900000, duration=266.575s, table=220, n_packets=99, n_bytes=9638, priority=6,reg6=0x1b00 actions=load:0x90001b00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
cookie=0x8000007, duration=266.576s, table=220, n_packets=95, n_bytes=9058, priority=9,reg6=0x90001b00 actions=output:"tapba660f72-15"

More details at [1]

[1]https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/builder-copy-sandbox-logs/571/jamo-netvirt-1538-netvirt-csit-1node-0cmb-1ctl-2cmp-apex-queens-gate-snat-conntrack-oxygen/5/robot-plugin/log_02_l2.html.gz#s1-k2-k1-k1

Comment by Faseela K [ 18/Dec/18 ]

thapar Please supply a run with TRACE enabled for org.opendaylight.genius.interfacemanager

Comment by Vishal Thapar [ 18/Dec/18 ]

Trace logs for failed port in 220:

2018-12-15T01:57:27,014 | DEBUG | nc-exe-26 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.9}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=2415926784, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T01:57:27,016 | DEBUG | nc-exe-26 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=110100480}, getFlowName=acl.egressacl.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=WriteMetadataCase{getWriteMetadata=WriteMetadata{getMetadata=0, getMetadataMask=1099511627774, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}, Instruction{getInstruction=GoToTableCase{getGoToTable=GoToTable{getTableId=239, augmentations={}}, augmentations={}}, getOrder=1, augmentations={}}, Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=NxActionRegLoadNodesNodeTableFlowApplyActionsCase{getNxRegLoad=NxRegLoad{getDst=Dst{getDstChoice=DstNxRegCase{getNxReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, augmentations={}}, getEnd=31, getStart=0, augmentations={}}, getValue=2415926784, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=2, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=6, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T01:57:27,024 | DEBUG | nc-exe-30 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group
2018-12-15T02:02:08,667 | DEBUG | nc-exe-31 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

Flows in OVS [compute1]:

cookie=0x8000007, duration=248.821s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0x90001e00 actions=output:"tapa992567a-ae"
cookie=0x6900000, duration=248.818s, table=220, n_packets=0, n_bytes=0, priority=6,reg6=0x1e00 actions=load:0x90001e00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239
cookie=0x8000007, duration=248.817s, table=220, n_packets=157, n_bytes=12369, priority=9,reg6=0x1e00 actions=output:"tapa992567a-ae"

Comment by Jamo Luhrsen [ 18/Dec/18 ]

I will work on this today. I'll just enable this module for TRACE and let it run. Keep in mind, k.faseela, that
this is sort of a hard to reproduce failure, but we will hit it eventually.

Comment by Jamo Luhrsen [ 18/Dec/18 ]

this job is where I'm trying to get this

Comment by Faseela K [ 19/Dec/18 ]

d.arunprakash@ericsson.com : We need your help in debugging this from openflowplugin side.. 

Comment by Arunprakash D [ 19/Dec/18 ]

Somashekhar is checking this.

Comment by Somashekhar Javalagi [ 19/Dec/18 ]

I checked the flow programmed by openflowplugin.

For table=220 and reg6=0x1e00, found below flow-mods received by plugin in ovs-vswitchd logs,

2018-12-15T01:57:27.289Z|02466|vconn|DBG|tcp:192.0.2.19:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x242): DEL_STRICT table:220 priority=9,reg6=0x1e00 actions=output:12

2018-12-15T01:57:27.290Z|02468|vconn|DBG|tcp:192.0.2.19:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x244): ADD table:220 priority=6,reg6=0x1e00 cookie:0x6900000 actions=load:0x90001e00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

2018-12-15T01:57:27.291Z|02471|vconn|DBG|tcp:192.0.2.19:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x247): ADD table:220 priority=9,reg6=0x1e00 cookie:0x8000007 actions=output:12

2018-12-15T02:02:08.942Z|03073|vconn|DBG|tcp:192.0.2.19:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x310): DEL_STRICT table:220 priority=6,reg6=0x1e00 actions=load:0x90001e00->NXM_NX_REG6[],write_metadata:0/0xfffffffffe,goto_table:239

2018-12-15T02:02:08.942Z|03074|vconn|DBG|tcp:192.0.2.19:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x311): ADD table:220 priority=9,reg6=0x1e00 cookie:0x8000007 actions=output:12

 

For the corresponding flow-mods, there are respective changes pushed by the plugin in karaf logs.

2018-12-15T01:57:27,016 | DEBUG | nc-exe-26 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=110100480}, getFlowName=acl.egressacl.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=WriteMetadataCase{getWriteMetadata=WriteMetadata{getMetadata=0, getMetadataMask=1099511627774, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}, Instruction{getInstruction=GoToTableCase{getGoToTable=GoToTable{getTableId=239, augmentations={}}, augmentations={}}, getOrder=1, augmentations={}}, Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=NxActionRegLoadNodesNodeTableFlowApplyActionsCase{getNxRegLoad=NxRegLoad{getDst=Dst{getDstChoice=DstNxRegCase{getNxReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, augmentations={}}, getEnd=31, getStart=0, augmentations={}}, getValue=2415926784, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=2, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=6, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

2018-12-15T01:57:27,024 | DEBUG | nc-exe-30 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

2018-12-15T02:02:08,667 | DEBUG | nc-exe-31 | FrmUtil | 369 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.8.0.SNAPSHOT | Check if flow Flow{getCookie=FlowCookie{_value=134217735}, getFlowName=default.a992567a-ae6c-4acf-9b6c-fda693ac6f18, getHardTimeout=0, getId=Uri{_value=75753836899984.220.a992567a-ae6c-4acf-9b6c-fda693ac6f18.0}, getIdleTimeout=0, getInstructions=Instructions{getInstruction=[Instruction{getInstruction=ApplyActionsCase{getApplyActions=ApplyActions{getAction=[Action{getAction=OutputActionCase{getOutputAction=OutputAction{getMaxLength=0, getOutputNodeConnector=Uri{_value=12}, augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, augmentations={}}, getOrder=0, augmentations={}}], augmentations={}}, getMatch=Match{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.general.rev140714.GeneralAugMatchNodesNodeTableFlow=GeneralAugMatchNodesNodeTableFlow{getExtensionList=[ExtensionList{getExtension=Extension{augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxAugMatchNodesNodeTableFlow=NxAugMatchNodesNodeTableFlow{getNxmNxReg=NxmNxReg{getReg=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowjava.nx.match.rev140421.NxmNxReg6, getValue=7680, augmentations={}}}}}, getExtensionKey=interface org.opendaylight.yang.gen.v1.urn.opendaylight.openflowplugin.extension.nicira.match.rev140714.NxmNxReg6Key, augmentations={}}]}}}, getPriority=9, getTableId=220, isBarrier=false, isInstallHw=true, isStrict=true, augmentations={}} is dependent on group

These logs are for flow-add events. We don't have any logs in code for flow delete events.

 

Comment by Faseela K [ 20/Dec/18 ]

so what does that mean now? delete is not pushed to the switch? Did you check the output of config/inventory? we see only two flows there, but there are 3 flows on the switch. What would you need to further debug the issue?

Comment by Somashekhar Javalagi [ 20/Dec/18 ]

We are not logging any message after deleting the flows. The config inventory is captured at the teardown state after which some flows programmed for table 220. 

thapar can I get config inventory dump after all the tests?

Comment by Jamo Luhrsen [ 20/Dec/18 ]

k.faseela, is the config/opendaylight-inventory:nodes model dump what you want?

Comment by Jamo Luhrsen [ 20/Dec/18 ]

I'm having some trouble.

1) even though I have org.opendaylight.genius.interfacemanager set to TRACE in the logging config file
there are no TRACE level logs in the karaf.log

2) I cannot hit this bug right now. this sandbox job has run for 80+
times now without hitting it

 

Comment by Faseela K [ 21/Dec/18 ]

Yes Jamo. What Som is saying is the current teardown dump is captured before the log entry for flows is showing up in the karaf log. So he is suspecting if the teardown was captured at a letr point of time, inventory dump would have had 3 flows

Comment by Faseela K [ 21/Dec/18 ]

Looks like you should set only genius.interfacemanager, the config file now says

log4j2.logger.org_opendaylight_org_opendaylight_genius_interfacemanager.name = org.opendaylight.org.opendaylight.genius.interfacemanager which is wrong.

Comment by Somashekhar Javalagi [ 21/Dec/18 ]

We are receiving add flow event with flow of priority 9 and then we are receiving update flow event  for flow with priority 9 to replace it with flow with priority 6. So eventually we have to  do,

ADD event:           ADD flow priority = 9

UPDATE event :   DELETE flow priority = 9

                              ADD flow priority = 6.

 

But the UPDATE event is processed first before ADD event.

Which is first trying to DELETE flow which doesn't exist followed by adding both the flows.

Though DataTreeChange notification is received in proper order, there is some problem in processing the flow mod events.

Comment by Faseela K [ 21/Dec/18 ]

Somashekhar : Do you know how to fix this? Or you need any further information?

Comment by Faseela K [ 21/Dec/18 ]

jluhrsen : Somashekhar has confirmed that this is something wrong in openflowplugin, so move the Jira to ofp?

Comment by Jamo Luhrsen [ 21/Dec/18 ]

I'm not totally following what we want then. The link I gave in my comment is the
model dump in the suite teardown of the final suite the job runs. We can't get
that debug much later than that, right?

Comment by Jamo Luhrsen [ 21/Dec/18 ]

will do

Comment by Somashekhar Javalagi [ 21/Jan/19 ]

jluhrsen, Can you please enable below log level and reproduce the same?

log:set TRACE org.opendaylight.openflowplugin.applications.frm.nodeconfigurator.NodeConfiguratorImpl

To see the queuing order of flow-mod requests?

Comment by Jamo Luhrsen [ 21/Jan/19 ]

Somashekhar, I wont have time to look at this right now. But, if someone can make
a job in the sandbox, you can use the jenkins build parameters to set custom
logging like you want. You'll need to run it in a loop (once per hour maybe) and
monitor it for when it might fail.

Comment by Somashekhar Javalagi [ 06/Feb/19 ]

I am keep on trying, but not yet able to reproduce. I will update my analysis for added logs onc I am able to reproduce.

Comment by Somashekhar Javalagi [ 19/Mar/19 ]

Created a sandbox and scheduled it to run every hour for a week, but could not see any failures and hence was not able to reproduce. Can be reopened if it is seen again.

Generated at Wed Feb 07 20:34:05 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.