[OVSDB-285] bridge not created if it's configured northbound while ovs node is disconnected Created: 02/Feb/16  Updated: 23/May/16  Resolved: 23/May/16

Status: Resolved
Project: ovsdb
Component/s: Southbound.Open_vSwitch
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Vinh Nguyen
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 5177

 Description   

If the connection between the plugin and ovs node is not active when a
configuration is requested via northbound rest api, that configuration will
not be made to the ovs node when it reconnects.

to recreate:

1. connect ovs to controller
sudo ovs-vsctl set-manager tcp:${IP}:6640

2. create connection in config store to match that as seen in
operational.

3. create a bridge in config store and verify it's there in operational
and on ovs node

4. disconnect ovs node (iptables can be used to block traffic OUT on
port 6640, or just a reboot of ovs node)

5. while disconnected, create a new bridge in config store

6. reconnect ovs node and the configuration from 3 will still be there
in operational as well as on ovs node, but the config made in 5 will
not.



 Comments   
Comment by Jamo Luhrsen [ 02/Feb/16 ]

email thread for more on the conversation:
https://lists.opendaylight.org/pipermail/ovsdb-dev/2016-February/002528.html

Comment by Vinh Nguyen [ 08/Apr/16 ]

Code review submitted:

https://git.opendaylight.org/gerrit/#/c/37358/

Comment by Jamo Luhrsen [ 12/Apr/16 ]

(In reply to Vinh Nguyen from comment #2)
> Code review submitted:
>
> https://git.opendaylight.org/gerrit/#/c/37358/

Vinh,

I tried to test this patch, but saw the same behavior of this bug.

However, I had some trouble building the patch to create a distribution.
I was using our multipatch job in the sandbox, which will create an
entire distribution based on any patches. I used this patch for that
job, but it failed when building with master (Boron). The failure was
in yangtools (warning message below) so probably nothing wrong with the patch.
But pulling this patch in to the stable/beryllium branch I got a distribution
to build. Using that distribution, I re-tested this bug and saw the
same behavior.

[WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy: org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo : Unsupported major.minor version 52.0

Comment by Vinh Nguyen [ 12/Apr/16 ]

(In reply to Jamo Luhrsen from comment #3)
> (In reply to Vinh Nguyen from comment #2)
> > Code review submitted:
> >
> > https://git.opendaylight.org/gerrit/#/c/37358/
>
> Vinh,
>
> I tried to test this patch, but saw the same behavior of this bug.
>
> However, I had some trouble building the patch to create a distribution.
> I was using our multipatch job in the sandbox, which will create an
> entire distribution based on any patches. I used this patch for that
> job, but it failed when building with master (Boron). The failure was
> in yangtools (warning message below) so probably nothing wrong with the
> patch.
> But pulling this patch in to the stable/beryllium branch I got a distribution
> to build. Using that distribution, I re-tested this bug and saw the
> same behavior.
>
> [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> Unsupported major.minor version 52.0

Hi Jamo,

I was able to verify the bug on the stable/beryllium branch. There are the reproduction steps, please let me know if I missed something:

1) git checkout stable/beryllium
2) chery-pick the patch:
git fetch https://git.opendaylight.org/gerrit/ovsdb refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
3) build and run karaf

4) connect an ovs node to controller
sudo ovs-vsctl set-manager tcp:${IP}:6640

5) create a bridge in config store and verify it's there in operational
and on ovs node

6) reboot the ovs node

7) while ovs node disconnected, create a new bridge in config store

8) when the ovs comes back up, verify that the bridge created in 5 and 7 are present in the operational as well as on ovs node.

Thanks, Vinh

Comment by Jamo Luhrsen [ 12/Apr/16 ]

(In reply to Vinh Nguyen from comment #4)
> (In reply to Jamo Luhrsen from comment #3)
> > (In reply to Vinh Nguyen from comment #2)
> > > Code review submitted:
> > >
> > > https://git.opendaylight.org/gerrit/#/c/37358/
> >
> > Vinh,
> >
> > I tried to test this patch, but saw the same behavior of this bug.
> >
> > However, I had some trouble building the patch to create a distribution.
> > I was using our multipatch job in the sandbox, which will create an
> > entire distribution based on any patches. I used this patch for that
> > job, but it failed when building with master (Boron). The failure was
> > in yangtools (warning message below) so probably nothing wrong with the
> > patch.
> > But pulling this patch in to the stable/beryllium branch I got a distribution
> > to build. Using that distribution, I re-tested this bug and saw the
> > same behavior.
> >
> > [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> > org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> > Unsupported major.minor version 52.0
>
>
> Hi Jamo,
>
> I was able to verify the bug on the stable/beryllium branch. There are the
> reproduction steps, please let me know if I missed something:
>
>
> 1) git checkout stable/beryllium
> 2) chery-pick the patch:
> git fetch https://git.opendaylight.org/gerrit/ovsdb
> refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
> 3) build and run karaf
>
> 4) connect an ovs node to controller
> sudo ovs-vsctl set-manager tcp:${IP}:6640
>
> 5) create a bridge in config store and verify it's there in operational
> and on ovs node
>
> 6) reboot the ovs node
>
> 7) while ovs node disconnected, create a new bridge in config store
>
> 8) when the ovs comes back up, verify that the bridge created in 5 and 7 are
> present in the operational as well as on ovs node.
>
> Thanks, Vinh

Vinh,

I can try again using your steps 1-3. Could you try again as well, and
instead of doing your step 6, can you run this command on your ovs node:

iptables -A OUTPUT -p tcp --dport 6640 -j DROP"

that will simulate the ovs node "going away". While it's in that state,
ovs-vsctl should show it's not connected. Now, you can go back to your
step 7 and 8.

Also, I learned today that when a node goes away like this (iptables method)
it will take aprox 3.5m before we remove it from operational. When I was
doing this test earlier today I was not waiting that long. I wonder if
there is any difference there. I have a feeling your step 6 (reboot) will
actively close the ovs connection so the plugin will see the TCP connection
closed and might immediately remove it from operational.

Thanks,
JamO

Comment by Vinh Nguyen [ 19/Apr/16 ]

(In reply to Jamo Luhrsen from comment #5)
> (In reply to Vinh Nguyen from comment #4)
> > (In reply to Jamo Luhrsen from comment #3)
> > > (In reply to Vinh Nguyen from comment #2)
> > > > Code review submitted:
> > > >
> > > > https://git.opendaylight.org/gerrit/#/c/37358/
> > >
> > > Vinh,
> > >
> > > I tried to test this patch, but saw the same behavior of this bug.
> > >
> > > However, I had some trouble building the patch to create a distribution.
> > > I was using our multipatch job in the sandbox, which will create an
> > > entire distribution based on any patches. I used this patch for that
> > > job, but it failed when building with master (Boron). The failure was
> > > in yangtools (warning message below) so probably nothing wrong with the
> > > patch.
> > > But pulling this patch in to the stable/beryllium branch I got a distribution
> > > to build. Using that distribution, I re-tested this bug and saw the
> > > same behavior.
> > >
> > > [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> > > org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> > > Unsupported major.minor version 52.0
> >
> >
> > Hi Jamo,
> >
> > I was able to verify the bug on the stable/beryllium branch. There are the
> > reproduction steps, please let me know if I missed something:
> >
> >
> > 1) git checkout stable/beryllium
> > 2) chery-pick the patch:
> > git fetch https://git.opendaylight.org/gerrit/ovsdb
> > refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
> > 3) build and run karaf
> >
> > 4) connect an ovs node to controller
> > sudo ovs-vsctl set-manager tcp:${IP}:6640
> >
> > 5) create a bridge in config store and verify it's there in operational
> > and on ovs node
> >
> > 6) reboot the ovs node
> >
> > 7) while ovs node disconnected, create a new bridge in config store
> >
> > 8) when the ovs comes back up, verify that the bridge created in 5 and 7 are
> > present in the operational as well as on ovs node.
> >
> > Thanks, Vinh
>
> Vinh,
>
> I can try again using your steps 1-3. Could you try again as well, and
> instead of doing your step 6, can you run this command on your ovs node:
>
> iptables -A OUTPUT -p tcp --dport 6640 -j DROP"
>
> that will simulate the ovs node "going away". While it's in that state,
> ovs-vsctl should show it's not connected. Now, you can go back to your
> step 7 and 8.
>
> Also, I learned today that when a node goes away like this (iptables method)
> it will take aprox 3.5m before we remove it from operational. When I was
> doing this test earlier today I was not waiting that long. I wonder if
> there is any difference there. I have a feeling your step 6 (reboot) will
> actively close the ovs connection so the plugin will see the TCP connection
> closed and might immediately remove it from operational.
>
> Thanks,
> JamO

Hi JamO,

I also can verify the fix using iptables command for step 6) and remove the rule in step 8)

In step 6) after about couple minutes the node is removed from the datastore.
In step 8Then (In reply to Jamo Luhrsen from comment #5)
> (In reply to Vinh Nguyen from comment #4)
> > (In reply to Jamo Luhrsen from comment #3)
> > > (In reply to Vinh Nguyen from comment #2)
> > > > Code review submitted:
> > > >
> > > > https://git.opendaylight.org/gerrit/#/c/37358/
> > >
> > > Vinh,
> > >
> > > I tried to test this patch, but saw the same behavior of this bug.
> > >
> > > However, I had some trouble building the patch to create a distribution.
> > > I was using our multipatch job in the sandbox, which will create an
> > > entire distribution based on any patches. I used this patch for that
> > > job, but it failed when building with master (Boron). The failure was
> > > in yangtools (warning message below) so probably nothing wrong with the
> > > patch.
> > > But pulling this patch in to the stable/beryllium branch I got a distribution
> > > to build. Using that distribution, I re-tested this bug and saw the
> > > same behavior.
> > >
> > > [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> > > org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> > > Unsupported major.minor version 52.0
> >
> >
> > Hi Jamo,
> >
> > I was able to verify the bug on the stable/beryllium branch. There are the
> > reproduction steps, please let me know if I missed something:
> >
> >
> > 1) git checkout stable/beryllium
> > 2) chery-pick the patch:
> > git fetch https://git.opendaylight.org/gerrit/ovsdb
> > refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
> > 3) build and run karaf
> >
> > 4) connect an ovs node to controller
> > sudo ovs-vsctl set-manager tcp:${IP}:6640
> >
> > 5) create a bridge in config store and verify it's there in operational
> > and on ovs node
> >
> > 6) reboot the ovs node
> >
> > 7) while ovs node disconnected, create a new bridge in config store
> >
> > 8) when the ovs comes back up, verify that the bridge created in 5 and 7 are
> > present in the operational as well as on ovs node.
> >
> > Thanks, Vinh
>
> Vinh,
>
> I can try again using your steps 1-3. Could you try again as well, and
> instead of doing your step 6, can you run this command on your ovs node:
>
> iptables -A OUTPUT -p tcp --dport 6640 -j DROP"
>
> that will simulate the ovs node "going away". While it's in that state,
> ovs-vsctl should show it's not connected. Now, you can go back to your
> step 7 and 8.
>
> Also, I learned today that when a node goes away like this (iptables method)
> it will take aprox 3.5m before we remove it from operational. When I was
> doing this test earlier today I was not waiting that long. I wonder if
> there is any difference there. I have a feeling your step 6 (reboot) will
> actively close the ovs connection so the plugin will see the TCP connection
> closed and might immediately remove it from operational.
>
> Thanks,
> JamO

Hi JamO,

I can verify the fix using the iptables command to add/delete firewall rules in step 6 and 8 respectively.
In step 6, the ovs is disconnected after (In reply to Jamo Luhrsen from comment #5)
> (In reply to Vinh Nguyen from comment #4)
> > (In reply to Jamo Luhrsen from comment #3)
> > > (In reply to Vinh Nguyen from comment #2)
> > > > Code review submitted:
> > > >
> > > > https://git.opendaylight.org/gerrit/#/c/37358/
> > >
> > > Vinh,
> > >
> > > I tried to test this patch, but saw the same behavior of this bug.
> > >
> > > However, I had some trouble building the patch to create a distribution.
> > > I was using our multipatch job in the sandbox, which will create an
> > > entire distribution based on any patches. I used this patch for that
> > > job, but it failed when building with master (Boron). The failure was
> > > in yangtools (warning message below) so probably nothing wrong with the
> > > patch.
> > > But pulling this patch in to the stable/beryllium branch I got a distribution
> > > to build. Using that distribution, I re-tested this bug and saw the
> > > same behavior.
> > >
> > > [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> > > org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> > > Unsupported major.minor version 52.0
> >
> >
> > Hi Jamo,
> >
> > I was able to verify the bug on the stable/beryllium branch. There are the
> > reproduction steps, please let me know if I missed something:
> >
> >
> > 1) git checkout stable/beryllium
> > 2) chery-pick the patch:
> > git fetch https://git.opendaylight.org/gerrit/ovsdb
> > refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
> > 3) build and run karaf
> >
> > 4) connect an ovs node to controller
> > sudo ovs-vsctl set-manager tcp:${IP}:6640
> >
> > 5) create a bridge in config store and verify it's there in operational
> > and on ovs node
> >
> > 6) reboot the ovs node
> >
> > 7) while ovs node disconnected, create a new bridge in config store
> >
> > 8) when the ovs comes back up, verify that the bridge created in 5 and 7 are
> > present in the operational as well as on ovs node.
> >
> > Thanks, Vinh
>
> Vinh,
>
> I can try again using your steps 1-3. Could you try again as well, and
> instead of doing your step 6, can you run this command on your ovs node:
>
> iptables -A OUTPUT -p tcp --dport 6640 -j DROP"
>
> that will simulate the ovs node "going away". While it's in that state,
> ovs-vsctl should show it's not connected. Now, you can go back to your
> step 7 and 8.
>
> Also, I learned today that when a node goes away like this (iptables method)
> it will take aprox 3.5m before we remove it from operational. When I was
> doing this test earlier today I was not waiting that long. I wonder if
> there is any difference there. I have a feeling your step 6 (reboot) will
> actively close the ovs connection so the plugin will see the TCP connection
> closed and might immediately remove it from operational.
>
> Thanks,
> JamO

(In reply to Jamo Luhrsen from comment #5)
> (In reply to Vinh Nguyen from comment #4)
> > (In reply to Jamo Luhrsen from comment #3)
> > > (In reply to Vinh Nguyen from comment #2)
> > > > Code review submitted:
> > > >
> > > > https://git.opendaylight.org/gerrit/#/c/37358/
> > >
> > > Vinh,
> > >
> > > I tried to test this patch, but saw the same behavior of this bug.
> > >
> > > However, I had some trouble building the patch to create a distribution.
> > > I was using our multipatch job in the sandbox, which will create an
> > > entire distribution based on any patches. I used this patch for that
> > > job, but it failed when building with master (Boron). The failure was
> > > in yangtools (warning message below) so probably nothing wrong with the
> > > patch.
> > > But pulling this patch in to the stable/beryllium branch I got a distribution
> > > to build. Using that distribution, I re-tested this bug and saw the
> > > same behavior.
> > >
> > > [WARNING] Failed to notify spy hudson.maven.Maven3Builder$JenkinsEventSpy:
> > > org/opendaylight/yangtools/yang2sources/plugin/YangToSourcesMojo :
> > > Unsupported major.minor version 52.0
> >
> >
> > Hi Jamo,
> >
> > I was able to verify the bug on the stable/beryllium branch. There are the
> > reproduction steps, please let me know if I missed something:
> >
> >
> > 1) git checkout stable/beryllium
> > 2) chery-pick the patch:
> > git fetch https://git.opendaylight.org/gerrit/ovsdb
> > refs/changes/58/37358/1 && git cherry-pick FETCH_HEAD
> > 3) build and run karaf
> >
> > 4) connect an ovs node to controller
> > sudo ovs-vsctl set-manager tcp:${IP}:6640
> >
> > 5) create a bridge in config store and verify it's there in operational
> > and on ovs node
> >
> > 6) reboot the ovs node
> >
> > 7) while ovs node disconnected, create a new bridge in config store
> >
> > 8) when the ovs comes back up, verify that the bridge created in 5 and 7 are
> > present in the operational as well as on ovs node.
> >
> > Thanks, Vinh
>
> Vinh,
>
> I can try again using your steps 1-3. Could you try again as well, and
> instead of doing your step 6, can you run this command on your ovs node:
>
> iptables -A OUTPUT -p tcp --dport 6640 -j DROP"
>
> that will simulate the ovs node "going away". While it's in that state,
> ovs-vsctl should show it's not connected. Now, you can go back to your
> step 7 and 8.
>
> Also, I learned today that when a node goes away like this (iptables method)
> it will take aprox 3.5m before we remove it from operational. When I was
> doing this test earlier today I was not waiting that long. I wonder if
> there is any difference there. I have a feeling your step 6 (reboot) will
> actively close the ovs connection so the plugin will see the TCP connection
> closed and might immediately remove it from operational.
>
> Thanks,
> JamO

Hi JamO,

I can verify the fix successfully using iptables command to add/delete firewall rules for step6 and 8 respectively.
In step 6, the connection is disconnected after 3-4 minutes and the configurations are removed from the datastore

In step 8 after removing the firewall, the node is reconnected and new configurations are reconciled like the reboot scenario.

Thanks, Vinh

Comment by Sam Hague [ 16/May/16 ]

be: https://git.opendaylight.org/gerrit/38531
b: https://git.opendaylight.org/gerrit/37358

Generated at Wed Feb 07 20:35:59 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.