|
If a peer sends a large bunch of BGP updates (either a big RIB sync or a big change to its RIB after the sync was completed) and closes immediately after finishing this transmission, the BGPCEP feature dies. All further connections are accepted and everything seems to work normally but the data received from the peers is silently discarded (well, silently relative to the connected peers; the discarding is actually pretty loud in karaf.log). The larger the bunch of BGP updates, the more probable is that this situation occurs.
Examining karaf.log gives that after the connection to the peer with the big update closes, one or more OptimisticLockFailed exceptions appear ("Node was deleted by other transaction"). After them all other peer's data is rejected with either "CanCommit Failed: Transaction chain failed" or "New transaction ABC raced with transaction XYZ" (where "XYZ" is one of the transactions that are involved in the previous OptimisticLockFailed exceptions) errors.
This is just a quick bug report intended for myself to make sure the bug does not get lost in the course of debugging various aspects of the Internet Feed tests (and to get a nice ID to refer to the bug later). More info (logs etc.) will be reported/attached as it is acquired.
|