[CONTROLLER-1223] Clustering : Chunk an append entry to ensure that we not exceed akka-remoting size limit Created: 24/Mar/15  Updated: 03/Jul/17  Resolved: 03/Jul/17

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Post-Helium
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Tom Pantelis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 2890

 Comments   
Comment by Moiz Raja [ 24/Mar/15 ]

A large transaction with a lot of operations can result in a really large append entry which may need to be chunked the same as snapshot chunking.

Comment by Tom Pantelis [ 18/Jan/16 ]

If a single AppendEntries exceeds the max size, maybe we should just send a snapshot.

Comment by Robert Varga [ 13/Apr/16 ]

This is related to maximum-frame-size. This affects not only AppendEntries, but also other messages. Rather than dealing with this in the business logic, I think we should be plugging into the transport and implement some sort of fragmentation/defragmentation component – if that is possible.

Comment by Robert Varga [ 19/Dec/16 ]

We can solve this by switching to Artery (and perhaps allocating a dedicated channel to carry these messages).

Patch to switch to Artery: https://git.opendaylight.org/gerrit/49466

Comment by Robert Varga [ 20/Dec/16 ]

One additional improvement we can do with Artery is to separate messages into separate subchannels, hence eliminating latency cross-talk between components.

Specifically:

  • RAFT messages should use one subchannel
  • Frontend -> Backend messages should use another subchannel
  • Backend -> Frontend messages should use yet another subchannel

This should result in RAFT not being blocked by anything going on in the FE/BE interface and Shards being able to make forward progress irrespective of frontend request being made from their local node to shard leaders on other nodes.

Comment by Tom Pantelis [ 28/Dec/16 ]

It does not appear Artery itself solves this - while the aeron layer will fragment messages into smaller chunks, akka still imposes an upper limit governed by artery.advanced.maximum-frame-size. We'll need to bump this up as we did with netty.tcp.maximum-frame-size.

From what I've read, Artery defines 3 sub channels: 1 for system/heartbeat messages, 1 for ordinary messages and 1 for large messages. By default all user messages go into the "ordinary" sub channel and are thus isolated over the wire from system messages. You can specify particular actors to utilize the large message channel via the artery.large-message-destinations setting. We may want to use this for the FE actors (i.e. read replies can be large). I have not seen anything in the akka docs whereby you can define your own custom sub channels.

Artery does allow for the configuration of inbound and outbound "lanes" such that ser/des can be performed in parallel for different destination actors, although the messages still share the same transport sub channel. However this feature is not supported or recommended for use yet (needs more hardening).

Comment by Jie Han [ 12/Apr/17 ]

I have committed a component to the gerrit,which does the job of fragmentation and defragmentation mentioned above, it has past the test of reading large data in one go of in three nodes cluster that would not broke the heartbreak of raft.
Looking forward to a more in-depth discussion.

Some links:
https://bugs.opendaylight.org/show_bug.cgi?id=7841 ,in which attached the UML squences for its usage.

https://git.opendaylight.org/gerrit/#/c/54753/

Comment by Tom Pantelis [ 26/Jun/17 ]

Submitted https://git.opendaylight.org/gerrit/#/c/57301/

Generated at Wed Feb 07 19:54:59 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.