[CONTROLLER-1616] Exception Cannot append new entry Created: 21/Mar/17  Updated: 25/Jul/23  Resolved: 24/Mar/17

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Claudio David Gasparini Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks BGPCEP-392 BGP scaling target not met in 3-node ... Confirmed
Duplicate
is duplicated by CONTROLLER-1621 RaftActor seems to get confused with ... Resolved
External issue ID: 8027

 Description   

Observed during analysis of bgpcep-csit-3node-periodic-bgpclustering-only-carbon

https://logs.opendaylight.org/sandbox/jenkins091/bgpcep-csit-3node-periodic-bgpclustering-only-carbon/5/archives/odl1_karaf.log.gz

2017-03-21 07:46:17,292 | WARN | lt-dispatcher-20 | AbstractReplicatedLogImpl | 227 - org.opendaylight.controller.sal-akka-raft - 1.5.0.SNAPSHOT | member-1-shard-default-config: Cannot append new entry - new index 28 is not greater than the last index 28
java.lang.Exception: stack trace
at org.opendaylight.controller.cluster.raft.AbstractReplicatedLogImpl.append(AbstractReplicatedLogImpl.java:128)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.raft.ReplicatedLogImpl.appendAndPersist(ReplicatedLogImpl.java:101)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.raft.RaftActor.persistData(RaftActor.java:544)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.Shard.persistPayload(Shard.java:465)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardDataTree.replicatePayload(ShardDataTree.java:438)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardDataTree.abortTransaction(ShardDataTree.java:658)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.AbstractShardDataTreeTransaction.abort(AbstractShardDataTreeTransaction.java:75)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardTransaction.closeTransaction(ShardTransaction.java:80)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardTransaction.handleReceive(ShardTransaction.java:66)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardReadTransaction.handleReceive(ShardReadTransaction.java:38)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:28)[226:org.opendaylight.controller.sal-clustering-commons:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104)[226:org.opendaylight.controller.sal-clustering-commons:1.5.0.SNAPSHOT]
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.Actor$class.aroundReceive(Actor.scala:497)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[200:com.typesafe.akka.actor:2.4.17]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]



 Comments   
Comment by Tom Pantelis [ 21/Mar/17 ]

The transaction actor tried to close/abort the transaction by calling directly into the Shard via the AbstractShardDataTreeTransaction. That violates actor encapsulation. It needs to send a message to the Shard.

Comment by Viera Zelcamova [ 21/Mar/17 ]

Hi Tom,
can you check the bug if can be fixed for Carbon RCO? We have dependency on BGP side and are blocked right now. Thank you.

Comment by Tom Pantelis [ 21/Mar/17 ]

I'm sure it can be fixed by then

(In reply to Viera Zelcamova from comment #2)
> Hi Tom,
> can you check the bug if can be fixed for Carbon RCO? We have dependency on
> BGP side and are blocked right now. Thank you.

Comment by Vratko Polak [ 22/Mar/17 ]

Karaf.log segment from today [0]

2017-03-22 11:54:53,923 | INFO | entLoopGroup-6-2 | AbstractBGPSessionNegotiator | 305 - org.opendaylight.bgpcep.bgp-rib-impl - 0.7.0.SNAPSHOT | BGP Session with peer [id: 0x5cb7d574, L:/10.29.12.73:1790 - R:/10.29.12.154:17900] established successfully.
2017-03-22 11:54:55,215 | WARN | lt-dispatcher-38 | AbstractReplicatedLogImpl | 227 - org.opendaylight.controller.sal-akka-raft - 1.5.0.SNAPSHOT | member-1-shard-default-config: Cannot append new entry - new index 33 is not greater than the last index 33
java.lang.Exception: stack trace
at org.opendaylight.controller.cluster.raft.AbstractReplicatedLogImpl.append(AbstractReplicatedLogImpl.java:128)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.raft.ReplicatedLogImpl.appendAndPersist(ReplicatedLogImpl.java:101)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.raft.RaftActor.persistData(RaftActor.java:544)[227:org.opendaylight.controller.sal-akka-raft:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.Shard.persistPayload(Shard.java:465)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardDataTree.replicatePayload(ShardDataTree.java:439)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardDataTree.abortTransaction(ShardDataTree.java:659)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.AbstractShardDataTreeTransaction.abort(AbstractShardDataTreeTransaction.java:75)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardTransaction.closeTransaction(ShardTransaction.java:80)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardTransaction.handleReceive(ShardTransaction.java:66)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardReadTransaction.handleReceive(ShardReadTransaction.java:38)[233:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedActor.onReceive(AbstractUntypedActor.java:28)[226:org.opendaylight.controller.sal-clustering-commons:1.5.0.SNAPSHOT]
at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104)[226:org.opendaylight.controller.sal-clustering-commons:1.5.0.SNAPSHOT]
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.Actor$class.aroundReceive(Actor.scala:497)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[200:com.typesafe.akka.actor:2.4.17]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
2017-03-22 11:54:55,523 | WARN | lt-dispatcher-23 | ShardManager | 233 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Supervisor Strategy caught unexpected exception - resuming
java.lang.ClassCastException: org.opendaylight.controller.cluster.raft.persisted.ApplyJournalEntries cannot be cast to org.opendaylight.controller.cluster.raft.ReplicatedLogEntry
at akka.persistence.UntypedPersistentActor$$anonfun$persistAsync$1.apply(PersistentActor.scala:247)[207:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.UntypedPersistentActor$$anonfun$persistAsync$1.apply(PersistentActor.scala:247)[207:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$class.akka$persistence$Eventsourced$$peekApplyHandler(Eventsourced.scala:594)[207:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$ProcessingState$$anonfun$2.applyOrElse(Eventsourced.scala:610)[207:com.typesafe.akka.persistence:2.4.17]
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:662)[207:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183)[207:com.typesafe.akka.persistence:2.4.17]
at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168)[207:com.typesafe.akka.persistence:2.4.17]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[200:com.typesafe.akka.actor:2.4.17]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)[200:com.typesafe.akka.actor:2.4.17]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[200:com.typesafe.akka.actor:2.4.17]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
2017-03-22 11:54:55,526 | WARN | lt-dispatcher-33 | OneForOneStrategy | 201 - com.typesafe.akka.slf4j - 2.4.17 | org.opendaylight.controller.cluster.raft.persisted.ApplyJournalEntries cannot be cast to org.opendaylight.controller.cluster.raft.ReplicatedLogEntry
2017-03-22 11:54:55,738 | WARN | lt-dispatcher-41 | FrontendClientMetadataBuilder | 233 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Unknown history for aborted transaction member-3-datastore-operational-fe-0-txn-7-0, ignoring

[0] https://logs.opendaylight.org/releng/jenkins092/bgpcep-csit-3node-periodic-bgpclustering-only-carbon/212/archives/odl1_karaf.log.gz

Comment by Robert Varga [ 23/Mar/17 ]

As per Tom's analysis, https://git.opendaylight.org/gerrit/49265 did the wrong thing in ShardTransaction, because ShardTransaction is an actor, which means it must not make upcalls to Shard.

Comment by Robert Varga [ 23/Mar/17 ]

https://git.opendaylight.org/gerrit/53747

Generated at Wed Feb 07 19:56:00 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.