Amoy Testnet Bor Halt

Incident Report for Polygon

Postmortem

Summary

On Dec 10, 2025, the Amoy Testnet stopped producing blocks starting at ~09:44 UTC (halt occurred at block 30,273,158). The network recovered once the affected block producer's Heimdall and Bor caught up and resumed block production at 10:25 AM UTC.

Impact

  • Bor block production stalled on Amoy, so transactions were not finalised during the incident window.
  • Heimdall continued span rotations, but Bor could not reliably resume block production until the underlying miner deadlock condition cleared.

What happened (high level)

  1. BP4 produced block number 30,269,664.
  2. BP5 failed to import that block, which had a state sync event with an “invalid bloom” error because its Heimdall was lagging, leaving it behind the chain tip.
  3. As the producer rotated (spans), Bor entered a state in which span rotations occurred while one producer was down/out-of-sync.
  4. A Bor miner bug caused the active producer to become permanently stuck during these span rotations, preventing it from committing new work even when it became the rightful producer again.
  5. Once the BP5's Heimdall fully synced and it successfully imported block 30,269,664, it caught up to the tip and began producing blocks, restoring chain progress.

Timeline (UTC)

  • ~09:44 UTC — Chain halts around block 30,273,158.
  • BP5 repeatedly rejects the head block with BAD BLOCK: invalid bloom, and remains behind due to Heimdall lag.
  • Following minutes - Heimdall triggers repeated span rotations as finalisation exceeds the change-producer threshold, but Bor does not consistently resume production.
  • ~10:24 UTC - BP5's Heimdall catches up; Bor imports 30,269,664, then rapidly syncs to the tip.
  • After that, the BP5 resumes block production, and the network continues normally.

Root cause

Primary root cause: Bor miner deadlock during span rotations

A critical bug in BP4's Bor’s miner caused the node to become stuck when spans rotated while one block producer was down/out-of-sync:

  • Bor tracks in-flight sealing work in a pendingTasks map.
  • After a block was successfully sealed, the corresponding entry in pendingTasks was not deleted.
  • When the chain stopped advancing (due to producer issues), the normal cleanup path (clearPending) was no longer triggered.
  • As a result, the miner continuously observed “has pending tasks” and refused to commit new work, even after spans rotated back to make it the rightful producer.

This created a self-sustaining deadlock: no new block → no cleanup → pending tasks remain → miner won’t commit → no new block.

Resolution and recovery

  • The network recovered when the BP5's Heimdall synced, allowing Bor to import the previously rejected head block, catch up to the tip, and resume block production.

Corrective actions

A new and stable version of Bor was rolled out over the weekend on Amoy which the team monitored and now has rolled out to Mainnet as a precautionary step.

Code fixes (Bor)

  1. Clean up pending sealing tasks
* Delete the relevant `pendingTasks` entry after successful sealing/head update.
* Ensure deletion also happens on error paths \(e.g., if writing the block fails\).
* Clear obsolete tasks at/below the sealed block number after success.
  1. Set Bor’s stale task threshold appropriately
* Bor doesn’t use uncle-block semantics \(unlike PoW-era Ethereum miner logic\), so stale tasks can be discarded immediately.
* Set `staleThreshold = 0` for Bor to avoid retaining unnecessary backlog.
Posted Dec 15, 2025 - 13:30 UTC

Resolved

Block production has resumed as of 10:25 AM UTC.
The network is stable and operational.
Posted Dec 10, 2025 - 10:26 UTC

Investigating

At 9:44 AM UTC, block production halted due to an issue with Bor on Amoy Testnet.


The chain is currently paused. Our engineers are investigating and working on a resolution

There is no impact on Mainnet
Posted Dec 10, 2025 - 10:10 UTC