Operational Defect Database

BugZero found this defect 59 days ago.

MongoDB | 2615893

Namespace may become unavailable while sharding collection during downgrade from v7.2 to v7.0

Last update date:

3/21/2024

Affected products:

MongoDB Server

Affected releases:

7.2.0

Fixed releases:

7.3 Required

Description:

Info

SERVER-81353 made some changes to the create coordinator (used for shardCollection) that were released in v7.2.0 and that can potentially result in a namespace becoming unavailable when downgrading from v7.2.x to v7.0. The bug is very improbable to hit considering that a very specific interleaving must occur during downgrade. Scenario (downgrade is happening, multi-version mix of binaries): A sharCollection request is received by a shard in v7.2 that spawns a create coordinator An error occurs after acquiring the critical section, that results in calling triggerCleanup that persists on the coordinator document the abort reason The shard primary steps down before the coordinator could execute the _cleanupOnAbort procedure introduced by SERVER-81353 A new shard primary in v7.0 is elected The new primary shard resumes the coordinator, executes _cleanupOnAbort, the default implementation since the create coordinator does not override that method in v7.0. The coordinator finishes: DDL locks are released but the recoverable critical section remains indefinitely held Consequences: can't run CRUDs or DDLs different than shardCollection over the namespace. Solution: run again the shardCollection command with the original options. This will result in spawn a new coordinator that will reuse the existing critical section and run again, eventually clearing up the state both in case of success or failure.

Top User Comments


Steps to Reproduce


Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Open

Learn More

Search:

...