Operational Defect Database

BugZero found this defect 54 days ago.

MongoDB | 2623652

Reduce reshardingTxnClonerProgressBatchSize in Stepdown Suites

Last update date:

3/26/2024

Affected products:

MongoDB Server

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

By default, the resharding transaction cloner only writes down its progress every 1000 entries. In stepdown suites, a failover is triggered every 8 seconds. In very slow variants (e.g. tsan debug), the cloner might be unable to process enough records to reach a checkpoint where progress is persisted before the next failover occurs, leaving it unable to make any progress (see BF-32013 for an example of this in practice). We should reduce the batch size to 1 in stepdown suites to guarantee that the cloner is able to make progress, even if the system is very slow.

Top User Comments


Steps to Reproduce


Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Open

Learn More

Search:

...