BugZero found this defect 33 days ago.
Data sources
All data on this page is proprietary to BugZero® or gathered from public sources
4/16/2024
MongoDB Server
4.2.12
No fixed releases provided.
We have a PSA Replica Set, each data-bearing node has 32 cores, 64GB memory and 3TB SSD. This has been running fine for over two years now, but recently, while the data size keeps growing, we ran into a weird problem, twice in a month: When high traffic occurred, primary's CPU(we use primaryPrefrred read preference) first went up to around 90%, then drop down to below 50%, and all queries slowed down after the drop. We have examined systctl params, ulimits params, filesystem configs(XFS, no TPH) , WiredTiger cache usage(arount 80%), disk limits(throughput and IOPS), WiredTiger cache dirty percentage(around %5), etc, but couldn't figure out what's the rational behind the stall. Please help to confirm if this is a bug, or give us a clue on what are we doing working. See attachments for related FTDC files. We know version 4.2.12 has been EoL, apologes first if you find this issue is inapposite. Many Thanks!
We couldn't find stable reproduce steps.