BugZero found this defect 194 days ago.
Data sources
All data on this page is proprietary to BugZero® or gathered from public sources
3/15/2024
MongoDB Server
5.0.0
6.0.0
7.0.0
7.1.0
No fixed releases provided.
In all version previous 7.2, in case of aggregation with $lookup, if the user data are located on the local shard we will simply run a router loop that will attempt 10 times to run the aggregation locally hoping at least one will succeed. The local access will cause a check on the local filtering metadata which in case they are not installed yet, the collection access would return StaleConfig. Usually it's ok to retry since it's just a transient error that requires a refresh on the shard side. However, because the access is local, the filtering metadata are not refreshed until the error is propagated back to the entry point which will performed the refresh and obtain the filtering metadata This happens after failing 10 times, but we could simply fail at the 1th in case of StaleConfig. In 7.2 this issue was unintentionally fixed by SERVER-74816: https://github.com/10gen/mongo/blob/ba27121ae83e40362e418f7f4b0f88ef79977765/src/mongo/db/pipeline/sharded_agg_helpers.cpp#L1822-L1862 The goal of this ticket is to backport that specific change up to 5.0
JIRAUSER1257318 commented on Tue, 5 Dec 2023 13:15:48 +0000: One idea is to wrap that local read within a shard-role retry loop (proposed here: SERVER-77402)