Operational Defect Database

BugZero found this defect 39 days ago.

MongoDB | 2638353

$lookup pipeline ignores index when a let clause variable is NULL

Last update date:

4/11/2024

Affected products:

MongoDB Server

Affected releases:

7.2.2

Fixed releases:

No fixed releases provided.

Description:

Info

Let's say we have two collections, widgets and features Every widgets record has two fields primary_feature and secondary_feature, which store some natural id (assume all numeric and indexed). Primary one is guaranteed to be non-NULL, while the secondary one sometimes/ often could be NULL or undefined or unset We need to perform a $lookup aggregation on widgets to pull up all relevant features, filtered/ transformed by certain criteria in subsequent pipeline stages. To do that we use a pipeline clause with a let variable for each of  primary_feature and secondary_feature While  primary_feature performs properly by using index, records where secondary_feature equals NULL trigger a colscan. This makes this $lookup pipeline unusable for all practical purposes We also found that if we are not using the pipeline clause and instead rely on localField/ foreignField clauses the indices are used properly even when secondary_feature equals NULL. We are using this workaround for now, but this leads to complicated/ fragile $filter/ $map operations for data clean up, and might not be viable at all for certain data sets Please fix the $lookup pipelines to always use available indices even when a let variable is NULL

Top User Comments

JIRAUSER1277026 commented on Thu, 11 Apr 2024 06:16:39 +0000: chris.kelly@mongodb.com thanks for the quick reply and linking the relevant issue. I tried the workaround proposed over there and it worked with a certain modification Posting my workaround tweak into the main issue. We can close this one JIRAUSER1265262 commented on Thu, 11 Apr 2024 01:07:17 +0000: Hi dk@targetchoice.com, Thanks for your report. In order to get more detail, it would be helpful to start with an example aggregation that reproduces your issue, along with a sample document. Specifically, this will help de-dupe against known limitations of $lookup such as SERVER-40362 (which it sounds like you're describing). There is currently a distinction between "null" and "missing" in regard to optimization. If it's that, I would recommend watching that ticket for further work here. Chris

Steps to Reproduce

See description

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Investigating

Learn More

Search:

...