Operational Defect Database

BugZero found this defect 2488 days ago.

MongoDB | 381319

[SERVER-29098] ShardingTaskExecutor::scheduleRemoteCommand should not run passed callback before processing metadata

Last update date:


Affected products:

MongoDB Server

Affected releases:

No affected releases provided.

Fixed releases:




Also add comment in task_executor.h that implementations for scheduleWork/scheduleWorkAt/scheduleRemoteCommand should ensure that the passed callback is the last thing that gets called (no post processing directly related to the callback request). orig title: AsyncRequestsSender::_ready should wait for callback to finish running before returning data orig description: Currently, it just clears the handle from the executor and never explicitly waits for it to finish. It looks like AsyncResultsMerger may also have the same issue.

Top User Comments

xgen-internal-githook commented on Wed, 24 May 2017 17:03:44 +0000: Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'} Message: SERVER-29098 ShardingTaskExecutor::scheduleRemoteCommand should not run passed callback before processing metadata Branch: master https://github.com/mongodb/mongo/commit/e9435bd2d63150d92cedfe19448c78e733036891 esha.maharishi@10gen.com commented on Mon, 15 May 2017 16:16:12 +0000: If we don't fix this at the ShardingTaskExecutor layer by moving the passed-in callback to after metadata updates, we'll have to make sure every use of ShardingTaskExecutor::scheduleRemoteCommand waits for the "wrapper" callback to run. I think this is more difficult to do and maintain, since it's different than the way we use the TaskExecutor interface, but it's often not immediately obvious at a call site whether TaskExecutor or ShardingTaskExecutor is being used, even in sharding code. esha.maharishi@10gen.com commented on Mon, 15 May 2017 16:12:59 +0000: Hm, another way of looking at this: ShardingTaskExecutor's scheduleRemoteCommand method is a thin wrapper around TaskExecutor::scheduleRemoteCommand. The wrapper's job is to perform metadata updates related to the operation (e.g., update clusterTime, update getLastError stats). However, the wrapper calls the passed-in callback before doing these metadata updates. Most callers assume that the callback they pass in to scheduleRemoteCommand is the last thing that the TaskExecutor will run. Thus, the caller may return to the client before the metadata updates, which will be still executing in parallel inside the TaskExecutor, have run! This is problematic, because the metadata updates rely on there being an OperationContext, but if the response has been returned to the client, the associated OperationContext is gone or invalid! So, I think a good fix would be for ShardingTaskExecutor::scheduleRemoteCommand to call the passed-in callback after doing the metadata updates. This ensures that metadata updates will not "lose" the OperationContext, and we still guarantee to callers that once their callback has completed, the TaskExecutor is not executing any more work, including metadata updates, for their operation.

Additional Resources / Links


BugZero Risk Score

Coming soon



Have you been affected by this bug?


Do you know how much operational outages are costing you?

Understand the cost to your business and how BugZero can help you reduce those costs.


Login to read and write comments.

Have you ever...

had your data corrupted from a