Operational Defect Database

BugZero found this defect 2537 days ago.

MongoDB | 365731

[SERVER-28377] Do not check that remote last applied is ahead of local last fetched in OplogFetcher first batch during initial sync

Last update date:

9/7/2017

Affected products:

MongoDB Server

Affected releases:

No affected releases provided.

Fixed releases:

3.4.4

3.5.6

Description:

Info

There is a race where the sync source's most recent oplog entries could become visible and be read by downstream nodes before the sync source updates its heartbeat map with its new last applied OpTime. This can cause downstream nodes to get stale lastAppliedOpTimes in metadata which can be a problem for OplogFetcher::checkRemoteOplogStart. This will only cause the OplogFetcher to return early and choose a new sync source, so it should not cause harm beyond unnecessary sync source changes and some very quick initial sync restarts. We should remove the check that the remote last applied OpTime is greater than or equal to the local last fetched OpTime in OplogFetcher::checkRemoteOplogStart when "requireFresherSyncSource" is false. This will also require changing the comments that explain the boolean's meaning. An alternative is to use the max of the metadata lastOpApplied and the last OpTime in the batch as the remote last applied OpTime in OplogFetcher::checkRemoteOplogStart.

Top User Comments

xgen-internal-githook commented on Thu, 6 Apr 2017 15:26:16 +0000: Author: {u'username': u'mtrussotto', u'name': u'Matthew Russotto', u'email': u'matthew.russotto@10gen.com'} Message: SERVER-28377 If first batch of OplogFetcher has a document ahead of the remote last applied from heartbeat, use the document's time instead. (cherry picked from commit 925e245ca4cb59fdec3c008097df612fd48ae00a) Branch: v3.4 https://github.com/mongodb/mongo/commit/2a5996b761a64787593192b6413bb774bea06da0 xgen-internal-githook commented on Mon, 3 Apr 2017 23:55:41 +0000: Author: {u'username': u'mtrussotto', u'name': u'Matthew Russotto', u'email': u'matthew.russotto@10gen.com'} Message: SERVER-28377 If first batch of OplogFetcher has a document ahead of the remote last applied from heartbeat, use the document's time instead. Branch: master https://github.com/mongodb/mongo/commit/925e245ca4cb59fdec3c008097df612fd48ae00a

Additional Resources / Links

Share:

BugZero Risk Score

Coming soon

Status

Closed

Have you been affected by this bug?

cost-cta-background

Do you know how much operational outages are costing you?

Understand the cost to your business and how BugZero can help you reduce those costs.

Discussion

Login to read and write comments.

Have you ever...

had your data corrupted from a

VMware

bug?

Search:

...