Operational Defect Database

BugZero found this defect 17 days ago.

MongoDB | 2660817

RecordIds can be reused

Last update date:

5/2/2024

Affected products:

MongoDB Server

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Info

Steady state replication case Suppose the following sequence occurs on the primary, which assigns recordIds as inserts come in: Insert {_id: 1}. Oplog entry: {op: "i", _id: 1, rid: 1} Delete {_id: 1}. Oplog entry: {op: "d", _id: 1, rid: 1} Kill and restart the primary Insert {_id: 2}. The primary checks on disk for the highest recordId, and that is currently 0, as no documents exist. Therefore it uses recordId(1), creating a new oplog entry: {op: "i", _id: 2, rid: 1}. Now, if the secondary tries to apply these entries as a part of a batch, it may see in a batch: [ {op: "i", _id: 1, rid: 1}, {op: "d", _id: 1, rid 1}, {op: "i", _id: 2, rid: 1}, ] The secondary assigns oplog entries to the applier threads based on the hash of the _id. Therefore it's possible that - Applier thread 1 gets: [ {op: "i", _id: 1, rid: 1}, {op: "d", _id: 1, rid 1} ] Applier thread 2 gets: [ {op: "i", _id: 2, rid: 1} ] As a result it's possible for the threads to interleave in a way that we are left with data corruption (if applier thread 1 deletes document with recordId(1) after applier thread 2 has inserted). Initial sync case The reuse of recordIds due to restart is problematic even when the writes don't appear in the same batch. Let's say we have a primary -> secondary -> initial syncing node chain. The primary generates oplog entries: [ ts: 1 -> {op: "i", _id: 1, rid: 1}, ts: 2 -> {op: "d", _id: 1, rid 1}, ... // recordId reuse due to restart of primary: ts: 10 -> {op: "i", _id: 2, rid: 1}, ] Initial sync starts at ts: 1, However, by the time collection cloning actually starts, the collection only contains the insert from ts: 10, i.e. the document {_id: 2} with recordId(1). After the collection cloning phase of initial sync has completed, we replay oplog entries. But the oplog entry at ts: 1 also writes to recordId(1), although for a different document! Solutions See comments

Top User Comments


Steps to Reproduce


Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Backlog

Learn More

Search:

...