Operational Defect Database

BugZero found this defect 52 days ago.

MongoDB | 2626045

[v7.0] Writes in transactions in sharded clusters may not conflict with collection drop and rename, violating snapshot isolation

Last update date:

3/28/2024

Affected products:

MongoDB Server

Affected releases:

7.0.0

Fixed releases:

No fixed releases provided.

Description:

Info

Collection acquisitions from the shard role API (7.1 and later) call CollectionCatalog::establishConsistentCollection() both when the namespace is being used for writes and when the namespace is being used for reads. In 7.0, CollectionCatalog::establishConsistentCollection() is only used for reads. This means if the collection doesn't exist at the wall-clock time of the transaction beginning on a shard but the collection did originally exist at the read timestamp of the transaction, then attempting to write to the collection namespace won't error. This leads to violations of snapshot isolation in scenarios similar to SERVER-84760 but where writes (e.g. findAndModify) are impacted. If the collection is read within the transaction prior to attempting to write to the collection, then attempting to write to the collection will correctly error when interleaved with a drop or rename. In 6.0 and earlier, attempting to read or write to a collection which was re-created would cause the transaction to correctly error. https://github.com/mongodb/mongo/blob/r6.0.14/src/mongo/db/db_raii.cpp#L548-L557 https://github.com/mongodb/mongo/blob/r6.0.14/src/mongo/db/catalog_raii.cpp#L82-L99

Top User Comments

max.hirschhorn@10gen.com commented on Thu, 28 Mar 2024 22:39:36 +0000: It looks like I may have been too eager in filing this ticket. I retested repro_snapshot_txn_readtimestamp_conflicts_with_rename.js against the latest version of 7.0 and see the changes from SERVER-84723 causing the transaction to abort with a SnapshotUnavailable error response when the transaction attempts its first write on the now-renamed collection. I'm going to leave this ticket open for the Catalog & Routing team to confirm. [jsTest] "ok" : 0, [jsTest] "errmsg" : "Transaction 8ca1bbdb-3a39-43bc-9295-08c35c247789 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= - - :0 was aborted on statement 2 due to: a non-retryable snapshot error :: caused by :: findAndModify :: caused by :: Collection test.secondcoll has undergone a catalog change and no longer satisfies the requirements for the current transaction.", [jsTest] "code" : 246, [jsTest] "codeName" : "SnapshotUnavailable",

Steps to Reproduce


Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Needs Scheduling

Learn More

Search:

...