Operational Defect Database

BugZero found this defect 2454 days ago.

MongoDB | 391696

[SERVER-29517] Data race with ViewGraph::_idCounter can corrupt the in-memory ViewGraph

Last update date:

10/30/2023

Affected products:

MongoDB Server

Affected releases:

3.4.4

Fixed releases:

3.4.5

3.5.9

Description:

Info

The ViewGraph is an in-memory directed acyclic graph data structure in which nodes represent view definitions and edges represent "view-on" relationships. This structure assigns unique unsigned 64 bit numbers to each node in the graph, using ViewGraph::_idCounter: https://github.com/mongodb/mongo/blob/6f7fd7318d61bd145bb75a9a0a5d35387d2a6b9f/src/mongo/db/views/view_graph.h#L182 The intention is that concurrent access to this counter is prevented by the ViewCatalog's mutex. However, the _idCounter is a static data member. There is a ViewCatalog per database, each owning and synchronizing access to a separate ViewGraph instance. Since the _idCounter is static, all ViewGraph instances share the same counter! This means that the various ViewGraphs can all access the counter simultaneously. This leads to the assignment of invalid node ids, which in turn corrupts the in-memory graph. We have seen this manifest as a process-fatal invariant failure, or as an unexpected failed view catalog operation (e.g. a view drop, modify, or create).

Top User Comments

xgen-internal-githook commented on Thu, 8 Jun 2017 16:29:56 +0000: Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'} Message: SERVER-29517 Fix data race by making ViewGraph::_idCounter non-static. (cherry picked from commit e2376ccbb43d3fb2579995a55ebf82f7c16fcb4f) Branch: v3.4 https://github.com/mongodb/mongo/commit/520b8f3092c48d934f0cd78ab5f40fe594f96863 david.storch commented on Thu, 8 Jun 2017 16:16:20 +0000: The invariant failure associated with this problem that we've observed in testing looks like this: [MongoDFixture:job0] 2017-06-07T21:20:45.163+0000 I - [conn1516] Invariant failure node->children.empty() src/mongo/db/views/view_graph.cpp 130 ... [MongoDFixture:job0] [MongoDFixture:job0] ***aborting after invariant() failure [MongoDFixture:job0] [MongoDFixture:job0] [MongoDFixture:job0] 2017-06-07T21:20:45.166+0000 I COMMAND [conn1580] CMD: drop db167.view_catalog_70 [MongoDFixture:job0] 2017-06-07T21:20:45.168+0000 I COMMAND [conn1572] command db163.coll163 appName: "MongoDB Shell" command: find { find: "coll163", filter: { x: 259.0, tid: 25.0 } } planSummary: IXSCAN { tid: 1 } keysExamined:1200 docsExamined:1200 cursorExhausted:1 numYields:0 nreturned:1 reslen:135 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 267304 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_command 272ms [MongoDFixture:job0] 2017-06-07T21:20:45.168+0000 I COMMAND [conn1568] command db163.coll163 appName: "MongoDB Shell" command: find { find: "coll163", filter: { x: 80.0, tid: 27.0 } } planSummary: IXSCAN { tid: 1 } keysExamined:1200 docsExamined:1200 cursorExhausted:1 numYields:0 nreturned:1 reslen:135 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 267300 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_command 272ms [MongoDFixture:job0] 2017-06-07T21:20:45.176+0000 F - [conn1516] Got signal: 6 (Abort trap: 6). [MongoDFixture:job0] [MongoDFixture:job0] 0x10e2cb27a 0x10e2caaf0 0x7fff91dbef1a 0x110eac000 0x7fff8c4b0b73 0x10e25d348 0x10e0960c9 0x10e08fea8 0x10e08f580 0x10e08f08c 0x10e0913f9 0x10d94fa85 0x10d951453 0x10d945a39 0x10d9a8496 0x10d9a523d 0x10d9a4365 0x10def2e69 0x10db54096 0x10d80f2aa 0x10d80fb78 0x10e2574de 0x10e257af1 0x7fff914d72fc 0x7fff914d7279 0x7fff914d54b1 [MongoDFixture:job0] ----- BEGIN BACKTRACE ----- [MongoDFixture:job0] {"backtrace":[{"b":"10D801000","o":"ACA27A","s":"_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE"},{"b":"10D801000","o":"AC9AF0","s":"_ZN5mongo12_GLOBAL__N_110abruptQuitEi"},{"b":"7FFF91DBA000","o":"4F1A","s":"_sigtramp"},{"b":"0","o":"110EAC000"},{"b":"7FFF8C453000","o":"5DB73","s":"abort"},{"b":"10D801000","o":"A5C348","s":"_ZN5mongo15invariantFailedEPKcS1_j"},{"b":"10D801000","o":"8950C9","s":"_ZN5mongo9ViewGraph23insertWithoutValidatingERKNS_14ViewDefinitionERKNSt3__16vectorINS_15NamespaceStringENS4_9allocatorIS6_EEEEi"},{"b":"10D801000","o":"88EEA8","s":"_ZZN5mongo11ViewCatalog16_upsertIntoGraphEPNS_16OperationContextERKNS_14ViewDefinitionEENK3$_3clES5_b"},{"b":"10D801000","o":"88E580","s":"_ZN5mongo11ViewCatalog16_upsertIntoGraphEPNS_16OperationContextERKNS_14ViewDefinitionE"},{"b":"10D801000","o":"88E08C","s":"_ZN5mongo11ViewCatalog26_createOrUpdateView_inlockEPNS_16OperationContextERKNS_15NamespaceStringES5_RKNS_9BSONArrayENSt3__110unique_ptrINS_17CollatorInterfaceENS9_14default_deleteISB_EEEE"},{"b":"10D801000","o":"8903F9","s":"_ZN5mongo11ViewCatalog10createViewEPNS_16OperationContextERKNS_15NamespaceStringES5_RKNS_9BSONArrayERKNS_7BSONObjE"},{"b":"10D801000","o":"14EA85","s":"_ZN5mongo8Database10createViewEPNS_16OperationContextENS_10StringDataERKNS_17CollectionOptionsE"},{"b":"10D801000","o":"150453","s":"_ZN5mongo12userCreateNSEPNS_16OperationContextEPNS_8DatabaseENS_10StringDataENS_7BSONObjEbRKS5_"},{"b":"10D801000","o":"144A39","s":"_ZN5mongo16createCollectionEPNS_16OperationContextERKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEERKNS_7BSONObjESD_"},{"b":"10D801000","o":"1A7496","s":"_ZN5mongo9CmdCreate3runEPNS_16OperationContextERKNSt3__112basic_stringIcNS3_11char_traitsIcEENS3_9allocatorIcEEEERNS_7BSONObjEiRS9_RNS_14BSONObjBuilderE"},{"b":"10D801000","o":"1A423D","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{"b":"10D801000","o":"1A3365","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"10D801000","o":"6F1E69","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"10D801000","o":"353096","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"10D801000","o":"E2AA","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKNSt3__110shared_ptrINS_9transport7SessionEEE"},{"b":"10D801000","o":"EB78","s":"_ZNSt3__110__function6__funcIZN5mongo23ServiceEntryPointMongod12startSessionENS_10shared_ptrINS2_9transport7SessionEEEE3$_0NS_9allocatorIS8_EEFvRKS7_EEclESC_"}, xgen-internal-githook commented on Thu, 8 Jun 2017 16:10:31 +0000: Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'} Message: SERVER-29517 Fix data race by making ViewGraph::_idCounter non-static. Branch: master https://github.com/mongodb/mongo/commit/e2376ccbb43d3fb2579995a55ebf82f7c16fcb4f

Additional Resources / Links

Share:

BugZero Risk Score

Coming soon

Status

Closed

Have you been affected by this bug?

cost-cta-background

Do you know how much operational outages are costing you?

Understand the cost to your business and how BugZero can help you reduce those costs.

Discussion

Login to read and write comments.

Have you ever...

had your data corrupted from a

VMware

bug?

Search:

...