Operational Defect Database

BugZero found this defect 2353 days ago.

MongoDB | 428628

[SERVER-31120] Invariant failure remotesExhausted_inlock() || _lifecycleState == kKillComplete

Last update date:

10/30/2023

Affected products:

MongoDB Server

Affected releases:

3.2.19

Fixed releases:

3.6.0-rc0

Description:

Info

Triggered by PyMongo's test suite, on my branch where I'm developing sessions. This is a mongos error from a sharded cluster with auth, when PyMongo is calling "getMore" on an aggregation cursor with "lsid": 2017-09-17T14:19:16.577-0400 F - [conn187] Invariant failure remotesExhausted_inlock() || _lifecycleState == kKillComplete src/mongo/s/que ry/async_results_merger.cpp 84   mongos(_ZN5mongo15invariantFailedEPKcS1_j+0x2E6) [0x10e848a46] mongos(_ZN5mongo18AsyncResultsMergerD2Ev+0x196) [0x10dfed976] mongos(_ZN5mongo16RouterStageMergeD0Ev+0x1C) [0x10dfea69c] mongos(_ZN5mongo23ClusterClientCursorImplD0Ev+0xB6) [0x10dfe93c6] mongos(_ZN5mongo20ClusterCursorManager14checkOutCursorERKNS_15NamespaceStringExPNS_16OperationContextE+0x3D3) [0x10e1b0d93] mongos(_ZN5mongo11ClusterFind10runGetMoreEPNS_16OperationContextERKNS_14GetMoreRequestE+0x4A) [0x10dfe30ea] mongos(_ZN5mongo12_GLOBAL__N_117ClusterGetMoreCmd3runEPNS_16OperationContextERKNSt3__112basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEERKNS_7BSONObjERNS_14BSONObjBuilderE+0x116) [0x10df87ab6] mongos(_ZN5mongo12BasicCommand11enhancedRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x77) [0x10e2f0037] mongos(_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x20) [0x10e2ee530] mongos(_ZN5mongo12_GLOBAL__N_110runCommandEPNS_16OperationContextERKNS_12OpMsgRequestEONS_14BSONObjBuilderE+0xC8F) [0x10dfc6a7f] mongos(_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE+0x341) [0x10dfc32d1] mongos(_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x2E5) [0x10df21c25] mongos(_ZN5mongo19ServiceStateMachine15_processMessageERNS0_11ThreadGuardE+0x18A) [0x10df2967a] mongos(_ZN5mongo19ServiceStateMachine15_runNextInGuardERNS0_11ThreadGuardE+0x175) [0x10df28b35] mongos(_ZN5mongo19ServiceStateMachine7runNextEv+0x38) [0x10df294a8] Log attached. PyMongo was executing: # Use batchSize to ensure multiple getMore messages cursor = db.test.aggregate( [{'$project': {'_id': '$_id'}}], batchSize=5)   self.assertEqual( expected_sum, sum(doc['_id'] for doc in cursor))

Top User Comments

xgen-internal-githook commented on Wed, 20 Sep 2017 16:26:09 +0000: Author: {'email': 'jcarey@argv.me', 'name': 'Jason Carey', 'username': 'hanumantmk'} Message: SERVER-31120 fix invalid session getMore invariant When passing the wrong lsid to a cursor (not the lsid used to create it) we invariant in sharding. This appears to be about poor lifetime issues in mongos cursors. This papers over the bad api and adds a test for the fix. Branch: master https://github.com/mongodb/mongo/commit/b4fa6b5c46612b7943230dc1a4b24ce7867aa681 jesse commented on Tue, 19 Sep 2017 15:05:53 +0000: That's great! Very helpful for driver testing if the server uasserts when getMore doesn't have the right lsid. I think the PyMongo code I was testing at the time did send a different lsid with getMore than with aggregate; that's fixed in my code now. jason.carey commented on Mon, 18 Sep 2017 19:21:58 +0000: I was able to reproduce this by issuing a getMore with a different lsid than than the lsid used to create a cursor. It may also happen if a getMore is issued without an lsid for a cursor that was created with one. The fix will clean that up (so that a helpful uassert shows up instead of an invariant)

Additional Resources / Links

Share:

BugZero Risk Score

Coming soon

Status

Closed

Have you been affected by this bug?

cost-cta-background

Do you know how much operational outages are costing you?

Understand the cost to your business and how BugZero can help you reduce those costs.

Discussion

Login to read and write comments.

Have you ever...

had your data corrupted from a

VMware

bug?

Search:

...