Operational Defect Database

BugZero found this defect 74 days ago.

MongoDB | 2600919

16MB document in $source fails stream processor with retryable -- it should be non-retryable or DLQ

Last update date:

3/11/2024

Affected products:

MongoDB Server

Affected releases:

No affected releases provided.

Fixed releases:

8.0.0-rc0

Description:

Info

"changestream -> merge" infinite loop scenario. This scenario leads to a $merge to the collection with a document of size slightly less than 16MB. When we try to read this in our changestream $source, the changestream MongoDB server code fails with this stack below. We changed how we will handle this to fail the stream processor with a non-retryable error. Note: In SERVER-87669 we will change our $merge/DLQ/$emit to prevent writing documents larger than ~15MB, to avoid this potential changestream issue. We still want this $source change though, because some other producer could create a large doc. {{#5 0x00007f980a77ab3c in mongo::invariantWithLocation (testOK=@0x7f97a897122f: false, expr=0x7f980a1d1363 "false", file=0x7f980a1dcd5b "src/mongo/bson/bsonobj.cpp", line=129) at src/mongo/util/assert_util_core.h:73 #6 0x00007f980a3b9823 in mongo::BSONObj::_assertInvalid (this=0x7f97a8971570, maxSize=16793600) at src/mongo/bson/bsonobj.cpp:129 #7 0x00007f980a788aba in mongo::BSONObj::init (this=0x7f97a8971570, data=0x345ba6600008 "?\367\026\001\003_id") at src/mongo/bson/bsonobj.h:743 #8 0x00007f980a788a32 in mongo::BSONObj::BSONObj (this=0x7f97a8971570, bsonData=0x345ba6600008 "?\367\026\001\003_id", t=...) at src/mongo/bson/bsonobj.h:163 #9 0x00007f980a7887d0 in mongo::BSONObjBuilderBase::done (this=0x7f97a8971440) at src/mongo/bson/bsonobjbuilder.h:594 #10 0x00007f980a77ae5c in mongo::BSONObjBuilder::obj (this=0x7f97a8971440) at src/mongo/bson/bsonobjbuilder.h:791 #11 0x00007f980777445b in mongo::Document::toBson (this=0x7f97a8971580) at src/mongo/db/exec/document_value/document.h:316 #12 0x00007f97fae76cd8 in mongo::PlanExecutorPipeline::_trySerializeToBson (this=0x345bb8a4cd00, doc=...) at src/mongo/db/pipeline/plan_executor_pipeline.cpp:163 #13 0x00007f97fae76a14 in mongo::PlanExecutorPipeline::getNext (this=0x345bb8a4cd00, objOut=0x7f97a8971858, recordIdOut=0x0) at src/mongo/db/pipeline/plan_executor_pipeline.cpp:100 #14 0x00007f97cfe9466e in mongo::(anonymous namespace)::GetMoreCmd::Invocation::generateBatch (this=0x345bb8a4c9c0, opCtx=0x345bb97e3440, cursor=0x345bb8a21b00, cmd=..., isTailable=true, nextBatch=0x7f97a8972348, numResults=0x7f97a8971de8, docUnitsReturned=0x7f97a8971dd0) at src/mongo/db/commands/getmore_cmd.cpp:450}} ===== {$source: {db: "test"}, $merge: {db: "test", coll: "testout"}} the $source reads the same DB the $merge is writing to. the loop keeps increasing the size of the output documents until they hit the 16MB max. this gets inserted into the collection: { a: 1 } first input we see, this gets written to the output collection { .. a bunch of changestream fields, fullDocument: {a: 1} } second input we see, { .. a bunch of fields .., fullDocument: { .. a bunch of fields.., fullDocument: {a: 1 } } } .. and so on .. Eventually this fails with: Executor error during getMore :: caused by :: BSONObj size: 18282631 (0x116F887) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265E8E3FD000000012B042C0100296E5A1004E1CD3A26341442E184C3CD0ABA49D7A8463C6F7065726174696F6E54797065003C696E736572740046646F63756D656E744B65790046465F..." }: generic server error } This failure comes from the MongoDB server (we get the BSONObjTooBig error code back), not the streams code. ====== This can be repro-ed in both dev and prod. https://splunk.corp.mongodb.com/en-US/app/cloud/search?earliest=-4h%40m&latest=now&q=sea[…]events&display.general.type=events&sid=1709764729.5182673 sp.createStreamProcessor('test1', [{$source: {connectionName: "StreamsAtlasConnection", db: "test"}}, {$merge: { into: {connectionName: "StreamsAtlasConnection", db: "test", coll: "testout"} }} ]) The error is: Executor error during getMore :: caused by :: BSONObj size: 18282631 (0x116F887) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265E8E3FD000000012B042C0100296E5A1004E1CD3A26341442E184C3CD0ABA49D7A8463C6F7065726174696F6E54797065003C696E736572740046646F63756D656E744B65790046465F..." }: generic server error }

Top User Comments


Steps to Reproduce


Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Closed

Learn More

Search:

...