Operational Defect Database

BugZero found this defect 2373 days ago.

MongoDB | 420956

[SERVER-30864] Query from system.profile errors when executed on the same mongod

Last update date:

9/29/2017

Affected products:

MongoDB Server

Affected releases:

3.4.7

3.5.12

Fixed releases:

No fixed releases provided.

Description:

Info

Short version: I successfully execute an aggregation framework query, which is logged into system.profile. Copy pasting the pipeline from system.profile back into the mongo shell, and executing against the same mongod instance, fails with various errors. Example error: assert: command failed: { "ok" : 0, "errmsg" : "$regex has to be a string", "code" : 2, "codeName" : "BadValue" } : aggregate failed _getErrorWithCode@src/mongo/shell/utils.js:25:13 doassert@src/mongo/shell/assert.js:16:14 assert.commandWorked@src/mongo/shell/assert.js:370:5 DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5 @(shell):1:1   2017-08-28T18:13:14.977+0300 E QUERY [thread1] Error: command failed: { "ok" : 0, "errmsg" : "$regex has to be a string", "code" : 2, "codeName" : "BadValue" } : aggregate failed : _getErrorWithCode@src/mongo/shell/utils.js:25:13 doassert@src/mongo/shell/assert.js:16:14 assert.commandWorked@src/mongo/shell/assert.js:370:5 DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5 @(shell):1:1 > The aggregation pipeline originally comes from the tpch test suite of sqlproxy. I will post a more detailed repro in a dev-only comment.

Top User Comments

henrik.ingo@10gen.com commented on Fri, 15 Sep 2017 12:36:11 +0000: It appears that mongo shell not supporting the extended json strict mode is actually documented feature: https://docs.mongodb.com/manual/reference/mongodb-extended-json/#input-in-strict-mode I'm waiting for more info from drivers people to understand why / what to expect. henrik.ingo@10gen.com commented on Fri, 15 Sep 2017 11:50:03 +0000: Followup for the ns bug: SERVER-31098 henrik.ingo@10gen.com commented on Wed, 13 Sep 2017 16:30:14 +0000: Hi David I've figured out what the problem is. If you simply launch a mysql shell, copy paste the SQL query into it, then open a mongo shell and do db.system.profile.find(), it will give you the format that works. (In that sense, this SERVER ticket can indeed be closed as cannot reproduce.) The problematic format {"$regex":{"$regex":"^.*BRASS$","$options":"i"}}}} ...is returned when using the generate_tpch_agg_queries.py script. It seems the python driver serializes the returned BSON differently from the mongo shell. And in particular, the format returned by the python driver is not useable as a mongodb query. At least not if copy pasted into a shell. An interesting question is whether the captured query would work if used over the python driver? I can take it upon myself to find a driver person and file whatever ticket should be filed based on this finding. IMO a user should be able to expect that the same mongodb query (in json) works in the shell as well as all drivers. I also figured out the other errors I have mentioned in this ticket. nreturned The value of nreturned in your above system.profile snippet was 101, but mongosqld actually returned 460 rows. This is because there is an additional getMore entry in the system.profile, where the additional 359 documents are returned. So this matches up. ns The ns field in your above snippet has the value tpch.mongo_supplier. This is wrong. If you do db.mongo_supplier.aggregate() with the above pipeline, it will return zero documents. The correct collection is mongo_part. Note that this bug reproduces also if you simply execute the pipeline with db.mongo_part.aggregate(), it will record mongo_supplier in the system.profile. You don't need mongosqld to reproduce. The getMore entry correctly has tpch.mongo_part as the namespace. I will file a new SERVER ticket about the ns issue tomorrow, if you don't do it today. david.storch commented on Wed, 13 Sep 2017 14:40:47 +0000: I am closing this ticket as Cannot Reproduce, since I am not able to generate such a log entry when running TPC-H query 1 against the denormalized dataset. I tried with several versions of mongod and mongosqld but could not reproduce the issue. I've left some details below of the steps I took: mongorestore tpch_full_denormalized.bson.gz. Tried with mongod versions 3.5.12, 3.4.5, and 3.2.16. mongosqld version fdc0c4fb2638c19dffc12a47ffab1c5b3253341f Enabled profiling on the mongod and ran query below through the mysql client. SELECT `mongo_supplier`.`s_n_name` AS `n_name`, `mongo_part`.`p_mfgr` AS `p_mfgr`, `mongo_part`.`p_partkey` AS `ps_partkey`, `mongo_partsupp`.`ps_suppkey` AS `ps_suppkey`, `mongo_supplier`.`s_address` AS `s_address`, `mongo_supplier`.`s_comment` AS `s_comment`, `mongo_supplier`.`s_name` AS `s_name`, `mongo_supplier`.`s_phone` AS `s_phone`, `mongo_supplier`.`s_acctbal` AS `sum_s_acctbal_ok` FROM `mongo_part` INNER JOIN `mongo_partsupp` ON (`mongo_part`.`p_partkey` = `mongo_partsupp`.`ps_partkey`) INNER JOIN `mongo_supplier` ON (`mongo_partsupp`.`ps_suppkey` = `mongo_supplier`.`s_suppkey`) INNER JOIN ( SELECT `mongo_part`.`p_partkey` AS `p_partkey`, MIN(`mongo_partsupp`.`ps_supplycost`) AS `__measure__0` FROM `mongo_part` INNER JOIN `mongo_partsupp` ON (`mongo_part`.`p_partkey` = `mongo_partsupp`.`ps_partkey`) WHERE ((`mongo_partsupp`.`ps_s_r_name` = 'EUROPE') AND (`mongo_part`.`p_size` = 15) AND (`mongo_part`.`p_type` LIKE '%BRASS')) GROUP BY 1 ) `t0` ON (`mongo_part`.`p_partkey` = `t0`.`p_partkey`) WHERE (`mongo_partsupp`.`ps_supplycost` = `t0`.`__measure__0`); The query succeeds and appears to generate a valid profile entry. On 3.4.5, here was the output from the profiler: { "op" : "command", "ns" : "tpch.mongo_supplier", "command" : { "aggregate" : "mongo_part", "allowDiskUse" : true, "pipeline" : [ { "$match" : { "p_suppliers" : { "$elemMatch" : { "ps_s_r_name" : "EUROPE" } } } }, { "$unwind" : { "includeArrayIndex" : "p_suppliers_idx", "path" : "$p_suppliers" } }, { "$match" : { "p_suppliers.ps_s_r_name" : "EUROPE" } }, { "$match" : { "$and" : [ { "p_size" : NumberLong(15) }, { "p_type" : { "$regex" : /^.*BRASS$/i } } ] } }, { "$group" : { "min(mongo_partsupp_DOT_ps_supplycost)" : { "$min" : "$p_suppliers.ps_supplycost" }, "_id" : { "mongo_part_DOT_p_partkey" : "$_id" } } }, { "$project" : { "mongo_part_DOT_p_partkey" : "$_id.mongo_part_DOT_p_partkey", "min(mongo_partsupp_DOT_ps_supplycost)" : "$min(mongo_partsupp_DOT_ps_supplycost)" } }, { "$match" : { "mongo_part_DOT_p_partkey" : { "$ne" : null } } }, { "$lookup" : { "as" : "__joined_mongo_part", "from" : "mongo_part", "localField" : "mongo_part_DOT_p_partkey", "foreignField" : "_id" } }, { "$unwind" : { "path" : "$__joined_mongo_part", "preserveNullAndEmptyArrays" : false } }, { "$match" : { "__joined_mongo_part._id" : { "$ne" : null } } }, { "$lookup" : { "from" : "mongo_part", "localField" : "__joined_mongo_part._id", "foreignField" : "_id", "as" : "__joined_mongo_partsupp" } }, { "$unwind" : { "path" : "$__joined_mongo_partsupp", "preserveNullAndEmptyArrays" : false } }, { "$unwind" : { "path" : "$__joined_mongo_partsupp.p_suppliers", "includeArrayIndex" : "__joined_mongo_partsupp.p_suppliers_idx", "preserveNullAndEmptyArrays" : false } }, { "$addFields" : { "__predicate" : { "$let" : { "vars" : { "predicate" : { "$cond" : [ { "$or" : [ { "$eq" : [ { "$ifNull" : [ "$__joined_mongo_partsupp.p_suppliers.ps_supplycost", null ] }, null ] }, { "$eq" : [ { "$ifNull" : [ "$min(mongo_partsupp_DOT_ps_supplycost)", null ] }, null ] } ] }, null, { "$eq" : [ "$__joined_mongo_partsupp.p_suppliers.ps_supplycost", "$min(mongo_partsupp_DOT_ps_supplycost)" ] } ] } }, "in" : { "$cond" : [ { "$or" : [ { "$eq" : [ "$$predicate", false ] }, { "$eq" : [ "$$predicate", 0 ] }, { "$eq" : [ "$$predicate", "0" ] }, { "$eq" : [ "$$predicate", "-0" ] }, { "$eq" : [ "$$predicate", "0.0" ] }, { "$eq" : [ "$$predicate", "-0.0" ] }, { "$eq" : [ "$$predicate", null ] } ] }, false, true ] } } } } }, { "$match" : { "__predicate" : true } }, { "$match" : { "__joined_mongo_partsupp.p_suppliers.ps_suppkey" : { "$ne" : null } } }, { "$lookup" : { "from" : "mongo_supplier", "localField" : "__joined_mongo_partsupp.p_suppliers.ps_suppkey", "foreignField" : "_id", "as" : "__joined_mongo_supplier" } }, { "$unwind" : { "path" : "$__joined_mongo_supplier", "preserveNullAndEmptyArrays" : false } }, { "$project" : { "mongo_partsupp_DOT_ps_suppkey" : "$__joined_mongo_partsupp.p_suppliers.ps_suppkey", "mongo_supplier_DOT_s_comment" : "$__joined_mongo_supplier.s_comment", "mongo_supplier_DOT_s_n_name" : "$__joined_mongo_supplier.s_n_name", "mongo_part_DOT_p_mfgr" : "$__joined_mongo_part.p_mfgr", "mongo_part_DOT_p_partkey" : "$__joined_mongo_part._id", "mongo_supplier_DOT_s_address" : "$__joined_mongo_supplier.s_address", "mongo_supplier_DOT_s_name" : "$__joined_mongo_supplier.s_name", "mongo_supplier_DOT_s_phone" : "$__joined_mongo_supplier.s_phone", "mongo_supplier_DOT_s_acctbal" : "$__joined_mongo_supplier.s_acctbal" } } ], "cursor" : {   } }, "cursorid" : 22890722930, "keysExamined" : 200000, "docsExamined" : 40058, "numYield" : 1566, "locks" : { "Global" : { "acquireCount" : { "r" : NumberLong(4958) } }, "Database" : { "acquireCount" : { "r" : NumberLong(2479) } }, "Collection" : { "acquireCount" : { "r" : NumberLong(2478) } } }, "nreturned" : 101, "responseLength" : 51479, "protocol" : "op_query", "millis" : 865, "planSummary" : "IXSCAN { p_type: 1 }", "ts" : ISODate("2017-09-13T14:34:09.140Z"), "client" : "127.0.0.1", "appName" : "mongosqld", "allUsers" : [ ], "user" : "" } I can also successfully run the profiled command against the server using the mongo shell. henrik.ingo, if you have further guidance on how to reproduce this issue, feel free to re-open the ticket. ryan.chipman, one thing I did notice is that the mongosqld logs appear to contain the invalid $regex format noted by Henrik. The very same query which produced the profile entry above contains this invalid $match in the logs: {"$match":{"$and":[{"p_size":{"$numberLong":"15"}},{"p_type":{"$regex":{"$regex":"^.*BRASS$","$options":"i"}}}]}} Perhaps there is a sqlproxy issue at play here? henrik.ingo@10gen.com commented on Mon, 11 Sep 2017 08:35:11 +0000: If this thread isn't helpful, please arrange a conf call with me so we can walk through it. ian@10gen.com commented on Fri, 8 Sep 2017 15:12:18 +0000: henrik.ingo going through this with the Query team and wondering if you have an easy way to reproduce this? henrik.ingo@10gen.com commented on Tue, 29 Aug 2017 18:49:50 +0000: Ah, I see. Here's the output of "query 1", which is the query failing with the regex, as output when running generate_tpch_agg_queries.py: Running query 1 in file ../../sqlproxy/master/testdata/suites/tpch/handwritten_denormalized.yml... n_name p_mfgr ps_partkey ps_suppkey s_address s_comment s_name s_phone sum_s_acctbal_ok RUSSIA Manufacturer#4 193758 8797 dyz8WERDG3fdz . theodolites boost. regular requests play? blithely special dolphins cajole unusual asympto Supplier#000008797 32-474-915-2675 3540.29 UNITED KINGDOM Manufacturer#1 160328 2845 ZOlKEPI8ftemk3cAGokylKstRcZiBT0sc furiously ironic requests. carefully final pinto beans after the blithely ironic orbi Supplier#000002845 33-639-575-6452 6820.97 RUSSIA Manufacturer#1 157992 3023 a2vIsI7JLHFfBvJoSmPGfm arefully unusual instructions are slyly regular accounts. busy courts sleep furiously. Supplier#000003023 32-598-929-2593 4383.12 It goes on for quite a while. Also, if you look at the full system.profile record in tpch.js, you'll see that it returned 101 records, as well as examined even more, took a number of locks, etc. "docsExamined": 40058, "keysExamined": 200000, "locks": { "Collection": { "acquireCount": { "r": 2476 } }, "nreturned": 101, Finally, I should re-iterate that also the queries that don't result in a parse error, nevertheless return 0 documents. Hence I suspect those queries are wrong as well, compared to what was executed, when the query did return records. The expected number of records of course is what is recorded in nreturned key in the system.profile entry. david.storch commented on Tue, 29 Aug 2017 15:25:10 +0000: henrik.ingo, I'm not sure that this quite answers my question. The profiler entry certainly contains an invalid query, though currently there is no data to support that the query which generated this profiler entry was not itself invalid. Do you have any evidence which proves that the query issued from mongosqld to mongod succeeded, despite generating an invalid profile entry? My current hypothesis is that the query which generated the profiler entry itself failed with "$regex has to be a string". henrik.ingo@10gen.com commented on Tue, 29 Aug 2017 14:57:07 +0000: david.storch The full query is attached in tpch.query2.js. It's one of the queries - the second - I've gotten as a result of using the script from you, in the linked (and non-public) ticket. Specifically, I tried executing mongosqld against mongod 3.4.7 and then verified that the same 3.4.7 could not execute the query that it itself had logged into system.profile. It seems that other versions of mongod have the same issue, but for the purpose of filing this bug, I was extra pedantic about checking with 3.4.7. david.storch commented on Tue, 29 Aug 2017 14:26:59 +0000: I'm re-opening this issue, since it doesn't look like a duplicate of SERVER-6768. That ticket describes the fact that the profiler stores documents with $-prefixed fields and field names containing the "." character. This ticket describes a separate problem: that the server appears to have generated a profiler entry whose query is not grammatically correct. Specifically, this portion is not a valid query: { "p_type": { "$regex": { "$options": "i", "$regex": "^.*BRASS$" } } } This should instead be written like so: { "p_type": { "$regex": "^.*BRASS$", "$options": "i", } } henrik.ingo, could you help me understand how to reproduce this problem? Specifically, could you provide the MongoDB query which caused this profiler entry to be generated? That should help us spot the problem. CC cristopher.stauffer.

Additional Resources / Links

Share:

BugZero Risk Score

Coming soon

Status

Closed

Have you been affected by this bug?

cost-cta-background

Do you know how much operational outages are costing you?

Understand the cost to your business and how BugZero can help you reduce those costs.

Discussion

Login to read and write comments.

Have you ever...

had your data corrupted from a

VMware

bug?

Search:

...