BugZero found this defect 1308 days ago.
Data sources
All data on this page is proprietary to BugZero® or gathered from public sources
7/21/2023
SQL Server 2016 Developer - duplicate (do not use)
SQL Server 2016 Enterprise - duplicate (do not use)
SQL Server 2016 Enterprise Core - duplicate (do not use)
SQL Server 2016 Standard - duplicate (do not use)
SQL Server 2016 Service Pack 1
SQL Server 2017 on Windows (all editions)
SQL Server 2017 on Linux (all editions)
build lower than 14.0.3006.16
14.0.3006.16
Assume that you create a PolyBase external table that uses a PARQUET file as data source in SQL Server 2017 and Microsoft SQL Server 2016. The PARQUET file is split into multiple files in Hadoop Distributed File System (HDFS), and each file is greater than the block size of HDFS. In this situation, when you query data from this external table, duplicate rows may be returned.
This issue is fixed in the following cumulative updates for SQL Server: Cumulative Update 1 for SQL Server 2017 Cumulative Update 6 for SQL Server 2016 RTM Cumulative Update 6 for SQL Server 2016 SP1