SQL Query to Delete Duplicate Rows
Last Updated :
26 Nov, 2024
Duplicate rows in a database can cause inaccurate results, waste storage space, and slow down queries. Cleaning duplicate records from our database is an essential maintenance task for ensuring data accuracy and performance. Duplicate rows in a SQL table can lead to data inconsistencies and performance issues, making it crucial to identify and remove them effectively
In this article, we will explain the process of deleting duplicate rows from a SQL table step-by-step, using SQL Server, with examples and outputs. We’ll cover techniques using GROUP BY
, CTE
, and more, incorporating best practices to help us effectively handle duplicates.
What Are Duplicate Rows?
Duplicate rows are records in a database table that have identical values in one or more columns. These rows often arise due to issues like multiple imports, user errors, or missing constraints like primary keys or unique indexes. Duplicate rows can lead to inaccurate reports, redundant data storage, and slower query performance, making it essential to remove them.
A SQL query to delete duplicate rows typically involves identifying duplicates using functions like ROW_NUMBER
()
or COUNT
()
and making sure that only one copy of each record is kept in the table.
Why Remove Duplicate Rows?
- Optimize memory usage by eliminating redundant data.
- Ensure data accuracy for reports and queries.
- Improve query performance by reducing data redundancy.
Steps to Delete Duplicate Rows in SQL
To effectively remove duplicate rows in SQL, we can follow a structured approach. Below are the steps to create a DETAILS table and demonstrate how to identify and delete duplicate records:
Step 1: Create the Sample Table
We will create a table named DETAILS
to demonstrate how to identify and delete duplicate rows. This step helps in setting up the necessary structure to store sample data and perform operations like detecting duplicates and applying deletion techniques.
Query:
CREATE TABLE DETAILS (
SN INT IDENTITY(1,1) PRIMARY KEY,
EMPNAME VARCHAR(25) NOT NULL,
DEPT VARCHAR(20) NOT NULL,
CONTACTNO BIGINT NOT NULL,
CITY VARCHAR(15) NOT NULL
);

Â
Step 2: Insert Data into the Table
Let’s insert some data, including duplicates, into the DETAILS
table. This step allows us to copy real-world scenarios where duplicate records might occur, enabling us to demonstrate how to identify and remove them effectively.
Query:
INSERT INTO DETAILS (EMPNAME, DEPT, CONTACTNO, CITY)
VALUES
('VISHAL', 'SALES', 9193458625, 'GAZIABAD'),
('VIPIN', 'MANAGER', 7352158944, 'BAREILLY'),
('ROHIT', 'IT', 7830246946, 'KANPUR'),
('RAHUL', 'MARKETING', 9635688441, 'MEERUT'),
('SANJAY', 'SALES', 9149335694, 'MORADABAD'),
('VIPIN', 'MANAGER', 7352158944, 'BAREILLY'),
('VISHAL', 'SALES', 9193458625, 'GAZIABAD'),
('AMAN', 'IT', 78359941265, 'RAMPUR');

Â
Output

Â
Step 3: Identify Duplicate Rows
We Use the GROUP BY
clause with the COUNT
(*)
function to find rows with duplicate values. This step helps us group the records by specific columns and count how many times each combination occurs, making it easier to identify duplicates that appear more than once in the table.
Query:Â
SELECT EMPNAME,DEPT,CONTACTNO,CITY,
COUNT(*) FROM DETAILS
GROUP BY EMPNAME,DEPT,CONTACTNO,CITY
HAVING COUNT(*)>1
Output

Â
Step 4: Retain Unique Rows
To identify unique rows, use the same query without the HAVING
clause or filter rows with a count of 1
. This step allows us to isolate the rows that appear only once in the table, ensuring that only the distinct records are retained while duplicates are removed.
Query:
SELECT EMPNAME,DEPT,CONTACTNO,CITY,
COUNT(*) FROM DETAILS
GROUP BY EMPNAME,DEPT,CONTACTNO,CITY
Output

Â
Step 5: Delete Duplicate Rows
Use the GROUP BY
clause along with MIN
(SN)
to retain one unique row for each duplicate group. This method identifies the first occurrence of each duplicate combination based on the SN (serial number) and deletes the other duplicate rows.
Query:
DELETE FROM DETAILS
WHERE SN NOT IN (
SELECT MIN(SN)
FROM DETAILS
GROUP BY EMPNAME, DEPT, CONTACTNO, CITY
);
Select * FROM DETAILS;
Output

Output after Deleting Duplicate Rows
Step 6: Alternative Approach Using CTE
Using a Common Table Expression (CTE), we can delete duplicates in a more structured way. CTEs provide a cleaner approach by allowing us to define a temporary result set that can be referenced within the DELETE
statement. This method can be more readable and maintainable, especially when dealing with complex queries.
Query:
WITH CTE AS (
SELECT SN, EMPNAME, DEPT, CONTACTNO, CITY,
ROW_NUMBER() OVER (PARTITION BY EMPNAME, DEPT, CONTACTNO, CITY ORDER BY SN) AS RowNum
FROM DETAILS
)
DELETE FROM CTE WHERE RowNum > 1;
Output
SN |
EMPNAME |
DEPT |
CONTACTNO |
CITY |
1 |
VISHAL |
SALES |
9193458625 |
GAZIABAD |
2 |
VIPIN |
MANAGER |
7352158944 |
BAREILLY |
3 |
ROHIT |
IT |
7830246946 |
KANPUR |
4 |
RAHUL |
MARKETING |
9635688441 |
MEERUT |
5 |
SANJAY |
SALES |
9149335694 |
MORADABAD |
8 |
AMAN |
IT |
78359941265 |
RAMPUR |
Explanation:
In the above result, the duplicate rows for VISHAL
and VIPIN
have been removed, and only one instance of each remains.
Conclusion
Duplicate rows in SQL databases can negatively impact performance and data accuracy. Using methods like GROUP BY
, ROW_NUMBER
()
, and CTE
, we can efficiently delete duplicate rows in SQL while retaining unique records. Always test your queries on a backup or development environment to ensure accuracy before applying them to production databases. By Using these methods, we can confidently remove duplicate rows in SQL, keeping our database clean and reliable.
Similar Reads
Python MySQL
Python MySQL Connector is a Python driver that helps to integrate Python and MySQL. This Python MySQL library allows the conversion between Python and MySQL data types. MySQL Connector API is implemented using pure Python and does not require any third-party library. This Python MySQL tutorial will
9 min read
How to install MySQL Connector Package in Python
MySQL is a Relational Database Management System (RDBMS) whereas the structured Query Language (SQL) is the language used for handling the RDBMS using commands i.e Creating, Inserting, Updating and Deleting the data from the databases. A connector is employed when we have to use MySQL with other pro
2 min read
Connect MySQL database using MySQL-Connector Python
While working with Python we need to work with databases, they may be of different types like MySQL, SQLite, NoSQL, etc. In this article, we will be looking forward to how to connect MySQL databases using MySQL Connector/Python.MySQL Connector module of Python is used to connect MySQL databases with
2 min read
How to Install MySQLdb module for Python in Linux?
In this article, we are discussing how to connect to the MySQL database module for python in Linux. MySQLdb is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2.0 and is built on top of the MySQL C API. Installing MySQLdb module for Python o
2 min read
MySQL Queries
Python MySQL - Select Query
Python Database API ( Application Program Interface ) is the Database interface for the standard Python. This standard is adhered to by most Python Database interfaces. There are various Database servers supported by Python Database such as MySQL, GadFly, mySQL, PostgreSQL, Microsoft SQL Server 2000
2 min read
CRUD Operation in Python using MySQL
In this article, we will be seeing how to perform CRUD (CREATE, READ, UPDATE and DELETE) operations in Python using MySQL. For this, we will be using the Python MySQL connector. For MySQL, we have used Visual Studio Code for python. Before beginning we need to install the MySQL connector with the co
6 min read
Python MySQL - Create Database
Python Database API ( Application Program Interface ) is the Database interface for the standard Python. This standard is adhered to by most Python Database interfaces. There are various Database servers supported by Python Database such as MySQL, GadFly, mSQL, PostgreSQL, Microsoft SQL Server 2000,
2 min read
Python MySQL - Update Query
A connector is employed when we have to use MySQL with other programming languages. The work of MySQL-connector is to provide access to MySQL Driver to the required language. Thus, it generates a connection between the programming language and the MySQL Server. Update Clause The update is used to ch
2 min read
Python MySQL - Insert into Table
MySQL is a Relational Database Management System (RDBMS) whereas the structured Query Language (SQL) is the language used for handling the RDBMS using commands i.e Creating, Inserting, Updating and Deleting the data from the databases. SQL commands are case insensitive i.e CREATE and create signify
3 min read
Python MySQL - Insert record if not exists in table
In this article, we will try to insert records and check if they EXISTS or not. The EXISTS condition in SQL is used to check if the result of a correlated nested query is empty (contains no tuples) or not. It can be used to INSERT, SELECT, UPDATE, or DELETE statements. Pre-requisite Connect MySQL Da
4 min read
Python MySQL - Delete Query
Python Database API ( Application Program Interface ) is the Database interface for the standard Python. This standard is adhered to by most Python Database interfaces. There are various Database servers supported by Python Databases such as MySQL, GadFly, PostgreSQL, Microsoft SQL Server 2000, Info
3 min read
MySQL Working with Data
MySQL | Regular Expressions (Regexp)
In MySQL, regular expressions (REGEX) are a powerful tool used to perform flexible pattern matching within string data. By utilizing REGEXP and RLIKE operators, developers can efficiently search, validate, and manipulate string data in more dynamic ways than simple LIKE queries. In this article, we
6 min read
SQL Query to Match Any Part of String
It is used for searching a string or a sub-string to find a certain character or group of characters from a string. We can use the LIKE Operator of SQL to search sub-strings. The LIKE operator is used with the WHERE Clause to search a pattern in a string of columns. The LIKE operator is used in conj
3 min read
SQL Auto Increment
In SQL databases, a primary key is important for uniquely identifying records in a table. However, sometimes it is not practical to manually assign unique values for each record, especially when handling large datasets. To simplify this process, SQL databases offer an Auto Increment feature that aut
6 min read
SQL Query to Delete Duplicate Rows
Duplicate rows in a database can cause inaccurate results, waste storage space, and slow down queries. Cleaning duplicate records from our database is an essential maintenance task for ensuring data accuracy and performance. Duplicate rows in a SQL table can lead to data inconsistencies and performa
5 min read
SQL Query to Convert an Integer to Year Month and Days
With this article, we will be knowing how to convert an integer to Year, Month, Days from an integer value. The prerequisites of this article are you should be having a MSSQL server on your computer. What is a query? A query is a statement or a group of statements written to perform a specific task,
2 min read
Calculate the Number of Months between two specific dates in SQL
In this article, we will discuss the overview of SQL Query to Calculate the Number of Months between two specific dates and will implement with the help of an example for better understanding. Let's discuss it step by step. Overview :Here we will see, how to calculate the number of months between th
3 min read
How to Compare Two Queries in SQL
Queries in SQL :A query will either be an invitation for data results from your info or for action on the info, or each. a question will provide you with a solution to a straightforward question, perform calculations, mix data from totally different tables, add, change, or delete data from info. Cre
2 min read
Joining 4 Tables in SQL
The purpose of this article is to make a simple program to Join two tables using Join and Where clause in SQL. Below is the implementation for the same using MySQL. The prerequisites of this topic are MySQL and the installment of Apache Server on your computer. Introduction :In SQL, a query is a req
3 min read
MySQL Working with Images
Working with MySQL BLOB in Python
In Python Programming, We can connect with several databases like MySQL, Oracle, SQLite, etc., using inbuilt support. We have separate modules for each database. We can use SQL Language as a mediator between the python program and database. We will write all queries in our python program and send th
4 min read
Retrieve Image and File stored as a BLOB from MySQL Table using Python
Prerequisites: MySQL server should be installed In this post, we will be talking about how we can store files like images, text files, and other file formats into a MySQL table from a python script. Sometimes, just like other information, we need to store images and files into our database and provi
3 min read
How to read image from SQL using Python?
In this article, we are going to discuss how to read an image or file from SQL using python. For doing the practical implementation, We will use MySQL database. First, We need to connect our Python Program with MySQL database. For doing this task, we need to follow these below steps: Steps to Connec
3 min read
Boutique Management System using Python-MySQL Connectivity
In this article, we are going to make a simple project on a boutique management system using Python MySql connectivity. Introduction This is a boutique management system made using MySQL connectivity with Python. It uses a MySQL database to store data in the form of tables and to maintain a proper r
15+ min read