7 Ways to Remove Duplicates from a List in Python

Sakshi Khanna 09 Jan, 2024 • 6 min read

Introduction

Python is a versatile programming language that offers developers various functionalities. One common task that Python developers often encounter is removing duplicates from a list. This blog will explore 7 ways to remove duplicates from a list, along with code examples and explanations.

list in python

What is a List?

You can store multiple values in a single variable using lists, as they are ordered collections of items.

Key characteristics of a List:

  • Ordered: Items maintain a specific order, accessible by their index.
  • Mutable: After creating the list, you can change, add, or remove items.
  • Can hold any data type: Lists can contain mixed data types (numbers, strings, other lists, etc.)
  • Created using square brackets []

Example 1:

# Creating a list

furnitures = ["chair", "table", "sofa"]

# Accessing elements by index (starts from 0)

first_furniture = furnitures[0]

print(first_furniture)

Output:

"chair"

Example 2:

last_furniture = furniture[-1] #(negative index starts from the end)

print(last_furniture)

Output:

"cherry" 

# Modifying elements

furnitures[1] = "bed"  # Replace "table" with "bed"

# Adding elements

furnitures.append("bed")  # Add "bed" to the end

furnitures.insert(2, "table")  # Insert "table" at index 2

# Removing elements

furnitures.remove("bed")  # Remove the first occurrence of "cherry"

del furnitures[0]  # Delete the first element

# Other common operations

furnitures.pop()  # Remove and return the last element

furnitures.index("bed")  # Find the index of "bed"

furnitures.count("table")  # Count the occurrences of "table"

furnitures.sort()  # Sort the list alphabetically

furnitures.reverse()  # Reverse the order of the list

Also Read: Introduction to Python Programming (Beginner’s Guide)

Using Set to Remove Duplicates

The first method to remove duplicates from a list is using Python’s set() function. This method is simple and efficient. Assets do not allow duplicate elements.

Convert the List to a Set

my_list = [1, 2, 2, 3, 4, 4, 5]

unique_set = set(my_list) 

print(unique_set)

Output:

{1, 2, 3, 4, 5}
  1. If you need a list structure again, convert the set back to a list (optional):
unique_list = list(unique_set)

print(unique_list)

Output:

[1, 2, 3, 4, 5]

Key points:

  • Order is not preserved: Sets are unordered, so the resulting list’s element order might differ from the original.
  • Efficiency: Sets are optimized for fast membership checks and duplicate removal, making this a very efficient method.

Using List Comprehension

List comprehension is a powerful feature in Python that allows us to create new lists concisely and readably. We can leverage list comprehension to remove duplicates from a list.

Create a New List with Unique Elements

  • Use a list comprehension that iterates through the original list and adds each unique element to a new list.
my_list = [1, 2, 2, 3, 4, 4, 5]

unique_list = [x for x in my_list if x not in unique_list]

print(unique_list)

Output:

[1, 2, 3, 4, 5]

Explanation:

  • x for x in my_list: Iterates through each element x in the original list.
  • if x is not in unique_list: Checks if the current element x is already in the new list. If not, it’s added to the new list.
  • This creates a list with unique elements, preserving their original order.

Key points:

  • Order is preserved: Unlike using sets, list comprehension maintains the original order of elements.
  • Efficiency: List comprehensions can be efficient for smaller lists.
  • Readability: The code is concise and often considered more readable than using sets.

Using the Filter() Function

Python’s filter() function can be used to create a new list with elements that satisfy a certain condition. We can use the filter() function to remove duplicates from a list.

  1. Define a Filtering Function
  • Create a function that checks if an element has yet to be seen before.
def is_unique(x, seen):

    return x not in seen

2. Apply Filter() with a Set to Track Unique Elements

  • Use filter() to apply each element’s is_unique() function.
  • Maintain a set to track seen elements for efficient membership checks.
my_list = [1, 2, 2, 3, 4, 4, 5]

unique_list = list(filter(lambda x: is_unique(x, set()), my_list))

print(unique_list)

Output:

[1, 2, 3, 4, 5]

Explanation:

  • filter() takes a function and an iterable (the list).
  • It calls the function for each element, keeping only those where the function returns True.
  • lambda x: is_unique(x, set()): Creates a short anonymous function that calls is_unique() for each element.
  • set() is used within is_unique() to efficiently check if an element is already in the seen set.

Key points:

  • Order is preserved: Like list comprehension, filter() maintains the original order.
  • Flexibility: filter() can be used with other logic as needed.
  • Efficiency: While often less efficient than sets or list comprehensions, it offers flexibility for custom filtering.

Using the OrderedDict Class

The OrderedDict class in Python is a dict subclass that remembers the order in which items are added. We can use the OrderedDict class to remove duplicates from a list while preserving the original order of elements.

  1. Import the OrderedDict class from collections import OrderedDict
  2. Create an OrderedDict from the list: This automatically removes duplicates based on keys, preserving order:
my_list = [1, 2, 2, 3, 4, 4, 5]

unique_dict = OrderedDict.fromkeys(my_list)
  1. Extract the keys as a list:
unique_list = list(unique_dict.keys())

print(unique_list)

Output:

[1, 2, 3, 4, 5]

Key points:

  • Order is preserved: OrderedDict maintains the original order of elements.
  • Optimized for insertions: It’s more efficient than regular dictionaries for frequent insertions at the beginning or end.
  • Unique keys: Dictionary keys must be unique, so duplicates are automatically removed.

Using the Counter Class

The Counter class in Python is a powerful tool for counting the occurrences of elements in a list. We can leverage the Counter class to remove duplicates from a list and obtain the count of each unique element.

  1. Import the Counter class from collections import Counter
  2. Create a Counter object from the list: The counter counts the occurrences of each element:
my_list = [1, 2, 2, 3, 4, 4, 5]

counter = Counter(my_list)
  1. Extract the unique elements: Use the keys() method to get a list of unique elements:
unique_list = list(counter.keys())

print(unique_list)

Output:

[1, 2, 3, 4, 5]

Key points:

  • Order is not guaranteed: Counter doesn’t preserve the original order.
  • Element counts: It provides element frequencies if needed:
print(counter.most_common())

Output:

[(2, 2), (4, 2), (1, 1), (3, 1), (5, 1)]

Using the itertools.groupby() Function

The itertools module in Python provides a groupby() function that allows us to group elements based on a key function. We can use the groupby() function to remove duplicates from a list.

  1. Import the groupby() function: from itertools import groupby
  2. Sort the list: groupby() works on consecutive elements, so sort the list first:
my_list = [1, 2, 2, 3, 4, 4, 5]

my_list.sort()  # [1, 2, 2, 3, 4, 4, 5]
  1. Apply groupby() and extract unique elements: Group consecutive elements together:
unique_elements = [key for key, _ in groupby(my_list)]

print(unique_elements)

Output:

[1, 2, 3, 4, 5]

Explanation:

  • groupby() takes an iterable (like the sorted list) and a key function (defaults to the identity function).
  • It returns an iterator that yields consecutive keys and groups of elements.
  • key for key, _ in groupby(…): Uses a generator expression to extract only the keys (unique elements) from the groups.

Key points:

  • Order is preserved: Sorting maintains the original order of unique elements.
  • Efficiency: groupby() can be efficient for larger lists, as it avoids creating intermediate data structures.
  • Grouped processing: It’s useful for further processing or analysis of groups of duplicates.

Using the Pandas Library

Use the pandas’ library in Python for data manipulation and analysis. We can use the dropduplicates() function in pandas to remove duplicates from a list and obtain the unique elements.

  1. Import the pandas’ library:
import pandas as pd
  1. Create a pandas Series from the list:
my_list = [1, 2, 2, 3, 4, 4, 5]

my_series = pd.Series(my_list)
  1. Use the drop_duplicates() method:
unique_series = my_series.drop_duplicates()
  1. Convert back to a list:
unique_list = list(unique_series)

Output:

[1, 2, 3, 4, 5]

Key points:

  • Preserve the original order of elements: Maintain the original order of elements.
  • Flexibility: Customize drop_duplicates() with parameters like:
    • keep: Which duplicates to keep (“first”, “last”, or “False” for all).
    • subset: Columns to consider for duplicate identification (for DataFrames).
  • DataFrame handling: Use DataFrame.drop_duplicates() for duplicate removal in DataFrames.

Alternative using unique():

unique_list = my_series.unique()  # Also returns a NumPy array

Conclusion

In this blog, we explored 7 ways to remove duplicates from a list in Python. Each method offers advantages and uses them based on the specific requirements of the task. By understanding these methods, Python developers can efficiently handle duplicate elements in lists and optimize their code for better performance. Whether it’s using set(), list comprehension, filter(), OrderedDict, Counter, itertools. group by (), or the panda’s library, Python provides a variety of tools to tackle the common problem of removing duplicates from a list.

If you want to learn Python from scratch. Then, enroll for free for this course.

Frequently Asked Questions

Q1. What are the ways to remove duplicates from list in Python?

A. Use the set() constructor to eliminate duplicates in a list:
unique_list = list(set(original_list)).
Alternatively, employ a list comprehension:
unique_list = [x for i, x in enumerate(original_list) if x not in original_list[:i]].

Q2. What is the fastest way to remove duplicates in Python?

A. The fastest way to remove duplicates in Python is by using the set() constructor: unique_list = list(set(original_list)).

Q3. How do I remove duplicates from a list in Python efficiently?

A. To efficiently remove duplicates from a list in Python while preserving the original order, you can use the collections.OrderedDict approach:
from collections import OrderedDict
unique_list = list(OrderedDict.fromkeys(original_list))

Q4. How do I extract duplicates from a list in Python?

A. To extract duplicates from a list in Python, you can use the following approach:
original_list = [1, 2, 2, 3, 4, 4, 5]
duplicates = [item for item in set(original_list) if original_list.count(item) > 1]
This list comprehension creates a new list (duplicates) containing items that appear more than once in the original list while using a set to ensure unique values.

Sakshi Khanna 09 Jan 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear