Data Structures

Master Python's built-in data structures — lists, tuples, sets, dicts, and comprehensions.

Beginner 40 min read 🐍 Python

Python's Built-in Data Structures

Four powerful containers built right in

Lists, tuples, sets, and dictionaries — no imports needed. Most programming languages make you import a library for collections. Python gives you four versatile, battle-tested data structures out of the box.

Choosing the right data structure is one of the most important decisions in programming. Using a list when you should use a set can make your program 1000x slower. Using a dict when you need a list makes your code confusing. Let's understand each one deeply so you can make the right choice every time.

Lists []

Ordered, mutable, allows duplicates. Your go-to for most sequences.

Tuples ()

Ordered, immutable, allows duplicates. For data that shouldn't change.

Sets {}

Unordered, mutable, NO duplicates. Fast membership testing.

Dicts {k:v}

Key-value pairs, ordered (3.7+). For mapping relationships.

Lists

Lists are Python's workhorse data structure. Think of a list as a numbered shelf — each item has a position (starting from 0), you can add or remove items freely, and items stay in the order you put them. Lists can hold any type of data, and you can even mix types in a single list (though it's usually better not to).

Lists are the most commonly used data structure in Python. If you're unsure which collection to use, a list is usually a safe default. You'll use them for storing search results, collecting user inputs, building sequences, and much more.

Creating and Accessing

Create a list with square brackets. Access elements by index (position number). Remember: indexing starts at 0, not 1. You can also use negative indices — -1 is the last element, -2 is second-to-last:

fruits = ["apple", "banana", "cherry", "date"]
numbers = [1, 2, 3, 4, 5]
mixed = [42, "hello", True, 3.14]

print(fruits[0])     # apple (first)
print(fruits[-1])    # date (last)
print(fruits[1:3])   # ['banana', 'cherry']
print(len(fruits))   # 4
Output
apple
date
['banana', 'cherry']
4

Modifying Lists

fruits = ["apple", "banana", "cherry"]
fruits.append("date")           # Add to end
fruits.insert(1, "avocado")     # Insert at index
fruits.extend(["fig", "grape"]) # Add multiple
print(fruits)

fruits.remove("banana")  # Remove by value
popped = fruits.pop()    # Remove last
print(f"Popped: {popped}")

nums = [3, 1, 4, 1, 5, 9]
nums.sort()
print(nums)
Output
['apple', 'avocado', 'banana', 'cherry', 'date', 'fig', 'grape']
Popped: grape
[1, 1, 3, 4, 5, 9]
Key Takeaway: Lists are O(1) for append and index, O(n) for insert/remove/search. Use when order matters.

Tuples

point = (3, 4)
x, y = point  # Unpacking
print(f"x={x}, y={y}")

# Swap!
a, b = 1, 2
a, b = b, a
print(a, b)

from collections import namedtuple
Point = namedtuple("Point", ["x", "y"])
p = Point(3, 4)
print(p.x, p.y)
Output
x=3, y=4
2 1
3 4

Tuples vs Lists

Use tuples for fixed collections (coordinates, RGB). Use lists for dynamic collections (shopping cart). Tuples are faster and can be dict keys.

Sets

nums = set([1, 2, 2, 3, 3, 3])  # Duplicates removed
print(nums)  # {1, 2, 3}

a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
print(a | b)    # Union: {1,2,3,4,5,6}
print(a & b)    # Intersection: {3,4}
print(a - b)    # Difference: {1,2}
print(3 in a)   # True — O(1) lookup!
Output
{1, 2, 3}
{1, 2, 3, 4, 5, 6}
{3, 4}
{1, 2}
True

⚠️ Common Mistake: Using a List for Membership Testing

Wrong:

# O(n) — scans every element
if 999_999 in list(range(1_000_000)):  # SLOW
    print("Found!")

Why: Lists check membership by scanning every element. Sets use hash tables for O(1).

Instead:

# O(1) — hash lookup
large_set = set(range(1_000_000))
if 999_999 in large_set:  # FAST
    print("Found!")

Dictionaries

Creating and Accessing

person = {
    "name": "Alice",
    "age": 30,
    "hobbies": ["reading", "coding"]
}

print(person["name"])                # Alice
print(person.get("phone", "N/A"))    # N/A (safe)

person["email"] = "[email protected]"  # Add

for key, value in person.items():
    print(f"  {key}: {value}")
Output
Alice
N/A
  name: Alice
  age: 30
  hobbies: ['reading', 'coding']
  email: [email protected]

Useful Methods

scores = {"Alice": 95, "Bob": 87}

scores.setdefault("Charlie", 0)  # Add if missing
print(scores)

# Merge (Python 3.9+)
defaults = {"theme": "light", "lang": "en"}
user = {"theme": "dark"}
final = defaults | user
print(final)
Output
{'Alice': 95, 'Bob': 87, 'Charlie': 0}
{'theme': 'dark', 'lang': 'en'}

Comprehensions

# List comprehension
squares = [x**2 for x in range(6)]
print(squares)

# With filter
evens = [x for x in range(10) if x % 2 == 0]
print(evens)

# Dict comprehension
word_len = {w: len(w) for w in ["hi", "hello", "hey"]}
print(word_len)

# Set comprehension
unique = {len(w) for w in ["hi", "hello", "hey", "ha"]}
print(unique)
Output
[0, 1, 4, 9, 16, 25]
[0, 2, 4, 6, 8]
{'hi': 2, 'hello': 5, 'hey': 3}
{2, 3, 5}
Key Takeaway: Comprehensions are more readable and faster than equivalent for-loops. Use them for simple transformations.
🔍 Deep Dive: Generator Expressions

Replace brackets with parentheses for a lazy generator: sum(x**2 for x in range(1000000)). Generators compute values one at a time, using almost no memory.

Choosing the Right Structure

NeedUseWhy
Ordered, mutablelistFast append, index access
Fixed datatupleImmutable, hashable
Unique elementssetO(1) lookup
Key-value mappingdictO(1) by key
CountingCounterfrom collections
Default valuesdefaultdictfrom collections