If you often write small classes that mostly hold data—users, settings rows, API payloads—you have probably duplicated the same __init__, __repr__, and __eq__ boilerplate. Below I start with one runnable @dataclass example, then we build up together: defaults, safe defaults for lists and dicts, your own methods, __post_init__, turning objects into dicts, freezing instances, comparing and ordering, hiding fields from repr or equality, nesting types, a quick inheritance pattern, and how this compares to NamedTuple and SimpleNamespace. When you need every edge case, keep the official dataclasses documentation open beside you—I stay close to what you are most likely to ship in real code.
Tested on: Python 3.13.3; kernel 6.14.0-37-generic.
Simple Python dataclass example
Let's start with the smallest thing that still feels useful: two fields, you construct one object, you print it.
from dataclasses import dataclass
@dataclass
class User:
name: str
age: int
user = User("Alice", 30)
print(user)Run that on your machine and you should get something like User(name='Alice', age=30)—a readable repr—without you hand-writing __init__ or __repr__.
What is a dataclass in Python?
Picture a normal Python class, then put @dataclass on top. When the class body is parsed, you have already told Python what each attribute should look like using type annotations. The decorator uses those names to build __init__, __repr__, and __eq__ for you (and more if you ask for it). That idea landed in Python 3.7 as PEP 557.
So for you the mental model is simple: you describe the shape of the data once at the top of the class; Python fills in the boring methods you used to copy-paste. If you want to remember what life without it felt like, the next section shows the same idea as a plain class.
dataclass vs normal class
Say you need a Book with title, author, and pages. Without dataclasses you usually wire the same three names into __init__, into how the object prints, and into equality—again and again.
class BookPlain:
def __init__(self, title: str, author: str, pages: int):
self.title = title
self.author = author
self.pages = pages
def __repr__(self):
return f"BookPlain(title={self.title!r}, author={self.author!r}, pages={self.pages})"
def __eq__(self, other):
if not isinstance(other, BookPlain):
return NotImplemented
return (self.title, self.author, self.pages) == (other.title, other.author, other.pages)With @dataclass you write the fields once and stop there:
from dataclasses import dataclass
@dataclass
class Book:
title: str
author: str
pages: intYou can still call Book("1984", "George Orwell", 328) and use == between two books. The win for you shows up the moment you add or rename a field: you touch one place instead of three. If you want more background on classes and constructors in general, see the class and constructor guides.
Default values in a dataclass
You already know how default arguments work on functions—field defaults work the same way. Anything without a default must come first, then fields with defaults (Python enforces that ordering so the generated __init__ stays valid).
from dataclasses import dataclass
@dataclass
class Profile:
username: str
role: str = "member"
points: int = 0
p = Profile("alex")
print(p)Try creating Profile("alex") without passing role or points; you should see those filled in for you in the printed object.
Use field(default_factory) for lists and dictionaries
Here is a trap I still see in real code: writing items: list = [] (or items: list[str] = []) as a default. Every instance ends up sharing that one list. You do not want that.
Instead you hand dataclass a callable that builds a fresh empty container each time—usually list or dict—via field(default_factory=...).
from dataclasses import dataclass, field
@dataclass
class Cart:
items: list[str] = field(default_factory=list)
cart1 = Cart()
cart2 = Cart()
cart1.items.append("book")
print(cart1.items)
print(cart2.items)Run it: cart1 should show ['book'] while cart2 stays []. That is the behavior you expect when each cart owns its own list. The field() documentation spells out default_factory in more detail if you want the fine print.
Add methods to a dataclass
Nothing stops you from adding regular methods next to your fields—it is still a class. I use this when I want a tiny helper (formatting, a computed label) sitting right beside the data it belongs to.
from dataclasses import dataclass
@dataclass
class Book:
title: str
author: str
pages: int
def summary(self) -> str:
return f"{self.title} by {self.author} ({self.pages} pp.)"
b = Book("1984", "George Orwell", 328)
print(b.summary())You should see a single readable line. If your class is turning into a big object graph with lots of behavior, ask yourself whether a plain class (or another pattern) reads clearer for your teammates—dataclasses shine when data is the point.
Use post_init after object creation
After dataclass builds and runs __init__ for you, Python looks for __post_init__ on your class. If you defined it, it runs next. That hook is where I put validation (“if end < start, raise”), normalization, or values derived from other fields.
If a value should not be passed into the constructor at all, mark it with field(init=False) and set it inside __post_init__.
from dataclasses import dataclass, field
@dataclass
class Book:
title: str
author: str
pages: int
weight_kg: float = field(init=False)
def __post_init__(self) -> None:
self.weight_kg = round(self.pages * 0.001, 3)
b = Book("1984", "George Orwell", 328)
print(b.weight_kg)You never pass weight_kg into Book(...); it appears after init. You should see a small float based on pages.
Convert dataclass to dictionary using asdict()
Sometimes you need JSON-friendly data or you are handing values to something that expects a dict. asdict() walks your instance and builds a dictionary, turning nested dataclasses into nested dicts along the way.
from dataclasses import asdict, dataclass
@dataclass
class User:
name: str
age: int
user = User("Alice", 30)
print(asdict(user))You should see {'name': 'Alice', 'age': 30}. If you prefer a tuple in field order instead of a mapping, reach for astuple() the same way.
One heads-up (I come back to this in common mistakes): asdict() is not just “give me __dict__”; it follows the documented recipe, including recursion and copying behavior that can surprise you on large or deeply linked objects.
Frozen dataclass for immutable objects
Pass frozen=True when you want the instance to behave more like a value: after construction, attribute writes raise FrozenInstanceError. I use this for small identifiers, coordinates, or anything you do not want changed by accident.
from dataclasses import FrozenInstanceError, dataclass
@dataclass(frozen=True)
class Point:
x: int
y: int
point = Point(1, 2)
try:
point.x = 10
except FrozenInstanceError as e:
print(type(e).__name__, str(e))You should see FrozenInstanceError printed along with text telling you the field cannot be assigned.
Order and compare dataclass objects
Flip on order=True when you want <, >, and friends generated for you. Here is the part that trips people up: Python does not guess which field “matters most.” It walks your fields left to right, like comparing two tuples built from those values.
from dataclasses import dataclass
@dataclass(order=True)
class Score:
points: int
name: str
a = Score(90, "Alice")
b = Score(80, "Bob")
print(a > b)You get True because 90 beats 80 on the first field. If points were equal, Python would move on to name and compare strings lexicographically.
So if you had title, then author, then pages, a > comparison looks at title first—not pages alone. When you care about sorting by one column, either declare that field earlier or mark other fields with field(compare=False) so they drop out of comparisons. You are in control of the ordering story.
Exclude fields from repr or comparison
Maybe you do not want a secret token showing up whenever someone prints your object, or you want equality to ignore a volatile cache field. Per-field field() flags let you tune what the generated methods include.
from dataclasses import dataclass, field
@dataclass
class Account:
username: str
token: str = field(repr=False, compare=False)
a1 = Account("alex", "secret-a")
a2 = Account("alex", "secret-b")
print(a1)
print(a1 == a2)When you print a1, you should not see token in the output, and a1 == a2 should be True because only username participates in equality for you here.
Nested dataclasses
You can nest them like any other type: one dataclass field whose type is another dataclass. When you call asdict() on the outer object, nested dataclasses become nested dicts, which is handy when you are building payloads.
from dataclasses import asdict, dataclass
@dataclass
class Address:
city: str
zip_code: str
@dataclass
class Person:
name: str
home: Address
p = Person("Riya", Address("Pune", "411001"))
print(asdict(p))You should see something like {'name': 'Riya', 'home': {'city': 'Pune', 'zip_code': '411001'}}.
Inheritance with dataclasses
You can subclass the same way you already do with regular classes: the child lists extra annotated fields, and the parent fields stay part of the constructor. Just watch the usual rule—non-default fields before defaulted ones—across the whole inheritance chain. When parent __init__ logic still matters, Python super() applies the same cooperative rules as with hand-written constructors.
from dataclasses import dataclass
@dataclass
class Publication:
title: str
@dataclass
class Magazine(Publication):
issue: int
m = Magazine("PyMag", 42)
print(m)You should see both title and issue in the repr. If you stack several levels of inheritance, skim the official notes on init=False on parent classes so you are not surprised by constructor shape.
dataclass vs NamedTuple vs SimpleNamespace
Here is how I choose between them in practice:
| Construct | Mutability | Boilerplate | When I reach for it |
|---|---|---|---|
@dataclass |
Mutable unless you set frozen=True |
Low for init/repr/eq | App models with fields, defaults, and a few methods |
typing.NamedTuple / collections.namedtuple |
Immutable tuple feel | Low | Small fixed records, unpacking, hashable values when you need them |
types.SimpleNamespace |
Mutable bag of attributes | None | Quick experiments, not a schema I would publish |
If your data is really a row of values, a tuple or named tuple variant might still fit. If you live in string keys and dynamic shape, a dictionary might be simpler than a class. Dataclasses sit in the middle: named, typed fields, less ceremony than rolling everything by hand.
Common mistakes with Python dataclasses
Things I watch for when I review code:
- Forgetting a type annotation on something you meant as a field—without it, dataclass might not treat that name as a field at all.
- Using
items: list[str] = [](or any mutable literal default)—share one list across all instances by accident; usefield(default_factory=list)instead. - Expecting Python to reject
User(123, "oops")at runtime—annotations help you and tools like mypy, but the interpreter itself does not enforce them unless you add checks (often in__post_init__). - Believing
order=Truemagically sorts by the field you care about—comparisons walk fields in declaration order; reorder fields or usecompare=Falsewhere needed. - Growing a dataclass into a huge behavior-heavy hierarchy—sometimes a plain class or composition reads clearer for the next person who opens the file.
- Assuming instances are immutable—by default you can still assign new values to attributes; reach for
frozen=Truewhen you need immutability. - Treating
asdict()as a cheap shallow view—it recursively expands dataclasses and follows the documented copying rules, which matters for performance and for what still aliases.
Python dataclass quick reference table
| What you want | What you usually write |
|---|---|
| A small data record | @dataclass + annotated fields |
| A scalar default | role: str = "guest" |
| A fresh list or dict per object | items: list[str] = field(default_factory=list) |
| Logic right after construction | def __post_init__(self): ... |
A field not passed into __init__ |
field(init=False) and assign in __post_init__ |
| Immutability | @dataclass(frozen=True) |
| Rich comparisons | @dataclass(order=True) |
Hide from repr or skip in == |
field(repr=False) / field(compare=False) |
| Dict for JSON or logging | asdict(obj) |
| Tuple in field order | astuple(obj) |
Summary
By now you have seen how @dataclass lets you declare fields once and lean on Python for __init__, __repr__, and __eq__, with optional ordering and freezing when you need them. You default scalars directly, use default_factory for mutable defaults, hook __post_init__ for validation and derived values, and use asdict / astuple when another layer of your program wants plain data structures. Keep comparisons honest—field order drives order=True—and remember annotations document intent for you and for static checkers, not automatic runtime guards unless you add them yourself.

