Marshmallow is a small library for structuring, validating, and serializing data in Python. You declare a Schema with fields, call dump() when sending data outward, and load() when accepting untrusted input. The sections below walk through schemas, common field types, validation, nesting, and the hooks that sit around dump() and load().
Install with pip (see pip requirements file for project pinning). For error handling patterns around ValidationError, see Python try except.
Tested on: Python 3.13.3; marshmallow 3.26.2; kernel 6.14.0-37-generic.
What is Marshmallow in Python?
Marshmallow is a serialization and validation library. You describe the expected shape of data once in a Schema class (field names, types, which keys are required, and optional validators). That description then drives two operations: turning Python data into plain dicts you can send as JSON (dump()), and turning untrusted dicts (for example parsed request bodies) into validated Python data (load()).
It does not replace your database or web framework. It sits at the boundary where structured data enters or leaves your program—so you validate early, fail with clear field-level errors, and keep your internal code working with predictable dicts or objects.
A minimal round trip: define fields, load() inbound data, then dump() it again for a response or log line.
from marshmallow import Schema, fields
class BookSchema(Schema):
title = fields.Str(required=True)
pages = fields.Int(load_default=0)
payload = {"title": "Python Distilled", "pages": 250}
book = BookSchema().load(payload)
print("after load:", book)
print("after dump:", BookSchema().dump(book))After load, book is a dict with normalized types (here pages is an int). After dump, you get JSON-serializable values—often strings for dates and decimals, depending on field types. The following sections unpack Schema, individual field types, errors, nesting, and hooks.
Install Marshmallow using pip
pip install marshmallowPin a major series in production (for example marshmallow>=3.21,<4) so upgrades stay predictable.
The in-page Run control for Python on this site executes code in a remote sandbox that includes only the standard library, so third-party packages such as Marshmallow are not available there. Every Python example below uses {run=false} on the fence so the button is not offered; run the snippets on your machine after installing Marshmallow (for example save a file and run python example.py, or paste into python -i).
Create a Marshmallow schema
A schema subclasses Schema and attaches fields as class attributes. Instantiate the class once and reuse it like a small codec:
from marshmallow import Schema, fields
class UserSchema(Schema):
id = fields.Int(dump_only=True)
name = fields.Str(required=True)
email = fields.Email(required=True)
age = fields.Int(load_default=None)
schema = UserSchema()This schema expects name and email on input; id is output-only in typical APIs; age may be omitted on load and becomes None.
Common Marshmallow field types
fields.Str, fields.Int, fields.Float, fields.Bool, fields.Email, fields.Url, fields.Date, fields.DateTime, fields.Decimal, fields.List, fields.Dict, and fields.Nested cover most payloads. Use validate helpers such as validate.Length, validate.Range, and validate.OneOf to express rules next to the field.
from marshmallow import Schema, fields, validate
class TagSchema(Schema):
label = fields.Str(validate=validate.Length(min=1, max=40))
count = fields.Int(validate=validate.Range(min=0))Serialize data using dump()
dump() walks the object or dict field-by-field and returns a plain dict (order-preserving) ready for json.dumps. It does not run the same validators as load() unless you customize that path—treat dumped data as already trusted.
from marshmallow import Schema, fields
class UserSchema(Schema):
id = fields.Int(dump_only=True)
name = fields.Str(required=True)
email = fields.Email(required=True)
age = fields.Int(load_default=None)
user = {"id": 1, "name": "Ada", "email": "ada@example.com", "age": 36}
print(UserSchema().dump(user))That includes id, name, email, and age keys suitable for JSON encoding.
Deserialize and validate data using load()
load() validates required fields, runs validators, and coerces types. Valid input returns a plain dict:
from marshmallow import Schema, fields
class UserSchema(Schema):
id = fields.Int(dump_only=True)
name = fields.Str(required=True)
email = fields.Email(required=True)
age = fields.Int(load_default=None)
payload = {"name": "Lin", "email": "lin@example.com"}
print(UserSchema().load(payload))That returns {'name': 'Lin', 'email': 'lin@example.com', 'age': None} because load_default supplies age when missing.
dump() vs load() in Marshmallow
Think in terms of data direction and trust.
load() answers: “Is this dict allowed, and what Python values should I use?” It runs required-field checks, type coercion (strings to numbers, ISO strings to datetime, and so on), and any validators you attached. It is the right place to gate untrusted input (HTTP bodies, CLI flags, config files) before that data reaches your domain logic. When something is wrong, Marshmallow raises ValidationError and you typically map err.messages to HTTP 400 or similar.
dump() answers: “How do I expose this object or dict to the outside world?” It reads attributes or keys and builds a plain dict suitable for json.dumps or logging. By default it does not re-run the same validation pipeline as load(); you assume the value is already valid for your API contract. Use dump() for responses, audit payloads, and message queues—after your business rules have already accepted the data.
The two calls can intentionally see different fields: dump_only (for example id, created_at) appears on output but is ignored on input; load_only (for example password) is accepted on input but stripped from serialized output. Constructor options such as only and exclude further narrow which fields participate in a given call without editing the class.
Required fields and validation
required=True on a field makes load() fail if the key is missing or the value is None (unless allow_none=True). Combine with field-level validate for richer rules:
from marshmallow import Schema, fields, validate
class ProfileSchema(Schema):
username = fields.Str(required=True, validate=validate.Length(min=3))
bio = fields.Str(load_default="")Handle ValidationError
Wrap load() in try/except and read err.messages—a dict mirroring field paths. For general exception patterns, see Python try except.
from marshmallow import Schema, fields, validate, ValidationError
class LoginSchema(Schema):
username = fields.Str(required=True, validate=validate.Length(min=3))
password = fields.Str(required=True, validate=validate.Length(min=8))
try:
LoginSchema().load({"username": "ab", "password": "short"})
except ValidationError as err:
print(err.messages)That prints a dict pointing at failing fields (for example length errors under username and password).
Nested schemas in Marshmallow
fields.Nested embeds another schema for structured sub-objects:
from marshmallow import Schema, fields
class AddressSchema(Schema):
city = fields.Str(required=True)
country = fields.Str(required=True)
class PersonSchema(Schema):
name = fields.Str(required=True)
address = fields.Nested(AddressSchema, required=True)
payload = {
"name": "Kim",
"address": {"city": "Seoul", "country": "KR"},
}
print(PersonSchema().load(payload))Serialize and deserialize lists using many=True
When the JSON body is an array of objects, set many=True on the schema instance. Each element is validated independently; errors include the list index in the error path.
from marshmallow import Schema, fields
class UserSchema(Schema):
id = fields.Int(dump_only=True)
name = fields.Str(required=True)
email = fields.Email(required=True)
age = fields.Int(load_default=None)
rows = [
{"name": "Ada", "email": "ada@example.com", "age": 36},
{"name": "Lin", "email": "lin@example.com"},
]
print(UserSchema(many=True).load(rows))For a nested list inside one object, use fields.Nested(SomeSchema, many=True) on the parent field instead.
Format DateTime fields in Marshmallow
Use the format argument to control string output during dump() and accepted string patterns on load():
from datetime import datetime
from marshmallow import Schema, fields
class EventSchema(Schema):
name = fields.Str()
starts = fields.DateTime(format="%Y-%m-%d %H:%M:%S")
dt = datetime(2026, 6, 20, 14, 30, 0)
print(EventSchema().dump({"name": "Meet", "starts": dt}))For timezone-aware values and ISO strings, prefer aware datetime objects and document the format your API expects; see also Python datetime for clock and timezone basics.
Use only, exclude, load_only, and dump_only
Constructor arguments only and exclude temporarily narrow which fields participate:
from marshmallow import Schema, fields
class UserSchema(Schema):
id = fields.Int(dump_only=True)
name = fields.Str(required=True)
email = fields.Email(required=True)
age = fields.Int(load_default=None)
minimal = UserSchema(only=("name", "email")).dump(
{"id": 1, "name": "Ada", "email": "a@e.com", "age": 40}
)
print(minimal)Field flags load_only and dump_only split read vs write paths—passwords are the classic load_only case:
from datetime import datetime, timezone
from marshmallow import Schema, fields
class AccountSchema(Schema):
username = fields.Str(required=True)
password = fields.Str(required=True, load_only=True)
last_login = fields.DateTime(dump_only=True)
acct = AccountSchema().dump(
{
"username": "ada",
"password": "secret",
"last_login": datetime(2026, 6, 19, 12, 0, tzinfo=timezone.utc),
}
)
print(acct)password is accepted on load() but omitted from dump(); last_login appears on output and is ignored on input. The printed dict includes username and last_login only.
Transform data with pre_load, post_load, pre_dump, and post_dump
Method decorators let you normalize or enrich data around validation:
from marshmallow import Schema, fields, pre_load, post_load, pre_dump, post_dump
class DemoSchema(Schema):
name = fields.Str()
@pre_load
def strip_name(self, data, **kwargs):
if isinstance(data, dict) and isinstance(data.get("name"), str):
data = {**data, "name": data["name"].strip()}
return data
@post_load
def add_meta(self, data, **kwargs):
data["loaded"] = True
return data
@pre_dump
def pass_through(self, obj, **kwargs):
return dict(obj)
@post_dump
def tag_dump(self, data, **kwargs):
data["dumped"] = True
return data
print(DemoSchema().load({"name": " ada "}))
print(DemoSchema().dump({"name": "Ada"}))pre_load/post_load wrap load(); pre_dump/post_dump wrap dump().
Common Marshmallow mistakes
- Calling
load()on data that still contains JSON strings—parse JSON first so Marshmallow sees dicts and lists. - Forgetting
many=Truewhen the payload is a top-level array. - Expecting
dump()to validate inbound rules—validation defaults targetload(); keep that separation clear. - Reusing a single
Schemainstance concurrently across threads without care—create per request if isolation matters. - Surprised by unknown keys: set policy explicitly on
Meta:
from marshmallow import Schema, fields, EXCLUDE
class Loose(Schema):
name = fields.Str()
class Meta:
unknown = EXCLUDE
print(Loose().load({"name": "Ada", "extra": 1}))That loads {'name': 'Ada'} and drops extra instead of raising.
Python Marshmallow quick reference table
| Task | Pattern |
|---|---|
| Serialize | Schema().dump(obj) |
| Deserialize + validate | Schema().load(data) |
| List of objects | Schema(many=True).load([...]) |
| Nested object | fields.Nested(ChildSchema) |
| Nested list | fields.Nested(ChildSchema, many=True) |
| Hide secret on output | load_only=True |
| Read-only API field | dump_only=True |
| Subset of fields | Schema(only=(...)) / exclude=(...) |
| Validation errors | except ValidationError as err: err.messages |
Summary
Marshmallow centers on Schema classes and fields: dump() serializes trusted data for JSON-friendly dicts, while load() deserializes and validates inbound dicts, raising ValidationError when rules fail. Required fields, validators, and ValidationError.messages give structured API errors. Wire schemas into a Flask web app once these patterns are clear. Nested and many=True model objects inside objects and lists. DateTime formatting keeps string contracts explicit. only, exclude, load_only, and dump_only separate read and write shapes. pre_load, post_load, pre_dump, and post_dump normalize and enrich data around the core two calls.

