Get File Extension in Python: pathlib, os.path, and edge cases

Learn how to get a file extension in Python using pathlib (suffix, suffixes), os.path.splitext, case-insensitive checks, multi-part names like .tar.gz, and common pitfalls.

Published

Updated

Read time 4 min read

Reviewed byDeepak Prasad

Get File Extension in Python: pathlib, os.path, and edge cases

File extensions show up in filtering uploads, routing parsers, and building output paths. In modern Python, pathlib.Path is usually the clearest tool; os.path.splitext remains useful on plain strings. This guide walks through those APIs, double extensions such as .tar.gz, practical checks, odd filenames, mistakes to avoid, and a short cheat sheet.

Tested on: Python 3.13.3; kernel 6.14.0-37-generic.


Method 1: Get file extension using pathlib

pathlib is in the standard library (since Python 3.4). Build a Path from a string; .suffix returns the last extension including the leading dot, or an empty string when there is none. The related property Path.stem is the final path segment with the last suffix removed (for report.csv, stem is report).

Example (filename only):

python
from pathlib import Path

name = Path("report.csv")
print(name.suffix)

Running that prints .csv (with the dot).

Example (full path):

python
from pathlib import Path

path = Path("/home/user/project/data/report.csv")
print(path.suffix)
print(path.name)

The suffix is still .csv; path.name is report.csv.

Example (case-insensitive check):

Comparisons should normalize case, because Path("data.CSV").suffix is .CSV, not .csv.

python
from pathlib import Path

path = Path("Report.CSV")
if path.suffix.lower() == ".csv":
    print("Treat as CSV")

Method 2: Get file extension using os.path.splitext

os.path.splitext takes a path string and returns (root, ext) where ext is the part from the last dot onward, including the dot. If there is no extension, ext is ''.

Example (split basename):

python
import os

root, ext = os.path.splitext("document.txt")
print(root, ext)

That prints document and .txt.

Example (path with directories):

python
import os

path = "/var/log/nginx/access.log.1"
root, ext = os.path.splitext(path)
print("root:", root)
print("extension:", ext)

Here ext is .1 (only the final segment after the last dot). For logs like access.log, you get .log; if you need the “double” story, use Method 3.


Method 3: Get multiple extensions like .tar.gz

Path.suffix only reflects the last piece: for archive.tar.gz it is .gz. Use Path.suffixes for every dotted suffix segment.

Example (last suffix only):

python
from pathlib import Path

path = Path("archive.tar.gz")
print(path.suffix)

Output shape: .gz.

Example (all suffix segments):

python
from pathlib import Path

path = Path("archive.tar.gz")
print(path.suffixes)

That yields a list like ['.tar', '.gz']. For document.v2.0.txt, suffixes is ['.v2', '.0', '.txt']—each dotted chunk after the first dot in the basename is modeled as its own suffix, so treat suffixes as “split on dots in the name,” not always “semantic version triple.”

os.path.splitext on archive.tar.gz returns ('archive.tar', '.gz'), same “last extension only” rule as Path.suffix.


Method 4: Check file extension in Python

Use normalized suffix checks instead of substring hacks. Extension checks belong next to other validation (size, MIME, permissions) when security matters.

Example (single allowed type):

python
from pathlib import Path

path = Path("export.csv")
if path.suffix.lower() == ".csv":
    print("CSV branch")

Example (allow-list):

python
from pathlib import Path

allowed = {".csv", ".tsv", ".txt"}
path = Path("notes.TXT")
if path.suffix.lower() in allowed:
    print("Allowed text-like type")

For more on existence checks before opening files, see check if a file exists. For writing outputs, see write to file in Python. CSV routing often pairs extension checks with read and write CSV.

Names without a dot in the basename (README) and typical dotfiles (.gitignore) yield an empty suffix and an empty suffixes list. For those, os.path.splitext still returns a pair: ('README', '') and ('.gitignore', '')—the extension slot is empty, not missing. Multiple dots in one name still give a single final suffix (for example .txt on document.v2.0.txt); see Method 3 for how suffixes breaks that basename into segments.


Common mistakes

  1. Using split(".") for every filename — name.split(".")[-1] drops the leading dot, breaks on paths with directories, and mishandles names with no dot (the whole string becomes the “extension”). Prefer Path.suffix or os.path.splitext.

  2. Forgetting that the extension includes the dot — suffix and splitext give .csv, not csv. Comparisons like suffix == "csv" silently fail.

  3. Trusting the extension as full file validation — extensions are hints only. For uploads or security-sensitive flows, combine size limits, allow-lists, and content checks (for example mimetypes or reading magic bytes); do not rely on a renamed .exe posing as .txt.


Python file extension cheat sheet

Goal Typical approach
Last extension with dot Path(name).suffix or os.path.splitext(name)[1]
All dotted suffix segments Path(name).suffixes
Root + ext tuple os.path.splitext(path)
Name without last suffix Path(name).stem (for a.b.txta.b)
Case-insensitive rule path.suffix.lower() in {".csv", ".tsv"}
No extension in basename suffix is ''; splitext second part is ''
Dotfile like .gitignore Expect empty suffix; use full name rules if needed

Official references: pathlib.Path, os.path.splitext.


Summary

Use pathlib.Path.suffix for the final extension (including the dot), Path.stem when you need the filename with that suffix stripped once, suffixes when double endings such as .tar.gz matter, and os.path.splitext when you want a (root, ext) pair on strings. Normalize with .lower() for comparisons, remember dotfiles often have an empty suffix, and avoid split(".") for production path logic. Extension checks are not a substitute for real content validation on untrusted files. Related: split a string, try / except.


Frequently Asked Questions

1. Should I use pathlib or os.path.splitext to get the extension?

Prefer pathlib.Path for new code: .suffix and .suffixes are clear and work well with joins; os.path.splitext is fine when you already work with str paths and want a (root, ext) tuple.

2. What is the difference between suffix and suffixes on a Path?

suffix is only the final suffix including its leading dot (for example .gz on archive.tar.gz); suffixes is a list of every dotted segment treated as a suffix (for example [.tar, .gz]).

3. Does the extension include the dot?

Yes for pathlib.suffix and the second value from os.path.splitext: they return .csv not csv. Compare with .lower() when you need case-insensitive rules.
Bashir Alam

Data Analyst and Machine Learning Engineer

Computer Science graduate from the University of Central Asia, currently employed as a full-time Machine Learning Engineer at uExel. His expertise lies in OCR, text extraction, data preprocessing, and …