SubRip (.srt) subtitles are plain text, but they are structured: each block has a numeric index, a start → end time line, then the caption lines, separated by blank lines. When you convert srt to text or convert srt to txt, you usually want only the spoken lines—no cue numbers and no timestamps—whether for search indexing, LLM input, or a simple srt to text export in the browser or Node.js.
This article shows a regex-first way to strip cue headers, a line-skipping variant, and a small block parser that behaves well when a cue spans multiple lines. It also shows how to use the built-in fs module in Node without installing extra packages.
Tested on: Node.js v20.18.2. After each runnable block, a short note describes the plain-text dialogue you should see (blank lines may match the sample spacing).
Quick reference
Use this table for convert srt to text regex javascript pipelines.
| Step | Detail |
|---|---|
| Read file | fs.readFileSync(path, "utf8") in Node |
| Strip cue start | Regex on index + HH:MM:SS,mmm --> ... + newline |
| Multi-line cues | Split blocks on blank lines; drop first two lines per block |
| Save srt to txt | fs.writeFileSync("out.txt", plain, "utf8") |
1. Strip cue headers with String.prototype.replace
Match the start of each cue: optional Windows line endings (\r?\n), a numeric index, newline, timestamp line, newline. Use the g and m flags so ^ applies to each line.
const srt = `1
00:00:51,916 --> 00:00:54,582
London in the 1960s.
2
00:00:54,708 --> 00:00:57,124
Everyone had a story about the Krays.
`;
// In Node.js you can load the same string with:
// const srt = require("node:fs").readFileSync("./captions.srt", "utf8");
const cueHeader =
/^\d+\r?\n\d{1,2}:\d{2}:\d{2},\d{3} --> \d{1,2}:\d{2}:\d{2},\d{3}\r?\n/gm;
const plain = srt
.replace(cueHeader, "")
.replace(/\n{3,}/g, "\n\n")
.trim();
console.log(plain);You should see only the two dialogue lines, separated by a blank line—no cue indices or timestamps.
\d{1,2} for the hour field covers common 00: and 01: style exports. After stripping, /\n{3,}/g collapses leftover gap lines so srt to text reads like paragraphs.
2. Line loop (skip index, time, and empty lines)
Another way to convert srt to text is to split lines and continue on index-only lines, timestamp-only lines, and blanks. Dialogue prints in cue order; blank lines between cues are removed here, so spacing differs from §1.
const srt = `1
00:00:51,916 --> 00:00:54,582
London in the 1960s.
2
00:00:54,708 --> 00:00:57,124
Everyone had a story about the Krays.
`;
const timeLine =
/^\d{1,2}:\d{2}:\d{2},\d{3} --> \d{1,2}:\d{2}:\d{2},\d{3}$/;
for (const line of srt.split(/\r?\n/)) {
if (/^\d+$/.test(line)) continue;
if (timeLine.test(line)) continue;
if (line.trim() === "") continue;
console.log(line);
}Each dialogue line prints on its own console.log line with no blank line between them (the loop skips empty lines).
3. Block split for multi-line cues
If a cue has two dialogue lines, a regex that only removes the header still leaves both lines—good—but a naive “skip two lines per block” without grouping can get wrong. Splitting on blank-line-separated blocks is easy to reason about for convert srt file to text when cues span multiple lines:
function srtToPlain(srt) {
return srt
.replace(/\r\n/g, "\n")
.trim()
.split(/\n\s*\n/)
.map((block) => block.trim().split("\n").slice(2).join("\n"))
.filter(Boolean)
.join("\n\n");
}
const srt = `1
00:00:00,000 --> 00:00:02,000
Hello there
General Kenobi
2
00:00:02,100 --> 00:00:04,000
Second cue
`;
console.log(srtToPlain(srt));You should see a block with Hello there and General Kenobi, a blank line, then Second cue—multi-line cues stay grouped.
4. Node.js file I/O (built-in fs)
For convert srt to text file scripts, read UTF-8 text with the built-in node:fs module—do not run npm install fs; fs is part of Node. Apply the same cueHeader chain as in §1 to the string returned by readFileSync(path, "utf8"). Writing convert srt to txt is then writeFileSync("out.txt", plain, "utf8").
Summary
- Regex replace removes standard cue headers; validate against malformed or WebVTT inputs.
- Line loops and block splits trade simplicity for multi-line cue safety.
- Always use
utf8when reading and writing; avoid npmfs.
References
MDN, Node.js, and community references for regex-based SRT cleanup.
- MDN: Regular expressions
- Node.js:
fs.readFileSync - Wikipedia: SubRip (structure of
.srtfiles) - Stack Overflow: JavaScript replace and cleaning an
.srt
