HTML to Markdown Converter
Paste HTML source or upload a file and get clean Markdown — optimised for AI chatbot training and LLM ingestion. Free, no signup, and nothing you submit is stored.
Drop your HTML file here, or click to browse
One .html or .htm file, up to 1 MB Need bigger files or more files? Start a free trial and you can train your AI on multiple files at once, each up to 30 MB. Start a free trial →
Converting a live page instead? Try the URL to Markdown converter →
How to convert HTML to Markdown
There are several good ways to convert HTML to Markdown, and the right one depends on how often you do it and how much code you want to write. The converter on this page is the fastest for one-off conversions; if you're batch-converting hundreds of pages, a Python or Node.js library will serve you better. Here are the methods that actually work.
(Searching for html to md? Same thing — .md is simply Markdown's file extension, and this page converts HTML to .md files.)
Use this free converter Fastest
- 1Paste your HTML into the box at the top of this page — or switch to Upload file and drop a
.htmlfile (up to 1 MB either way). - 2Click Convert to Markdown — conversion runs in seconds, right here, with no signup.
- 3Copy the Markdown or download it as a
.mdfile.
Under the hood it's the same engine Resolve247 uses to ingest web content for AI chatbot training: it keeps the heading structure, lists, links and tables, and strips the noise — scripts, styles, navigation and other boilerplate that pollutes LLM context. Your HTML is processed in memory and never stored.
Python: markdownify Best for batches
markdownify is the most direct Python route — it walks the HTML tree and emits Markdown, preserving headings, links, lists and tables.
pip install markdownify
from markdownify import markdownify as md html = open("page.html").read() print(md(html, heading_style="ATX")) # ATX = "#" headings
Note it converts everything it's given — nav bars, footers and cookie banners included. For clean output, pass it just the content element, or pre-extract it with BeautifulSoup.
Python: html2text
html2text is the long-standing alternative — highly configurable (ignore links, ignore images, body width) and good when you want plainer, reading-oriented output.
pip install html2text
import html2text h = html2text.HTML2Text() h.body_width = 0 # don't hard-wrap lines print(h.handle(open("page.html").read()))
Node.js: turndown
In a JavaScript stack, turndown is the standard — it runs in Node or the browser, and a plugin adds GitHub-flavored tables and strikethrough.
npm install turndown turndown-plugin-gfm
const TurndownService = require("turndown"); const { gfm } = require("turndown-plugin-gfm"); const td = new TurndownService({ headingStyle: "atx" }); td.use(gfm); // tables, strikethrough, task lists console.log(td.turndown(html));
CLI: pandoc One-liner
Unlike PDF (which pandoc can't read), HTML is one of pandoc's strongest input formats — a single command converts a saved page to GitHub-flavored Markdown.
brew install pandoc # macOS — or: apt install pandoc pandoc -f html -t gfm page.html -o page.md
Pandoc converts the whole document faithfully — which means nav, footers and inline junk come along too. Converters built for LLM ingestion (like this page) exist to strip that boilerplate for you.
Quick guide to Markdown formatting
New to Markdown? It expresses formatting with plain characters instead of tags — which is exactly why LLMs parse it so reliably. Here's how to read (and write) everything this converter produces:
| Formatting | Markdown format | Notes |
|---|---|---|
| Heading | # Title ## Section ### Sub | 1–6 # marks set heading levels 1–6 — the equivalents of <h1>–<h6>. |
| Bold | **bold text** | Renders as bold text (<strong>). |
| Italic | *italic text* or _italic text_ | Renders as italic text (<em>). |
| Bold + italic | ***both*** | Renders as both. |
| Underline | — | There's no underline syntax in Markdown — <u> content converts as plain text. |
| Strikethrough | ~~crossed out~~ | Renders as |
| Bullet list | - item or * item | One item per line (<ul><li>); indent two spaces to nest. |
| Numbered list | 1. first item | Numbers auto-correct when rendered — 1. on every line also works. |
| Link | [link text](https://example.com) | Text in square brackets, URL in parentheses. |
| Image |  | A link with a leading !. (This converter outputs text only.) |
| Inline code | `code` | Backticks render text in monospace. |
| Code block | ``` … ``` | Triple backticks on their own lines fence off a multi-line block (<pre>). |
| Quote | > quoted text | A > at the start of a line renders a blockquote. |
| Table | | Col | Col | | Pipes separate cells; a | --- | --- | row under the header row defines the table. |
Why convert HTML to Markdown for AI?
A typical web page is mostly not content. View source on any article and the words you actually read are buried in scripts, stylesheets, navigation, cookie banners, tracking pixels and div soup — routinely 90% of the bytes. Feed raw HTML to an LLM and you pay for every one of those junk tokens, while the markup noise actively distracts the model from the meaning.
Markdown is the opposite: pure structure, near-zero overhead. Headings stay headings, lists stay lists, tables stay tables — and everything else disappears. That structure is what makes RAG pipelines work well: chunking a document on its real heading boundaries keeps each chunk coherent, which directly improves retrieval and answer quality.
It's also why Markdown is the standard input for AI chatbot training. When Resolve247 trains a support chatbot on your website or help centre, this exact conversion runs first — clean source material is half of what makes an anti-hallucination guarantee possible. An AI can only answer from your content reliably if your content was ingested cleanly.
And beyond AI: Markdown is plain text. It diffs in git, edits in any editor, and converts onwards to anything. Once your knowledge is out of the markup, it's portable for good.
Want to train an AI chatbot on this data?
Your clean Markdown is chatbot training material. Start a 30-day free trial of Resolve247 and turn it into an AI support agent that answers your customers 24/7 — and never makes things up.
Start a Free Trial30-day free trial. No credit card required.
HTML to Markdown FAQ
Yes. Paste HTML or upload a file and download the Markdown with no signup, no card and no email. There's a fair-use rate limit to keep it fast for everyone — a Resolve247 free trial removes it.
Nothing is stored. Whether you paste source or upload a file, it's converted in memory and the Markdown is returned in the same request — we keep neither the HTML nor the output once the response is sent.
Whichever is easier — both run through the same engine and produce identical Markdown. Pasting is quickest when the source is already in your clipboard or you're converting a fragment; uploading suits saved .html or .htm files. Both are capped at 1 MB.
1 MB per conversion, whether pasted or uploaded — that's a lot of HTML, since markup is text. Need more? A Resolve247 free trial lets you train your AI on multiple files at once, each up to 30 MB.
Yes. <script> and <style> blocks are removed entirely, and boilerplate like navigation, footers and cookie banners is stripped where it's detectable — that's most of what makes the output clean enough for LLM ingestion. Content markup (headings, lists, links, tables, emphasis) is preserved as Markdown.
That's what our URL to Markdown converter is for — paste the link and it fetches the HTML for you. Use this page when you already have the source: a saved file, a CMS export, or markup from view-source or your clipboard.
Both. You can paste a complete document (<!doctype html> down to </html>) or just a fragment — a table, an article body, a few paragraphs. The converter handles either, and imperfect real-world markup is fine too.
HTML to MD is the same thing as HTML to Markdown — .md is simply the file extension Markdown uses. To convert HTML to MD, paste your source (or upload a .html file) in the converter above, click Convert to Markdown, and download the output as a .md file.
Yes — that's what it's built for. This is the same conversion engine Resolve247 uses to ingest web content for its AI support chatbots, so the output is structured for chunking, embedding and chatbot training.