← All tools
// DOCUMENT TOOLS

PDF to Markdown online

Convert PDF files to clean Markdown — browser-only, private, no upload.

PDF to Markdown Converter logo
by
CHUNKY
MUNSTER
📄
Drop your PDF here
or click to browse  ·  .pdf files only  ·  processed entirely in your browser
Extracting text…
0
Pages
0
Words
0
Headings
0
Paragraphs
0
Images
What gets converted
Headings (H1–H4) detected from font size · Bold and italic from font names · Bullet and numbered lists · Paragraph breaks · Horizontal rules between pages · Image placeholder markers
Works best with
Text-based digital PDFs — reports, documentation, ebooks, articles, research papers. Not suitable for scanned PDFs (image-only pages) which require OCR.
Where to use the output
GitHub README · Notion · Obsidian · VS Code · Markdown editors · Static site generators (Jekyll, Hugo, Astro) · AI/RAG data pipelines · LangChain document loaders
100% private
This tool runs entirely in your browser using PDF.js. Your file is never uploaded to any server. No sign-up, no tracking, no cost.

How to Convert a PDF to Markdown

  1. Drop your PDF into the upload area above or click to browse.
  2. Adjust options — heading sensitivity, page breaks, image placeholders.
  3. The Markdown output appears instantly. Switch to Preview to see it rendered.
  4. Copy or download the .md file. Your PDF never leaves your browser.

How the Conversion Works

This tool uses PDF.js to read the raw text content of each page, including each text item's font size, font name, and position. It groups text items into visual lines by Y-coordinate, calculates the body font size from the most common size across the document, and classifies anything significantly larger as a heading (H1–H4). Bold and italic are detected from font names. Bullet and numbered lists are identified by their leading characters. Page boundaries are optionally marked with horizontal rules.

Frequently Asked Questions

Does this work with scanned PDFs?

No. Scanned PDFs consist of images rather than actual text. OCR (optical character recognition) is needed to extract text from them — this cannot run in the browser. The tool will warn you if pages appear to be image-only.

Can I use this for AI and RAG pipelines?

Yes. The Markdown output is clean structured text, ideal for chunking and embedding in retrieval-augmented generation (RAG) systems, LangChain document loaders, or any text-based AI pipeline. It produces the same format as tools like OpenDataLoader PDF — but entirely in your browser.

What Markdown flavour does the output use?

Standard CommonMark-compatible Markdown — works in GitHub, Notion, Obsidian, VS Code, and all major Markdown renderers and static site generators.

Is there a file size limit?

No hard limit. Large PDFs will take longer to process. Very large files (100+ pages) may take several seconds depending on your device. The file never leaves your browser.

Why are images shown as placeholders?

Markdown references image files separately rather than embedding them. The tool inserts ![Image](image-page-N) markers where images are detected so you know where to add them manually after export.