HTML2PDF 📄

A robust, headless browser-based CLI tool to convert HTML files, online URLs, and entire directories of HTML documents into high-quality PDFs.

Built with Playwright to ensure that complex CSS, background colors, and asynchronous JavaScript content are fully rendered before PDF generation.

✨ Features

Single URL Conversion: Convert any live web page to a PDF.
Local File Conversion: Convert locally saved HTML files with full asset rendering.
Batch Processing: Point the tool at a directory, and it will recursively find and convert all .html and .htm files.
High Fidelity: Retains background colors, images, and waits for network requests to finish (networkidle) to prevent missing assets.
Memory Efficient: Creates fresh page contexts for batch processing to prevent memory leaks during heavy workloads.

🚀 Installation

1. Clone the repository

git clone https://github.com/yourusername/HTML2PDF.git
cd HTML2PDF

2. Install Python Dependencies

It is recommended to use a virtual environment.

pip install -r requirements.txt

3. Install Playwright Browsers

HTML2PDF requires the Chromium browser binary to render pages.

playwright install chromium

💻 Usage

HTML2PDF is operated via a Command Line Interface (CLI).

1. Convert an Online URL

Use the -u (or --url) flag to specify the web address, and -o (or --output) for the output PDF path.

python HTML2PDF.py -u "https://github.com" -o "github_homepage.pdf"

2. Convert a Single Local HTML File

Use the -f (or --file) flag to specify the local file path.

python HTML2PDF.py -f "./sample.html" -o "local_output.pdf"

3. Batch Convert a Directory

Use the -b (or --batch) flag to specify the input directory containing HTML files. The -o flag will act as the output directory.

python HTML2PDF.py -b "./html_inputs" -o "./pdf_outputs"

Note: If the output directory does not exist, the script will create it automatically.

⚙️ How it Works

Unlike traditional converters (e.g., wkhtmltopdf), HTML2PDF spins up a headless Chromium instance. It waits for the networkidle event, ensuring all external CSS, CDN images, and asynchronous JavaScript payloads are fully loaded before capturing the layout as a vector PDF.

📄 License

This project is open-sourced under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
HTML2PDF.py		HTML2PDF.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTML2PDF 📄

✨ Features

🚀 Installation

1. Clone the repository

2. Install Python Dependencies

3. Install Playwright Browsers

💻 Usage

1. Convert an Online URL

2. Convert a Single Local HTML File

3. Batch Convert a Directory

⚙️ How it Works

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HTML2PDF 📄

✨ Features

🚀 Installation

1. Clone the repository

2. Install Python Dependencies

3. Install Playwright Browsers

💻 Usage

1. Convert an Online URL

2. Convert a Single Local HTML File

3. Batch Convert a Directory

⚙️ How it Works

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages