Convert PDF to Word in Python

Andrew Wilson
2 min readOct 10, 2024

--

PDF is a widely used file format known for its fixed layout and cross-platform compatibility. It is suitable for sharing and printing to ensure a consistent layout. Word, on the other hand, is a more editable document format that is easy to modify and re-edit.

Conversion from PDF to Word usually stems from the need to edit or reformat the content of PDF files. This article will share an easy method to convert PDF to Word through Python programming.

PDF to Word Python Converter Library

Spire.PDF for Python is a third-party Python library that supports generating, processing and converting PDF files. You can install it directly though the following pip command.

pip install Spire.PDF

How to Convert PDF to Word in Python

Spire.PDF for Python provides a very simple way to convert PDF to Word (Doc or Docx format). Follow the mains steps to complete the task.

  1. Load a PDF file through the LoadFromFile() method of the PdfDocument class.
  2. Save the PDF file in .doc format using the SaveToFile(filename, FileFormat.DOC) method.
  3. Save the PDF file in .docx format using the SaveToFile(filename, FileFormat.DOCX) method.
  4. Close the PDF file.

Python Code:

from spire.pdf import PdfDocument
from spire.pdf import FileFormat

# Load a PDF file
pdf = PdfDocument()
pdf.LoadFromFile("file.pdf")

# Convert the PDF file to a Doc file
pdf.SaveToFile("PdfToDoc.doc", FileFormat.DOC)

# Convert the PDF file to a Docx file
pdf.SaveToFile("PdfToDocx.docx", FileFormat.DOCX)

# Close the PDF file
pdf.Close()

Output:

Python PDF to Word

A PDF file can be converted to an editable Word document through few lines of code. Beyond that, the Python PDF library also supports converting PDF to HTML, PDF to Excel, PDF to Images, PDF to TIFF, etc.

--

--

Andrew Wilson
Andrew Wilson

Written by Andrew Wilson

Explore C#, Java and Python solutions for processing Word/Excel/PowerPoint/PDF files.

No responses yet