Extract Text from PowerPoint in Java

Andrew Wilson
2 min readMay 12, 2023

In a PowerPoint presentation, if you want to share and transfer text information with others without sending the entire document, you can extract the text content from it. In this article, you will learn how to extract text from all presentation slides in Java using a free library.

Import Dependency (2 Methods)

● Download the free library(Free Spire.Presentation for Java) and unzip it, and then add the Spire.Presentation.jar file to your project as dependency.

● Directly add the jar dependency to your maven project by adding the following configurations to the pom.xml.

<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.presentation.free</artifactId>
<version>5.1.0</version>
</dependency>
</dependencies>

Extract Text from a PowerPoint Presentation in Java

The free Java PowerPoint library allows you to iterate through each slide to get the text using ParagraphEx.getText() method. The following is the complete sample code to extract text from PPT and save in a TXT file.

import com.spire.presentation.*;

import java.io.*;

public class extractText {
public static void main(String[] args) throws Exception {
//Create a Presentation instance
Presentation presentation = new Presentation();

//Load a sample PowerPoint document
presentation.loadFromFile("input.pptx");

//Create a StringBuilder instance
StringBuilder buffer = new StringBuilder();

//Foreach the slide and extract text
for (Object slide : presentation.getSlides()) {
for (Object shape : ((ISlide) slide).getShapes()) {
if (shape instanceof IAutoShape) {
for (Object tp : ((IAutoShape) shape).getTextFrame().getParagraphs()) {
buffer.append(((ParagraphEx) tp).getText() + "\n");
}
}
}
}

//Save the extracted text to a txt file
FileWriter writer = new FileWriter("extractText.txt");
writer.write(buffer.toString());
writer.flush();
writer.close();
}
}

--

--

Andrew Wilson

Explore C#, Java and Python solutions for processing Word/Excel/PowerPoint/PDF files.