Convert PDF to HTML using Java

PDF offers to share and print read-only documents without losing documents formatting. We can easily convert PDF documents to HTML web pages and view them in any browser. In this article, we will learn how to convert PDF documents to HTML webpages using Java.

The following topics shall be covered in this article:

Java API to Convert PDF to HTML — Free Download

For converting PDF to HTML, we will be using GroupDocs.Conversion for Java API. It provides fast, efficient, and reliable file conversion solutions to end-users. Please either download the JAR of the API or just add the following pom.xml configuration in a Maven-based Java application.

<repository>
	<id>GroupDocsJavaAPI</id>
	<name>GroupDocs Java API</name>
	<url>http://repository.groupdocs.com/repo/</url>
</repository>
<dependency>
        <groupId>com.groupdocs</groupId>
        <artifactId>groupdocs-conversion</artifactId>
        <version>21.10.1</version> 
</dependency>

PDF to HTML Conversion using Java

We can easily convert PDF documents to HTML webpages programmatically by following the simple steps given below:

  1. Firstly, load a PDF document using the Converter class.
  2. Next, create an instance of the MarkupConvertOptions class.
  3. Then, optionally set various convert options such as FixedLayout, FixedLayoutShowBorders, etc.
  4. Finally, convert PDF to HTML using the Converter.Convert() method. It takes output file path and convert options as arguments.

The following code sample shows how to convert a PDF document to an HTML webpage using Java.

PDF to HTML Conversion using Java

PDF to HTML Conversion using Java.

Convert Range of Pages from PDF to HTML

We can convert a range of pages of a PDF document to HTML by following the steps given below:

  1. Firstly, load a PDF document using the Converter class.
  2. Next, create an instance of the MarkupConvertOptions class.
  3. Then, set page number to start conversion from.
  4. After that, set pages count to convert total number of pages.
  5. Finally, convert PDF to HTML using the Converter.Convert() method.

The following code sample shows how to convert a range of pages from a PDF document to an HTML file using Java.

Convert Specific Pages of PDF to HTML

We can convert specific pages of a PDF document to HTML by following the steps given below:

  1. Firstly, load a PDF document using the Converter class.
  2. Next, create an instance of the MarkupConvertOptions class.
  3. Then, provide specific page numbers in a comma-separated list to convert.
  4. Finally, convert PDF to HTML using the Converter.Convert() method.

The following code sample shows how to convert specific pages of a PDF document to an HTML file using Java.

Convert Password-Protected PDF to HTML in Java

We can also convert password-protected PDF documents to HTML webpages by following the steps given below:

  1. Firstly, provide password using the PdfLoadOptions class object.
  2. Next, load a PDF document using the Converter class with PdfLoadOptions.
  3. Then, create an instance of the MarkupConvertOptions class.
  4. Finally, convert PDF to HTML using the Converter.Convert() method.

The following code sample shows how to convert a password-protected PDF document to an HTML document using Java.

PDF to HTML Conversion with Watermark in Java

We can convert PDF documents to HTML webpages and add watermarks to converted HTML files by following the steps given below:

  1. Firstly, load a PDF document using the Converter class.
  2. Next, create an instance of the WatermarkOptions class.
  3. Then, set various options such as Text, Color, Width, Height, Font, etc.
  4. Next, create an instance of the MarkupConvertOptions class.
  5. After that, assign WatermarkOptions to MarkupConvertOptions.
  6. Finally, convert PDF to HTML using the Converter.Convert() method.

The following code sample shows how to convert a PDF document to an HTML document with a watermark.

PDF to HTML Conversion with Watermark in Java

PDF to HTML Conversion with Watermark in Java.

Get a Free License

Please try the API without evaluation limitations by requesting a free temporary license.

Conclusion

In this article, we have learned how to convert PDF documents to HTML webpages in Java. We have also seen how to convert a password-protected PDF file to HTML and add a watermark to the converted file programmatically. Besides, you can learn more about GroupDocs.Conversion for Java API using the documentation. In case of any ambiguity, please feel free to contact us on the forum.

See Also