PDF to HTML conversion

The PDF (Portable Document Format) is one of the widely used document formats for cross-platform data and information sharing. One of its unique capabilities includes that a document’s fidelity stays intact when viewing on any platform using the application as per Adobe specifications. Furthermore, the HTML (HyperText Markup Language) is also a leading file format for web page development and most web browsers support this format. However, PDF is widely accepted as it can be easily viewed on any device without losing document formatting. Therefore, in this article, we are going to discuss steps on how to convert the HTML file to PDF format using .NET API.

C# API to Convert HTML to PDF

In order to perform the conversion operation, first, we need to install Aspose.PDF for .NET on the system. The API is available on the NuGet library. Please run the following command on Package Manager Console to install it:

Install-Package Aspose.Pdf

Once the installation is complete, the Aspose.PDF for .NET will appear under the Packages folder in solution explorer.

Convert HTML to PDF in C#

Given below are the steps on how you can Convert HTML to PDF using C#

  1. Create an instance of the License class to remove any limitations during the PDF file generation process.
  2. Create an object of HtmlLoadOptions class while passing input HTML base url as argument to HtmlLoadOptions(…) constructor.
  3. Initialize object of Document class and pass HtmlLoadOptions object as an argument to its constructor.
  4. Call Save(…) method of Document object and render the output in PDF format.

Embed fonts during conversion

Most HTML pages often use fonts (i.g. fonts from local folders, Google Fonts, etc), and in order to preserve the layout of pages, the same fonts shall be embedded during the rendering process. So in order to control the embedding of fonts in the resultant document, we need to use IsEmbedFonts property.

The unit of measurement in Aspose.PDF is Points. And, we know that A3 measures 297 × 420 millimeters or 11.69 × 16.54 inches. So, the dimensions are rounded off to 842 × 1190 points. In the following code snippet, we are adjusting the page size of the resultant document as A3 and page orientation as Landscape.

Convert Web page to PDF

Other than the conversion of HTML files, we may also have a requirement to directly convert a web page to PDF format. So in order to accomplish this requirement, first we will fetch the remote Web page contents using the HttpClient instance, create a Stream object and then pass the Stream instance to the Document object. The reason we need the content in Stream is that the Document instance only accepts files or steam objects.

The following section explains the steps on how to convert a Web page to PDF using C#

  1. Read the contents of the page using an HttpClient object.
  2. Instantiate the HtmlLoadOptions object and set the base URL.
  3. Initialize a Document object and pass the stream object and HtmlLoadOptions instance as arguments.
  4. Call the Save(String) method from the Document class to generate the output.

Render complete HTML on a single page

During the HTML to PDF conversion, the length of the resultant file is according to the content length of an input HTML document. Therefore, if the input HTML is comprised of multiple pages, then the resultant file will also span over multiple pages. However, we may confine the output to a single PDF page. In order to accomplish this requirement, the IsRenderToSinglePage property of HtmlLoadOptions class can be used.

Given below is the code snippet for rendering the complete HTML content on a single PDF page using C#.

Get a Free License

You may request a free temporary license to try the API without any evaluation limitations.

Conclusion

In this article, we have learned about the approach of converting HTML files to PDF format using .NET API. If you are further interested to learn about other exciting features being offered by Aspose.PDF for .NET, please visit the Key features page. A complete set of examples can be found over the GitHub repository.

Quick Tip

We have also developed free online applications to quickly check the features being offered by our APIs. So you may check the Aspose.PDF Conversion App to transform HTML file to PDF format. Furthermore, you may also use various other file formats and accomplish your conversion requirements.