Convert HTML to Word DOCX using C#

HTML (HyperText Markup Language) is a leading file format for web pages supported by all browsers. In various cases, we may need to convert HTML files or content from live webpages into Word documents (DOC, DOCX, DOT, DOTM, DOCM). It helps to edit the text of HTML web pages or apply the text formatting. In this article, we will learn how to convert HTML to a Word DOC or DOCX document using C#.

The following topics shall be covered in this article:

C# API to Convert HTML to DOCX — Free Download

For converting HTML files or webpages to Word processing file formats, we will be using Aspose.Words for .NET API. It is a complete solution to create, edit, convert or analyze Word documents programmatically. Please either download the DLL of the API or install it using NuGet.

Install-Package Aspose.Words

C# Convert HTML to Word DOCX

We can easily convert HTML files to Word documents programmatically in C# by following the steps given below:

  1. Load an HTML file using the Document class.
  2. Call the Document.Save(string, SaveFormat) method to save the HTML file as “output.docx”.

The _SaveFormat enumeration in Document.Save() method specifies the format in which you want to convert the HTML file. The following code sample shows how to convert an HTML file to DOCX using C#.

Convert HTML to Word in C#

Convert HTML to Word in C#

C# Convert a Web Page to Word from URL

We can also convert an HTML web page directly from a live URL to a Word document in C# by following the steps given below:

  1. Firstly, download web page content as a System.Byte array from the specified URL.
  2. Next, initiate MemoryStream object with an array object as argument.
  3. Then, create an instance of the HtmlLoadOptions class.
  4. After that, create an instance of the Document class and initialize it with MemoryStream and HtmlLoadOptions objects.
  5. Finally, call the Document.Save(string, SaveFormat) method to save the HTML file as “output.docx”.

The following code sample shows how to convert an HTML web page to DOCX using C#.

C# Convert an HTML String to Word

We can generate a Word document from an HTML string dynamically in C# by following the steps given below:

  1. Firstly, create an instance of the Document class.
  2. Next, create an instance of the DocumentBuilder class with Document object.
  3. Then, insert HTML into the document using DocumentBuilder.InsertHtml(string) method.
  4. Finally, save the Word document using the Document.Save(string, SaveFormat) method.

The following code sample shows how to convert an HTML string to DOCX using C#.

Get a Free License

Please try the API without evaluation limitations by requesting a free temporary license.

Conclusion

In this brief tutorial, we have learned how to convert HTML to a Word document using C#. We have also seen how to convert live web pages from a URL to Word DOC or DOCX files programmatically. Besides, you can learn more about Aspose.Words for .NET API using the documentation. In case of any ambiguity, please feel free to contact us on the forum.

See Also