
Reading HTML in C# opens up a world of possibilities for you to interact with web content in C# .NET applications. You can easily parse or navigate HTML documents for simple data extraction or complex web scraping tasks in C# to process HTML content. Accordingly, this blog post covers how to read HTML in C# while covering different approaches to load HTML content and parse the HTML string based on your requirements.
Configure HTML Reader API in C#
You can easily download the API from the New Releases section or install Conholdate.Total for .NET from NuGet gallery by running the following installation command in Package Manager Console in Visual Studio:
PM> NuGet\Install-Package Conholdate.Total
Read an HTML File in C#
HTML (Hypertext Markup Language) is the backbone of web pages, defining the structure and content of websites. When you access a web page, your browser interprets the HTML code and renders it into a visual layout. To read and manipulate HTML content in C#, follow these steps:
- Load the source HTML file with the HTMLDocument class instance.
- Read the HTML content using the OuterHTML property.
The code snippet below demonstrates how to read an HTML file using C#:
Navigate HTML File to Read HTML Contents in C#
To navigate an HTML file and read its contents in C#, follow these steps:
- Prepare HTML code and create an HTMLDocument object.
- Get the reference to the first child (first SPAN) of the BODY.
- Traverse child nodes and extract the required information.
The following code sample shows how to navigate HTML nodes to read HTML contents in C#:
Read HTML File as String in C#
You can read HTML files as a string in C# from any URL with these steps:
- Initialize an HTMLDocument object with the URL.
- Read the text content of the HTML page.
- Write a TXT file with the extracted text.
The code sample below explains how to read an HTML file as a string in C# from any URL:
Free Evaluation License
You can get a free temporary license to avoid any evaluation limitations.
Summing Up
Being able to read HTML in C# is a valuable skill for web‑related projects and data extraction tasks. This post covered three approaches to reading HTML in C#. Use them to scrape or parse information from HTML pages for further processing. Explore additional API features and feel free to reach out on the forum.