Recently, we have learned how to read Excel workbooks using the Microsoft Office COM APIs. As you may already know that the COM APIs are slow while performing operation, we will see another way to read the content which is faster.

 

In this article, we will learn how to read Excel 2007 workbooks using the Apache NPOI libraries which is available freely to use in your application.

 

Here's how to read Excel 2007 document (XLSX) using NPOI libraries (www.kunal-chowdhury.com)

 

Basic concepts about NPOI library

Before starting with the code, you should have the basic knowledge about the NPOI library. NPOI is the .NET version of POI Java project, originally hosted at http://poi.apache.org. It is a free, open source project which can help you to read/write Word, Excel, PowerPoint document files. You can find the source code of NPOI project hosted at https://github.com/tonyqus/npoi. The libraries can be downloaded from NuGet from this URL: https://www.nuget.org/packages/NPOI.

 

You may like to read:



 

Reading Excel 2007 document format using NPOI

To read the 'Excel 2007' file format, i.e. the file having extension of .xlsx, you will need to use the NPOI.XSSF.Extractor.XSSFExcelExtractor class. It extends base POIXMLTextExtractor and inherits IExcelExtractor interface. The exposed property 'Text' provides you the document content that includes all the sheets.

 

To read the content of the said Excel file, create an instance of XSSFExcelExtractor by passing the file path to the constructor. Optionally you can include or exclude cell comments, header and footer information, sheet names to the output result. Now, call the Text property of the instance to read the file text. Code has been shared below, for easy reference. You may have to handle the exceptions that you encounter while accessing/reading the content.

 


/// <summary>Gets the text from extended excel file (Excel 2007 Format).</summary>
/// <param name="filePath">The file path of the Excel sheet.</param>
/// <returns>The text contents of the Excel sheets</returns>
private static string GetTextFromExcel2007Format(string filePath)
{
    XSSFExcelExtractor excelExtractor = null;
 
    try
    {
        excelExtractor = new XSSFExcelExtractor(filePath);
        excelExtractor.IncludeCellComments = false; // optional
        excelExtractor.IncludeHeaderFooter = false; // optional
        excelExtractor.IncludeSheetNames = false; // optional
 
        return excelExtractor.Text;
    }
    catch
    {
        // handle the exception
    }
    finally
    {
        if (excelExtractor != null)
        {
            excelExtractor.Close();
            excelExtractor = null;
        }
    }
 
    return string.Empty;
}

 

I hope that the above code was helpful for you to read the Excel 2007 file format. In the next post, we will learn how to read Excel 97-2003 format using the free, open source NPOI library. Don't forget to read and share the post that I publish. Have a great day!

 

 

Have a question? Or, a comment? Let's Discuss it below...

Thank you for visiting our website!

We value your engagement and would love to hear your thoughts. Don't forget to leave a comment below to share your feedback, opinions, or questions.

We believe in fostering an interactive and inclusive community, and your comments play a crucial role in creating that environment.