Sunday, January 18, 2009

Converting Office 2007 Excel to XML using C#

Don't want Office 2007 installed on your web server to access Excel 2007 content?

Here is a Visual Studio C# solution that demonstrates how to read an Excel 2007 workbook, converting it to XML. You don't need Office 2007 installed in order to do this! Once Excel content is in XML format you can process it easier using transformations tools or other language code.

The solution code uses .NET 3.5 XML support and LINQ. The solution also contains an example console application that will take an Excel 2007 file name on the command-line and convert it to XML via the console. Everything should run out-of-the-box using an example Excel 2007 file that is included; just run the example application and accept the default file name to read.

The quickest way to learn how to use the ExcelReader class, and its options, is to examine the brief Program.cs file in PCarver.OfficeXml.ExampleExcel project. You an register for events to transform cell contents and ignore certain rows (see example code).

Download only the binary files for converting Excel to XML:

Click here for binaries/executables

Download the full Visual Studio Solution:

Click here for Visual Studio 2008 Solution

If you have any bug fixes or nice enhancements, please post a comment and let me know.

Information from Microsoft on these formats is found at:

http://msdn.microsoft.com/en-us/library/aa338205.aspx.

9 comments:

A. Mooman said...

Can I use your example code to develop similar code?
Please advice.

Thanks

A.M.

PaulTechGuy said...

Sure you can use the code assuming the GNU GENERAL PUBLIC LICENSE at http://www.gnu.org/licenses/gpl.html.

Have fun with it! If you find any bugs, have nice enhancements, or requests for features, let me know.

A. Mooman said...

Thank you Paul very much. I will be referencing your work.

It is a great job you have done.

Coming from Unix World and driven by mono to C#, you work have made me interested to try C#. Knowing that XML has enforced all of us to think broader than just being Unix or Window driven mentality.

Thank you Paul.
Great work.
Nasser

Smash said...

This is great for excel 2007, what about earlier versions or is that out of the scope of the project?

PaulTechGuy said...

Smash,

The project was specifically focused on Excel 2007 because of the requirement to deploy a solution that did not require an Office installation.

There are no plans for earlier Office version support (e.g. Office 2003) because it uses COM (requires Office install) and doesn't have the XML support.

A. Mooman said...

You can use perl to convert the CSV or Excel into XML format:
http://www.cpan.org/modules/by-module/XML/XML-Excel-0.01.readme.

Nasser

A. Mooman said...

You can always use Perl to convert CSV or Excel into XML.
Check this URL:
http://www.cpan.org/modules/by-module/XML/XML-Excel-0.01.readme.

E.g.,
#!/bin/perl
use XML::Excel;
$excel_obj = XML::Excel->new();
$excel_obj->parse_doc("test.xls", {headings => 1});

$excel_obj->print_xml("out.xml");


Hope this helps.

Nasser

PaulTechGuy said...

Nasser,

What version of Office does this method support?

Thanks.

A. Mooman said...

This (XML::Excel) should works with MS Excel 97- 2003 worksheets.

Nasser

Can't RDP? How to enable / disable virtual machine firewall for Azure VM

Oh no!  I accidentally blocked the RDP port on an Azure virtual machine which resulted in not being able to log into the VM anymore.  I did ...