Wednesday, May 14, 2008

Aggregation: LINQ and SQL XML fields

I'm thinking ahead about the possibility of having to store some XML snippets in a database XML field. How can we store XML data in the database and use LINQ to easily retrieve it...and report on it. There are lots of examples out there, but I want to look at a more difficult scenario, one involving a variable XML data and aggregating that data from multiple records.

I've dreamed up a scenario based on a survey to collect the opinions of partners. When a partner takes the survey, there could be 1 to n questions on various opinions. I say n since the number of questions could change over time (plus I want to model variable XML data for this example).

For each partner, we decide to store the opinions in a single database XML field. For this example, I use the values "Choice X", but in reality it might be something like "Favor debit cards" or "Want e-mail rewards". Here is an example of what might be stored in the XML field for a single partner record, for a single survey:

<Opinions>
<Opinion>Choice 1
<Opinion>
<Opinion>Choice 2<Opinion>
<Opinion>Choice 5<Opinion>
</Opinions>

Don't get hung up on whether this is a correct way to store survey results...that's not the purpose here.

Notice that since certain opinions were not selected, they were not included in the XML (e.g. Choice 3-4 are missing). Assuming that the opinions are a simple set of checkboxes on a survey page, the database XML column value should be created like:

XElement results = new XElement("Opinions",
cblOpinion.Items.OfType<ListItem>()
.Where(o => o.Selected)
.Select(o => new XElement("Opinion", o.Text)));

These XML results are stored in the database. At some point we are going to have to report on the opinions of the partners. Assuming we can search the database to obtain partner records for a certain survey within a certain time period, we'd like to report on data like:

Choice 1, 378
Choice 2, 120
Choice 3, 629

I spent several iterations trying to figure out the best way to do this. In the end, as I suspected, it was easier that I thought. It wouldn't surprise me to have someone else find even an easier way as LINQ is a powerful language. To accomplish this the code below performs two steps, 1) get all survey opinions from the database, creating a sequence of XElement with all the opinions, and 2) create grouping sequence of all like-opinions. The example uses the "Linq To Sql" support in VS 2008.

// Step #1
// extract out all survey opinions from the DB and create one
// xml element
DbDataContext db = new DbDataContext();
XElement allOpinions = new XElement("Survey", db.Opinions
.Select(x => x.Opinions) // database column is called Opinions
.Select(x => x));

// Step #2
// group like opinions together (could have been done along with
// previous linq statement, but kept separate since may be other
// queries to perform on allOpinions
var groups = allOpinions.Descendants("Opinion")
.OfType<XElement>()
.GroupBy(x => x.Value); // e.g. "Choice 1"

I can output the result of #1 and #2 as:

// Output #1
Console.WriteLine(allOpinions);

// Output #2
foreach (IGrouping<string, XElement> g in groups)
{
Console.WriteLine("{0},{1}", g.Key, g.Count());
}

No comments:

Can't RDP? How to enable / disable virtual machine firewall for Azure VM

Oh no!  I accidentally blocked the RDP port on an Azure virtual machine which resulted in not being able to log into the VM anymore.  I did ...