Generating XML from Recursive, Hierarchical Data

Table of Contents [Hide/Show]


   Overview
      Hierarchy Levels
   Database Objects
      Database Schema
      Helper Code
   Solution
      First Attempt
      A More Generalized Approach
      Sample Output

{outline||<1> - }

Overview

I was given hierarchical configuration data and wanted to store the data in a simple, generalized database schema. I also wanted to generate a hierarchical XML document on-the-fly from the data stored in the database tables.

Hierarchy Levels

The hierarchy used to test the solution consists of four levels.

(1) Application
(2) Report
(3) Subreport
(4a) Subreport Parameters
(4b) Subreport Columns

Database Objects

Database Schema

The DBO.Element Table

CREATE TABLE dbo.Elements(
    ElementId       int          IDENTITY(1,1) NOT NULL,
    ElementName     varchar(100)               NOT NULL,
    ParentElementId int                        NULL,
 CONSTRAINT PK_Element PRIMARY KEY CLUSTERED 
(
    ElementId ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
    ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON PRIMARY
) ON PRIMARY
GO
ALTER TABLE dbo.Elements  WITH CHECK ADD  CONSTRAINT FK_Elements_Elements 
    FOREIGN KEY(ParentElementId)
    REFERENCES dbo.Elements (ElementId)

The DBO.Attributes Table

CREATE TABLE dbo.Attributes(
    AttributeId    int          IDENTITY(1,1) NOT NULL,
    ElementId      int                        NOT NULL,
    AttributeName  varchar(50)                NOT NULL,
    AttributeValue varchar(max)               NOT NULL,
 CONSTRAINT PK_Attributes PRIMARY KEY CLUSTERED 
(
    AttributeId ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
    ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS  = ON) ON PRIMARY
) ON PRIMARY

GO
ALTER TABLE dbo.Attributes WITH CHECK ADD CONSTRAINT 
    FK_Attributes_Elements FOREIGN KEY(ElementId)
REFERENCES dbo.Elements (ElementId)

Helper Code

The DBO.PadLeft() function

ALTER function [dbo].[PadLeft]
    (
     @inputValue                int
    ,@width			int
    ) returns varchar(max) as begin

return substring(
     replicate('0', @width) + convert(varchar(max), @inputValue)
    ,len(convert(varchar(max), @inputValue)) + 1
    ,@width
    )

end

Displaying the Hierarchy

declare @RootElementId int, @IdWidth int

set @RootElementId = 1
set @IdWidth = 10
;
WITH MyElements as (

    select
         RootElementId  = ElementId
        ,ElementId      = ElementId
        ,ElementName
        ,ElementPath    = dbo.PadLeft(ElementId, @IdWidth)
        ,Lvl            = 1    
        ,ParentElementId
    from
        dbo.Elements
    where 1=1
        and ElementId = @RootElementId

    union all select
         RootElementId  = d.RootElementId
        ,ElementId      = e.ElementId
        ,ElementName    = e.ElementName
        ,ElementPath    = d.ElementPath + '\' + dbo.PadLeft(e.ElementId, @IdWidth)
        ,Lvl            = d.Lvl + 1
        ,e.ParentElementId
    from
        MyElements d
        inner join dbo.Elements e
            on e.ParentElementId = d.ElementId
)
,MyData as (
    select 
         RootElementId
        ,ItemId          = ElementId
        ,ItemName        = ElementName
        ,ItemPath        = ElementPath
        ,ItemLvl         = Lvl
        ,ItemType        = 'Element'
    from 
        MyElements

    union select
         RootElementId
        ,ItemId          = AttributeId
        ,ItemName        = '@' + AttributeName + ' = [' + substring(AttributeValue, 1, 50)
                            + case when len(AttributeValue) > 50 then '...' else '' end + ']'
        ,ItemPath        = ElementPath + '\@' + dbo.PadLeft(AttributeId, @IdWidth)
        ,ItemLvl         = Lvl + 1
        ,ItemType        = 'Attribute'
    from
        MyElements e
        inner join dbo.Attributes a
            on e.ElementId = a.ElementId
    )
select
     ItemId = ItemType + ' ' + convert(varchar(4), ItemId)
    ,Item    = replicate('    ', ItemLvl-1) + ItemName
from
    MyData
order by
    ItemPath

This code returns results like the following to visually display the hierarchy of the data. (Only the first few rows are shown.)

ItemId Item
Element 1Application
Attribute 1    @Name = [My Application]
Element 2    Report
Attribute 2        @ReportId = [1]
Attribute 3        @XmlFile = [MyApp.xml]
Attribute 4        @ReportName = [My Report]
Attribute 5        @Template = [c:\Some Network Folder\...]
Element 4        Subreport
Attribute 10            @SubreportName = [My Subreport]
Attribute 11            @SubreportId = [1]

Solution

First Attempt

Unfortunately, I couldn't discover a generalized method to generate the XML document, at least not restricting myself to a pure T-SQL solution with no dynamic SQL. The following function contains four "levels" of SQL, each level being almost identical to the others, so it should be fairly straightforward to make the solution another level deeper if necessary. The names of the fields — [@Name], [@ReportName], [@ReportId], [@XmlFile], [@Template], [@SubreportId], [@SubreportName], [@MaxRecords], [@ConnectionString], [@SqlStatement], [@ParameterName]— is the only hard-coded part of the function's code. These names are taken straight from the dbo.Attributes table, prefixed with an "@" symbol, and wrapped in square brackets.

The DBO.GetConfigXml() Function

ALTER function [dbo].[GetConfigXml]
    (
    @ElementId int
    ) returns xml as begin

declare 
    @result xml

set @result = 

 (

--declare @ElementId int
--set @ElementId = 1

    select 
         [@NodeType] = ElementName
        ,[@ElementId] = ElementId
        ,[@Name]
        ,[@ReportName]
        ,[@ReportId]
        ,[@XmlFile]
        ,[@Template]
        ,[@SubreportId]
        ,[@SubreportName]
        ,[@MaxRecords]
        ,[@ConnectionString]
        ,[@SqlStatement]
        ,[@ParameterName]
        ,(
            select
                 [@NodeType] = ElementName
                ,[@ElementId] = ElementId 
                ,[@Name]
                ,[@ReportName]
                ,[@ReportId]
                ,[@XmlFile]
                ,[@Template]
                ,[@SubreportId]
                ,[@SubreportName]
                ,[@MaxRecords]
                ,[@ConnectionString]
                ,[@SqlStatement]
                ,[@ParameterName]
                ,(
                    select 
                         [@NodeType] = ElementName
                        ,[@ElementId] = ElementId
                        ,[@Name]
                        ,[@ReportName]
                        ,[@ReportId]
                        ,[@XmlFile]
                        ,[@Template]
                        ,[@SubreportId]
                        ,[@SubreportName]
                        ,[@MaxRecords]
                        ,[@ConnectionString]
                        ,[@SqlStatement]
                        ,[@ParameterName]
                        ,(
                            select 
                                 [@NodeType] = ElementName
                                ,[@ElementId] = ElementId
                                ,[@Name]
                                ,[@ReportName]
                                ,[@ReportId]
                                ,[@XmlFile]
                                ,[@Template]
                                ,[@SubreportId]
                                ,[@SubreportName]
                                ,[@MaxRecords]
                                ,[@ConnectionString]
                                ,[@SqlStatement]
                                ,[@ParameterName]
                                ,[@ColumnName]
                                ,[@Size]
                                ,[@Justification]
                                ,[@Format]

                            from (
                                select
                                     e.ElementId
                                    ,ElementName
                                    ,ParentElementId
                                    ,AttributeName  = '@' + AttributeName
                                    ,AttributeValue
                                from
                                    dbo.Elements e
                                    left join dbo.Attributes a
                                        on e.ElementId = a.ElementId
                                where 
                                    e.ParentElementId = Lvl03PivotTable.ElementId
                                ) as Lvl03Table
                                
                            pivot (
                                min(AttributeValue) for AttributeName in (
                                 [@Name]
                                ,[@ReportName]
                                ,[@ReportId]
                                ,[@XmlFile]
                                ,[@Template]
                                ,[@SubreportId]
                                ,[@SubreportName]
                                ,[@MaxRecords]
                                ,[@ConnectionString]
                                ,[@SqlStatement]
                                ,[@ParameterName]
                                ,[@ColumnName]
                                ,[@Size]
                                ,[@Justification]
                                ,[@Format]
                                )) as Lvl04PivotTable

                            for xml path ('Node'), TYPE -- Parameters
                            )
                    from (
                        select
                             e.ElementId
                            ,ElementName
                            ,ParentElementId
                            ,AttributeName  = '@' + AttributeName
                            ,AttributeValue
                        from
                            dbo.Elements e
                            left join dbo.Attributes a
                                on e.ElementId = a.ElementId
                        where 
                            e.ParentElementId = Lvl2PivotTable.ElementId
                        ) as Lvl2Table
                        
                    pivot (
                        min(AttributeValue) for AttributeName in (
                         [@Name]
                        ,[@ReportName]
                        ,[@ReportId]
                        ,[@XmlFile]
                        ,[@Template]
                        ,[@SubreportId]
                        ,[@SubreportName]
                        ,[@MaxRecords]
                        ,[@ConnectionString]
                        ,[@SqlStatement]
                        ,[@ParameterName]
                        )) as Lvl03PivotTable

                    for xml path ('Node'), TYPE -- Subreport
                    )
            from (
                select
                     e.ElementId
                    ,ElementName
                    ,ParentElementId
                    ,AttributeName  = '@' + AttributeName
                    ,AttributeValue
                from
                    dbo.Elements e
                    left join dbo.Attributes a
                        on e.ElementId = a.ElementId
                where 
                    e.ParentElementId = Lvl01PivotTable.ElementId
                ) as Lvl02Table
                
            pivot (
                min(AttributeValue) for AttributeName in (
                 [@Name]
                ,[@ReportName]
                ,[@ReportId]
                ,[@XmlFile]
                ,[@Template]
                ,[@SubreportId]
                ,[@SubreportName]
                ,[@MaxRecords]
                ,[@ConnectionString]
                ,[@SqlStatement]
                ,[@ParameterName]
                )) as Lvl2PivotTable

            for xml path ('Node'), TYPE -- Report
            )

    from (
        select
             e.ElementId
            ,ElementName
            ,ParentElementId
            ,AttributeName  = '@' + AttributeName
            ,AttributeValue
        from
            dbo.Elements e
            left join dbo.Attributes a
                on e.ElementId = a.ElementId
        where 
            e.ElementId = @ElementId 
        ) as Lvl01Table
        
    pivot (
        min(AttributeValue) for AttributeName in (
         [@Name]
        ,[@ReportName]
        ,[@ReportId]
        ,[@XmlFile]
        ,[@Template]
        ,[@SubreportId]
        ,[@SubreportName]
        ,[@MaxRecords]
        ,[@ConnectionString]
        ,[@SqlStatement]
        ,[@ParameterName]
        )) as Lvl01PivotTable

    for xml path ('Node'), TYPE -- Application

    )

return @result

end

Sample Output

The dbo.GetConfigXml() function produces output similar to the following.

<Node NodeType="Application" ElementId="1" Name="My Reports">
  <Node NodeType="Report" ElementId="2" ReportName="MyReport" ReportId="1" XmlFile="MyConfig.xml"
        Template="c:\Templates\Template.xls">
    <Node NodeType="Subreport" ElementId="4" Template="MyTemplate" SubreportId="1" SubreportId="1"
          SubreportName="MyData" MaxRecords="5000" ConnectionString="connection-string"
          SqlStatement="SELECT  ...">
      <Node NodeType="Parameter" ElementId="16" Name="EmployeeId" Source="EMP_ID" />
      <Node NodeType="Parameter" ElementId="17" Name="StartDate" Source="START_DT" />

A More Generalized Approach

The First Attempt (above) is cumbersome to use, especially when it is necessary to add levels to the hierarchy. I therefore developed the following, more generalized, solution, which consists of a view, a table-valued function, and a scalar function to replace the dbo.GetConfigXml() function shown above. This solution places the data from the dbo.Attributes table in XML Nodes instead of XML attributes, thus making the solution cleaner and easier to extend.

The DBO.ElementsExtended View

CREATE view [dbo].[ElementsExtended]
as
select
     ID         = e.ElementId
    -------------------------------------
    ,Name       = ElementName
    -------------------------------------
    ,ParentID   = ParentElementId
    -------------------------------------
    ,Attributes = (
        select 
             Name     = AttributeName
            ,Value    = AttributeValue 
        from 
            dbo.Attributes a 
        where 1=1
            and e.ElementId = a.ElementId 
        for xml 
            path ('Attribute'), type
        )
    -------------------------------------
from
    dbo.Elements e

The DBO.GetElement() Table-Valued Function

/*
select * from dbo.GetElement(1, 0) 
select * from dbo.GetElement(2, 0)
select * from dbo.GetElement(1, 0) for xml path('Element'), type
*/
ALTER function dbo.GetElement
    (
     @ElementId          int
    ,@ParentElementId    int
    ) returns table as return

select
    *
from
    dbo.ElementsExtended e
where 1=1
    and (
        e.ID = @ElementId
        or
        e.ParentID = @ParentElementId
    )

The DBO.GetElementXml() Function

/*
select result = dbo.GetElementXml(1)
select result = dbo.GetElementXml(2)
*/
ALTER function [dbo].[GetElementXml]
    (
    @ElementId int
    ) returns xml as begin

declare 
    @result xml

set @result = (

    select 
        *
        ,(
            select 
                * 
                ,(
                    select 
                        *
                        ,(
                            select 
                                *
                                ,
                                (
                                    select 
                                        * 
                                    from 
                                        dbo.GetElement(0, e4.ID) e5
                                    for xml 
                                        path('Element'), type
                                ) 
                            from 
                                dbo.GetElement(0, e3.ID) e4
                            for xml 
                                path('Element'), type
                            )
                    from 
                        dbo.GetElement(0, e2.ID) e3
                    for xml 
                        path('Element'), type

                    )
            from 
                dbo.GetElement(0, e1.ID) e2
            for xml 
                path('Element'), type
            ) 
    from 
        dbo.GetElement(@ElementId, 0) e1
    for xml 
        path('Element'), type

    )

return @result

end

Sample Output

The dbo.GetElementXml() function produces output similar to the following.
<Element>
  <ID>1</ID>
  <Name>Application</Name>
  <Attributes>
    <Attribute>
      <Name>Name</Name>
      <Value>My Reports</Value>
    </Attribute>
  </Attributes>
  <Element>
    <ID>2</ID>
    <Name>Report</Name>
    <ParentID>1</ParentID>
    <Attributes>
      <Attribute>
        <Name>ReportId</Name>
        <Value>1</Value>
      </Attribute>
      ...