Prepared by Jason Cater
March 15, 2001
Latest Revision: March 23, 2001
Please email any comments/suggestions/concerns/criticisms to the GNUe-Reports mailing list.
Most of the written discussion on GNUe Reports to date has focused on the interaction of Reports with GEAS and third party tools. Not much discussion has gone into actually defining how reports will be represented. As several people need a functioning [or, at least primitively functioning :) ] reporting tool as soon as possible (myself included), I would like to present a working draft for report definitions so work can go into the report server and report clients.
This paper will not discuss internals (methods, et al) of either a report server or a report client, only the attributes of GNUe Reports that are of interest to a report designer/developer.
The ultimate goal of the server engine is to take an XML-based report definition and generate an XML output stream conforming to this report definition. The XML output can then be converted into whatever format is required by the client. It is of no consequence to this draft as to what tools will be used for the conversion or at what point the conversion takes place.
It is also of no consequence to this draft as to what the XML output might look like. As you will see shortly, the report server doesn't care what the outputted XML will look like. Below is an example of what a resultant XML stream *might* look like:
<gnue-report-output>
... (header stuff) ...
<title>
<line>Vendor List in Vendor ID Order</line>
<line>My Company Name</line>
</title>
<rundate format="MM/DD/YYYY HH:MI">02/28/2001 12:01</rundate>
<pageheader><left/><center/><right/></pageheader>
<pagefooter><left/><center/><right/></pagefooter>
<reportheader><left/><center/><right/></reportheader>
<reportfooter><left/><center/><right/></reportfooter>
<body>
<t>
<th><td>Vendor</td><td>Name</td><td>Zip</td></th>
<tr>
<td>123456</td>
<td>Cater, Jason</td>
<td>38117</td>
</tr>
<tr>
<td>234567</td>
<td>The Printing Company</td>
<td>38101</td>
</tr>
<tf/>
</t>
(2 Vendors)
</body>
</gnue-report-output>
This output could then be transformed into whatever format desired (HTML, PS, PDF, TXT,RTF, Spreadsheet, etc).
A report definition is in an XML format with four basic sections: parameters, sorting options, queries (sources), and layout definition:
<gnue-report version="0.0.1" id="AP_VENDORS"
name="Accounts Payable Vendors"> <parameters> ... </parameters> <sortoptions> ... </sortoptions> <queries> ... </queries> <layout format="standard"> ... </layout> </gnue-report>
Each of these sections will be briefly discussed.
A basic requirement of any reporting package is the ability to pass parameters to the report. These parameters can be used in "where" clauses to affect the outcome of the report. They could also be used as fields in the report output.
In the definitions of the parameters, defaults, restrictions, and hints can also be defined. Take as an example the following list of parameters:
<parameters>
<parameter id="company" type="char" name="Company"/>
<parameter id="vendor1" type="char" name="Starting Vendor" default="00000000"
required="no" values="select" limited="no">
#FIXME: This "select" would actually need to be represented by
XML to make database independent#
select vendor, name from ap_vendors order by vendor;
</parameter>
<parameter id="vendor2" type="char" name="Ending Vendor" default="ZZZZZZZZ"
required="no"/>
<parameter id="withbal" type="char" name="Include Balances" required="yes"
values="list" limited="yes">
<item value="y">Yes</item>
<item value="n">No</item>
</parameter>
</parameters>
This section defines four parameters: company, vendor1, vendor2, and withbal.
The parameter "company" is a basic text parameter. It has no restrictions.
The parameter "vendor1" is a text parameter that defaults to "00000000" if no value is supplied. It also suggests to the human interface portion of the report client to present a list of possible choices to the user via a select statement, but since 'limited=no" was specified, the user is not restricted to the results of the select. Note that the "select" statement returns two columns -- the first column will always be the value returned as the parameter; the second column will be what the client displays as a selection.
The parameter "withbal" defines a constant set of possible values. Since "limited=yes" was specified, then "withbal" must be equal to one of the values in the defined set (either "y" or "n").
Many different "reports" are identical except in the manner in which they are sorted. GNUe-Reports can consolidate such reports into a single report by creating "sort options". Basically, sort options are "where clauses" that can be selected by the report client. Take for example:
<sortoptions>
<sortoption id="byVendor" default="yes" name="Vendor ID Order">
<sortcolumn>vendor</sortcolumn>
</sortoption>
<sortoption id="byName" name="Vendor Name Order">
<sortcolumn>name</sortcolumn>
<sortcolumn>vendor</sortcolumn>
</sortoption>
<sortoption id="byBalance" name="Balance Order">
<sortcolumn>balance</sortcolumn>
<sortcolumn>vendor</sortcolumn>
</sortoption>
</sortoptions>
This basically says that the report can be sorted three different ways, in "Vendor ID Order", "Vendor Name Order", or "Balance Order". The selected Sort Option will be combined with the query (explained below) to form the underlying "select" statement.
In defining a query, I focused on three basic options:
Each possibility has its advantages and drawbacks.
One possibility is to use an SQL select statement. Take for example:
<select>select a.vendor as vendor,
a.name as name,
a.zip as zip
from ap_vendors
where company = :company</select>
The upsides to this method are:
The downsides to this method are:
Queries could be defined using an libGDA's XML Query (see <http://www.gnome-db.org/docs/white-papers/xml-queries.php>)
The upsides to this method are:
The downsides to this method are:
The final solution is to define our own XML query:
<queries>
<query name="main">
<sources>
<source id="a" name="test_xml"/>
<source id="b" name="companies"/>
</sources>
<columns>
<column name="cname" source="b.company" group="main" size="30"/>
<column name="name" source="a.name" group="VendorList" size="30"
minsize="20" maxsize="40">
<summary name="ttl_names" function="count"/>
</column>
<column name="vendor" source="a.vendor" group="VendorList" size="12"/>
<column name="zip" source="a.zip" group="VendorList" size="5"/>
</columns>
<qualification>
<eq>
<field id="a.company"/>
<parameter id="company">
</eq>
<eq>
<field id="b.company"/>
<field id="a.company"/>
</eq>
</qualification>
<orderby>
<sort/>
</orderby>
</query>
</queries>
The upsides to this method are:
The downsides to this method are:
The layout section defines what should be output by the report server. It consists of only four tags:
<field name=""/> - This tag inserts the value of field "name" resulting from a query.
<summ name=""/> - This tag inserts the current value of an aggregate summary function.
<param name=""/> - This tag inserts the value of a passed parameter. See the above section on "Report Parameters" for more information.
<section group="" pagebreak=""/> - This tag delimits different repeat sections (or groups) within the report. Every report will have at least one section. If parent/child queries were defined, then each parent will be a section and will contain other sections representing the child queries. It is also possible for a single query to be represented by several sections. This will create a "breaking" or "grouping" effect. Take for instance the following data resulting from a single query:
Joe Cool's Blues, Winton Marsalis, Track 1, Linus & Lucy
Joe Cool's Blues, Winton Marsalis, Track 2, Buggy Ride
Joe Cool's Blues, Winton Marsalis, Track 3, Peppermint Patty
Joe Cool's Blues, Winton Marsalis, Track 4, On Peanuts Playground
Voice of an Angel, Charlotte Church, Track 1, Pie Jesu
Voice of an Angel, Charlotte Church, Track 2, Panis Angelicus
Voice of an Angel, Charlotte Church, Track 3, In Trutina
Voice of an Angel, Charlotte Church, Track 4, Danny Boy
Suppose we reformat the above rows to print as:
Joe Cool's Blues by Winton Marsalis Track 1 Linus & Lucy Track 2 Buggy Ride Track 3 Peppermint Patty Track 4 On Peanuts Playground Voice of an Angel by Charlotte Church Track 1 Pie Jesu Track 2 Panis Angelicus Track 3 In Trutina Track 4 Danny Boy
We would then say that this query is represented by two sections, one containing the Title and Artist, another, a subsection of the first, containing the Track and Song. (If you haven't figured it out yet, this is part of my CD collection :)
All of the above tags will be stripped from the output and substituted with the data they represent. Any other tags will be passed verbatim as output. By doing so, we are not restricting the types of reports that can be generated via our server. Any report that can be represented using XML can be defined using our layout section.
Take for instance...
<layout>
<body>
<section group="Main" pagebreak="yes">
<t>
<th><td>Vendor</td><td>Name</td><td>Zip</td></th>
<section group="VendorList" pagebreak="no">
<tr>
<td><field name="vendor"/></td>
<td><field name="name"/></td>
<td><field name="zip"/></td>
</tr>
</section>
<tf/>
</t>
(<summ name="ttl_names"/> Vendors)
</section>
</body>
</layout>
All extra tags (body,t, th, td, tr) are passed in the output stream.
As stated earlier, the output from the report server is largely dependent on the report definition's layout section. However, a few common elements will exist in all outputs. First, the layout defined by the report definition will be enclosed by a set of tags (<gnue-report-output/>) identifying that this is a report output. Secondly, any parameters (including sortoptions) used to generate this file will be stored in the output using a <gnue-report-request> block (the format of a report-request is discussed later in this draft).
<gnue-report-output format="standard" version="0.0.1">
<gnue-report-request>
<report id="AP_VENDOR"/>
<parameter id="company" value="101"/>
<parameter id="vendor1" value="000000"/>
<parameter id="vendor2" value="ZZZZZZ"/>
<parameter id="withbal" value="y"/>
<sortoption id="byName"/>
<output definition="standard-ps"
destinationtype="printer"
destination="myprinter"/>
</gnue-report-request>
... (layout specific code goes here) ...
</gnue-report-output>
Aside from these elements, the rest of the output is determined by the layout section.
The following subsections identify three common types of output that will be generated by the reporting engine -- Informational Reports, Pre-printed Business Forms, and Data Files. It includes sample representations of these three types of reports. Please note, however, that this draft is not proposing the sample report definitions described below... more thought and discussion needs to go into those definitions. They strictly serve as examples. You have been warned.
Usually when one thinks of reports, they envision financial statements, batch edit reports, vendor lists, etc. For the sake of this draft, we will call these Informational Reports (whether they contain useful information or not is irrelevant ;-) [Perhaps a better description would be "End-User Reports" since these types of reports are typically used by the end-user. Any thoughts???]
Most of the time, this type of report will (should) have a fairly consistent look across the company. Take my company, for example. Most reports:
| ||||||||||||||||||||||||||||||
Also, many times the data is in a semi-tabular format (or can be written in a semi-tabular format).
The following output could be used to describe the contents of the report sample shown above:
<gnue-report-output format="standard" version="0.0.1">
<gnue-report-request>
...
</gnue-report-request>
<title>
<line>Vendor List in Vendor ID Order</line>
<line>My Company Name</line>
</title>
<sortdesc>Vendor ID Order</sortdesc>
<rundate format="MM/DD/YYYY HH:MI">02/28/2001 12:01</rundate>
<pageheader><left/><center/><right/></pageheader>
<pagefooter><left/><center/><right/></pagefooter>
<reportheader><left/><center/><right/></reportheader>
<reportfooter><left/><center/><right/></reportfooter>
<body>
<t>
<th><td>Vendor</td><td>Name</td><td>Zip</td></th>
<tr>
<td>123456</td>
<td>Cater, Jason</td>
<td>38117</td>
</tr>
<tr>
<td>234567</td>
<td>The Printing Company</td>
<td>38101</td>
</tr>
<tf/>
</t>
(2 Vendors)
</body>
</gnue-report-output>
This output could then be used to construct postscript, html or whatever is needed to format the report as the client wishes. The output above would be generated by a report definition such as the following:
<gnue-report version="0.0.1" id="AP_VENDORS"
name="Accounts Payable Vendors"> <parameters> <parameter id="company" type="char" name="Company"/> <parameter id="vendor1" type="char" name="Starting Vendor" default="00000000" required="no" values="select" limited="no"> select vendor, name from ap_vendors order by vendor; </parameter> <parameter id="vendor2" type="char" name="Ending Vendor" default="ZZZZZZZZ" required="no"/> <parameter id="withbal" type="char" name="Include Balances" required="yes" values="list" limited="yes"> <item value="y">Yes</item> <item value="n">No</item> </parameter> </parameters> <sortoptions> <sortoption id="byVendor" default="yes" name="Vendor ID Order"> <sortcolumn>vendor</sortcolumn> </sortoption> <sortoption id="byName" name="Vendor Name Order"> <sortcolumn>name</sortcolumn> <sortcolumn>vendor</sortcolumn> </sortoption> <sortoption id="byBalance" name="Balance Order"> <sortcolumn>balance</sortcolumn> <sortcolumn>vendor</sortcolumn> </sortoption> </sortoptions> <queries> <query name="main"> <sources> <source id="a" name="test_xml"/> <source id="b" name="companies"/> </sources> <columns> <column name="cname" source="b.company" group="main" size="30"/> <column name="name" source="a.name" group="VendorList" size="30" minsize="20" maxsize="40"> <summary name="ttl_names" function="count"/> </column> <column name="vendor" source="a.vendor" group="VendorList" size="12"/> <column name="zip" source="a.zip" group="VendorList" size="5"/> </columns> <qualification> <eq> <field id="a.company"/> <parameter id="company"/> </eq> <eq> <field id="b.company"/> <field id="a.company"/> </eq> </qualification> <orderby> <sort/> </orderby> </query> </queries> <layout format="standard" bestpaper="letter" bestorientation="landscape"> <title> <line>Accounts Payable Vendor List</line> <line>(<sortname/></line> </title> <pageheader><left/><center/><right/></pageheader> <pagefooter><left/><center/><right/></pagefooter> <reportheader><left/><center/><right/></reportheader> <reportfooter><left/><center/><right/></reportfooter> <body> <section group="Main" pagebreak="yes"> Vendors for <field name="cname"> <t> <th><td>Vendor</td><td>Name</td><td>Zip</td></th> <section group="VendorList" pagebreak="no"> <tr> <td><field name="vendor"/></td> <td><field name="name"/></td> <td><field name="zip"/></td> </tr> </section> <tf/> </t> (<summ name="ttl_names"/> Vendors) </section> </body> </layout> </gnue-report>
Another requirement for any reporting package is the ability to print on preprinted business forms. A common example, especially for an accounting package, is some sort of tax form. For our example, we will use a 1099-MISC form.
Below is an example of how the XML output stream might look for this style of report:
<gnue-report-output format="us-irs-1099misc" version="0.0.1">
<gnue-report-request>
...
</gnue-report-request>
<form>
<corrected>X</corrected>
<void>X</void>
<payer-fed-id>12-3456789</payer-fed-id>
<payer-address>
<line>My Company, Inc.</line>
<line>123 N Main St</line>
<line>Memphis, TN 38101</line>
</payer-address>
<recipient-name>Jason Cater</recipient-name>
<recipient-street>
<line>Apt 202</line>
<line>123 N School St</line>
</recipient-street>
<recipient-locale>
<line>Cordova, TN 38018</line>
</recipient-locale>
<account-number>JCATER001</account-number>
<box1>0.00</box1>
<box2>0.00</box2>
<box3>0.00</box3>
<box4>0.00</box4>
<box5>0.00</box5>
<box6>0.00</box6>
<box7>100,000,000.00</box7>
<box8>0.00</box8>
<box9></box9>
<box10>0.00</box10>
<box11>0.00</box11>
<box12>0.00</box12>
<box13>0.00</box13>
<box14>0.00</box14>
<box15></box15>
<box16>0.00</box16>
<box17>0.00</box17>
<box18>0.00</box18>
</form>
<form>
<corrected></corrected>
<void></void>
<payer-fed-id>12-3456789</payer-fed-id>
<payer-address>
<line>My Company, Inc.</line>
<line>123 N Main St</line>
<line>Memphis, TN 38101</line>
</payer-address>
<recipient-name>Bill Smith</recipient-name>
<recipient-street>
<line>201 N School St</line>
</recipient-street>
<recipient-locale>
<line>Memphis, TN 38018</line>
</recipient-locale>
<account-number>BSMITH001</account-number>
<box1>0.00</box1>
<box2>0.00</box2>
<box3>0.00</box3>
<box4>0.00</box4>
<box5>0.00</box5>
<box6>0.00</box6>
<box7>100,000.00</box7>
<box8>0.00</box8>
<box9></box9>
<box10>0.00</box10>
<box11>0.00</box11>
<box12>0.00</box12>
<box13>0.00</box13>
<box14>0.00</box14>
<box15></box15>
<box16>0.00</box16>
<box17>0.00</box17>
<box18>0.00</box18>
</form>
</gnue-report-output>
This output could then be fed through a transformation engine to produce code for line printers (for multi-part forms) or postscript for laser-style forms (or possibly even electronic submission).
Another requirement for any reporting package is the ability to generate data files. Typically, data files, or at least text-based files, can be classed as either a structured markup, delimited,or a fixed-length file. Structured markup files will not be examined as the reporting package by default emits a structured markup (XML).
<gnue-report-output format="datafile" version="0.0.1">
<gnue-report-request>
...
</gnue-report-request>
<format type="delimited" fieldseparator="," recordseparator="\n"/>
<record>
<value>Jason</value>
<value>Cater</value>
<value>123 N School St</value>
<value></value>
<value>Cordova</value>
<value>TN</value>
<value>38018</value>
<value>901-222-3333</value>
<value>Cater, Jason</value>
</record>
<record>
<value>Bryan</value>
<value>Cater</value>
<value>321 N School St</value>
<value></value>
<value>Cordova</value>
<value>TN</value>
<value>38018</value>
<value>901-222-4444</value>
<value>Cater, Bryan</value>
</record>
</gnue-report-output>
The final output would be:
Jason,Cater,123 N School St,,Cordova,TN,38018,"Cater, Jason" Bryan,Cater,321 N School St,,Cordova,TN,38018,"Cater, Bryan"
The <format type="..."> tag could have several predefined types: delimited, fixedlength, comma-delimited, tab-delimited, etc. The comma-delimited and tab-delimited would be the same as the delimited type, except fieldseparator and recordseparator would have predefined values.
As a report server cannot exist as an island, a language and platform independent method of communicating between clients and the server must exists. As with all else in GNUe, XML is chosen for this task.
A dynamic report client at some point will need to retrieve information about a report without actually running the report. An example is when the client must know what parameters a report will prompt for so that it can then prompt the user. The first communication structure is the request-header structure. This is sent from a client to the server to request information about a report.
<gnue-report-request-header version="0.0.1"> <report id="AP_VENDORS"/> </gnue-report-request-header>
The server will respond with a report header:
<gnue-report-header version="0.0.1">
<report id="AP_VENDORS" name="Accounts Payable Vendors"/>
<parameters>
<parameter id="company" name="Company"/>
<parameter id="vendor1" name="Starting Vendor" default="00000000"
required="no" values="select" limited="no">
select vendor, name from ap_vendors order by vendor;
</parameter>
<parameter id="vendor2" name="Ending Vendor" default="ZZZZZZZZ"
required="no"/>
<parameter id="withbal" name="Include Balances" required="yes"
values="list" limited="yes">
<item>y</item>
<item>n</item>
</parameter>
</parameters>
<sortoptions>
<sortoption id="byVendor" default="yes" name="Vendor ID Order"/>
<sortoption id="byName" name="Vendor Name Order"/>
<sortoption id="byBalance" name="Balance Order"/>
</sortoptions>
<saved-options id="LAST" name="Previous Report Options">
<parameter id="company" value="101"/>
<parameter id="vendor1" value="987650"/>
<parameter id="vendor2" value="987650"/>
<parameter id="withbal" value="n"/>
<sortoption id="byVendor"/>
<output definition="standard-pdf"
destinationtype="email-attachment"
destination="user@domain.com"/>
</saved-options>
<saved-options id="UserDef0001"
name="All Vendors in Name Order (With Balances)">
<parameter id="company" value="101"/>
<parameter id="vendor1" value="000000"/>
<parameter id="vendor2" value="ZZZZZZ"/>
<parameter id="withbal" value="y"/>
<sortoption id="byName"/>
<output definition="standard-ps"
destinationtype="printer"
destination="myprinter"/>
</saved-options>
</gnue-report-header>
Notice that the report header also passes saved-options. This optional feature lets a user define a common set of parameters for a report. They can then select these predefined options instead of repeatedly choosing the same parameters.
To initiate a report being run, a request structure is sent from a client to the server. This structure identifies the report to run and and any necessary parameters and sort-options.
<gnue-report-request version="0.0.1">
<report id="AP_VENDOR"/>
<parameter id="company" value="101"/>
<parameter id="vendor1" value="000000"/>
<parameter id="vendor2" value="ZZZZZZ"/>
<parameter id="withbal" value="y"/>
<sortoption id="byName"/>
<output definition="standard-ps"
destinationtype="printer"
destination="myprinter"/>
</gnue-report-request>
The server responds with a results structure.
<gnue-report-results version="0.0.1"> <status result="success">Your report was sent via email to user@domain.com</status> </gnue-report-results>
If an error occurs, the result status reflects this. Note that if a raw XML stream is selected as the destination, this result structure will contain a <gnue-report-output> with the report contents.
<gnue-report-results version="0.0.1">
<status result="success">Your report is enclosed</status>
<gnue-report-output>
...
</gnue-report-output>
</gnue-report-results>
One of the main drawbacks to this style of report markup lies within the layout section. Since tags within the layout section are dependent on the class of report being run, a single DTD cannot describe the generic report definition.
It should be relatively trivial to write a script to take a base report DTD and combine it with the class specific report DTD (a DTD describing the resultant output of a report) to form a report definition DTD. In other words, we can write a DTD to describe every aspect of a report definition except the layout section. This can be combined with a report specific DTD (e.g., the "1099-Misc" DTD) to form a report definition.
If this would be considered poor form, we could only use the four main markups in the layout section (section, field, summ, and parameter) and encode any report-specific codes using < and >. The example given in the "Informational Report" section would then be written as:
<layout>
<body>
<section group="Main" pagebreak="yes">
<t>
<th><td>Vendor</td><td>Name</td>
<td>Zip</td></th>
<section group="VendorList" pagebreak="no">
<tr>
<td><field name="vendor"/></td>
<td><field name="name"/></td>
<td><field name="zip"/></td>
</tr>
</section>
<tf/>
</t>
(<summ name="ttl_names"/> Vendors)
</section>
</body>
</layout>
I'm not sure which is more cumbersome: using a "<" style markup or using separate DTD for each report style.
Any thoughts?