gnue-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnue-dev] GNUe Reports and xmlns, oh my!


From: Jason Cater
Subject: [Gnue-dev] GNUe Reports and xmlns, oh my!
Date: Wed, 20 Mar 2002 15:54:03 -0600

After much time off from GNUe Reports, I'm finally able to tackle it
once again.  At this point in the development, I have the bulk of the
"reporting engine" completed. By "engine", I'm referring to the code
that extracts data from the database, groups repeating data, generates
summaries, and outputs an XML stream.

It is at this point, the outputting of XML, that I got stuck. The
Reporting Engine, as it sits, will output any type of XML you throw at
it.  The underlying layout engine only recognizes a handful of tags:

  <section>   This denotes a [repeating] section. It can have no
              parent datasource or it can be tied to a datasource
              as either a primary section or a supplementary
              grouping section.  Sections can contain other
              sections or fields and summaries.

  <field>     This inserts a field's value into the current
              position

  <summ>      Similar to <field>, except this inserts a field
              "summary" (e.g., average, total, count, ...)

Combined with triggers and the (possible) addition of formulas (which
would be a cross between summaries and triggers) these few tags handle
all the underlying data transformation needs. (Note: this only refers to
the output generating <layout> section... there are still other
underlying tags [datasources, parameters, sorting, etc] )

Obviously, however, these few tags do nothing with regards to generating
layout codes. So, more layout logic has to be added to the output XML
stream. This could be relative formatting codes such as <reportheader>,
<pagefooter>, <column>, etc, or absolute positioning codes, such as
<label x=300 y=200 size="12pt">, etc.

At this point, I don't care WHAT the output XML stream will consist of
-- only that there has to be *something* more than the transformation
logic listed above (<section>, <field>, <summ>). Also, these 3
transformation tags are REPLACED with their results and do not reappear
in the output stream.  To illustrate my point, suppose the following
layout section is tied to a zipcode datasource:

<layout>
  <section source="dtsStates">
    <field name="state"/>
    <section>
      <field name="city"><field name="zipcode"/>
    </section>
  </section>
</layout>

With no other markup in the layout section, the following would be
produced:

AK ADAK
   AKHIOK
   AKIACHAK
   AKIAK
   AKUTAN
   ALAKANUK
   ALEKNAGIK
   ALLAKAKET
   AMBLER
   ANAKTUVUK
   ANAKTUVUK PASS
   ANCHOR POINT
   ANCHORAGE
   ANDERSON
  <..snip..>
AL ABBEVILLE
   ABEL
   ABERNANT
   ACMAR
   ADAMSVILLE
   ADDISON
   ADGER
   AKRON
   ALABASTER
   ALBERTA
   ALBERTVILLE
  <..snip..>

Notice there is no markup indicators in the stream. So, something has to
be added for this output to be usefully transformed into a variety of
reports. For the sake of argument, suppose we decide to primarily use
the following set of tags in our output stream:

  <reportheader>
  <reportfooter>
  <pageheader>
  <pagefooter>
  <band>
  <bandheader>
  <bandfooter>
  <bandrow>
   ...

The first output stream might then look like:

<reportheader>Cities by State</reportheader>
<pageheader/>
<band>
  <bandrow><text>AK</text>
    <band>
      <bandrow><text>ADAK</text></bandrow>
      <bandrow><text>AKHIOK</text></bandrow>
      <bandrow><text>AKIACHAK</text></bandrow>
      <bandrow><text>AKIAK</text></bandrow>
      <bandrow><text>AKUTAN</text></bandrow>
      <bandrow><text>ALAKANUK</text></bandrow>
      <bandrow><text>ALEKNAGIK</text></bandrow>
      <bandrow><text>ALLAKAKET</text></bandrow>
      <bandrow><text>AMBLER</text></bandrow>
      <bandrow><text>ANAKTUVUK</text></bandrow>
      <bandrow><text>ANAKTUVUK PASS</text></bandrow>
      <bandrow><text>ANCHOR POINT</text></bandrow>
      <bandrow><text>ANCHORAGE</text></bandrow>
      <bandrow><text>ANDERSON</text></bandrow>

  ..snip..

    </band>
  </bandrow>
  <bandrow><text>AL</text>
    <band>
      <bandrow><text>ABBEVILLE</text></bandrow>
      <bandrow><text>ABEL</text></bandrow>
      <bandrow><text>ABERNANT</text></bandrow>
      <bandrow><text>ACMAR</text></bandrow>
      <bandrow><text>ADAMSVILLE</text></bandrow>
      <bandrow><text>ADDISON</text></bandrow>
      <bandrow><text>ADGER</text></bandrow>
      <bandrow><text>AKRON</text></bandrow>
      <bandrow><text>ALABASTER</text></bandrow>
      <bandrow><text>ALBERTA</text></bandrow>
      <bandrow><text>ALBERTVILLE</text></bandrow>

  ..snip..

  </bandrow>
</band>


Since I know it will come up, I'm *NOT* proposing this to be the markup!
I'm simply using this as an example.

The layout section for the above might then look like:

<layout>
  <reportheader>Cities by State</reportheader>
  <pageheader/>
  <band>
  <section source="dtsStates">
    <bandrow>
    <field name="state"/>
    <band>
    <section>
      <bandrow>
      <text><field name="city"/></text>
      <bandrow>
    </section>
    </band>
    </bandrow>
  </section>
  </band>
</layout>


All of the output stream tags would then just be passed through to the
output stream and the transformation tags (section, field, summ) would
be replaced with the data.

Ok; so far, so good.... However, I have two problems with this approach:

  1) Looking at the .GRD report definition, it's hard to
     distinguish between which tags are transformation tags
     and which are output formatting tags. With a visual editor,
     you could argue that this point is minimal; but, more
     importantly to me:

  2) This binds our GRD format to a specific output stream markup.
     To some, this might not be a big deal, but I'm *strongly*
     against this. I *don't* want to hardcode formatting tags in
     the reports definition simply because I feel this would
     greatly diminish the usefulness of the underlying reports
     engine.

I considered for a while using multiple, dynamically-generated DTDs
based on the underlying output stream format. This would require a
different GParser file for each format.  This would, of course,
introduce a lot of new requirements into the reports engine code. I
*really* didn't want to go this route.

Pursuing the issue further, it dawned on me that the report engine
doesn't really even want to know about the formatting tags... it simply
wants to pass those straight through to the output stream. So idea #2
was to have all the formatting tags encoded as plain text. In other
words, <reportheader> would actually be stored in the report definition
as &lt;reportheader&gt;. Our sample layout would then look like:

<layout>
  &lt;reportheader>Cities by State&lt;/reportheader&gt;
  &lt;pageheader/&gt;
  &lt;band&gt;
  <section source="dtsStates">
    &lt;bandrow&gt;
    <field name="state"/>
    &lt;band&gt;
    <section>
      &lt;bandrow&gt;
      &lt;text&gt;<field name="city"/>&lt;/text&gt;
      &lt;bandrow&gt;
    </section>
    &lt;/band&gt;
    &lt;/bandrow&gt;
  </section>
  &lt;/band&gt;
</layout>

Well, this "solution" beautifully solves the problem from the viewpoint
of the report engine.  However, this introduces two problems:

  1) Similar to point #1 above, this makes the report definition
     XML harder to read.  It would be easy to visualize the
     transformation logic, but much harder to visualize the
     formatting logic. As with point #1 above, this problem is
     minimalized if a visual designer is used. However, ...

  2) This would make a visual designer harder to code. In effect,
     each report definition contains at least two independent XML
     streams.  The first stream would be parsed to generate the
     GObject tree, as is done now.  A second XML parsing might
     have to take place to decode the formatting logic.

I still preferred this approach to the former approach, as this approach
trivializes the output stream.  The resulting xml is converted from the
text content of the source definition.

Then I saw the light: XML Namespaces! Let me explain by example.. Our
sample layout section could now look like:

<layout output:xmlns="GNUe:Reports:Standard">
  <output:reportheader>Cities by State</output:reportheader>
  <output:pageheader/>
  <output:band>
  <section source="dtsStates">
    <output:bandrow>
    <field name="state"/>
    <output:band>
    <section>
      <output:bandrow>
      <output:text><field name="city"/></output:text>
      <output:bandrow>
    </section>
    </output:band>
    </output:bandrow>
  </section>
  </output:band>
</layout>

This achieves several goals:

  1. It is easy to see which tags are standard report definition
     tags and which are formatting/layout tags -- all formatting
     tags are denoted with a namespace designation.

  2. This would require minimal changes to the report parser as
     Python's SAX2 parser differentiates between standard xml tags
     and namespace-qualified tags.  The primary GRParser markup
     would only contain the primary report definition tags (with
     the transformation layout tags).  The parser would be
     instructed to simply pass the namespace-qualified tags through
     as some generic (yet-to-be-created) GObject, a GObject that
     the report engine knows to ignore during processing, and pass
     through unchanged when outputting.

  3. GNUe Designer would not simply ignore the namespace GObject,
     but would know how to interpret these and visually layout the
     report design on the screen.  The existing GObject code in
     Designer could also still be used since the complete report
     is a standard GObject tree (albeit some GObjects are ignored
     by the report engine.)

  4. The xmlns="GNUe:Reports:xxxx" implicitly provides a standard
     naming system for various report formats.  Of course, GNUe, the
     ERP application, will probably only have one format (which, of
     course, would be the format that GNUe Designer would design
     against.)  But other layouts would still be possible/easy to
     implement.

  5. This leaves the door open for the Reports Engine to "proxy"
     output formatting logic. In other words, this would allow, if
     we ever so desired, to write a PDF (or HTML or TeX) formatting
     engine (instead of an external XSLT script) for reports of type
     GNUe:Reports:Standard that the engine could load.  Instead of
     physically creating an XML output stream, it creates a stream
     of events to this engine, which would understand the xmlns
     tags.  Or, more to the point, the engine would send a
     "reportheader" event to the GNUe:Reports:Standard:PDF
     formatting engine, which would be creating a "PDF" output
     stream.

     I'm not suggesting that we pursue the "formatting engine", but
     just suggesting that the possibility would always be there
     without much recoding. Of course, I never said that I won't
     officially suggest this later on. :)

So, unless someone gives me a compelling reason otherwise, this is the
approach I want to take.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]