For binary data, creating a
MIME attachment option:
I am dealing with the W3C XOP [1] and
W3C MTOM [2] specifications for another project, and I had the following
thoughts.
W3C XOP specifies an alternative
serialization of the XML Information Set (]3]) (alternative to XML 1.0)
and can always be used as an alternative solution to producing an XML
document that contains long Base64 character strings representing chunks
of binary data. XOP replaces a "verbose" XML document with a
MIME multipart file containing those binary data chunks in binary
form preceded by a modified XML document in which each Base64 character
string in the original XML document has been replaced by a special
XOP-defined element that references the corresponding binary data chunk
that is present further on in the MIME file.
The good news is that any standard that
specifies an XML-based format can take advantage of XOP **almost
implicitly**, provided that the specification is expressed in terms of
the XML Information Set rather than in terms of "physical" XML elements
and attributes. Since I have done this in other cases, I know it is not
difficult.
In other words, a specification that
said, for example, "An *element information item* named so and so shall
have *attribute information items* such and such, and shall have certain
*element information items* among his [children], etc. etc." (XML
Infoset terminology), would work equally well for describing either a
"verbose" XML document (containing Base64 strings) or
a less-verbose file in XOP format. (Alternatively, a statement upfront
in the specification saying that any occurrence of the word "element" is
an abbreviation of "element information item", any occurrence of the
word "attribute" is an abbreviation of "attribute information item",
etc., would also work.)
Note that use of the formal XML Infoset
terminology has become common in standards produced by the W3C and other
organizations (for example, W3C XML Schema [3], W3C SOAP 1.2 [4], W3C
WSDL 2.0 [5], ISO/IEC 24824-2 [7]), mainly because it allows the
standard to ignore many syntactic details of XML that are insignificant
and (at the same time) to be more precise as to what is really
intended. A consequence of using XML Infoset terminology is that other
non-XML serializations (such as Fast Infoset [7]) become possible, thus
allowing a more compact and fast-to-process representation while
keeping compatibility with XML technologies.
Conformance to a standard written
using formal XML Infoset terminology might be formulated so as to
require support for both XML 1.0 and XOP serializations, or (better yet)
could be structured into multiple levels (where level 2 could mean
mandatory support of both XML 1.0 and XOP, and level 1 could not require
any particular serialization format of the XML Infoset, leaving the
serialization aspect up to the implementation). The latter is being
done for several W3C standards.
The conclusion is that, by using this
approach, the "verbosity" problem of XML with regard to the transfer of
biometric sample images or other potentially large binary data chunks
(which need to be encoded in Base64), is implicitly solved. A "writer"
would be able to choose between an ordinary XML 1.0 format (when the
binary data is small or size is not a concern) and an XOP format. A
"reader" would need to accept both formats (but note that since an XOP
processor must necessarily understand XML 1.0, supporting both is no
more expensive than supporting an XOP format only).
(W3C MTOM is about using XOP in the
context of SOAP, and therefore it is not relevant to this discussion.) From
Dave Weston:
On NIST-ITL and CBEFF "Harmonization"
One of the objectives of the
NIST-ITL 2005 update, widely discussed in the workshop meeting, was
to bring about harmonization of the NIST-ITL standard with the
SC37/M1 biometric standards. I don't think we should give up on
this objective and the CBEFF proposal should not be dismissed so
easily and I join you in hoping that a convergent solution can be
found. In particular, CBEFF's flexibility to nest biometrics while
advantageous in the long term, needs to be restrained to the simpler
two-tier trasaction-record NIST structure, so as to satisfy those
parties that need to see a clear one-to-one mapping of NIST's ASCII
representation to NIST's XML representation. And the NIST domain
and domain owner concept must continue to be represented, perhaps as
CBEFF clients, although I haven't seen mentioned. We will continue
to look at ways to effect convergance.
From
Alan Viars:
On accomodating multiple data models:
Is it
possible to satisfy these groups by finding a way to embed their
“style of data” into the lean model? This may cause a data
duplication problem I know, but maybe that is acceptable.
I am suggesting designating places within the lean model to store
CBEFF (perhaps a record type), GJXDM (perhaps in a specific type-2
field) and a DOD XML specific data (perhaps also an XML document
embedded in a special location in type-2)
I guess I’m suggesting looking for a way to possibly “attach” XML
GJXDM, DOD, and CBEFF formatted data to the lean model.