Schedule for Saturday
chaired by yours truly Jirka Kosek (morning) and Mohamed Zergaoui (afternoon)
9:30 | Registration desk opens |
10:00 | Opening of the last conference day |
10:10 | X-definition 3.1 Václav Trojan, Jindřich Kocman and Jiří Kamenický (Syntea software group a.s.) |
10:40 | Relational and Semantic Views over Documents John Snelson (MarkLogic) |
11:10 | Coffee break |
11:40 | XQuery/XSLT/XPath 3.x: Ready to Deploy Michael Kay (Saxonica) |
12:10 | On the Descriptions of Data Steven Pemberton (CWI) |
12:40 | FOXpath navigation of physical, virtual and literal file systems Hans-Juergen Rennau (Traveltainment GmbH) |
13:10 | Lunch |
14:40 | DHW: An online introductory toolset for XML encoding Alejandro Bia (Miguel Hernández University) |
15:10 | A Text Structure “Epischema” for TEI Gerrit Imsieke (le-tex publishing services GmbH) |
15:40 | Coffee Break |
16:10 | CSS for Print via XSL-FO George Bina and Dan Caprioara (Syncro Soft) |
16:40 | The Merger Between IDPF and W3C and the Future of EPUB Liam Quin (W3C) |
17:00 | Closing of the conference |
Session details
X-definition 3.1
Václav Trojan, Jindřich Kocman and Jiří Kamenický (Syntea software group a.s.)
The X-definition is a programming language designed for
description of the structure of an XML document, its validation, processing
and even construction. The X-definition itself is an XML document. The
content of the X-definition is composed of models of elements.
Relational and Semantic Views over Documents
John Snelson (MarkLogic)
SQL is the norm. The relational model has had 45 years of dominance in database users hearts and minds, and has seeded a vast database tools, BI, and ETL market. Whatever your thoughts on the database market, it’s hard to escape the ubiquity of SQL and the relational model.
However increasingly developers are learning to embrace the benefits of document databases, and understand the advantages of hierarchical and ordered data models. Returning to more natural data modelling concepts like entities and relationships, they are rightly beginning to view third normal form as distinctly abnormal.
But if my logical entity is represented by a document, how do I use it from SQL? I may wish to use BI tools on my data, or allow colleagues without a document database background access to it. I may find the uniformity and strong mathematical foundation of querying using relational algebra compelling. Similarly, I may wish to expose some of my data as RDF for use in data integration or Semantic Web projects.
The solution can be found in a declarative live transformation into the target data model (relational or RDF), which is always kept up-to-date – where the data is kept in a logical document model, and exposed through views as more query, domain, or user friendly structures.
slides | MarkLogic workspace with examples
XQuery/XSLT/XPath 3.x: Ready to Deploy
Michael Kay (Saxonica)
It’s taken 10 years, but XQuery and XPath 3.1 are now Proposed Recommendations, and XSLT 3.0 is following close behind. The last year has been spent crossing the t’s and dotting the i’s – that is, fixing minor bugs. Getting these specs through the W3C process means not just having bug-free specs, it also means being able to demonstrate viable implementations; one contribution to that has been the development of a large test suite. The interoperability workshops on XSLT 3.0 held at XML Prague and XML London in 2016 were another part of the process.
This talk will provide a reminder of the headline features in the new versions of the specifications, focusing on the things that are “game-changers” in the sense that they make new kinds of applications possible; and it will discuss the maturity of the specifications in an attempt to answer the question “when can I actually take advantage of these features in real life?”.
On the Descriptions of Data
Steven Pemberton (CWI)
Usability describes the ease with which you can use something: how long it takes to achieve your aims, how correctly, and whether it is enjoyable in the process.
While this is normally applied to interactions with processes, such as computer programs, or machines, it is also applicable to notations: how easily can you achieve what you are trying to do, does the notation aid you in avoiding errors, and, indeed, is it enjoyable to do? However, surprisingly little attention is paid to designing notations for usability.
Invisible XML (ixml) is a technique for treating any parsable format as if it were XML, and thus allowing any parsable object to be injected into an XML pipeline. It uses a notation for describing data formats that are to be parsed.
Earlier papers on ixml discuss the design of the notation based on functional requirements of the language. This paper discusses changes to the design following experience with using it, giving examples of its use to develop data descriptions, and in passing, suggests other output formats.
FOXpath navigation of physical, virtual and literal file systems
Hans-Juergen Rennau (Traveltainment GmbH)
The FOXpath language extends the XPath language by adding support for file system navigation. This paper explores possibilities how to extend file sysctem navigation beyond physical file systems and include logical file systems like jar files, SVN repositories or github projects. The extension is based on a set of simple concepts related to URIs and their processing, and it is implemented as a FOXpath processor which supports the navigation of physical and various types of logical file systems.
DHW: An online introductory toolset for XML encoding
Alejandro Bia (Miguel Hernández University)
In this paper we will describe a set of online tools built initially for teaching XML encoding, though they can be used for production as well.
This set of tools comprises:
– An online platform with tools to validate, pretty-print, edit and transform XML documents.
– Tools for automatic XML-TEI markup from a lightweight markup language.
– Tools to graphically visualize and design markup vocabularies and XML document instances.
– XSLT processing.
– XPATH evaluation.
These tools will be briefly showcased during the conference presentation.
A Text Structure “Epischema” for TEI
Gerrit Imsieke (le-tex publishing services GmbH)
This paper presents an underutilized mechanism for XML document grammar customization. Instead of altering the base schema or adding Schematron constraints, a second grammar-implementing schema is associated with the document. This second schema will enforce structural constraints where the basic schema is liberal. This second schema is lightweight in that it allows anything anywhere except for a certain aspect for which it adds grammatical constraints over the permissive base schema. We call this additional, sparse, aspect-oriented schema an Epischema. An example to which this concept is applied is TEI’s notoriously generic div hierarchy, where the div/@type attribute can assume arbitrary values. Generic vocabularies such as TEI and HTML are increasingly used by publishers as the primary source format. These publishers ask for prescriptive constraints to be imposed on top of basic schema conformance. An advantage of epischemas over the commonplace Schematron constraints is that they allow better context-aware markup completion in authoring systems.
CSS for Print via XSL-FO
George Bina and Dan Caprioara (Syncro Soft)
The problem with XSL-FO is that it is complex to create and modify, and people prefer to customize PDF using CSS rather than creating/modifying an XSLT stylesheet that generates XSL-FO. Thus, CSS for print has received more traction lately. There are many initiatives for various XML vocabularies to provide support for CSS for print (for example, there are two open-source projects to generate PDF from DITA using CSS).
On the other hand, there are a number of XSL-FO engines available (including the open-source Apache FOP engine) that provide reasonable support for XSL-FO to produce PDF.
In order to leverage the existing FO processors for CSS-based PDF, we can support CSS for print by implementing a conversion from XML+CSS to XSL-FO and then apply an FO processor to get the actual PDF output.
We will show the anatomy of such an engine that implements CSS for print using XSL-FO as an intermediary format. We will focus on the advantages of such an approach as well as discussing challenges we encountered during implementation. Some advanced CSS level 3 and level 4 functions are essential for more advanced layouts or rendering of information. We also propose a few CSS extensions that can be very useful, such as a function to provide XPath evaluation support.
The Merger Between IDPF and W3C and the Future of EPUB
Liam Quin (W3C)
Liam Quin of W3C will present on the status of the planned merger
between IDPF and W3C and will explain why this is seen by both
organizations as a good thing for EPUB and for the ebook market. The
status of the EPUB specifications will also be covered, along with
(where it’s known and public) the future plans.