Day 2

Schedule for Friday

9:00	Registration desk opens
9:30	Opening and sponsors presentation
9:40	Invisible XML: State of Play and Future Directions Steven Pemberton
10:10	Crane-txt2xml – an attempt to socialize XML for non-XML’ers G. Ken Holman
10:40	XForms Implementation of a Federated Electronic Health Record, Using a Blockchain John Chelsom and Mirek Mužný
11:10	Coffee break
11:30	Standards update Norm Tovey-Walsh
11:45	Schematron 2025 – Technology Update Erik Siegel
12:15	Implementing Maps for XPath 4.0 Michael Kay
13:00	Lunch
14:30	Integrating AI into XML Development Workflows Octavian Nadolu
15:00	Publisher case study for XML-enabled quality-control techniques for journal metadata, pagination, and equations M. Scott Dineen
15:30	XTH – An implementation agnostic Test Suite Runner Adam Retter
16:00	Coffee break
16:30	Excel to XML for Financial Report Tony Graham
17:00	Ant Visualiser Ari Nordström
17:30	Comparative study of “PDF to AI format” converters Elena Montero Maousidou
17:45	Jewels in Plain Sight Eamonn Neylon
18:00	Closing of the day
19:00	Social Dinner – different location than the last time!!!
21:00	Music surprise 🎸

Session details

Invisible XML: State of Play and Future Directions

Steven Pemberton

Invisible XML has had a stable specification since 2022, there are currently a half dozen implementations, and typically a dozen presentations per year have recently been given at conferences. At the beginning of 2026 the first International Symposium on the technology was held, with 14 presentations and 40 or so attendees. Meanwhile there is a working group developing the language further.

This talk will summarise the topics and issues currently being discussed within the group, and will cover such issues as:

Renaming
Modularisation
Round tripping
Ambiguity
Pragmas
Lexerless parsing
Namespaces
Versioning

Crane-txt2xml – an attempt to socialize XML for non-XML’ers

G. Ken Holman

Many people are tasked with creating XML documents manually ad hoc invoices, articles, metadata records but find XML syntax cumbersome or intimidating. Meanwhile, Invisible XML infers structure from simple text. Crane-txt2xml is a configurable environment implementable for certain types of schemas to create schema-derived iXML grammars governing text input. This leverage opens interesting possibilities including supporting multilingual element and attribute labels generating XML element and attribute names. Thus, this environment attempts to socialize XML for non-XML’ers with simple textual content editing rules for conversion to schema-valid XML.

XForms Implementation of a Federated Electronic Health Record, Using a Blockchain

John Chelsom and Mirek Mužný

Valkyrie – distributed service-oriented architecture for coordinated healthcare services – is a six year project running from 2021 to 2027, funded by the The Research Council of Norway. The project has created an architecture for a federated Electronic Health Record (EHR) to be used by mental health practitioners. According to this architecture, recording of information about a clinical encounter in any existing health record in Norway triggers the generation of an encrypted token which contains basic meta data about the encounter and a locator that can be used to access a view of the full clinical information on the source EHR.

The encrypted tokens are sent through a secure messaging channel to the Valkyrie system, where the set of tokens related to each individual patient are formed into a blockchain that represents the full (virtual) record for that patient. During an encounter between a patient and their mental health practitioner the blockchain for that patient is retrieved and the meta data in the encrypted tokens is used to determine which parts of the patient’s clinical record (history) are relevant to the encounter. A summary of the relevant clinical history is presented and the full view of any encounter listed in that summary can be retrieved from the original source EHR, using the locator embedded in the associated encrypted token.

This paper describes the implementation of an end-to-end prototype of the Valkyrie architecture created using XForms and the cityEHR open source electronic records system. The prototype was scale-tested using a network of four Raspberry Pi 5 computers, one running the Valykrie EHR and three simulating the source EHR systems.

Standards update

Norm Tovey-Walsh

Status of the work on XSLT/XPath/XQuery 4.0 and XProc standards.

Schematron 2025 – Technology Update

Erik Siegel

Schematron is a language for validating documents. It continues where schema languages like W3C Schema and RelaxNG stop. It allows you to check your documents against rules, usually expressed as XPath expressions, and define your own error messages. It is used a lot and integrated in several XML related products, for instance oXygen.

In September 2025, a new edition of the Schematron standard was published with many new features and enhancements. This presentation covers the most important changes. It will also cover the tooling that can be used to apply Schematron 2025.

Implementing Maps for XPath 4.0

Michael Kay

Maps were introduced as a new data type in XSLT 3.0 and XPath 3.1. The original motivation came from the XSLT Streaming project: if you’re going to process a large document in streaming mode,where you can’t look back at parts of the document that you’ve already skipped over, then you need to remember the data that you’ve already seen, and that needs a more versatile data structure that the atomic values and nodes of XSLT 2.0. Subsequently interest grew in allowing XSLT and XPath to process JSON, and it was realised that maps also had a big role to play there.

Maps have proved a popular addition to the language, and the draft 4.0 specifications enhance that capability in a number of ways, based on user experience. Some of those capabilities are fairly superficial, in that they provide new functionality on top of the existing data model. But others — notably the fact that maps are now ordered — have a profound effect on the design on the underlying data structures.

This paper discusses the requirements imposed by the 4.0 language design on the way that maps are implemented, and proposes solutions to these challenges.

Integrating AI into XML Development Workflows

Octavian Nadolu

XML remains a core technology for structured content, data exchange, publishing, validation, and business-rule enforcement across many enterprise systems. XML developers routinely work with schemas, XPath, XQuery, XSLT, Schematron, and related technologies that are powerful but often verbose, detail-sensitive, and expensive to maintain at scale. As XML vocabularies, transformation pipelines, and validation rules grow in size and complexity, teams face increasing pressure to improve productivity without sacrificing correctness.

Recent advances in AI, especially large language models (LLMs), have made it practical to assist XML work in day-to-day development environments. In modern XML editors and IDEs, AI can be brought directly into the authoring experience to help draft schemas, explain XPath expressions, generate XSLT templates, propose Schematron assertions, summarize unfamiliar XML structures, and review changes before they are committed. Used well, AI can accelerate routine work and reduce friction for both experts and newcomers. Used poorly, it can introduce subtle errors that pass superficial inspection. For XML teams, the opportunity is real, but so is the need for validation, governance,and human oversight.

This article is intended as a practice-oriented experience and architecture paper rather than as a benchmark study. Its contribution is threefold: it identifies the XML tasks where AI assistance is currently most useful, proposes an integration model that combines LLM-based reasoning with XML-aware tools, and distills practical lessons about validation, review, and workflow design from real editor-centered usage scenarios. The goal is not to argue that AI replaces established XML technologies, but to show how it can be incorporated into existing XML engineering practices without weakening correctness guarantees.

Publisher case study for XML-enabled quality-control techniques for journal metadata, pagination, and equations

M. Scott Dineen

Optica Publishing Group (OPG) is an innovative society publisher of 20 journals in the field of optics and photonics and founder of one of the world’s first online-only journals, Optics Express. As pressure for publication speed continues to grow, so do publisher obligations to meet accessibility requirements, funder mandates, and a variety of other commitments, many of them driven by the metadata in a research article. Here I report on three quality-control approaches OPG production has developed in the past few years to improve metadata intregrity without trade-offs in speed or cost.

XTH – An implementation agnostic Test Suite Runner

Adam Retter

The W3C XQuery and XSLT Working Groups alongside their published standards have also always published XQuery and XSLT Test Suites. These test suites allow a vendor to validate, and if desirable, promote the compliance of their XQuery and/or XSLT processor implementation(s). The test suites themselves are expressed in a custom XML grammar, with both metadata and XPath and XQuery code embedded in the XML within various test cases. The Test Suite by itself is not executable. Historically, each vendor has had to implement additional software to process the Test Suite’s catalog, execute each test case, and record the results. Great care must be taken during the development and testing of such software, so as to ensure accurate execution and reporting of test results. We introduce XTH (XML Test Harness) which is an Open Source vendor agnostic software implementation that can execute the W3C test suites. XTH requires only a small connector to be developed for each vendor’s XQuery/XSLT processor, but otherwise takes care of ensuring the correct execution and reporting of test results. Additionally, we demonstrate XTH running within a CI environment to continuously report on the compliance of some well known processors.

Excel to XML for Financial Report

Tony Graham

This short talk covers some of the more interesting XML and XSLT aspects of a recent project for a commercial bank to convert Excel worksheets into XML for uploading to a Central Bank portal to comply with anti-money laundering regulations.

Ant Visualiser

Ari Nordström

What do you do when trying to decode an Ant script from hell? Ten thousand lines of script linking all over the place? Ants and subants getting in the way? XML properties scattered everywhere? Well, if you are like me, you draw pictures. And if that isn’t enough, you write some XSLT to help you do it.

Comparative study of “PDF to AI format” converters

Elena Montero Maousidou

This talk presents a systematic benchmark study evaluating current approaches for converting PDF documents into structured formats such as XML and Markdown. A specially designed benchmark document simulates real-world publishing scenarios, including complex structures like chapters, footnotes,bibliographies, tables, and mathematical content. The study compares different tools and pipelines with respect to structural accuracy, completeness, and semantic quality.

Jewels in Plain Sight

Eamonn Neylon

The experience of using a large language model to help build a conformant web-based validator for the Character Repertoire Description Language, an accompanying library of over 80 character repertoire schemas, and a Schematron-based quality tool is described and reflected on. Consideration is given to the value of precise formal specifications for code generation. The potential for AI-assisted tooling to surface defects in standards under development, and the changed economics that make niche technical work newly viable are also discussed.