Such is the immaturity of the provision of open education resources that there is currently no standard for representing the description of a resource in terms of content or syntax. As a result of this, it is often the case that material suitable for, and made available for use as an OER, is sub-optimally described for easy – especially automatic – processing.
For Delores Extensions a Waypoint Proxy Document (WPD) is used to describe each resource. This document contains just enough information, in a prescribed XML format, to allow the resource to be identified uniquely together with its provenance and licensing status, and to be minimally classified.
Sometimes all of the data necessary to populate an instance of the WPD is available in the ‘top-level’ description of a resource. This, however, is fairly uncommon. Where the information is missing it is necessary to resort to some more complex interrogation of the resource content and the context it is embedded in. So, for example, the licensing status of a particular resource might be found only, say, on the title page of the resource itself. To find such information may mean not only searching the text, but of transforming the format from one to another to making automatic search possible. For example, in processing content for Delores Extensions it has been necessary to manually select files for conversion from PDF to text, using PDFBox to achieve the format migration.
Often, the missing descriptive data will be found remote from the resource itself. Such is often the case where licensing information which applies to a set of resources is quote once only, say, on the source home page.
When assembling the Delores Extensions collection of resources much ‘manual sleuthing’ has gone on. Much of this labour would be made redundant at a stroke were a standard for the description of OERs be developed by the community and embraced by the OER providers.