The eduOER metadata aggregation workflow possible issues.


At this section we are going to introduce some points of failure regarding the whole metadata ingestion process. A per step presentation approach will be followed:

  1. Content based restrictions: The metadata that will be harvested should describe higher education multimedia learning objects.
  2. Harvesting: At this point the most usual problems occur on data providers’ side and are mostly protocol implementation issues.
    • OAI-PMH protocol issues:
      • Bad timestamp implementation. This disables the harvester’s ability to recognize the updated records thus incremental harvesting approach won’t work.
      • Bad resumption token implementation. This makes the harvester unable to harvest all the metadata records exposed by data providers. The harvester freezes to a certain record and can’t harvest the rest metadata.
      • No deleted records policy: The absence of a deleted records policy makes the harvester and the whole aggregation workflow unable to recognize what records should be deleted from the respective repository.
  3. Transformation: During transformation the main issue that might occur is the XSLT file version. The current supported version is XSLT 2.0.
  4. Identification: If a metadata record doesn’t use any element that describes the learning objects’ existence (in LOM the technical.location and the general.identifier elements) then the specific metadata describes nothing therefore the respective metadata record is discarded.
  5. Validation: If any of the rules defined in the OER application profile is not followed then the metadata record is considered as invalid thus discarded. The OER application profile’s rules are:
    1. Based on LOM schema.
    2. Mandatory elements:
      • general.title
      • general.description
      • general.keyword
      • general.language
      • technical.location
      • technical.thumbnail
      • lifecycle.contribute.entity
      • rights.description.string
      • All LOM schema vocabularies should be followed. A web validator is provided for testing purposes here. The GN4 OER profile implementation can be found here.
  6. Filtering All mandatory elements are checked, not only if they do exist but also if they contain any text inside. The cases in which the metadata are filtered out are:
    • A mandatory element does not exist in the metadata record.
    • A mandatory element exists but is empty.
    • A mandatory element exists but is empty.
  7. Language Detection: As mentioned above the language detection mechanism could be considered as second filtering step since it filters out all metadata records that contain elements the content of which can’t be language detected. The elements below are language detected therefore their content should not be a symbol or a sequence of symbols:
    • general.keyword
    • general.title
    • general.decription
  8. Link Checking: Possible issues are:
    • The LO link is not reachable (dead link).
    • The LO link is not direct but leads to a login page,other page.
    • The LO link is not well formed.