Unit Mapping#

work in progress

One of the simplest ways of getting an overview of a unit is to generate a simple index or map of the major headings and subheadings.

Example of prior work: Generating Mind Maps from OU/OpenLearn Structured Authoring XML Documents

from sqlite_utils import Database

# Open database connection
dbname = "all_openlean_xml.db"
db = Database(dbname)

Let’s get the OU-XML for an arbitary unit:

from lxml import etree
import pandas as pd

# If there are multiple units associated with H807, pick the first
h807_xml_raw = pd.read_sql("SELECT xml FROM xml WHERE code='H807'", con=db.conn).loc[0, "xml"]

# Parse the XML into an xml object
root = etree.fromstring(h807_xml_raw)

Bring in our simple utility function to help flatten elements, if required:

import unicodedata

def unpack(x):
    return etree.tostring(x)

# via http://stackoverflow.com/questions/5757201/help-or-advice-me-get-started-with-lxml/5899005#5899005
def flatten(el):
    """Utility function for flattening XML tags."""
    def _flatten(el):
        if el is None:
            return ""  # Originally returned None; any side effects of move to ''?
        result = [(el.text or "")]
        for sel in el:
            result.append(_flatten(sel))
            result.append(sel.tail or "")
        return unicodedata.normalize("NFKD", "".join(result)) or " "
    return _flatten(el).strip()

We can now grab all the headings and subheadings and render a simple contents list for the unit. To display the contents, we can use a simple tree widget.

Let’s start by parsing out the title of the unit:

title = root.find("ItemTitle").text
code = root.find("CourseCode").text

title, code
('Accessibility of eLearning', 'H807_1')

We can now build up out tree from session and section headings:

#%pip install ipytree
# ipytree provides access to a jstree wdget
from ipytree import Tree, Node

# Create a tree object
tree = Tree()

# Create a unit title node for our tree
node1 = Node(f"{title} ({code})")

# Add the unit title node to the top of the tree
tree.add_node(node1)

sessions = root.findall('.//Unit/Session')

unit_structure = {"title": {}}
for session in sessions:
    title = session.find('.//Title').text
    subnode = Node(title)
    node1.add_node(subnode)
    
    subsessions=session.findall('.//Section')
    for subsession in subsessions:
        heading = subsession.find('.//Title').text
        subnode.add_node( Node(heading) )

tree

The tree widget doesn’t appear to render when I flow this document as part of a Jupyer Book, so I need to find an alternative tree display for this demo. In the meantime, here’s a screehshot to get a flavour of what you’re missing…

Screenshot of ipytree tree widget output showing the expnaded table of contents for one section of an OpenLearn Unit and a collapsed view of another.

It would be easy enough to generate a table contain session and section headings across all units and then use that as a way of providing a heading level search to retrieve items at that level of granularity.

Generating Tables of Contents Derived From Sections in Different Units#

As well as generating tree listings of session and section headings related to a single unit, we can also generate table of content views over sections retrieved from multiple units.

For example, TO DO - search around a term to retrieve items from multiple units and generate a “customised” uniti on a topic, eg ordered by level, etc