Tickets, members, sales: Integrating Centaman with the Australian Museum website

Written by James Murty
Published on 2 March 2020

Tagged under:

About the author

James has two decades experience building Internet systems, from enterprise applications to highly customised websites to research projects for the CSIRO to deeply understanding emerging cloud technologies. He likes getting things done with high-quality code, and solving challenging problems for cool organisations.

Visit profile

As part of the site rebuild for the Australian Museum we integrated the website with Centaman, a system the museum uses to manage ticketing and memberships and as a customer relationship management (CRM) system. By working with this back-end system we were able to offer rich online features based on the existing processes and data museum staff are familiar with, and removed the need for staff to manually keep online interactions in sync with their core management system.

In this post we will discuss some key technical approaches we used for this successful integration, and will highlight interesting features of our Python + Django + Wagtail + PostgreSQL stack along the way.

Centaman offline to online

Centaman is a "Ticketing, Reservations, Point of Sale and CRM for Museums, galleries and cultural sites" and similar institutions. It provides a broad range of features to help institutions serve and engage with their guests and to build relationships with members.

While Centaman does many things – from the back-office management to running point of sale terminals to ticket scanning – the key features we brought online were for ticketing and memberships.

Ticketing:

Sell tickets for general entry and scheduled events
Record details of online ticket sales directly in Centaman
Send e-Ticket emails to customers to print and scan for quick entry
Show times, prices, and availability for upcoming scheduled events
Reserve tickets temporarily for events with limited availability.

Memberships:

Recognise museum members when they log in to the website, by linking their online and Centaman identities
Sell memberships to loyal visitors
Let members upgrade or switch their membership online.

All of this was made possible by the REST API provided by Centaman, which made core features and data available to the website.

From API to site features

Centaman provides a web API to fetch and store back-end data, but as with all APIs there can be a few (or many) steps to get from performing basic operations with raw data, to the end goal of exposing rich and robust features that will delight visitors to your site.

For this project, we started by building a simple Python API client since there wasn't one available already. This client translates the GET/POST/PUT actions performed on the HTTP endpoints provided by the Centaman API into less abstract functions, such as get_member_by_email() and reserve_timed_tickets(). The client does just this relatively small job, we purposefully kept it only as smart as needed.

On top of this simple API client we added a service layer. The service layer ties together the client's building block functions to make complete actions, and also mixes in extra business logic and data that is relevant to the website. This layer is where we handle more complex scenarios that involve multiple steps, such as ticket reservation where we need to check if reservations are still available for an event, and whether the user already holds a reservation, before we create a new reservation or extend an existing one.

A service layer such as this is vital to integrate the features and capabilities of a third-party API with the goals and data of the site that is using it. The integration layer for a service may need to do more or less work based on a number of factors, such as the complexity of features the website needs and how closely the API's actions and data model match the business goals of the website. But in our experience we always need at least a sprinkling of integration intelligence to work effectively with all but the most basic third-party service APIs.

Where is the truth?

A key goal of integrating the Australian Museum website with Centaman is for it to be the single, central place where data about the organisation's ticketing and membership services is kept. Centaman should be the one place where museum staff can find up-to-date and accurate information, without needing to cross-check other systems or wonder whether there is a piece of data out there somewhere that belongs in Centaman but hasn't quite made it there yet.

The truth is where the data is, and for Australian Museum the truth should be in Centaman.

However, as part of our integration we regularly fetch membership and ticketing data from the Centaman API and store it in the website's own database. This jeopardises the goal of having Centaman be the single source of truth because it is no longer the one and only place where this data lives, and if things go wrong there is the potential for website data to get out of sync with Centaman leaving museum staff with two competing truths to untangle.

So why take the risk? When we integrate a site with another "source of truth" system we consider the risks and trade-offs of copying data into multiple places. In this case there were compelling reasons to copy the data, and some tactics we use to do so safely.

Reasons to make an extra copy of data for the website:

Copying vital data to the website database reduces the number of API requests the site needs to perform, and minimises the overall load on the vital central Centaman system.
The website can show pages faster if it has the data it needs nearby, instead of needing to re-fetch data from multiple API endpoints every time a user visits a page.
When we need to combine or restructure API data for use by the website we can store the resulting data, instead of doing this work for each page view.

To safely keep copies of Centaman data on the website, we use these tactics:

We treat data from the API as read-only. We do not add to or update our copy of Centaman's data because that would leave us with data changes that would require a complex syncing and reconciliation process to manage.
When a user's action on the website updates Centaman data, we apply the update directly in Centaman through its API and then re-sync changes back to the website.
For any online action that absolutely must have the latest data we re-sync that data immediately as needed. For example we re-sync vital membership status data on every page where we check a user's membership status.
We automatically log users out of the website if the Centaman system is unavailable. This prevents users from performing actions based on information that might have become outdated, and that can only be performed using Centaman.

By copying data from Centaman for the website, but doing so carefully, we struck a balance between ensuring that Centaman is the one central system that will always have the correct and latest data, and ensuring that the website performs well when exposing Centaman's capabilities.

Documents in a relational database

With Centaman as the central source of data and truth for key features of the Australian Museum website, it is important that the website stores information from Centaman as accurately and completely as possible. To do this we needed to consider a fundamental issue in data storage: the difference between document-oriented and relational approaches.

Like most modern APIs, the one provided by Centaman is document-oriented and sends and receives data in the JSON format. JSON is easy for both people and computers to understand, and can capture a rich set of information including key/value pairs, lists, and nested data. This makes JSON-based APIs ideal for integrating systems.

In contrast to JSON documents, the Django and Wagtail technologies we used to build the Australian Museum website work best with the relational data model where information is stored in a table format similar to a spreadsheet. The rich structure of JSON documents can be quite different from the purposefully flat structure of relational data, and this can present a challenge when we want to store the JSON data from Centaman in a website's relational database.

Fortunately we use and prefer the PostgreSQL database which includes powerful support for storing and using JSON type data, and Django which lets us easily access documents stored in this way through its JSONField. With the JSON compatibility provided by JSONField combined with PostgreSQL we have the best of both worlds: we can store most website data in relational format for speed, compatibility, and flexibility, and also store JSON documents from Centaman exactly as we received them.

Here is a very basic example showing how a Django model class can store JSON data from Centaman in a JSONField along with a standard relational character field:

from django.db import models
from django.contrib.postgres.fields import JSONField


class CentamanDocumentModel(models.Model):

    centaman_data = JSONField(
        blank=True, null=True, db_index=True,
        help_text="Data synced from Centaman API")

    document_type = models.CharField(
        max_length=255, db_index=True,
        help_text="Doc type for validation and filtering")

Using a combination of JSON and relational database fields worked very well for us. We were able to store documents from the Centaman system very easily with complete fidelity, and could also query their contents and compare them as needed.

Having said that, no solution is ever perfect and in our use of JSONFields we encountered some small issues and caveats that are worth mentioning:

There is a chance that PostgreSQL will not perform as well processing JSON data as it does for simpler relational data. Although we were wary of performance problems and kept a lookout, we did not hit any such issues during this implementation
The admin features built into Django and Wagtail have only basic support for JSON fields. Unlike the relational data fields in the system, the JSON documents cannot be easily viewed, edited or filtered. This wasn't a problem for us because the Centaman JSON documents are not supposed to be edited or managed in the website admin area, only in Centaman itself, but the lack of powerful admin features for JSON documents is a shame
Django's JSONField is a relatively new addition to its object-relational mapping (ORM) database tools and not all of the usual querying features were supported while we developed the website. In particular, we had problems with ordering and comparison of fields within JSON documents using Django version 2.0.x that are fixed in later versions
The data inside JSON documents is not always strongly typed when accessed through Django's ORM, or indeed within the JSON document itself, so we sometimes needed to use query annotations and field casting to get the database to convert JSON field data into specific types to do in-database ordering or comparisons. Here is an example manager method that parses a textual date value in a JSON document to a full date type:

def with_dates(self):
    return self.annotate(
        # Fetch `JoinDate` JSON field as text and CAST to DATE
        join_date=Cast(
            KeyTextTransform('JoinDate', 'external_data'),
            models.DateField())
    )

# Annotate queryset dates to use rich ORM comparisons
qs.with_dates().filter(join_date__date__gte=yesterday)

Although we hit some minor problems we were able to work around them, and we expect issues like this will be fixed over time as the many open-source developers working on Django and PostgreSQL continue to improve support for JSON documents.

There are other approaches we could have used instead of using JSONField, but they have major drawbacks compared with the relatively minor issues we faced:

We could normalise the rich JSON data to store it as equivalent relational data, decomposing the rich structure into simpler pieces. But this generally requires a tedious and error-prone process to copy every field between the structures and to split nested data across multiple tables. It would take more work and more code, increase the chance of bugs, and result in a data structure completely different to the original from Centaman's API.
We could use two databases: one relational and an additional document-centric "NoSQL" database. A document-centric database would do a very good job of storing JSON documents, but having two databases adds a lot of complexity and an extra piece of infrastructure to manage and maintain.

Overall we are happy with our decision to use JSON data in PostgreSQL given the relative tradeoffs, and we expect it to pay off well into the future as the relative simplicity of this approach for syncing Centaman data leads to fewer bugs and an easier update path as the Centaman API evolves or changes.

Checking the digital paperwork

A final, technically interesting piece of the Centaman integration is the JSON document validation we applied to make sure the data we receive from the Centaman API has the structure and content we expect.

Because the Australian Museum website relies so much on a full understanding of the Centaman data, it is important for the code we write to be very clear about which JSON data fields we are using and how we are using them. An excellent way to do that is to run each document through a validation process that checks what fields are present and whether they have the kinds of values we expect, and complains loudly if something doesn't look right.

We did exactly this for the Australian Museum website using the Cerberus validation library for Python. Whenever the site fetches a document from the Centaman API it performs a validation check on it before it does anything else: before the document is saved to the website database, processed, or has values shown on a webpage.

This early validation step gives us a chance to enforce various rules:

We can make sure that any fields we absolutely require are present in the document.
We can enforce rules about the kinds of values we expect. For example, we can ensure that some fields have a specific value type such as a number or timestamp if we need to sort them, while we can accept any textual value at all for other fields if we merely display them.

Guaranteeing that any incoming documents comply with these rules leads to some nice benefits:

If a document breaks any of our rules, we receive an error report telling us exactly what was in the document and why it was rejected. This makes it much quicker and easier to find and fix any mistaken assumptions we have about the Centaman API.
Document validation errors tend to be clearer and easier to debug than accidental misuse of data from within a document.
If the API result documents change over time in a way we don't expect, we will find out about it straight away rather than the next time that document happens to be (mis)used by the website code.

In addition to these code-quality benefits, the validation specifications also act as extra developer-readable documentation about what various API result documents contain and which parts are important to us. This allows developers to work on the site without needing a full understanding of the Centaman API, because they can easily see which parts of the API data we do or don't use.

Cerberus works by accepting validation rules as expressed in schema documents. Although this schema format is very powerful and flexible, we found it could be fiddly to write and difficult to read. So instead of writing these schemas directly, we built a set of validation functions to help build the schema, with a focus on enforcing the kinds of rules and data checks we needed. With these helper functions we can express a validation schema that looks very similar to the real document it validates.

Here is an example validation schema for the Centaman Member document type, built using our helper functions from a validation module:

from australian_museum import validation as v

CentamanMemberSchema = v.Validator({
    'document_type': {'allowed': ['centaman-member']},
    'centaman_data': v.schema_dict({
        # Member code and number integers are required
        "MemberCode": v.validate_int(),
        "MemberNumber": v.validate_int(),
        # First & last name & email are optional
        "FirstName": v.validate_str(empty=True),
        "LastName": v.validate_str(empty=True),
        "Email": v.validate_with(
            # If email address is present, validate it
            v.validator_email, empty=True),
        # We don't use home address, be lax to permit any content
        "HomeAddress": v.lax_schema_dict({}),
    }),
})

Conclusion

Hopefully this post has given you some useful insights into the technologies and techniques we used to integrate Centaman into the new Australian Museum website. This integration project was complex and required careful work and testing, both by us at Interaction Consortium and especially from the Australian Museum staff who worked with us.

We are pleased with how this integration project turned out. Our success in making Centaman's core ticketing and membership features available directly from the website has reduced the time and effort required of Australian Museum staff to manage their online presence. We are also excited to see how it continues to develop as we build upon this strong foundation.

We encourage you to visit the Australian Museum website to see what you think. And maybe join up as well to try out some of the membership features 😉.

The Centaman integration is just one of multiple integrations we built for the Australian Museum website. Keep a lookout for other posts on this blog where we will discuss some of the other things we did. Find out about our next installments by signing up to the IC’s newsletter or follow us on Twitter.