Content modelling: What, why and how

In the past few years, content modelling and structured content have been much talked about in the content strategy world. But in fact, these aren’t new ideas. Content modelling is based on the venerable database administration practice of data modelling, while the term “structured content” has a long history in technical communications. So why have content strategists jumped on board? What’s made structured content sexy all of a sudden?

For many, the release of the iPhone was the “big bang”, with organisations suddenly realising customers wanted to be able to see their web content — all their web content — via an interface radically different from the desktop environment their websites were designed for.

But even before the big bang (yes, the metaphor breaks down here), what made structured content possible in the first place was the move from static websites with content stored in individual HTML files, to content management systems where content is stored in a database and “pages” are created dynamically as they’re requested.

The magic of databases

Content management systems have many well-known advantages over static websites: easier updates, distributed authoring, and so on. But there’s an even more fundamental benefit that's sadly often ignored in real-life CMS implementations. The fact that a CMS stores data in a database makes it possible to add structure to that content, and this gives us superpowers…if we choose to accept them.

To see what I mean, let’s think about two different ways an author could add information about an event (a very common type of content) to a CMS-based website.

In the first case, our author is using a “vanilla”, out-of-the-box CMS. They create content by “adding a new page”. The elements of this page are a title and a single body field. Our author has little choice but to paste all the information about the event straight into that body field.

Now imagine another kind of CMS implementation. In this case, when our author wants to add content, they first choose what type of content they want to add. Then they’re asked for several individual pieces of information specific to that type of content. In the case of an event, this might be a title, teaser, description, photo, location, date and time, and booking instructions.

In this second case, our event is saved in the database as a collection of discrete pieces of data (fields), rather than a single undifferentiated “blob”. And that’s good for us, because there are a lot more things we can do with information organised in this way. For instance, we now have the opportunity to:

  • Add our events to a calendar
  • Display our events as a chronological listing
  • Automatically archive events when they’re in the past
  • Display a map of the location of each event
  • Lay out event information differently on desktop and mobile devices
  • Send detailed information about our events to Google, or to another website

Some of these features might involve a fair bit of extra work installing and configuring third-party modules, but even to think about implementing them requires well-structured content to begin with.

Content models and presentation models

The process of working out what types of content we need, and how they should be structured in the CMS, is known as content modelling. A content model can be thought of as a specification of the structure of content in your CMS. It includes the content types that will be available to CMS authors, and the fields that each content type will consist of.

It’s useful to distinguish between your content model — how content is structured in the database — and what I call the presentation model, which is how your content will actually appear to users on your website (what content goes where, on which page). I’ve borrowed the term "presentation model" from Jeff Eaton. Although the content model and presentation model have a lot to do with each other, it’s very useful to keep them conceptually separate, for reasons that will become more obvious to you the deeper you get into content modelling.

Working out your content types

The content types you need depend on what content you have (your content ecosystem), and on what you want to do with that content. There’s no magic formula for working out what content types you need, but a good starting point is doing an audit of your current content and looking for patterns: structures, forms of information, formatting or markup that occur over and over again.

To get you started, here are some content types that you’ll often find on websites:

  • Blog post
  • Event
  • Location
  • Product
  • Service
  • Case study
  • Publication

Most websites will also have a content type called something like “Standard page”, for those one-off pieces of content like an "About Us" page that stubbornly refuse to be shoehorned into a templated structure, or aren’t worth creating a bespoke content type for.

But a content type is not actually a type of page. One content type might be called upon for several different pages (for instance, a blog post might appear both as its own page, and on an index of blog posts). Some content types might not even get their own page, but might only appear as a component of other pages. (This is one of the differences between a content model and a presentation model.)

Once you’ve decided on your content types, it helps you keep your thinking straight if you put them all in a diagram and map out their relationships. For instance, a diagram for a simple recipe website might look like this:

diagram showing content types and relationships for a sample website

Breaking content types into fields

The next step is to work out what fields you need for each of your content types. A field might be:

  • A component (or “chunk”) of the content (a piece of text, an image, an audio or video clip...)
  • An attribute of the content (location, author, category, topic...)
  • A relationship with another piece of content

The difficult part of breaking out your content type into fields is working out how much granularity you need. With too little granularity, you risk not having the components you need to do the work you want your content to do. With too much, you risk overwhelming authors with “checkbox overload”. Good content modelling is not about finding the perfect model that describes everything in the maximum possible detail; it's about finding a balance between competing needs.

But those needs aren't always immediately obvious. The Los Angeles Times’ editorial team had its journalists adding geographical locations to their stories for several years before they were actually put to use. Then, when the paper’s website was redesigned, they launched a feature called Neighborhoods, full of rich dynamic content about every neighbourhood in the city, finally making use of all that location data.

Documenting your content model

Documenting a content model isn’t rocket science: the most common tool I use is a simple spreadsheet breaking down each content type into its constituent fields. If possible, talk to your CMS developers about exactly what information they need about each field. Usually, they'll want to know:

  • What each field is for (this isn’t always obvious!)
  • The type of field it is (text, number, etc.)
  • Whether or not the field is required
  • Whether multiple values are allowed
  • Any other validation rules or constraints (e.g. maximum number of characters)
  • Help text that should be shown to CMS authors

And there you have it: your first content model! But at this stage, what you really have is a draft. Content models (unlike presentation models) are difficult to change on the fly, so they need to be robust enough to handle anything that’s thrown at them. This means testing your content model with real content, including worst case scenarios, and being prepared to revise and revise again.

Hopefully I've shown that you don’t need a complex technical vocabulary or a deep knowledge of your CMS to create a content model that works. What you do need is an analytical mind, a well-rounded understanding of the written word (content is not just data, it's also narrative and context), the imagination to think about how your content might be used in the future, and an ability to balance the needs of content authors, users, and developers.

Further reading

This article is based on a talk Angus gave at Content Strategy Melbourne. In March, Angus will be talking about content modelling in Drupal at DrupalSouth.


« back to blog index