Taxonomic Challenges

Over on SpokenWord.org we started with a set of “source” categories such as Conference, Interview, Lecture, Sermon and so on. These categories turned out to be rather useless since very few visitors really cared whether a recording was from a conference or a lecture, for example. What they cared about was whether it was about chemistry or China, which this taxonomy didn’t address.

Next we decided to go with a free-form tagging folksonomy as do many other content sites. For better or worse, we have a semi-automated source of tags: the <keyword> elements of the RSS feeds that supply most of our new programs. Tagging has worked quite well as a search mechanism: a way to actively find content. You can now search for chemistry or China and get reasonable results.

But we also want to present content in a more traditional manner. We want to proactively feature programs (particularly on the home page) in ways that will encourage first-time visitors to listen and view. So we’re thinking of re-instituting a taxonomy of categories in addition to our tags. Now comes the challenge of defining the categories. Here’s the taxonomy we have so far. We want to keep the count to no more than fifteen, so we need to combine where possible, but we want to make sure any spoken-word content fits into at least one category appropriately.

  • business and finance
  • science and technology
  • health and medicine
  • education
  • arts, entertainment, media and literature
  • energy/environment
  • food and drink
  • religion
  • government and politics (current affairs?)
  • sports, recreation & hobbies
  • travel/history
  • comedy (humor)

Anything missing? Remember, these are topical categories, not sources, media, etc.

Update: Here’s another option. We could simply adopt the categories used by iTunes for podcasts. It’s not perfect, but it has the advantage that all of our collections and feeds would be guaranteed compatible with iTunes’ taxonomy. Here’s the list from Apple:

  • Arts
    • Design
    • Fashion & Beauty
    • Food
    • Literature
    • Performing Arts
    • Visual Arts
  • Business
    • Business News
    • Careers
    • Investing
    • Management & Marketing
    • Shopping
  • Comedy
  • Education
    • Education Technology
    • Higher Education
    • K-12
    • Language Courses
    • Training
  • Games & Hobbies
    • Automotive
    • Aviation
    • Hobbies
    • Other Games
    • Video Games

  • Government & Organizations
    • Local
    • National
    • Non-Profit
    • Regional
  • Health
    • Alternative Health
    • Fitness & Nutrition
    • Self-Help
    • Sexuality
  • Kids & Family
  • Music
  • News & Politics
  • Religion & Spirituality
    • Buddhism
    • Christianity
    • Hinduism
    • Islam
    • Judaism
    • Other
    • Spirituality

  • Science & Medicine
    • Medicine
    • Natural Sciences
    • Social Sciences
  • Society & Culture
    • History
    • Personal Journals
    • Philosophy
    • Places & Travel
  • Sports & Recreation
    • Amateur
    • College & High School
    • Outdoor
    • Professional
  • Technology
    • Gadgets
    • Tech News
    • Podcasting
    • Software How-To
  • TV & Film
DeliciousStumbleUponDiggTwitterFacebookReddit

3 thoughts on “Taxonomic Challenges

  1. Ted Roche

    Clay Shirky’s “Ontology is Overrated” (http://itc.conversationsnetwork.org/shows/detail470.html) literally changed the direction of a web cataloging app I was in the midst of developing. Clay’s tales of attempts at categorization at Yahoo! provide a cautionary tale.

    I think you can get away with your categories, or iTunes, if you recognize that a single item needs to be listed in more than one hierarchy. Technically, a leaf node can exist at the end of any number of branches. So, a Ruby conference presentation (Technical) on how a startup formed and succeeded (Business) at a cool new web site that allows musicians to get together for charitable social causes (Music, Society) ought to be found at the end of which ever of the paths a browsing visitor travels down.

    Note that logically, if an item can appear at the end of any number of branches, you don’t have a tree structure, but a web. And you haven’t created a taxonomic hierarchy as much as a tagged web. Which is a Good Thing.

    Reply
  2. Stephen Hill

    Doug,

    Trying to reduce the scope of human activity to 15 categories is an exercise in frustration. I would argue for a more flexible, two-tier system.

    If you arbitrarily limit the number of categories to an integer based on screen real estate, you will wind up with blandly generic categories, and a lot of items that have be put in “Other” or “Miscellaneous.” You will also have to put many items in two or more categories. In either case you will still have to look at the tags to get a better sense of the actual subject matter of the content.

    Better, I think, to have a smaller number of extremely generic categories (Business, Politics, Technology, etc) with an easily expandable list of subcategories that can be responsive to changes in the ongoing cultural conversation.

    If you do this with rollovers and flyout submenus, it will actually take less screen real estate. Then make sure the database allows for assignment to multiple categories and subcategories.

    :: SH

    Reply

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>