Back

Layers, Multi-Purpose Taxonomies, and the Ghost in the Machine

October 23, 3:00 PM

Challenges and Solutions

Effective, unbiased machine learning models require clean, consistent, contextual, and well-considered data.

Taxonomies are foundational knowledge organization structures that provide a single source of truth for concepts used for purposes like navigation, insight discovery, and content and data tagging, in machine learning training sets.

Ontologies model knowledge domains, acting as the architectural underpinning and ruleset for how controlled vocabulary concepts are characterized, relate to each other, and are applied to content.

Together, taxonomies and ontologies are the semantic layer representing an organization’s subject matter expertise, shared understanding, knowledge, and viewpoints applied to content and data that power a wide variety of applications. Challenge: they are at risk of carrying inbuilt subjectivity and bias, a ghost in the machine, flowing into other data consuming systems and machine learning models.

Solution: Apply semantic models that reducing biases.

Challenge: How?

Solutions:

  • Processes for building and applying semantic models representing the business while reducing the introduction of biases which can skew downstream applications.
  • Techniques for developing taxonomies and ontologies that mitigate subjective viewpoints, including peer review, diverse expert involvement, and external source consultation.
  • Strategies for managing multi-purpose taxonomies in rapidly changing business environments, balancing the need for sustainable, unbiased models with immediate business demands.

In short: Create robust semantic models that adapt to evolving business requirements while maintaining data quality and minimizing bias.