Click here to close now.


API Journal Authors: Yeshim Deniz, Liz McMillan, Yakov Fain, XebiaLabs Blog, Jim Scott

Related Topics: Agile Computing, Industrial IoT, Open Source Cloud, API Journal, IoT User Interface, Cloud Security

Agile Computing: Blog Post

Using Taxonomy to Drive Online Contextual Advertising with Sophializer

Classifying Web Content to the IAB Taxonomy

It’s a Big Market …
…  online advertising.  There are 10,000 stories and data points about it.  Here are two to give some context to the journey below.  First, global online ad spending is projected by ZenithOptimedia to exceed print ad spend by 2015 (note 1).  This 2015 projected spend figure for online advertising is $132.4 billion.  Second, global online ad revenue is projected by another research agency, Digital TV Research, to hit $143 billion by 2017 (note 2).

These are prodigious amounts of money for companies to spend to connect with customers.  But … surely it’s easy to connect online customers to web content featuring, or suggesting, products? And surely, online is “better”?  Where can, and do, taxonomy-based approaches add value to this dance of moving (emotional and semantic) parts between the intentful consumer poised to shop and the intentful marketer with honed content?

Online Ad Targeting is Easy … so 'They' Say …
Really?  So what might be “easy”?  And, indeed, “better”?  Let’s unbundle these simulacra that look like very fuzzy concepts, and as ontologists and knowledge engineers let’s think our way forward with the concept of “precision”.

So … online is more precise than billboards by freeways?  Lightly stated, online has advantages.  What about magazine print ads vs. online?  Online has potential advantages. But … and this is a very big but … in both these cases (and all others) online depends on connecting potential customers to products, their features, their benefits, their attributes and so on precisely, and with precision that is repeatable and extensible.  Rather than random (random is the most expensive way to advertise and has fallen out of favor).  And, since online copy and online ads are words (including in videos) and are semantically classifiable, and since classifications can be organized into models (taxonomies and ontologies) … then there are advantages to be created through the combination of semantic analysis, categorization and taxonomy.

Now, let’s connect taxonomy, classification, semantics and optimizing online ad targeting.  There are a host of holy grails currently being sought in the web/mobile/social uber-ecosystem.  Some are well found, though not perfect, and are unlikely to traverse through a paradigmatic improvement.  Think ‘search’.  Others are most definitely not found (yet).  Given the size of the market outlined in the first paragraph, the rewards are huge to those with the tools and skillsets that know how to work with semantics, taxonomy/ontology, classification of content to taxonomy, and design of taxonomies to drive online targeting.

New Approaches to Classifying to the IAB Taxonomy with Sophializer
Sophia Search
is a recent entrant into this space.  (I have written about them before here.   Sophia Search’s tool – currently called the ‘Sophializer’ – categorizes any URL to nodes in the Internet Advertising Bureau (IAB) taxonomy.  Sophializer can also classify content of ads (and so create a semantic/conceptual ‘signature’ for each). The IAB Contextual Taxonomy comprises three levels:

  • Tier 1 – 23 nodes
  • Tier 2 – 371 nodes
  • Tier 3 – unspecified and vendor specific

Given that Sophializer categorizes both sides of this content dance – web page and ad – web properties can serve ads to any page automatically using the IAB taxonomy as the cross-mapping conceptual foundation.

Sophializer not only classifies to Tier 1 and Tier 2 it also discovers/generates robust classifications that can be used to customize Tier 3 for individual customers.

Benefits of Using Taxonomy for Ad Targeting
Taxonomy gives a framework to this kind of semantic work.  Essentially, we are cross-mapping both partners of this content dance – content and ad - using the IAB taxonomy  as a “choreographer” of sorts.  Other taxonomies could be used.  In fact, multiple taxonomies could be used – and this would be particularly powerful if these taxonomies were cross-mapped to each other.  For example, if you have content (web page, say, or ad) categorized and mapped to Taxonomy A and Taxonomy A is cross-mapped to the IAB taxonomy … then … you can propagate these ads to content that is already categorized.

Benefits of Using Categorization Tools to Assign Marketing Content to Taxonomy Nodes
There are a number of different methods of assigning content to nodes in any taxonomy –

  • Manually
  • Training sets of documents (training documents are most often manually selected as exemplars)
  • Categorization algorithms that work with semantic tokens

There is more than enough to say on each of these around methods, workflows, best practices and pitfalls for a blog post on each.  But not here.

Sophializer utilizes patented and proprietary algorithms in the core of their categorization engine.  Two fundamental points are worth, briefly, focusing on.  Firstly, different categorization engines use different patented technologies.  “Quality” from different categorizers is (very) variable.  Which is why it is important to carry out “Proofs of Concept” when evaluating this technology.

Secondly, the more semantically rich the taxonomy – e.g. fully enriched with synonyms and other types of evidence terms – the better “quality” one gets with any method of associating content to taxonomy nodes.   Both of these parameters are make-or-break (literally) in using semantics to target online ads.

Learn More 2.0
The Google Display Network is IAB Certified and complies with the top 2 tiers of the IAB Contextual Taxonomy.  You can read details of what Google do here and this also navigates you to the Google mapping to the IAB taxonomy Tier 1 and Tier 2.

Sophia Search currently has a number of engagements on the web that are live.  For example, targeting ads for non-fiction books (from a major publishing house) to news stories (on a pre-eminent news site).  You can contact them for details.

This is not an empty space.  Other companies are also searching for the holy grail of taxonomy-based content targeting mediated by content categorization that works.  See, for example, see ADmantX (

This whole space is an excellent example of where the application of the nexus of taxonomy, categorization and semantics will provide stratospheric business benefit.  Grails are waiting to be found here.

Note 1.  See ZenithOprimedia

The detailed ZenithOptimedia figures can be found here

Note 2.  See Hollywood Reporter

You can download the Digital TV Research press release about these figures here

@ThingsExpo Stories
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new data-driven world, marketplaces reign supreme while interoperability, APIs and applications deliver un...
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data shows "less than 10 percent of IoT developers are making enough to support a reasonably sized team....
Electric power utilities face relentless pressure on their financial performance, and reducing distribution grid losses is one of the last untapped opportunities to meet their business goals. Combining IoT-enabled sensors and cloud-based data analytics, utilities now are able to find, quantify and reduce losses faster – and with a smaller IT footprint. Solutions exist using Internet-enabled sensors deployed temporarily at strategic locations within the distribution grid to measure actual line loads.
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
You have your devices and your data, but what about the rest of your Internet of Things story? Two popular classes of technologies that nicely handle the Big Data analytics for Internet of Things are Apache Hadoop and NoSQL. Hadoop is designed for parallelizing analytical work across many servers and is ideal for the massive data volumes you create with IoT devices. NoSQL databases such as Apache HBase are ideal for storing and retrieving IoT data as “time series data.”
Today air travel is a minefield of delays, hassles and customer disappointment. Airlines struggle to revitalize the experience. GE and M2Mi will demonstrate practical examples of how IoT solutions are helping airlines bring back personalization, reduce trip time and improve reliability. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Dr. Sarah Cooper, M2Mi's VP Business Development and Engineering, will explore the IoT cloud-based platform technologies driving this change including privacy controls, data transparency and integration of real time context w...
The Internet of Everything is re-shaping technology trends–moving away from “request/response” architecture to an “always-on” Streaming Web where data is in constant motion and secure, reliable communication is an absolute necessity. As more and more THINGS go online, the challenges that developers will need to address will only increase exponentially. In his session at @ThingsExpo, Todd Greene, Founder & CEO of PubNub, will explore the current state of IoT connectivity and review key trends and technology requirements that will drive the Internet of Things from hype to reality.
The IoT market is on track to hit $7.1 trillion in 2020. The reality is that only a handful of companies are ready for this massive demand. There are a lot of barriers, paint points, traps, and hidden roadblocks. How can we deal with these issues and challenges? The paradigm has changed. Old-style ad-hoc trial-and-error ways will certainly lead you to the dead end. What is mandatory is an overarching and adaptive approach to effectively handle the rapid changes and exponential growth.
Today’s connected world is moving from devices towards things, what this means is that by using increasingly low cost sensors embedded in devices we can create many new use cases. These span across use cases in cities, vehicles, home, offices, factories, retail environments, worksites, health, logistics, and health. These use cases rely on ubiquitous connectivity and generate massive amounts of data at scale. These technologies enable new business opportunities, ways to optimize and automate, along with new ways to engage with users.
The IoT is upon us, but today’s databases, built on 30-year-old math, require multiple platforms to create a single solution. Data demands of the IoT require Big Data systems that can handle ingest, transactions and analytics concurrently adapting to varied situations as they occur, with speed at scale. In his session at @ThingsExpo, Chad Jones, chief strategy officer at Deep Information Sciences, will look differently at IoT data so enterprises can fully leverage their IoT potential. He’ll share tips on how to speed up business initiatives, harness Big Data and remain one step ahead by apply...
There will be 20 billion IoT devices connected to the Internet soon. What if we could control these devices with our voice, mind, or gestures? What if we could teach these devices how to talk to each other? What if these devices could learn how to interact with us (and each other) to make our lives better? What if Jarvis was real? How can I gain these super powers? In his session at 17th Cloud Expo, Chris Matthieu, co-founder and CTO of Octoblu, will show you!
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool for visual control over the cloud and the best price/performance value available. ProfitBricks was named one of the coolest Clo...
As a company adopts a DevOps approach to software development, what are key things that both the Dev and Ops side of the business must keep in mind to ensure effective continuous delivery? In his session at DevOps Summit, Mark Hydar, Head of DevOps, Ericsson TV Platforms, will share best practices and provide helpful tips for Ops teams to adopt an open line of communication with the development side of the house to ensure success between the two sides.
SYS-CON Events announced today that IBM Cloud Data Services has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IBM Cloud Data Services offers a portfolio of integrated, best-of-breed cloud data services for developers focused on mobile computing and analytics use cases.
SYS-CON Events announced today that Sandy Carter, IBM General Manager Cloud Ecosystem and Developers, and a Social Business Evangelist, will keynote at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
Developing software for the Internet of Things (IoT) comes with its own set of challenges. Security, privacy, and unified standards are a few key issues. In addition, each IoT product is comprised of at least three separate application components: the software embedded in the device, the backend big-data service, and the mobile application for the end user's controls. Each component is developed by a different team, using different technologies and practices, and deployed to a different stack/target - this makes the integration of these separate pipelines and the coordination of software upd...
Mobile messaging has been a popular communication channel for more than 20 years. Finnish engineer Matti Makkonen invented the idea for SMS (Short Message Service) in 1984, making his vision a reality on December 3, 1992 by sending the first message ("Happy Christmas") from a PC to a cell phone. Since then, the technology has evolved immensely, from both a technology standpoint, and in our everyday uses for it. Originally used for person-to-person (P2P) communication, i.e., Sally sends a text message to Betty – mobile messaging now offers tremendous value to businesses for customer and empl...
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
WebRTC converts the entire network into a ubiquitous communications cloud thereby connecting anytime, anywhere through any point. In his session at WebRTC Summit,, Mark Castleman, EIR at Bell Labs and Head of Future X Labs, will discuss how the transformational nature of communications is achieved through the democratizing force of WebRTC. WebRTC is doing for voice what HTML did for web content.
Nowadays, a large number of sensors and devices are connected to the network. Leading-edge IoT technologies integrate various types of sensor data to create a new value for several business decision scenarios. The transparent cloud is a model of a new IoT emergence service platform. Many service providers store and access various types of sensor data in order to create and find out new business values by integrating such data.