Think Data First, Platform Second – Why Data Fuels MDM



As the volume of data coming into organizations – from both internal and external sources – continues to grow and makes its way across departmental systems in many different formats, there is a critical need to create a single, holistic view of the key data entities in common use across the enterprise. Master Data Management (MDM) aims to accomplish this goal. Not surprisingly, MDM has become a significant priority for global enterprises, with the market expected to triple from $9.4B to $26.8B by 2020 according to analysts.

But while everyone is investing serious cash into the tools to manage the data, few are putting any thought into the data itself. This is akin to purchasing a luxury sports car and fueling it with water. Sure it looks great, but it won’t get you very far.


The underlying concept of MDM is surprisingly simple: get everyone “on the same page” looking at the same data and ensure it is accurate. Yet, master data and its management continue to be a universal challenge across many industries.  Organizations of all shapes and sizes share similar problems related to master data and can all reap benefits from solving them. That means concentrating on the quality of the data before going shopping for the sexiest MDM platform. In essence, you must master data before you can manage it. Ensuring the quality, structure, and integrability is your responsibility; your MDM platform won’t do that for you. It’s like purchasing a top-of-the-line oven and expecting it to produce a delectable meal. You are responsible for what goes into it.

Master Data Defined

Master Data is the foundational information on customers, vendors and prospect that must be shared across all internal systems, applications, and processes in order for your commercial data, transactional reporting, and business activity to be optimized and accurate. Because individual businesses and departments have a need to plan, execute, monitor and analyze these common entities, multiple versions of the same data can reside in separate departmental systems. This results in disparate data, which is difficult to integrate across functions and quite costly to manage in terms of resources and IT development. Cross-channel initiatives, buying and planning, merger and acquisition activity, and content management all create new data silos. Major strategic endeavors, part of any business intelligence strategy, can be hampered or derailed if fundamental master data is not in place. In reality, master data is the only way to connect multiple systems and processes both internally and externally.

Master data is the most important data you have.  It’s about the products you make and services you provide, the customers you sell to and the the vendors you buy from. It is the basis of your business and commercial relationship. A primary focus area should be your ability to define your foundational master data elements, (entities, hierarchies and types) and then the data that is needed (both to be mastered and to be accessible) to meet your business objective. If you focus on this before worrying about the solution, you’ll be on the right course for driving success with MDM. Always remember, think data first and platform second.

4 Wishes Data-Inspired Leaders Want This Holiday

4 Wishes Data-Inspired Leaders Want This Holiday | D&B

4 Wishes Data-Inspired Leaders Want This Holiday | D&B


What Every Data-Inspired Leader Wants This Holiday

With the holidays in full swing, everyone is busy making their lists and checking them twice. But while electronics and toys routinely top the wish lists for most, the data-inspired leaders of the world have some unique desires that can’t easily be purchased from your favorite store.

Whether you’ve been naughty (online hookup site for married couples was breached by hacking outfit, The Impact Team, and the personal details of 37M users were made public, leaving many men sleeping on the couch) or nice (Data Science for Social Good, a program at the University of Chicago that connects data scientists with governments, is working to predict when officers are at risk of misconduct, with the goal of preventing incidents before they happen), chief data officers, data scientists and all data stewards want better and safer ways to do their jobs.

Instead of playing Santa and asking them to sit on my lap and tell me what they want for the holidays, I figured I’d simply share some of the top things we’ve heard on data leaders’ wish lists this year.

1. A Better Way to Find Truth in Data

Mark Twain famously said, “There are three kinds of lies: lies, damned lies, and statistics.” One of the biggest problems we’re faced with every day is trying to make sense of the data we have. In a perfect world the answer to all of our questions would lie smack dab in the data itself, but that’s not the case. The premise that data can get us closer to that single version of the truth is harder to achieve than first thought. But it hasn’t stopped us from trying to form conclusions from the data that is presented. Sometimes we rush to conclusions in the face of mounting pressure from others who demand answers.

What we really need is a source of truth to compare it to, otherwise it is very hard to know what the truth actually is. Unfortunately, that is often an impossible goal – finding truth in a world of ambiguity is not as simple as looking up a word in the dictionary. If you think about Malaysia Airlines Flight 370, which tragically disappeared in 2014, there were several conflicting reports claiming to show where the downed airline would be found. Those reports were based on various data sets which essentially led to multiple versions of proposed “truth.” Until they finally found pieces of the wreckage, searchers were looking in multiple disconnected spots because that was what the “data” said. But without anything to compare it to, there was no way to know what was true or not. This is just one example how data can be used to get an answer we wall want. This same thing happens in business everyday, so the takeaway here is that we need to stop rushing to form conclusions and try to first understand the character, quality and shortcomings of data and what can be done with it. Good data scientists are data skeptics and want better ways to measure the truthfulness of data. They want a “veracity-meter” if you will, a better method to help overcome the uncertainty and doubt often found in data.

2. A Method for Applying Structure to Unstructured Data

Unstructured data – information that is not organized in a pre-defined manner, is growing significantly, outpacing structured data. Experts generally agree that 80-85% of data is unstructured. As the amount of unstructured data continues to grow, so does complexity and cost of attempting to discover, curate and make sense out of this data. However, there are benefits when it is managed right.

This explosion of data is providing organizations with insights they were previously not privy to, nor that they can fully understand. When faced with looking at data signals from numerous sources, the first inclination is to break out the parts that are understood. This is often referred to as entity extraction. Understanding those entities is a first step to drawing meaning, but the unstructured data can sometimes inform new insights that were not previously seen through the structured data, so additional skills are needed.

For example, social media yields untapped opportunities to derive new insights. Social media channels that offer user ratings and narrative offer a treasure trove of intelligence, if you can figure out how to make sense of it all. At Dun & Bradstreet, we are building capabilities that give us some insight into the hidden meaning in unstructured text. Customer reviews provide new details on the satisfactory of a business that may not previously be seen in structured data. By understanding how to correlate negative and positive comments as well as ratings, we hope to inform future decisions about total risk and total opportunity.

With unstructured data steadily becoming part of the equation, data leaders need to find a better way to organize the unorganized without relying on the traditional methods we have used in the past, because they won’t work on all of the data. A better process or system that could manage much or all of our unstructured data is certainly at the top of the data wish list.

3. A Global Way to Share Insights

Many countries around the world are considering legislation to ensure certain types of data stay within their borders. They do this out of security concerns, which are certainly understandable. They’re worried about cyber-terrorism and spying and simply want to maintain their sovereignty. Not surprisingly, it’s getting harder and harder to know what you may permissibly do in the global arena. We must be careful not to create “silos” of information that undermine the advancement of our ability to use information while carefully controlling the behaviors that are undesirable.

There’s a method in the scientific community that when you make a discovery, you publish your results in a peer-reviewed journal for the world to see. It’s a way to share knowledge to benefit the greater good. Of course not all knowledge is shared that way. Some of it is proprietary. Data falls into that area of knowledge that is commonly not shared. But data can be very valuable to others and should be shared appropriately.

That concept of publishing data is still confusing and often debated. Open data is one example, but there are many more nuanced approaches. Sharing data globally requires a tremendous amount of advise-and-consent to do this in a permissible way. The countries of the world have to mature in allowing the permissible use of data across borders in ways that do not undermine our concerns around malfeasance, but also don’t undermine the human race’s ability to move forward in using this tremendous asset that it’s creating.

4. Breeding a Generation of Analytical Thinkers

If we are going to create a better world through the power of data, we have to ensure our successors can pick up where we leave off and do things we never thought possible. As data continues to grow at an incredible rate, we’ll be faced with complex problems we can’t even conceive right now, and we’ll need the best and brightest to tackle these new challenges. For that to happen, we must first teach the next generation of data leaders how to be analytically savvy with data, especially new types of data that have never been seen before. Research firm McKinsey has predicted that by 2018, the U.S. alone may face a 50% to 60% gap between supply and demand of deep analytic talent.

Today we teach our future leaders the basics of understanding statistics. For example, we teach them regression, which is based on longitudinal data sets. Those are certainly valuable skills, but it’s not teaching them how to be analytically savvy with new types of data. Being able to look at data and tell a story takes years of training; training that is just not happening at the scale we need.

High on the wish list for all data stewards – and really organizations across the globe, whether they realize it or not – is for our educational institutions to teach students to be analytical thinkers, which means becoming proficient with methods of discovering, comparing, contrasting, evaluating and synthesizing information. This type of thinking helps budding data users see information in many different dimensions, from multiple angles. These skills are instrumental in breeding the next generation of data stewards.

Does this reflect your own data wish list? I hope many of these will come true for us in 2016 and beyond. Until then, wishing you the very best for the holiday season…

Structured Data Still Dominates Data Centers

databaseIn spite of all the attention paid to unstructured data as the raw input for Big Data initiatives, there’s a counterpoint circulating in IT circles: Structured data is still king when it comes to volumes and business relevance.

In recent years, much of the conversation about Big Data has focused on the growing amount of unstructured data: video, tweets, podcasts and so on. The stuff that can be truly fascinating, packed with information, and yet doesn’t fit neatly into a spreadsheet or database.

But as companies rush in to provide ways to capture and analyze unstructured data, Dell has issued a report stressing that good, old-fashioned structured data remains every bit as important. In fact, it may be even more critical.

The Dell report spells it out this way: Although advancements in the ability to capture, store, retrieve and analyze new forms of unstructured data have received significant attention, a survey of 300 database administrators revealed that most organizations are focused primarily on managing their structured data – and they plan to continue that focus in the foreseeable future. The survey estimates 75 percent of data managed by organizations is still structured data.

Of course, over time, the report’s authors at Unisphere Research, a division of Information Today, say unstructured data will continue to grow in importance, especially as companies develop new analytics and tools to capture and dissect this information.

Here’s another surprise: For all we hear about the exploding volumes of unstructured data, the report predicts that structured data will grow even faster. That’s because structured data comes from both external sources, such as e-commerce transactions, as well as internal documents and processes such as human resources.

The prediction is based on how database administrators measure data volumes today. More than a third of respondents said their structured data is growing at a rate of 25 percent or more per year, whereas fewer than 30 percent of respondents said the same about their unstructured data.

None of these points are meant to minimize the potential of capturing and leveraging unstructured data. This is not an either/or scenario. It’s about being able to do both things well.

The report is a great reminder, though, to think critically about bright shiny new objects. Moving forward, it’s going to be important for businesses to continue investing in traditional data systems if they want to really take full advantage of all the information available.

Image credit: Tim Morgan

Where Can Customer Insight Take You?

binocularsHow about straight to some pretty compelling business results?

Today, as the economic recovery picks up steam, it’s no wonder there’s a lot of emphasis being placed on the power that customer insight brings to the business process. Quality insight has significant and tangible benefits for the business, including



  • Sales cycle time reduced from 120 to 108 days
  • Customer retention rates increased by 4-5%
  • Day sales outstanding reduced to 2-3 days
  • Time required to resolve customer issues reduced by 3-5%

Marketing and sales executives understand that accurate, current and relevant insight into customers and prospects make a big difference in their effectiveness.  Getting it, however,  has been a lot like the quest for the Holy Grail. A lot of research for list vendors, a lot of sources saying they’ve got the answer, a lot of money spent on what promises to be the solution, and a lot of disappointment in the results.  The challenge is real—the Sales and Marketing Institute and D&B estimate that more than 95% of email addresses and contact data within customer files and customer relationship management (CRM) systems are partially inaccurate.  The rate of change in customer information is daunting; every 30 minutes

  • 120 business addresses change
  • 75 telephone numbers change
  • 15 company names change
  • 10 businesses close
  • 20 CEOs leave their job

The question then becomes is it really possible to get accurate customer insight in front of all stakeholders who could act on the information: marketing, sales, customer service, finance and more?

Yes. Absolutely, unequivocally, yes. Savvy chief marketing officers (CMOs) are asking their CRM vendors:

  • Where is the data coming from?
  • What processes are used to ensure quality, accuracy and relevance?
  • What specific data is included?
  • Is the information updated in real-time?
  • Can we score/sort/filter the data with preferences specific to our needs?

With the right insight into customers and prospects—complete, relevant, pervasive—all at your teams’ fingertips, so much more can be accomplished.  Knowing which customers to pursue, engaging them with relevant information on their industries and their own conditions makes every effort more effective.

What are you doing to make customer insight the engine that drives your business?

At the very least, make sure your CRM has access to on-demand customer and prospect data so that you have the most current and accurate information to meet your goals.

Photo credit Dana Beveridge.

Four Pitfalls to Avoid when Starting a Big Data Project

BigDataTileIn spite of all the excitement over Big Data, many companies will struggle with their Big Data initiatives in 2014.  Why’s that? Because there are some common myths and misperceptions in the market about the best ways to approach Big Data projects – and they can really trip you up.

Pitfall #1. Putting the cart before the horse. Clients often come to their Big Data projects with clear goals: to mine social networks, gather sensor data or add public data. That’s the promise of a Big Data initiative. The reality can be quite a bit less exciting. Most of the clients I talk to consistently ignore the basic practicalities of dealing with ever-growing data stores. When I examine their data architectures, it’s common to find that their data is in a poor state – riddled with errors, with records duplicated between systems. In the wild, data proliferates. And since storage is relatively inexpensive, it’s common to have data sprawl, both on premise and in the cloud, with companies keeping multiple copies of their data in different systems. The most extreme case of this I’ve seen was a client that was keeping nine copies of corporate data – not a good foundation for any Big Data initiative.

Companies with Big Data ambitions for the New Year should start by improving the quality, or data hygiene, of the data they already own. A good first step is to synchronize all copies of their corporate data.  It makes no sense to add Big Data into an unmanaged mix of corporate data that is so common in most environments.

Pitfall #2. Putting data quality at the end of the process. There’s a common myth that it’s best to address data quality at the end of the Big Data pipeline, during the later stages of processing. While it may be true that the new data added in a Big Data project, such as behavioral data or location data, can initially contain ‘junk’ data, that approach can cause serious problems when you’re dealing with your core corporate dataset.

A better approach is to apply data quality checks at the beginning of the pipeline, to improve the quality of the data you already own and to lay a strong foundation for the Big Data project. This is best done before acquiring new streams and types of data, such as external data, public data, machine to machine (M2M) data and the like. Applying data quality checks is also part of the process of weaning in new datasets, such as public data, before combining that data with existing corporate datasets.

There are a variety of solutions you can use to de-duplicate, validate and complete your key data, such as your customer and company information. Vendors such as D&B offer dataset integration with familiar analyst tools, such as PowerBI in Office 365 and also in Excel, that make their Cleanse and Match and Business Verification products attractive in ‘clean-at-the-beginning’ scenarios for Big Data.

Pitfall #3. Embracing new technologies – even if they don’t solve real-world problems. There’s a lot of buzz around Hadoop, in-memory computing and actionable analytics. All these technologies have their uses. But keep in mind that tomorrow’s magic data visualization tool won’t help anyone make sense of source data that is basically unsound.

At the risk of repeating myself: Start by evaluating and cleaning your data. It’s a good idea, and it can have a side benefit. When cleaning source data, I often find a very large amount of waste in the source systems, resulting in what I call “Small Big Data” projects. These projects can run on source systems such as relational databases, because the data volumes don’t warrant use of Hadoop or NoSQL solutions. Running “SmallBigData” projects using current systems and processes can be a time and money saver, since there are relatively few (if any) training needs for your existing staff.  I’ve guided teams to upgrade to SQL Server 2012 to better accommodate these types of projects, and during that process, helped them  understand how to make use of Enterprise features. This scenario is much simpler for these teams to get immediate business value.   I’ve also guided teams toward using MongoDB rather than Hadoop, as the adoption curve is simpler for the former when coming from a relational system.   That being said, there are of course Big Data projects where Hadoop is warranted, eg when you are using  newer technologies.  Guiding factors are the commonly stated volume, variety and velocity of data – meaning how much data, how fast and of what type(s).

Pitfall #4. Taking an IT-driven approach. Another common fallacy of Big Data projects is that they should be driven by the IT department. However, my experience is that successful Big Data projects have a strong corporate sponsor and are usually driven by the business analysts, with support from the IT group.  Analysts best understand the core business processes and executive sponsorship is always key to any successful change to the core IT structure. IT can and should play a partner role (rather than a lead) in these types of projects.

Simply put – business goals should be the primary driver and technology implementation is secondary. When IT leads BigData initiatives I have seen these projects produce results, i.e. a new model, cube, data store, report, etc…however success is measured by the value these new objects are providing to the business.  If they are not accepted and/or used by the business side of the house, then there is little value in the project.

To summarize, Big Data projects are best started by getting your current (data) house in order – this process can be aided by some of the D&B data services available on Microsoft’s Windows Azure Marketplace  from Cleanse & Match, Business Verification, and many others that help you clean, enrich and glean valuable insight your business data.  Also, the integration into current technologies, such as SQL Server SSIS and Excel to name a couple of examples, further improves usability.

Wishing you a clean, rich and verified 2014!


Big Data and the Need-to-Know About Suppliers

Sinking Ship Supply ChainManufacturers have awakened to the reality that they need to know a lot more about their suppliers: a 2013 report from Supply Chain Insights found that 41 percent of surveyed supply chain professionals cite the ability to get to data on suppliers as the #1 problem they face in 2012.

It makes sense. Economic uncertainty, natural disasters, political and social unrest are all factors with real potential for supply chain disruption. Knowing how a supplier is performing and where they are operating at any given time gives manufacturers an informed perspective that enables them to proactively respond to even the slightest tremor on the backbone of today’s brittle supply chains.

Getting the insight needed, though, isn’t really getting any easier. Fewer and fewer companies are publicly traded, making it more difficult to get real insight into leadership, locations, and financial performance. When it comes to doing business with suppliers in emerging economies, sources such as World Bank or the World Factbook don’t provide much beyond statistics and figures such as taxation, immigration, and legal aspects of doing business in a particular region or country.

Can big data help close the gap on information needed to reduce supplier risk? Supply chain executives think so, according to a 2013 report from Eye for Transport. Two-thirds of survey respondents are evaluating big data initiatives, citing lowering costs and the need for real-time decision making as two of the biggest drivers.

What’s valuable about the unstructured data is the velocity, volume and variety of the information. In many ways, harnessing big data can be like having feet-on-the-street where your suppliers operate. You can learn with lightning speed how a natural disaster may impact operations; whether there’s been a change in leadership or a court action against the company. Having this information arrive on your desktop, in a timely fashion, gives you critical lead time to evaluate options, formulate a strategy and execute with confidence.

You can start small – with the suppliers you already know that may be having problems, or those who are your most critical. Start by asking yourself

  • Where am I most vulnerable to disruption?
  • Which suppliers are most likely to negatively impact my business if they fail?
  • Where should I look for alternatives?

Identify the unstructured – or big – data that can help you sort through those questions and integrate it with the structured, traditional information you already track. Consolidate it in a single source and you will create the deep, complete picture you need to proactively manage suppliers and risk.

Check out this white paper on Supplier Management to learn more.

Photo credit Gorilla Girl (c) 2007

Harnessing Big Data for Informed Business Decisions

Big Data Infographic for blogWith expectations high for the value that can be delivered by ‘big data’, companies are investing in solutions that can turn those ambitions into realities – more customers, lower costs, less risk and more. A recent report from Gartner found that 64% of organizations surveyed have already purchased or are planning to invest in big data solutions in 2013, compared with 58% in 2012. Of that 64%, 30% have already invested in big data tech, 19% plan to invest within the next year and another 15% plan to invest within two years.

But are they using what they’ve built? Not so much: Less than 8% of Gartner’s 720 respondents, however, have actually deployed big data technology.  The reasons for the slow implementations are the usual ones I suspect: cultural barriers to applying analytics; internal feuding over who owns the data; lack of skills and internal disagreement of how to measure the ROI.

Before you invest in a big technology – or if you have already and need to put it to work – here are four basic rules that can help you get what you want.

Rule #1: Figure out what business challenge you are trying to solve with all this information. For example, do you want to:

  • Target your customer base more effectively?
  • Optimize retention strategies?
  • Understand where risk in your customer or supplier portfolio lies – and reduce it?

Rule #2: Tear down the silos of information to create a single source of truth about customers, prospects and suppliers.  One executive at D&B often talks about a company he’s worked with that has 140 different systems for collecting customer, sales and supplier information!? These disparate databases exacerbate the disconnects between departments and build obstacles to problem-solving and opportunity-capturing instead of accelerating both as intended.

Rule #3 Use what you already have. According to a ZDNet/D&B survey conducted of close to 600 organizations in early 2013, more than 75% track operational data from finance, ERP, CRM and other internal applications, while close to 45% track transactional data such as sales, queries, etc.  Odds are good that you already have a lot of insight into what you want to know. Applying rules #1 and #2 will help you get what you want out of it.

Rule #4 Integrate third party insight with your internal resources. The same survey found that less than 30% of respondents use data-as-a-service to enhance and validate their own information and add value to day-to-day business decisions. Odds are very good that your internal data is stagnant, and by augmenting internal information with external sources to ensure accuracy and relevance, you overcome the vulnerabilities caused by “dirty data.”

No matter what you are looking for from big data, these guideposts will help you navigate the selection, implementation and activation of your solution. You’ll get what you need much more quickly and less expensively, and with whole lot less aggravation—and isn’t that what breakthrough innovation is supposed to bring us?

(To see the full Informed Perspective infographic, click here.)

Big Data, Big Buzz: What You Really Need to Know

data blocks

Is big data such a big deal? Wondering if there’s more buzz than beef behind the noise?

Truthfully, for many businesses, ‘big data’ is a problem. The massive data sets are complex, unwieldy and can become the focal point of internal battles about who owns the responsibility to bring them under control.  And ‘big data’ is only going to get bigger.

The explosion is being driven by our seemingly insatiable desire for more information from more sources. Savvy organizations know there’s a gold mine of information about customer preferences, piques and peeves to be found in social media, warranty information and customer service feedback – most captured anecdotally and all unstructured.   There’s data now collected based on sensor technology that can tell you where a product is located, how well it is performing, if the temperature is safe, etc. There’s the structured data we all know and love – forms, contracts, purchase orders – that live in our databases and is easier to manage, at least in theory.  Less so now, but arriving fast on the scene, is audio and video data captured from phone conversations, meetings and other live interactions.

It’s no wonder we’re drowning in data. The questions we all have to answer are what data matters to us, where can we find it and how can we be sure it is the best possible resource for us. The good news is that there are strategies that can help determine which data you collect, shape and manage.

And that will be covered in our next post.