Sign In

How CDOs can unlock the value of unstructured data

Emails, instant mes­sag­ing, social media posts, images, videos, employ­ee login times: all these have one thing in com­mon – they’re forms of unstruc­tured data.

“Unlike many oth­er forms of com­pa­ny data, which can be stored and col­lat­ed in a data­base, this infor­ma­tion is dis­or­gan­ised and dif­fi­cult, if not impos­si­ble, to analyse,” says Robert Ruther­ford, chief exec­u­tive of IT con­sul­tan­cy com­pa­ny QuoStar.

But the data holds incred­i­ble val­ue. In fact, IT ana­lyst com­pa­ny IDC esti­mates that by 2020, 37 per cent of unstruc­tured data will be use­ful if prop­er­ly analysed, result­ing in $430 bil­lion in pro­duc­tiv­i­ty gains for organ­i­sa­tions that can prop­er­ly utilise it. The data can be used to help the way an organ­i­sa­tion oper­ates inter­nal­ly, but also can help to pro­vide new prod­ucts and ser­vices to cus­tomers or improve exist­ing offer­ings.

What’s more, accord­ing to the Ver­i­tas 2017 Data Genomics Index, 16 per cent of an organisation’s data is unknown, unstruc­tured data, and this is grow­ing year on year, with the num­ber of unknown files held by organ­i­sa­tions increas­ing by more than 50 per cent between 2016 and 2017.

Categorising unstructured data

So how can the data be unlocked? Mr Ruther­ford believes that to under­stand the val­ue of data, com­pa­nies first need to know what kinds of infor­ma­tion they hold. This may seem obvi­ous, but with­out know­ing the dif­fer­ent data types and where they are in the organ­i­sa­tion, they would be incred­i­bly dif­fi­cult to mine for val­ue.

So this requires com­pa­nies to start by cat­e­goris­ing their unstruc­tured infor­ma­tion. “While this can seem like a sim­ple process, it is often a hid­den chal­lenge because sys­tems do not allow com­pa­nies to clas­si­fy their data at an incep­tion point, which means the infor­ma­tion remains unstruc­tured and hard to analyse,” Mr Ruther­ford explains.

To under­stand the val­ue of data, com­pa­nies first need to know what kinds of infor­ma­tion they hold

There are arti­fi­cial intel­li­gence (AI) tools that can help organ­i­sa­tions to stream­line this process so data can be cat­e­gorised quick­ly, but it would still require a human ele­ment to under­stand the data that is being processed. The next step is to be able to analyse this data in the same way ana­lyt­ics con­tin­ue to pro­vide insight from struc­tured data. That insight then needs to be made action­able and deliv­ered to senior deci­sion-mak­ers to act on.

Jere­my Stim­son, chief tech­nol­o­gy offi­cer (CTO) at rep­u­ta­tion risk man­age­ment soft­ware com­pa­ny Pole­cat, says busi­ness stake­hold­ers shouldn’t have to do the work of dis­cov­ery them­selves and this is why man­ag­ing unstruc­tured data is com­pli­cat­ed.

“This means being able to con­vey data in ways that would sim­pli­fy and com­mu­ni­cate insights with clar­i­ty to vast audi­ences, through sharp data visu­al­i­sa­tions, chart­ing, graphs and illus­tra­tive mod­els, for exam­ple. There’s a whole infra­struc­ture at work to turn unstruc­tured data into some­thing a CEO can actu­al­ly use,” he says.

CDOs can enable businesses to mine unstructured data

It is for this rea­son that there needs to be a spe­cif­ic, C‑level exec­u­tive that man­ages this com­plex­i­ty of data with­in an organ­i­sa­tion; some­one who is not nec­es­sar­i­ly a part of the IT depart­ment, but can work along­side a CTO or chief infor­ma­tion offi­cer, as well as the chief mar­ket­ing offi­cer and oth­er C‑level exec­u­tives.

The chief data offi­cer (CDO) is not only a posi­tion on the rise, but the role is tak­ing on more impor­tance. Accord­ing to Gart­ner, more than half of CDOs report direct­ly to a top busi­ness leader and CDOs in gen­er­al are now not only focused on data gov­er­nance, data qual­i­ty and reg­u­la­to­ry dri­vers, but also deliv­er­ing tan­gi­ble busi­ness val­ue and enabling a data-dri­ven cul­ture.

The CDO can access the infor­ma­tion hid­den in dis­or­gan­ised datasets, and enable the busi­ness to mine unstruc­tured data and incor­po­rate it into part of a wider strat­e­gy

The chief data offi­cer can access the infor­ma­tion hid­den in dis­or­gan­ised datasets, and enable the busi­ness to mine unstruc­tured data and incor­po­rate it into part of a wider strat­e­gy. Although Nigel Vaz, chief exec­u­tive of Publicis.Sapient Inter­na­tion­al, points out that a busi­ness which hires a CDO needs to ensure they have real scope to make changes with­in the organ­i­sa­tion.

“The CDO role can­not be a sur­ro­gate for col­lec­tive C‑suite own­er­ship of data, but must add a set of com­ple­men­tary skills found­ed on an under­stand­ing of data as a dri­ver of organ­i­sa­tion­al effi­cien­cy and, cru­cial­ly, of future cus­tomer val­ue,” he says.

Dealing with unstructured data varies from organisation to organisation

Wright­ing­ton, Wigan and Leigh NHS Foun­da­tion Trust is seek­ing this kind of indi­vid­ual at the moment. “While the NHS is slow­er than the pri­vate sec­tor, with things like GDPR [Gen­er­al Data Pro­tec­tion Reg­u­la­tion] now in force, it’s more appar­ent that there needs to be rep­re­sen­ta­tion at board lev­el to talk about data and ana­lyt­ics,” says the trust’s head of busi­ness intel­li­gence and act­ing asso­ciate direc­tor of infor­ma­tion man­age­ment and tech­nol­o­gy Mark Sin­gle­ton.

“We have inter­views for our data pro­tec­tion offi­cer who will be report­ing to some­one at a board lev­el and, depend­ing on who is recruit­ed, they may also become our CDO,” he adds.

But dif­fer­ent organ­i­sa­tions have dif­fer­ent ways of approach­ing how they deal with unstruc­tured data. For exam­ple, Adobe imple­ment­ed a new oper­at­ing mod­el, where its lead­ers agreed on a con­sis­tent data struc­ture and def­i­n­i­tions so the insights they gained from the cus­tomer jour­ney could be used to improve and per­son­alise expe­ri­ences.

How unstructured data can fit into a comprehensive data strategy

Mean­while, Hotels.com has three dif­fer­ent data-relat­ed func­tions. One leads on how data is cre­at­ed, cap­tured and man­aged, anoth­er leads on turn­ing that data into help­ful capa­bil­i­ties for its cus­tomers by using tech­nolo­gies such as machine-learn­ing and AI, and then it has a CTO who leads on how to act on this and get it in front of its cus­tomers.

“The three of us togeth­er form a tight-knit com­mu­ni­ty, an ecosys­tem and work­flow,” says Hotels.com’s chief data sci­ence offi­cer Matthew Fry­er, who leads the mid­dle func­tion.

Unstruc­tured data, there­fore, forms a large part of Mr Fryer’s role and he cat­e­goris­es the data in three dif­fer­ent groups. The first is where Hotels.com uses data to make pre­dic­tions and rec­om­men­da­tions, whether that is rec­om­mend­ing a cus­tomer the best hotel, rec­om­mend­ing them the best fil­ter or mak­ing pre­dic­tions for inter­nal fore­cast­ing.

pie chart data ecosystem

The sec­ond group is where the com­pa­ny is try­ing to improve on what is often a frag­ment­ed and com­plex trav­el indus­try. This means try­ing to help with a customer’s entire jour­ney and their trav­el­ling plans, while keep­ing their pref­er­ences in mind.

“This is where we use some new­er inno­v­a­tive tech­niques like dis­play­ing the image from a hotel that best suits their pref­er­ences, and analysing tens of mil­lions of ver­i­fied text reviews to give us and the user more insight,” says Mr Fry­er, who adds that video analy­sis could be an area of growth in the years to come.

Anoth­er form of unstruc­tured data that Hotels.com is work­ing on being able to use is speech. The idea would be to enable a cus­tomer to explain every­thing they want­ed from a hotel by speech to a ser­vice such as Amazon’s Alexa and for the sys­tem or vir­tu­al assis­tant to respond with clear answers.

No right or wrong when dealing with unstructured data 

While Hotels.com has a clear work­flow in how it uses data and three data lead­ers with­in the organ­i­sa­tion, some organ­i­sa­tions are not hir­ing a spe­cif­ic chief data offi­cer, and instead are invest­ing in third-par­ty resources and train­ing to get their exist­ing staff to make bet­ter use of the data.

The Seri­ous Fraud Office (SFO), for exam­ple, has to deal with unstruc­tured data such as emails, doc­u­ments and oth­er writ­ten com­mu­ni­ca­tions, and cur­rent­ly has a team of peo­ple that sup­port its case teams in mak­ing sure they get the best out of the data sys­tems they have, says Ben Deni­son, the SFO’s CTO.

There is clear­ly no right or wrong answer when deal­ing with unstruc­tured data, but organ­i­sa­tions are tasked with under­stand­ing what data they have at their dis­pos­al, defin­ing and cat­e­goris­ing it, analysing the data for insight, and then act­ing on that insight. For larg­er organ­i­sa­tions, the log­i­cal move is to employ a chief data offi­cer who can over­see this process and con­tin­ue to do so as the amount of unstruc­tured data con­tin­ues to grow.