Sign In

Using e‑disclosure for meaningful analytics

We live in the era of infor­ma­tion over­load. Every day, we gen­er­ate more and more data, much of it mean­ing­less. With the click of a but­ton, you can e‑mail 50 peo­ple with a 50-megabyte attach­ment. We rack up data on our PCs, lap­tops, smart­phones and tablets. Accord­ing to IBM, we cre­ate 2.5 quin­til­lion bytes of data every day – and 90 per cent of the data in the world was actu­al­ly cre­at­ed in the last two years. But we rarely both­er to delete any of it.

In our day-to-day lives, this does not real­ly mat­ter. Who cares how much use­less infor­ma­tion we are stor­ing? But for cor­po­rates faced with a law suit, it sud­den­ly becomes a prob­lem. With­in that mush­room­ing expanse of data, there will be files that must be hand­ed over to the oppos­ing par­ty through the legal process of dis­clo­sure and find­ing these is an increas­ing­ly daunt­ing task.

In terms of dis­clo­sure, the data being gen­er­at­ed on Face­book, Twit­ter, Insta­gram, LinkedIn and oth­ers is the next big chal­lenge

The prob­lem is often com­pound­ed in the cor­po­rate world by reg­u­la­to­ry rules, which do not allow many files to be delet­ed, with invest­ment firms, for exam­ple, oblig­ed to retain large amounts of data, includ­ing mobile phone calls made by traders. Indeed, these days much of the data stored is not in tra­di­tion­al doc­u­ment form at all, but is video, audio and text, and increas­ing­ly on social media.

Social media and BYOD in litigation

Discovering data

In terms of dis­clo­sure, the data being gen­er­at­ed on Face­book, Twit­ter, Insta­gram, LinkedIn and oth­ers is the next big chal­lenge.

Chris Dale, founder of the e‑Disclosure Infor­ma­tion Project, says: “Lawyers have just got their heads round the fact that e‑mail and Word files are dis­cov­er­able, but they have not yet applied their minds to all the non-tra­di­tion­al data sources. Even if they are think­ing about social media, they are only look­ing at their duty to dis­close it, but are not see­ing its poten­tial val­ue as evi­dence.”

Accord­ing to Mr Dale, this type of data could be impor­tant to the lit­i­ga­tion strat­e­gy, for exam­ple if a wit­ness claimed not to have been at a cer­tain place, but a pho­to down­loaded on to Face­book via their smart­phone sug­gests oth­er­wise.

“It might be unlike­ly to turn the case, but it could be use­ful, for exam­ple in under­min­ing the cred­i­bil­i­ty of a wit­ness,” he says.

Key trends in litigation

Predictive coding

Tech­nol­o­gy cre­at­ed the prob­lem of suf­fo­cat­ing data, but it also holds the solu­tion. An e‑disclosure tech­nique known as pre­dic­tive cod­ing can reduce the dis­clo­sure pile from what could be mil­lions of files down to a man­age­able num­ber.

As Jonathan Maas, senior direc­tor at e‑disclosure con­sul­tan­cy Huron Legal, explains, pre­dic­tive cod­ing tech­nol­o­gy is like an “eager pup­py”. It first com­pletes a series of “train­ing runs” on small­ish sam­ples of doc­u­ments – not more than 2,000 – in which a lawyer will tell it what is impor­tant and what to dis­card. When it is ready, the lawyer then throws it a bone and off the pup­py bounces to per­form the same trick across the entire data set.

Lit­i­ga­tion may have been the dri­ver behind this new tech­nol­o­gy, but there is grow­ing recog­ni­tion it could be a handy tool in many oth­er fields

The end result is a man­age­able par­cel of files, neat­ly tied with a metaphor­i­cal bow and pre­sent­ed to the lawyers to be reviewed. Con­trary to mis­con­cep­tion, pre­dic­tive cod­ing does not mean hand­ing any doc­u­ments to a lit­i­ga­tion oppo­nent with­out a lawyer hav­ing eyes on them first.

For the lawyers, it does involve a leap of faith because the ini­tial sift­ing has been done by com­put­er, rather than the tra­di­tion­al team of exhaust­ed fee-earn­ers in crum­pled suits. But lawyers have a duty to keep their costs pro­por­tion­ate, which means exam­in­ing every doc­u­ment by hand is sim­ply not an option now that the vol­ume of data has been super­sized.

Legal technologyAs Vince Nei­cho, lit­i­ga­tion sup­port man­ag­er at City law firm Allen & Overy, puts it: “Using tech­nol­o­gy means that you will miss doc­u­ments, but then so will a fatigued lawyer sit­ting in a room.”

Lit­i­ga­tion may have been the dri­ver behind this new tech­nol­o­gy, but there is grow­ing recog­ni­tion it could be a handy tool in many oth­er fields; basi­cal­ly any task that involves pulling infor­ma­tion from very large amounts of data.

Mr Maas explains: “The tech could be used in a num­ber of the­atres of war; for exam­ple, inves­ti­ga­tions by reg­u­la­to­ry author­i­ties. It could help in inter­nal inves­ti­ga­tions – say, for insid­er deal­ing or IT theft. Or it could sim­ply be used for infor­ma­tion gov­er­nance gen­er­al­ly, for exam­ple where an organ­i­sa­tion may have count­less copies of the same thing. At the very least, you could use it to iden­ti­fy all the dupli­cates, stor­ing the mas­ter doc­u­ment in a clear­ly labelled way, and get rid of all the rest.

“One of the big by-prod­ucts of lit­i­ga­tion is that you always end up with a spank­ing clean fil­ing sys­tem with all your data in order.”

In the merg­ers and acqui­si­tions field, if you have pur­chased a com­pa­ny, you will nor­mal­ly acquire a large amount of its data, often uncat­e­gorised and unsort­ed. E‑disclosure tech­nol­o­gy can be deployed to find any intel­lec­tu­al prop­er­ty val­ue tucked away in that mass and also the busi­ness risks that might be lurk­ing with­in.

Clustering technology

In sit­u­a­tions where you know some­thing is not quite right, but you do not know exact­ly what you are look­ing for, the lat­est clus­ter­ing tech­nol­o­gy, which can be based on con­cepts rather than key­words, can pro­vide the solu­tion.

“Say you have bought a com­pa­ny that oper­ates in Rus­sia, which is a high-risk area in terms of the Bribery Act,” says Mr Dale. “You have 15 sales­men out there and it’s a good idea to find out what they are up to. Or say secrets are leak­ing out of your organ­i­sa­tion or you start to think that some­thing doesn’t smell right in a branch office. This is where clus­ter­ing can help.”

Clus­ter­ing is a way of group­ing doc­u­ments togeth­er accord­ing to their con­tent, to cre­ate a high-lev­el visu­al map of bright­ly coloured clus­ters. These can then be dis­missed or, if some­thing looks out of place, inves­ti­gat­ed fur­ther until you reach doc­u­ment lev­el.

Mr Dale gives the con­cept of “Labrador” as an exam­ple. The tech­nol­o­gy would sep­a­rate data into clus­ters about a place in the north east of the Unit­ed States, a dog and the Span­ish for work­er. If it finds a lot of things about dogs, it will then group these again, say into dog, canine, Labrador, poo­dle and so forth.

 

“That’s an exam­ple of it telling you it has found a lot of doc­u­ments that are sim­i­lar doc­u­ments, derived from the text with­in them and from the meta­da­ta, not from words you have fed it,” he explains.

The more tech­nol­o­gy forms a part of our work and per­son­al lives, the faster the data mass will expand and mul­ti­ply; at least until we learn to store it in a more organ­ised way and delete dupli­cat­ed or obso­lete files. But also grow­ing is the abil­i­ty of e‑disclosure tech­nol­o­gy, which began in the lit­i­ga­tion con­text, but is about to spread its wings far wider to adapt to han­dle those vol­umes, boost­ed by ever-increas­ing pro­cess­ing pow­er.

As Mr Nei­cho con­cludes: “This prob­lem is not going to go away, so we need to deal with it.”

THE RIGHT TO BE FORGOTTEN

Next month sees the dead­line for Euro­pean Union mem­ber states to reach a pos­si­ble agree­ment which could impact busi­ness­es across Europe – the Gen­er­al Data Pro­tec­tion Reg­u­la­tion.

The new reg­u­la­tion is a much need­ed update, giv­en that Europe is cur­rent­ly oper­at­ing under a set of rules cre­at­ed in 1995 when there were only around 23,500 web­sites on the inter­net, and social media and cloud com­put­ing did not even exist. The final ver­sion of the rules look set to be agreed in Decem­ber, fol­low­ing many years of con­sul­ta­tion, and will become law in two years’ time.

The right to be forgottenIt will apply not just to EU-based com­pa­nies, but any busi­ness – includ­ing US cor­po­rates – that “touch­es” the data of an EU cit­i­zen.

Focus is very much on the data pri­va­cy rights of the cit­i­zen. But some are con­cerned about the poten­tial impact on busi­ness, par­tic­u­lar­ly relat­ing to the so-called “right to be for­got­ten” con­tained in arti­cle 17 of the rules, allow­ing EU indi­vid­u­als to demand the era­sure of their per­son­al data.

David Mose­ley from Ver­i­tas explains: “The reg­u­la­tion need­ed to hap­pen. It will har­monise how we work togeth­er, but organ­i­sa­tions need to improve their infor­ma­tion man­age­ment, gov­er­nance and dis­cov­ery of data or their IT depart­ment will become the bot­tle­neck if still using man­u­al process­es to pro­vide request­ed infor­ma­tion.

“You need to devel­op sys­tems to remove per­son­al­ly iden­ti­fi­able infor­ma­tion, and stream­line your reten­tion and clas­si­fi­ca­tion poli­cies. A com­pa­ny could be inun­dat­ed with requests under the pro­posed rul­ing, and if you’re deal­ing with lega­cy archives and frag­ment­ed loca­tions, the IT depart­ment could eas­i­ly be buried.”

This is where e‑disclosure tech­nol­o­gy, with its abil­i­ty to search through a galaxy of data at warp speed, could make all the dif­fer­ence.

Mr Mose­ley adds: “E‑discovery tools will become a crit­i­cal busi­ness com­pet­i­tive edge. It is about hav­ing an auto­mat­ed work­flow. If you have 100 peo­ple ask­ing for the same thing, why have a man­u­al process?

“IT depart­ments are being expect­ed to do more with less. Unless you bring in the e‑tools, unfor­tu­nate­ly you will suf­fer the con­se­quences.”

Challenges law firms face with growing volumes of data