A Guide to E-Discovery Technology and Collection


I’m sure you’ve heard it before—data is growing at an unbelievable pace. The amount of data existing today will double by 2020. A few years ago a terabyte (TB) sounded like an unfathomable amount of data. Recently, I heard a company describe its data storage in terms of petabytes (1 petabyte = 1,000 terabytes).

This explosion in data growth can be attributed to the availability of larger, cheaper storage. You can buy a 2 TB external hard drive now for $100. Even in large companies, it’s fairly economical to add additional network storage arrays for a few thousand dollars, which could potentially add 20 terabytes or more of storage. The math on that is somewhere in the neighborhood of 20 million documents.

If the growing volume of data on external and networked devices wasn’t bad enough, now we need to consider cloud storage. Cloud storage requires little to no additional infrastructure, and it’s ridiculously cheap. Take a look at the new price model for DropBox (dropbox.com): $15 per user, per month, with unlimited storage. Its ease of use makes it very desirable to employees who may have strict network policies that don’t allow local data to be shared easily. (Cloud storage also creates a whole new set of legal issues involving privacy and confidentiality, but that is a discussion for a whole other article.)

Big Data for Firms of All Sizes

The issue of data growth is not limited to medium or large corporations. Don’t think that just because you deal with smaller clients, you’re going to have less of a challenge with collecting electronically stored information (ESI). As a matter of fact, smaller clients may be significantly more challenging because they lack the policies, standards, and support staff that you find in larger companies.

As an example, I owned a small consulting company a few years ago—I was the only employee. I used a MacBook Pro and a Windows PC, both with a subscription to Microsoft Office 365 with Exchange e-mail. With that subscription I was allowed 10 GB of Microsoft SkyDrive, where I stored company documentation and other important materials. I also had a 100 GB Box.com account that I used to share data with clients. All my accounting records were stored online in QuickBooks. And to top it off, I had a hosted phone system, which stored all my call records and voice mail online. All told, I was probably using somewhere in the neighborhood of 250 GB of total storage across these various platforms—and that didn’t include the several external hard drives connected to my machines that had various backups from the above sources. Now, I never considered what the e-discovery implications would have been had I been involved in a legal matter—I was just looking for the best functionality at the lowest price to run my business.

Larger clients may not suffer from the same problems that small business owners face, but they have a whole different set of challenges. For instance, BYOD (bring your own device) is gaining popularity in enterprises of all sizes. According to Gartner, 38 percent of companies are planning to stop issuing devices to workers by 2016. The challenge here is that companies potentially may have to deal with several different technology platforms, different applications, etc. Not only that, but technology is allowing users to work remotely, with some workers never or rarely working from their company offices. This makes collecting much more difficult because you’re now relying on the availability of that device and the bandwidth of its connection.

Proactive E-Discovery Solutions

Now that we’ve raised the questions, it’s time to start discussing solutions. When I look at the e-discovery technology landscape and consider the issues, the answer is very clear: We need to stop thinking of e-discovery as a reactive process. There is too much data and too many potential data sources to continue to deal with them on a matter-by-matter basis or “as the need arises.” Many of the e-discovery tools and procedures that we’ve relied on over the years are now antiquated. We need to adapt our tools and mindset to address the growing amount of data and various storage devices. In order to address today’s data challenges, we need to start taking a proactive approach to e-discovery collections.

Understanding the technology your client utilizes is an essential first step. You can’t begin to advise your clients until you fully understand their technology infrastructure and policies that govern it. You don’t need to wait until litigation to start building this profile. Having an understanding of the technology before litigation will make the process much quicker to get started and save a lot of legwork that could have been done beforehand. This can be accomplished with a face-to-face meeting with the client and key members of the technology team to draft a comprehensive e-discovery response plan; or, it can be as simple as a questionnaire with some high-level questions to give you a basic understanding of their technology. The approach you take will largely depend on the client’s size and type of business. Understanding where data is stored, retention policies, backup schedule, data disposition schedule, and other factors will be important components of this plan.

Once you have an understanding of your clients and their technology, you can begin to build your team of experts. This team may consist of people within your own firm or vendors that specialize in different areas of e-discovery. Developing relationships with key e-discovery partners will be essential to successful ESI collections and productions. Knowing the strengths and weaknesses of these vendors will be important when recommending a vendor for your clients—that’s why it’s important to start developing these relationships ahead of time. Most e-discovery vendors are happy to provide low-cost or no-cost pre-litigation consultation to your clients. This proactive approach ensures that you have the expertise to manage your clients and meet most challenges they may encounter. I can’t stress the importance of finding a good, reputable e-discovery vendor that you feel comfortable working with. They’ll be part of your team and should be a trusted advisor. You can’t expect to build trust overnight, so it’s important to develop that relationship before being involved in a matter.

After you get past information gathering and establishing vendor relationships, what’s left but to wait for litigation? There’s actually a lot more that can be done! Let’s reexamine some of the issues we outlined in the beginning—large amounts of data on laptops, desktops, networked devices, mobile devices, external hard drives, and in the cloud. Traditional e-discovery collection tools exist that would allow you to collect from any of these devices reactively. The problems with a reactive approach are numerous—most notably that it’s time consuming and it’s expensive. The math is simple—the more data you have, the more time and money it’s going to take to collect and process. This also translates to significantly higher costs and longer times for downstream e-discovery processing and review.

A proactive e-discovery approach is inevitable, so why not plan for it? Tools exist today that allow you to target data more narrowly by using keywords, dates, and document types directly on end devices. There are very few situations when it’s necessary to make an exact forensic copy of an entire data source today—especially without doing some kind of proactive filtering.

If you’re familiar with cloud storage, you’ll know that one of the things that happen when you upload a document to Box or DropBox is that the contents are automatically indexed on upload. This makes search and retrieval much quicker. And many of these cloud storage providers have started to build e-discovery functionality into their standard offering. These cloud providers understand that being able to retrieve documents quickly and accurately (especially for litigation or regulatory purposes) is crucial to their ongoing success. Without fast, accurate results, it would be very difficult to acquire mass adoption of these tools.

So why don’t we take the same approach when it comes to physical data stores: laptops, desktops, file shares, and e-mail servers? Isn’t it just as important, if not more important, to collect quickly and accurately from these sources? The technology exists to create a fully searchable index with documents that are easily retrieved when needed from local computers or servers. There’s no need only to address these types of storage purely in a reactive mode. Also, the cost of proactively addressing this data is a mere fraction of what it would cost to address reactively. In many instances clients will see a return on investment after a single legal event.

Some of the difficultly lies in convincing clients to invest in legal technology before litigation. No one wants to think about worst-case scenarios or what to do if litigation arises, but that’s exactly the mindset that needs to be adopted. Litigation is inevitable, and we need to plan accordingly.

Changing Technology, Changing Responses

I’ve worked exclusively in the legal technology space for more than 13 years. What I’ve noticed is that when a technology catches on in the legal world, there’s a tendency to overuse it and to use it well beyond its effectiveness. Ten years ago, it was perfectly acceptable to forensically image an entire hard drive, without filtering, and then process the entire drive. Back then, hard drives, on average, were 40 GB or less, and there was much less data. As users, we actually had to monitor the data we stored and purge unnecessary data from time to time. Today, computers come standard with 1 TB drives, and external drives of the same size or larger cost around $100. There’s no need to delete anything anymore, which translates into much higher e-discovery costs.

Applying the same technology used to copy drives ten years ago is no longer acceptable and contrary to available technology that mitigates the standard argument “because that’s the way we’ve always done it!”


Nate Latessa is vice president of customer solutions at VeDISCOVERY, LLC (vediscovery.com), a Cleveland, Ohio–based software developer of VeAGENT, a new technology for identifying and collecting loose, unstructured, or semi-structured documents and e-mails.


Previously published in ABA’s GPSolo.

Leave a Reply

Your email address will not be published. Required fields are marked *