Gimmal Blog

Read the latest thought leadership and industry news from the experts at Gimmal!

All Posts

Cleansing Your Content: What a Novel Idea

Below is our second in a series of blog posts written by Carla Mulley, Vice President of Marketing at Concept Searching. Concept Searching and Gimmal are working together to offer more intelligent records management capabilities to organizations of all sizes.

Read this post to learn why it's important to eliminate redundant, obsolete, and trivial (ROT) information.

Migrating, or I should say moving, to the cloud seems like an easy out to many organizations. It’s free isn’t it? No. Well it’s cheap. No again. I hate to burst bubbles, but cloud storage certainly isn’t free, and it isn’t cheap. But that’s the prevalent mentality. IBM claims that 85% of your data is unstructured and IDG indicates it is growing at 62% per year. Look at the numbers. The compound average annual growth rate for unstructured data far surpasses the price of a reduction in storage costs.

In the scenario of moving content to the cloud, or if you have moved content to the cloud, the prevailing theory is if you don’t adversely impact the data being moved or impact productivity, then the migration was judged a success. Wrong again. The majority of unstructured data is neither managed nor tracked. Less than 1% is even analyzed. Which leads us to the key question: what exactly does it contain and why on earth are you moving it, did you move it, or why are you keeping it?

Most likely you moved garbage, or ROT. All of it? Well no. Migrating content using the forklift approach leads to moving and ultimately paying for the storage of unknown privacy exposures, undeclared records, as well as the usual assortment of duplicates, stale information, even unknown content of value, whatever you can imagine is most likely being moved. For those organizations that depend on the owner of the content to keep what’s business critical and archive or delete the rest, I think we all know that won’t happen. We, as humans with our possessiveness of our data, just don’t like to get rid of it – ever, myself included.

Let’s assume that your organization has no qualms about paying to store this data, regardless of where it resides. This approach carries great risk and, ultimately, the costs can far exceed just the costs of storage.

Cleansing your unstructured data before migration means you are finally in control and are proactively alleviating organizational risk. The ability to automatically generate multi-term metadata is a key enabler. Multi-term metadata can consist of up to say 5 or so terms that represent a subject, topic, or a concept. Once auto-classified to one or more taxonomies, you have a highly granular inventory of what is being stored in your file shares, SharePoint, Exchange, ECM systems – basically any repository. Records management professionals and domain experts can then make the determination to keep it, archive it, or just plain old delete it.

It’s very straightforward. Let’s look at an example. We will assume you are looking for any privacy or sensitive information vulnerabilities that are unprotected in your corpus of content. A taxonomy is created that contains the types of privacy and sensitive phrases you are looking for. Sensitive information can be content that is defined by the organization, so there are no limits on the types of content you are seeking and you can also use phrases. Content is then auto-classified and the exposures are identified. If you would like, you can automatically remove files from access and send them to a secure repository for disposition. This same approach can be used to identify and manage undeclared records. Compliance and governance processes can also be enforced as well as providing defensible deletion with full audit capability. Of course, this cleansing of unstructured data also identifies duplicates, near duplicates, ROT, stale information – the usual suspects.

Although this post has focused on migrating content, we recommend running this process a few times a year. One client reduced their server footprint from 56 to 4. The impact of the risk becoming a reality doesn’t seem like it’s worth it.

You can read how another client addressed privacy and the elimination of one type of data breaches or how a healthcare client who faced strict regulatory guidelines for HIPAA compliance at the Concept Searching website.

Concept Searching and Gimmal have joined forces to add multi-term metadata and classification to an already exceptional product. For more information contactConcept SearchingorGimmal.

Carla Mulley
Carla Mulley
Carla Mulley has extensive experience from a career that includes senior management positions in marketing and sales and oversees all the company’s marketing operations. Carla’s expertise includes developing results-oriented strategic and tactical marketing initiatives, and the creation of new vertical and horizontal markets for technology solutions.

Related Posts

3 Tips to Ensure KORA Compliance

There has been a spotlight on the Kansas Open Records Act (KORA) in the media lately, largely due to recent violations. Under KORA, any individual can request public records from government bodies. If all requested records are not provided within in a specific timeframe, these organizations are subject to significant repercussions. This is merely one example of a ‘sunshine law’. The purpose of sunshine laws is to provide transparency into government agencies by giving the public access to local government proceedings.

Creating a Retention Schedule that Works

Creating a usable, automated, and simple file plan is an important part of ensuring records are managed in a consistent manner and that you are protected from legal risks, such as failure to disclose information during a discovery proceeding or the unauthorized leakage of information. The first step in the process is creating a retention schedule, which outlines how long records are kept in accordance with the organization’s obligations and the law.

How to Manage Your Sprawling Content

Sprawling content, the spread of content across multiple repositories, has been a thorn in the side of records managers since the dawn of document management. Consolidation of repositories, which began in the early 2000s, at first looked to be the solution. However, it ended up highlighting the problems of content sprawl due to the high costs of consolidation as well as need for records managers to manage multiple file plans. Federated records management offers a solution to these problems but doesn’t offer the same locked-down approach with regards to regulation that consolidation can. Consolidation of repositories and federated records management both have pros and cons and, depending on your organization’s content management processes and repositories, one can be more beneficial than the other in the long term.