Do Not Let Your Data Kill You – The Need for 3 R’s – Reduce, Recycle and Reuse

As the saying goes – anything in excess is a waste. Isn’t it true for information today?  Information or “data” – the four letter word which is more representative of the digital world has overwhelmed you, me and everyone transcending this space. Data in this form has various connotations – the more popular “Big Data”, Large or complex data, humongous data, etc.

On an average, data of companies have been increasing at a rapid pace – about 100% or more every year. Also, with users of social media being overactive, data transactions have multiplied manifold in real time. Though technical advances are being made to store this data in large repositories, there is a need for deriving context – meaningful information so as to Reduce, Recycle and Reuse data. For example, companies would like to use their data to understand and interpret information such as employee interactions, communications and client engagements. Data that is not used, but occupies useful repository space is a costly waste and needs to be eliminated. Regulatory requirements require one to use data to create intelligent and statutory reports that can be audited easily if the need be. The 3 R’s put in practice improve data management in a business environment:

Reduce:  Regulatory requirements for data, e.g. PCI data storage requirements or other Information governance or compliance standards, require one to be circumspect before planning for reduction of data. This challenge for cleaning up data not only results in a large volume of unused data, but also results in saving of data in local repositories of users with subsequent backups by the IT team.

Therefore, how do I reduce unused data? A Document Retention Policy, specifying the criteria for holding or removing data, the process governing such a decision and the relevant owners to implement and oversee is the first proactive step that any company can adopt that only appropriate data is maintained. With a policy in place, the discipline to actually implement such a policy enables a large reduction in unused data.

Recycle:  Regulatory Reporting is an important aspect for many industries. For example, in the US, Health industry related reports are mandatory, not only for the companies, but also for the patients, and the industry is well regulated.  Taxation or Financial obligations also require statutory reporting and audits. It is important for the data to be recycled and processed into useful reports for the auditors and the statutory authorities. Usually, intelligent software, ETL techniques, help in recycling such data.

Reuse: The most interesting part of data management is Reuse of data. The world of Business Analytics and Business Intelligence has offered options for deriving business insights from a large data set and intelligently reuse data. A new science “Data Science” has evolved in its own right and is promptly advocated by the Harvard Business Review. The HBR article from Thomas H Davenport and D J Patil in fact refers the job of a data scientist as the “sexiest job of 21st century”.

A few terms often used for reuse of data are:

  • Data Science: This is a term which loosely entails the combo of computer science, analytics, statistics, and data modeling. While this is a loose combination, and some companies have evolved their own courses or certifications, it still needs to mature as a science with comprehensive tenets and elaborate literature.
  • Smart data: Smart data is usually a subset of Big Data, with noise filtered out. While Big Data can be characterized by its attributes – variety, velocity and volume, a smart data is usually is characterized by velocity and value. Smart data is a key ingredient for intelligent BI Reporting.
  • Predictive Analytics: It involves smart methodologies utilizing data – machine learning techniques and statistical algorithms to predict the future outcomes of data. Companies gain out of predictive analytics by deriving or planning important outcomes from past data, e.g. revenue or profit.
  • Real Time Analytics: Analytics served real time, e.g. stock prices moving up or down, updates on page views, sessions, bounce rates, page navigation, advertisements dynamically adjusted based on type and frequency of customer usage, etc.
  • Intelligent Decision Systems: Use of Artificial intelligence in association with data is an area that helps users to derive the best and optimized decisions based on a large number of input variables. While this is still evolving, it can be used in number of areas such as building marketing systems that offer customers based on profile analysis, blocking of fraudulent transactions in credit card operations, etc.
  • Data Visualization: Pictorial or graphical representation of data intelligently, in an interactive way, help business professionals to identify trends and patterns in their data, e.g. sales data region-wise, or by customer profile.
  • Big Data Analytics: Reuse of data is not complete unless we use the term Big Data. The concept of Big data analytics has evolved from companies managing huge sets of data such as oil companies or telecommunication companies to social media such as Facebook, Twitter, LinkedIn that involve large data sets. This form of analytics help us to derive hidden patterns, market trends, preferences of customers, unknown correlations, etc.

 Business Data Analytics, therefore is in its infancy, to be nurtured, developed and evolved over the years. The attraction therefore is immense, and so is the job of the Data Scientist!!!

Sudipta Choudhury
Sudipta Choudhury is a Technical / Business Writer with considerable industry experience in various domains. He can be reached at sudipta_choudhury@writeforvalue.com

Published by

Sudipta Choudhury

Sudipta Choudhury is a Technical / Business Writer with considerable industry experience in various domains. He can be reached at sudipta_choudhury@writeforvalue.com

5 thoughts on “Do Not Let Your Data Kill You – The Need for 3 R’s – Reduce, Recycle and Reuse”

Leave a Reply

Your email address will not be published. Required fields are marked *