WHAT EXACTLY IS DATA SCIENCE?


DATA DYNAMICS

Anjul Bhambhri, vice president, Big Data, IBM demystifies the new emerging field of data science


RECENT STATISTICS SHOW A SIGNIFICANT RISE IN JOBS IN DATA SCIENCE AND HARVARD BUSINESS REVIEW HAS NAMED IT AS ONE OF THE TOP JOBS OF THE 21ST CENTURY. SO, WHAT EXACTLY IS DATA SCIENCE? 


Data science is the mathematical mining of data for discovery of new knowledge. It requires leveraging big data by using advanced analytical tools, machine learning algorithms and building more accurate predictive models. 


WHAT IS BIG DATA? From a technologist’s standpoint, big data is about extracting insights from an immense volume, variety and velocity of data. At IBM, we define big data as: 1) Scale of volume-petabytes and zettabytes instead of terabytes of data. 2) Wider variety-beyond traditional data sources including machine data (logs, web logs, instrumentation data, and network data) and social data. 3) Generated with high velocity-Data generated by machines is multiplying quickly, and it contains valuable insights that need to be discovered.
WHAT DOES THE JOB OF A DATA SCIENTIST ENTAIL? Data scientists are the change agents in an
organisation who perform the job of discovering “new” knowledge by examining data that may have inadvertently been ignored and not leveraged in the decision making process. They sometimes have been referred to as the ‘renaissance men’ or even the ‘Sherlock Holmes of data’. They require the ability to take data within and outside the enterprise, use advanced technologies to understand it, process it, extract new insights from it and communicate it to various stakeholders. In essence, data scientists need an innate desire to explore data.
HOW IS THIS DIFFERENT FROM BUSINESS INTELLIGENCE? Business intelligence involves database de
sign, data warehousing, querying and reporting on data. It is all about managing, slicing/ dicing and reporting on business data in order to better manage the enterprise.
    However, classic data science problems are risk management, fraud detection, catching outliers, detecting anomalies, predicting and preventing customer churn, getting a 360 degree view of the customer, email/
call transcript analysis, optimising network operations and predicting maintenance windows to name a few.
WHY IS THIS RELATIVELY NEW PROFESSION ATTRACTING SO MUCH INTEREST?

There has been a data explosion in the last two years. Every day, 2.5 quintillion bytes of data are being generated. If you think of the world as a digital universe then 90 per cent of the world’s data was created in the last two years. And at the same time, there has also been a breakthrough on the technology front which is making it possible for businesses to analyse and make business decisions by leveraging all of this data and not just subsets of data. This has created a new profession, that of a data scientist, a role that can make data the new oil for businesses.

    All industries are now hiring data scientists-financial sector, banks, retail, insurance, healthcare, energy and utility, marketing and media, academia, intelligence community, and so forth. 


WHAT ARE THE QUALIFICATIONS ASPIRANTS NEED TO HAVE? Data scientist skills are part science and part art. Aspirants usually require technical expertise in some scientific discipline, for example, a strong background in mathematics, statistics, machine learning. In addition, they need to either have subject matter expertise in the domain/ business that they are working in.
    In addition, they need to have curiosity, along with the ability to go beneath the surface and distill a problem down into a very clear set of hypotheses. Data scientists need “out-of-the-box” thinking and the ability to look at a problem differently.
    Storytelling is another skill they
need to have. This is the ability to use data to tell a story and to be able to communicate it effectively. 


ANYTHING ELSE YOU WOULD LIKE TO ADD? Foundation for the cities of future will be the network and information
they carry, enabling the delivery of vital services from transportation utilities and security to entertainment, education, and healthcare. Everything will be connected, intelligent, and green: offices, buildings, appliances, hospitals and schools. The list of possibilities is endless.
    Analytics on big data makes it possible to predict and solve problems that were not possible before and data science is the discipline that is enabling all this. Data scientist will be a very exciting and satisfying career in the next decade.
    —As told to Ruchi Kumar




No comments:

Post a Comment