Data science is a critical pillar of the Fourth Industrial Revolution, being a technology strand that seeks to improve our interactions with the massive amount of information available for decision making.
The field is relatively new in Kenya, but its popularity is fast spreading. Its adoption in 2020 will be inevitable for researchers, corporates, SMEs, startups and government agencies since it keeps growing into a very lucrative career path and useful analytic tool.
It is the new gold, yet related terms like Big Data, Machine Learning (ML), Deep Learning (DL) and Business Intelligence (BI) make the phrase to trigger waves of migraines to many a people.
But what exactly is data science?
Dr Lilian Awuor, a computer science lecturer at Maseno University’s School of Computing and Informatics says that data science is a field that blends scientific algorithms and statistical models to unearth patterns from structured and unstructured data.
“It involves using scientific tools to explore data in order to discover patterns that wouldn’t be automatically visible. For example, a bank can implement a data science process to find out why it has high customer churn rates.
“It employs a combination of mathematics, statistics, programming, analytical skills, research business knowledge. These are used to extract insights, learn trends and perform future predictions from data,” she explains, adding that aggregating, processing and cleaning data before it can be analysed is also part of data science.
But it should not be confused with Big Data, a buzzword usually misused in technology circles to mean data science.
“Big Data specialises in analysing data that is too large to be processed using a personal computer. The field has analysis tools that have been built specifically for processing humongous datasets like those held by banks, governments and telcos,” founder of data science training firm Predictive Analytics Timothy Oriedo expounds.
Dr Awuor says that the term Big Data simply means that there is data, and that data is big.
“It means tonnes of data stored and processed in databases, hard disks and on the cloud. It exists in the form of text, figures, udio, images and videos.
“On cloud storage, this data is received and processed in real time, for instance, social media data that includes tweets, posts, clicks, comments, shares, reviews and time spent,” she says.
Machine Learning is another terminology that is also a branch of data science and involves the application of algorithms and statistical models in the automation of tasks without explicitly programming the computer.
“Given some input data, the algorithm learns from the patterns and is, therefore, able to generalise for new and unseen input data. For example, a sales forecasting algorithm for 2020 can be implemented using machine learning,” says Mr Oriedo.
According to Dr Owuor, it is one way of making intelligent systems, or other, the science of building systems that automatically learn and improve using data without following computer commands.
ML uses sample data to build a mathematical model to ‘train machines’ learn how to make predictions or decisions without relying on human programming.
A closer term to ML is deep learning, which is simply the application of machine learning that utilises artificial neural networks.
These neural networks are modelled just like the nerves of the human brain. DL is mainly used for very large datasets. Some of its applications are in computer vision for self-driving cars, text to speech and automatic object detection.
It works well when building intelligent systems that make use of unstructured data such as in the oil and gas industry to turn high quantities of seismic data images into 3D maps with the goal of making reservoir predictions more accurate.
“In Kenya, DL was used by the Meteorological Department to warn the country last month that we would experience heavy rainfall in December than previous years. And the floods and mudslides being witnessed just prove its accuracy,” Prof Bitange Ndemo of the University of Nairobi’s Business School tells Digital Business.
A different branch of data science is Business Intelligence (BI) which encompasses a set of tools, processes, methodologies and strategies that are implemented in order to provide useful insights to an organisation.
“These insights are critical for running the day to day operations of a business as well as decision making. An example is the building of dashboards that show the Key Performance Indicators of an organization,” says Mr Oriedo.
Dr Awuor says that BI is about using available data to make informed decisions in a business setting.
“Recently, we were visualising health data for a given county. One particular ward had high infant mortality rate. Looking deeper into the data, we realise that only half of the children were being fully immunised.
“It was also realised that most homesteads are far away from health facilities thus families found it hard to bring children for immunisation. The proposed solution was to have more community health workers reaching out to the homesteads and vaccinating the children. This is BI in action.”
The use of data science is already changing economic lives of Kenyans in all spheres. From taxi hailing, medicare apps, fintech, e-commerce, agritech and emergency services, data science has been deployed to support Kenya’s robust digital economy.
But a data regulation gap had seen misuse of customer data.
“The new data law means integrity in the collection and processing of individual data will be a must. Large companies that have been mining customer data illegally are now coerced to comply. Data regulation has just come at the right time in Kenya. Expect more data science adoption in 2020,” says Mr Andrew Mukabana who has been a data scientist for seven years.
He predicts that data scientists will be in high demand as the volume of data grows in 2020, and companies struggling to handle data privacy issues.
“Am foreseeing a situation where companies will pump millions into in data science. Human resource department will be forced to create data departments headed by Chief Data Officer. There will be a data strategy whose implementation cost will outweigh that of marketing.”