Razor Insights

The Rise of Small Data in Artificial Intelligence and Machine Learning

Written by Jamie Hinton
Published on
Comprised of large data sets analysed to reveal patterns and associations, big data enables organisations to make predictions and identify key trends. It seems like the perfect business tool, but is it really all it’s made out to be? Although many large corporations have opted to invest in big data, could it be that the real enabler for many businesses is actually more manageable small data?

It seems like the perfect business tool, but is it really all it’s made out to be? Although many large corporations have opted to invest in big data, could it be that the real enabler for many businesses is actually more manageable small data?

What is big data?

Big data involves collecting huge amounts of information, using expert data scientists, advanced analytical tools and financial investment, to provide accurate and reliable insight to inform business decisions. This sounds like a fantastic way for businesses to gain valuable insights, but in reality, many businesses don’t have the resources required to make big data work; and so, it’s not something they could realistically or effectively employ.

There are also a number of issues surrounding big data, which call into question its reliability. One key problem is overfitting. The information associated with big data is so vast that Artificial Intelligence (AI) is necessary in order to analyse it. In order to help find patterns and trends in data, machine learning systems are fed large amounts of information – or training data.

Overfitting occurs when these systems process so much information they become over-influenced by the data they have taken in and focus on finding the same patterns in unseen data, as they have found in the training data. Because overfitting impacts the systems’ ability to generalise, they make predictions which may not be accurate or realistic. Understandably, this can cause problems, especially for businesses who rely on this information to inform their business activity and decisions.

Unfortunately, when programming systems and training them to pick up trends and patterns, there’s no way of knowing how well it will perform or whether it has been ‘overfit’ until it’s tested. Big data involves a lot of investment of time, money and resources, which can be a barrier for some businesses, especially when we factor in the risk of the results being inaccurate.

A key example of an instance where big data hasn’t hit the mark is Google Flu Trends. In 2013, Google boldly claimed they could predict the flu, based on how many people were searching for what were deemed to be flu-related terms or information. By tuning the search data into tracking information logged by the Centres for Disease Control and Prevention, Google believed that they could effectively produce estimates of when the flu was likely to hit and how many cases there would be. Google’s prediction was wrong, by a humiliating 140%.

Big data can help to identify correlations in different lots of information, which is of course useful, but that alone isn’t enough for us to decipher causation and make predictions, like Google tried to achieve. Google’s failure is a prime example of what can happen when big data goes wrong.

It’s also been warned that increased use of machine learning is leading to misleading results being generated. Many that are utilising machine learning are not using the correct techniques, and so increasingly, results are only showing trends present in training data, rather than patterns present in the real world. If techniques do not improve, more and more inaccurate insights will be generated.

But what about small data?

Small data is often overlooked and under-adopted in place of big data. In comparison to small data, big data is seen as providing high-tech, wholesale change and deeper insights, adopted by big corporations, many businesses consider it to be a hugely transformative because of its capabilities. But, quantity doesn’t always mean quality. By employing small data, businesses can use this information to make smaller, gradual improvements and changes, and for many businesses, small data holds the most potential.

Small data is where information is in a volume and format that makes it small enough for human comprehension and processing. Although more manageable than big data, it still provides useful insights into usage, trends and attitudes.

Here, advanced and complex technology is not necessarily needed in order to make the information accessible, informative and actionable. Examples of small data include inventory records, search history and usage reports –information that we as humans can comprehend and use.

Many small or medium sized businesses will never really need to use large databases or feel the full benefit of employing big data. However, for most businesses, small data already exists and is collected on a day to day basis, so it makes sense to make the most of it. When utilising your small data, there are a number of things to take into consideration, in order to help make your organisation more data-led.

You must have the right technology in place.

Although small data is much more manageable than big data, technology is still useful in order to make the most of the information you have. We have seen this at Razor in two recent projects, one for Age UK and the other for TechQuarters Using different software and platforms, you can more effectively harvest, extract and report data.

Transparency is important for effectively using the data that your business collects. Obviously collecting information has certain implications in terms of data protection, but by ensuring that all necessary staff have access to the data that you have, it can be used to influence and inform business and operational decisions, that improve and drive performance.

It’s also necessary to set measurable goals on what kind of data you want to collect (and how much), but also what exactly it is you want to achieve by collecting, using and analysing this data. Not only will this ensure that everyone in your business knows what they’re working towards, but it enables you to hold your team accountable, so that you know these insights are being used effectively, making your investment worthwhile.

Although there has been much more of a buzz around big data than small, due in great part to the misconception that with larger amounts of data, comes more meaningful, in-depth insight, however, it’s clear that for the vast majority of businesses, small data is much more useful, achievable and beneficial. As more organisations begin to realise this, will we see the rise of small data?