Historical evolution of analytics

17 September 2019

The first occurrences of analysis of statistics are found from the 19th century by Frederick Winslow Taylor who performed time management exercises. Even Henry Ford took some interest into the assembly lines and how performant they were. The breakthrough came in the 1970s when Edgar F. Codd created relational databases which held mostly structured data and allowed users for retrieve the data by writing SQL. These kind of databases and SQL are still very much in use even today.

With the Internet there was a big need for storing different kinds of data and this wasn’t the area for relational databases which are not very good at translating unstructured data into searchable. NoSQL databases followed this development with their ease of usage and very good flexibility. NoSQL systems then led into the era of big data and all the things around it. In the ever growing era of the Internet many organizations started to go online and they had perfect opportunity to start to collect data. In the late 2000s there was a great need for software that could serve as platform for all of the data gathered. This is also the time when business analytics started to get really popular. With all the data the organizations started to improve their businesses by making decisions based on data which is called business intelligence. Business intelligence was known from the late 1980s but it hadn’t made its way properly and was waiting for the right tools.

Currently the latest tools of analytics are in the cloud. Cloud services rent services rather than selling you them for good. They offer endless data warehouses and tools for data mining and other analytic endeavours.

Business analytics is more about looking at the data and inspecting why something happened even to predict or assume something about the future. Business intelligence is looking at the data and seeing the events leading up to something and not so much about why these events were taken.

What is data visualization? #

Visualizing the data collected is the easiest way to understand it. When an organization has a lot of data and they want to explore it or get “see it” it is easiest done by visualizing it. This usually means creating graphs, maps or charts. The latest ways of displaying data or visualizing it are usually full of information. The graphs are interactive and even update while looked at, some even allow you to do deeper analysis of them.

Visualization can be done with well known Microsoft Excel but this is rather old and dull way of representing data. Now there are business intelligence platforms such as Tableau or Qlik which focus heavily on visualization and allowing the user to pick and choose the data which is interesting. Visualization is important to humans and especially data scientists since it’s easy for us to recognize that something is going as planned from a graph rather than looking at the raw data.

There are dashboards everywhere now and there are new use cases all the time. Some businesses might be looking at click through rate of a webpage and some might be looking at the performance indicators of company. It’s increasingly popular to automate the data visualization and let the system create the graphs and pictures for you. The data behind the systems have started to understand different kinds of data and are usually ready to offer the best insights that the user should be looking at in their dashboard.

What is machine learning? #

Machine learning is a way to make systems learn from their experiences. In other words its a way to use data to make predictions.

To be able to teach a machine, one would require a lot of data. This data is poured into a software that is an algorithm that tries to figure out the wanted outcome. This algorithm may have been given instructions of the matter and how to react to different cases of the data. Or it might be completely free of rules and the purpose of the algorithm is just to find out patterns of the given data.

Features in machine learning are the facts about the subject and output variables are the facts that have been calculated from these features. Instance is a combination of these two – a single instance of the training data.

For example you could have a vast collection of pictures of animals and some data with them. Some of them are pictures of a cat and the other ones are pictures of a dog. The images come with their features such as the subspecies of the animal, where it’s from, how old is it, the shape of the animal and the colors of it. The output variable or variables are then the fact that is this a picture of a cat or a dog. These images, their features and the output data are called the training data. Now the machine goes through this data and has some kind of an answer (model) what a cat might look like. When you enter another picture of a cat into the algorithm the machine can try to guess based on the earlier data was the picture taken of a cat or not.

Machine learning is a very important and developing part of business analytics because it is one form of artificial intelligence and can help with decision making within the business. A business might want to know what a certain customer is keen on doing next in a website. This business could create a machine learning model of all of the previous users’ behaviors. Next when a customer visits their website the behavior of that single customer is inserted into the algorithm and the website kind of guides the user to go to the next thing that the algorithm predicts the user would want to go to. These kind of applications make machine learning a powerful tool within business analytics.