It is exciting to see the growth in visual analysis tools like Tableau and Microsoft PowerBI. What are the challenges that these tools face in the data analytics arena and are they suitable for more interesting modeling and exploration?
The likes of Tableau, SAS and PowerBI are lowering the barriers of cost, time and expertise to produce stunning visualisations and help drive critical business decisions. They embody the movement towards self-service BI and drag-and-drop analytics. There is a subtle transformative undercurrent at play here, data isn’t just for nerds! These products are fundamentally nurturing data literacy, critical thinking and curiosity among the masses. But are they perhaps limited in the types of questions they can answer?
I-sah Hsieh (SAS) recently blogged about the power and possibility of data visualisation in his post, Big data lessons learned from visualising 27 years of international trade data. It is quite a nice, practical summary that captures some of the possibilities and the challenges in the world of data visualisation. I quite liked the flow from obvious observations: “the world tends to import most things from China” (Hsieh 2015), to the more interesting discovery of high live stock exports between Ethiopia and Somalia and the observation of a changing import / export relationship between Australia and China. These serve as great examples of some of the questions and insights that effective visualisation might drive.
I wonder about some of the obvious secondary questions, for example:
In these cases it may be that simple summaries or interactive displays are not enough to fully explore these questions but that more complex models need to be developed. Here it is, I believe, that we begin to see the difference between a reporting specialist and an analyst – where an analyst will commonly explore hidden structure and trends in data to help answer such questions.
Arguably R, Python and SAS are three tools favoured by analysts to model and explore underlying relationships. SAS holds a strong market share in the enterprise market and even Microsoft’s own internal data science team are known to use R. So where does this leave technologies like Tableau or PowerBI, are they simply visual reporting tools? It is good to see that neither Tableau or Microsoft have their heads buried in the sand – both companies are expanding and developing products (specifically Tableau Desktop and Azure Machine Learning) to integrate seamlessly with R. In the case of Tableau, perhaps this is a move away from the intuitive, user friendly “analytics for everyone” ethos, but I believe it is an important move if they are to continue to enjoy a strong market position.
So what are the future challenges for data analytics? Here I am deliberately avoiding the challenges of Big Data Analytics as these are an entirely different kettle of fish. But as we consider the current state of analytics a few key common elements stand out:
Outside of the world of truly massive data, the majority of business analytics is performed in-memory on single machines. But there is still a need to integrate with data stored in distributed systems to collect, clean and collate into a working data set for analysis. In fact, this is no stretch at all considering that there are existing facilities in Tableau and SAS to do exactly this. It will be interesting to watch whether Tableau and SAS adapt their architectures to push the data processing out across distributed storage in a similar fashion to Hortonworks or MammothDB. From a technical perspective, I think this is the obvious challenge on the horizon for tools like Tableau and PowerBI. I guess we will have to watch and see…