Are your analytics siloed, as well as your data?

I love watching my clients moving forward in their data journey. Recently, I've become very interested in big data analytics, resulting from a convergence of my consulting story, which incorporates cloud architecture, Big Data, business intelligence and data visualisation. I'm always looking for new ways to help customers on a strategic level through this journey as they grow into needing Big Data analytics. For this reason, I got interested in Datameer as a tool aimed at making the Big Data journey easier for the organisation, whilst focusing on the business user. I am intrigued by the Datameer insight of being born in Hadoop, but aimed at producing business acuity based on Big Data. So, when I found out that Datameer have released Datameer 6, I took some time to find out more.

In organisations, it's commonly understood that data is mired in silos, usually within the boundaries of the lines of business. What's less understood, however is that analytics can also be siloed, in the same way as the data itself is trapped in silos. To illustrate this issue, let's take a very basic business question; what's the impact of a 15% increase in sales across the organization? This question is subtly deceptive, and the hidden complexity requires a horizontal analysis across the organization, rather than a focused, vertical analysis of one department. As the data descents down the sinkhole of the LOB, so do the resulting analytics.

It's easy to blame the technology for the hidden complexity. We've all heard the trope about the difficulties in obtaining the 'single version of the truth'. It's true that business users tend to have difficulty in using complex technology to create a prism to view each business question. What's less clear, however, is the selection process of technology that is a manifestation the business processes in place. There is a world of hype out there – Big Data? Fast Data? The industry is moving so fast, it’s hard to know what technology to throw your career behind, and business users are not usually at the front-and-center of the technology decision-making process. In the Big Data world, there exists a strange dichotomy whereby there are tools written for specifically for Hadoop, or they are built for business users; not both worlds. Whilst the Hadoop tools allow the developers to have a great scripting experience, they do not come with features that show enterprise usability. On the other hand, Business Intelligence tools, which usually include tools such as Microstrategy, Tableau, or Microsoft's Power BI have a lot more usability, but they have been retrofitted to work with Hadoop, usually via a Hive ODBC driver, or perhaps via Spark SQL. In between the Hadoop tools and the BI stack, there is a vacuum that needs to accommodate the business users' need to analyse Big Data.

Inexorably, the fast moving world of data is not kind to businesses. Currently, there is an industry practice of obsolescing one engine for another before it gains traction, and the resulting array of choice, and perceived attrition, can lead to much confusion. Organisations can find that their naturally-occurring data silos are compounded by the differing technology choices within each LOB, in addition to the disparate data sets which are studded throughout the business estate. For example, one LOB may use Spark for analysing data, whilst another may use Python. In addition to data sets being disjointed throughout the organisation, the disconnected technology choices occlude the horizontal view of the analytics across the enterprise.

In the world of data, we have crossed the Rubicon, which now includes analytics, and businesses are generally playing catch up with themselves and each other. The glittering constellation of technology choices for Big Data can lead to data fatigue amongst decision makers. The perceived choices appear like misplaced atoms in an unknown universe of mysterious possibilities which promise business value and insights as well as a return on investment. In this environment, nobody got fired for not making a decision, and this data fatigue can lead to decision makers waiting until the industry settles into a state of simulated annealing before making a decision. How is it possible to show business value in an environment where change is the only constant, which places the business user at the front-and-center of business-oriented, Big Data analytics whilst tackling the hidden issue of siloed analytics?

Enter Datameer, which offers a middle ground where business users can meet Hadoop. Datameer has over sixty-five connectors to fuse data together, and what's more, these tools are designed to help business users self-serve analytics.

The true insight comes from the ability of Datameer to remain agnostic to the Big Data engine driving the analytics; whether that is MapReduce, Tez or Spark, or perhaps some future Big Data engine. This reduces the manifestation of the false choice between execution engines and data preparation. Ultimately, these activities are chained together to produce insights. The true value is in joining up the dots of the insights across different Big Data engines, thereby mitigating the false choice between different Big Data execution engines. Since Datameer is not coupled to any particular engine, this releases data from the LOBs and businesses can start to look horizontally at their data - perhaps even for the first time.

Datameer place the business user at the forefront of the analytics process, by insulating the users from the choice of which computational engine to use for each analytics job. Further, as an additional enabler, Datameer also pleases IT departments with its focus on enterprise usability and data governance. Industry watchers will know that security came relatively late to Hadoop, and Datameer offers role based access control in order to ensure that the right people get the right Big Data. This is particularly important as Datameer joins the dots horizontally across the organisation to release data, and its associated analytics, from the silos. The release of Datameer 6 has a crisp, clear and improved interface from previous editions, which shows that they are in tune with the non-linear thought processes that lead to insights from data, whether these are generated within the vertical silo, or across silos.

Apache Spark is now added in Datameer 6 as an option to future-proof the technology. Business decision makers, who are tempted by Spark, have the opportunity to try it out alongside other computational engines that reside within the business, such as MapReduce. By using Datameer 6, business decisions makers are not being forced down the route of hiving off their existing engines throughout the organisation, resulting in potentially alienating the LOBs who have worked so hard to understand Big Data technology. It means that siloed LOBs can independently use the Big Data technology that is best for them, whilst still enabling insights across the organisation.

To summarise, you might want to head over to the Datameer for a look. Here's some more info: