Can You Reduce Big Data To A Simple Yes Or No?

Assuming you understand the question well, a simple yes or no answer isn’t very likely.

This is especially true in business where data analysis questions usually result in stories, not answers. The answers are derived from these stories by intelligent people. It’s the subject matter experts who can get at the answers.

There is confusion about data analysis. Often, people think an analytical system provides answers, which it does not.

All the time, effort, and money that go into designing, implementing, and using data analysis systems are only to make an order of things so that it’s easier for the experts to figure things out.

Graphs, charts, spreadsheets or what have you cannot tell you what to do. Ironically, I’ve been in meetings where people seemed almost frustrated with this fact.

The old adage that a tool is only as good as its user very much applies to data analysis systems such as what we call business intelligence, or big data, or predictive analytics, or something else.

It must be pretty obvious that big data cannot be reduced to a simple yes or no as the title of this article posits. But I chose the obvious to point out its absurdity. Big data doesn’t contain answers – you do, and it’s not a yes or no.

What you do with big data is as much an art form as it is science. If you can come up with a simple yes or no after analyzing heaps of information then you are a magician.

Recently, there’s been some chatter about artificial intelligence ”doing the thinking” for us. That’s another utopia. Even if such a thing were possible, why would we want to leave ourselves out of any part of the thought process?

The point of this article is that clients must be educated about what data can and cannot do. For example, data cannot answer questions or provide directions, only people can, and should, do that. However, data can enable people to answer questions better, easier, faster, etc.

It may again sound like I am pointing out the obvious here, but I was surprised at how many clients had unreasonable expectations of data analysis systems. Questions like ”why can’t it tell me if I should suspend sales of X to avoid product cannibalization?” came up in some meetings as well as other queries of similar nature where the misunderstanding was that ”it” somehow spits out ”what to do” instructions.

Implementations run a risk of failure if the users’ perception of the system is that ”it” doesn’t give them what they want, even if unreasonably so. This falls into expectation management and education, and it ought to be done right off the bat so that the title of this article doesn’t one day become a question in the meeting room.

Predictive Enough?

Back in early 2000s, Business Intelligence professionals were talking about the novelty of automated cross advertizing. An example could be advertizing greeting cards to someone who just purchased a gift, or vice versa. If you bought a present then you must need a greeting card, so how about some of the choices right there, presented to you before or after your check out. While this wasn’t predictive analytics, it was still based on past shopping behavior. Market baskets are a more sophisticated version of shopping behavior analysis.

Modern predictive analytics is more comprehensive – it is analysis of a pattern across a time span aimed at predicting what you are likely to do or need next. There is, probably, an infinite number of ways to analyze various data to try to predict human behavior – past shopping data, weather pattern data, seasonal data, income level data, credit score data. Creative analysis of combinations of these data and many more factors can uncover what’s about to happen before it happens.

However, many contemporary predictive analysis examples still amount to little more than simple cross advertizing that can, potentially, be achieved without the associated expense. It’s hardly necessary to go through extensive data analysis to “predict” that I am likely to need a greeting card to go with that gift I just purchased. It would be wise to understand the real business need for analytics. If the goal is to develop a competitive edge in the modern marketplace, then predictive analytics is likely the way to go, and it is a journey that takes methodology and patience.


Predictive analytics and advertising: the science behind knowing what women (and men) want

Competence & Honesty Is Hard Work

Hard as it may be to master data analysis, honest unbiased reporting is even more difficult.

We all make mistakes, and quite a few of us give in to the temptation to mislead others for personal gain. While mistakes are par for the course, deliberate manipulation is not.

Charts are one of the most common data visualization choices. It’s not uncommon for people to “distort” reality either by mistake or intentionally when presenting bar, line or other charts. In my career, I saw people spot and correct charting mistakes and make “clever” misrepresentations designed to help someone get ahead.

Labeling of X and Y axes can be misleading. The context of what’s being presented can be made unclear. Cherry picking is a common practice for making things look better than they really are. Misusing charting conventions can “trick” people into seeing something that isn’t there such as a pie chart seemingly showing a favorable picture overall when, really, the chart is focused only on a small subset without an accompanying explanation.

This brings me to the point I am trying to make – be extremely vigilant with analytics. It’s just as dangerous as it is helpful. If you know what you are doing or if you are not very good at it, the result can be a disaster just the same.


Is a chart lying to you? This video has some tips to figure it out. – Vox

Still Artificial

If Artificial Intelligence ever achieves complexity to rival human then we will be contemplating which one is superior. Maybe, we will be forced to reconsider our beliefs should it become clear that humans’ own creation is more intelligent. Or, else, we might be tempted to play God with AI.

Thankfully, none of these outcomes are near. For the time being, people are in control of their own future. AI is taking baby steps in assisting people with narrow scoped tasks in business. Human judgement is still key to success.


Can AI Ever Be as Curious as Humans?

Give Them What They Like And They Will Give You $

Gone are the days of spreadsheet data analysis and “blanket” marketing, or so we hope. In are the days of more individualized and more timely marketing through predictive analytics.

US Cellular found out what users of their mobile app liked and didn’t like to see. Naturally, the company gave the users more of the good stuff and less of the bad. Happier users means better retention and, ultimately, more $ for the company.

A professional photo market service provider figured out which customers were likely to stop using the mobile app. They had a clear “stopped using the app” definition, or churn – 30 days of no app activity. This made it possible to look for various app usage patterns that suggested possible churn. High churn risk correlated to a high number of certain types of app usage events – in other words, if I did “this,” “that,” and “the other,” I am highly likely to churn. But if I only did “this” and “that” then I am medium risk, and just “this” alone is a low risk. The service provider then decided to target high and medium risk customers by sending them information about photo competitions to their mobile phones. It worked – plenty of customers re-engaged with the mobile app and, eventually, continued to spend $.

Data these days can reveal wants and un-wants of individual customers. That means I get my very own marketing material tailored to me instead of the generic stuff, or I am stimulated to re-engage sooner rather than later. The likelihood of me responding is higher, and that’s what you want as a marketer.

Mobile phones are everywhere, including my pocket, and, therefore, it’s an important channel. Mobile app usage data can tell a lot, and a smart marketing team can make good $ from it. The challenge is to understand the data.


How Do Brands Use Predictive Analytics for Mobile?

Emotion is Also a Data Point

We must strike a balance between relying on data and emotion when making decisions. A glaring example of this is our last Presidential election here in America. Clinton relied on data and lost. Trump played on people’s emotions and won.


Tools for predictive analytics took a hit in presidential election

Keep It Simple

Simplicity is great if it works. By nature, Business Intelligence is not simple – there are lots of moving parts and a lot of coordination. However, resigning yourself to complexity is not necessary, in my opinion. There may be simpler quick wins along the way. It’s a matter of understanding customer needs and focusing on what they really expect versus “the whole nine yards.”

At a large American health care provider, the prevailing theory was that a new BI system needed to be built. The new system would address a large swath of data analysis requirements in many departments. Meanwhile, a doctor doing cancer research was manually maintaining data in a simple spreadsheet. Moreover, data arrived in plain text format with a lot of “garbage” characters in it. The doctor dutifully read through the text data and picked out what she needed for her spreadsheet. The doctor did this on a regular basis. It was a Herculean task every single time, to say the least. A piece of software to help her sift through the raw data and automatically pull out what’s relevant is all it took to get her to the grant money she was seeking.

Sure, a hospital may need the new system in the near future, but small problems also need to be addressed and now. The small amount of time it takes to implement a simple solution may be well worth it in terms of value to the client and the learning experience for future BI development. In fact, clients who get early simple results sometimes realize they are not ready for a full blown BI project, nor require it. That’s a good thing because nobody wants to find out well into a major undertaking that a customer wasn’t ready. Small engagements can reveal potential blind spots.

Our experience shows us that BI projects tend to be large and that people tend to think large because of it. Whenever people think large, the small things tend to elude them. Rory Sutherland – a marketing expert – gave a very interesting TED talk some time ago discussing the paradox of grand solutions with low or no impact. We’ve had instances where BI projects were reconsidered in light of smaller and more effective proposals.

None of this is to say that BI projects have little or no value. In fact, the exact opposite is true – business intelligence, when implemented correctly, brings out the best in businesses. However, BI can sometimes be a much more successful proposition if we address the small things.

Observations on Oracle BI


Not quite in the “magic” spot within Gartner’s famous quadrant, Oracle still is a formidable competitor in the business intelligence arena.

Oracle Business Intelligence Enterprise Edition (OBIEE) Version 11g is now significantly more substantial than the versions before it. The feature set and system requirements are greatly expanded resulting in mildly shocking sizing. Someone opined whether Larry Ellison is sticking to the “Fusion” strategy at Oracle by weaving as many products as possible into one thereby making it more expensive, more resource hungry, more dependent on consultant hours, and more appliance oriented. This may have been a humorous comment rather than an opinion, but there could be some truth to it.

A “small” proof-of-concept (POC) installation based on Oracle’s own cookbook came with a recommendation for 300 Gigabytes of disk space – and that’s only for the software. In practice, disk usage came close to 200 Gigabytes when ODI ETL was running. This was due to the number and size of temporary files that were being created during the run. Again, the database software and the data in it were on a separate server.

Another revelation came during a presentation of said POC when the laptop with 4 Gigabytes of memory was unable to run the necessary client side Java applets to show off parts of ODI user interface (BIACM). Therefore, we concluded that developers must make sure their workstations are adequately equipped.

Exalytics, Exadata, and Exalogic are Oracle’s specialty appliances configured to handle high availability. These are not cheap by any means, and software complexity lends itself very well to these pieces of hardware. It is possible to avoid the appliances, but if performance requirements are high then it may be very difficult to even come close with alternate solutions.

Constraints aside, Oracle’s differentiation is their pre-built applications – OBIA – which were, presumably, developed in collaboration with various industry leaders. For example, Student Information Analytics (SIA) is an application that could be installed, configured, and used by the Education industry out of the box with little extra work if the delivered product satisfies business requirements. Even if it does not, a fully built baseline system may be a better start than a blank slate. Currently, there are ready made applications for a number of verticals.

In short, 11g OBIEE suite is a capable monster of a software product that takes money and knowledge to implement. It does, however, provide a wealth of analytics options and that can lead to a good return on investment if done right.

Rumors have it, the next version of OBIEE – 12c – may include built-in visual data lineage tools.