Big Data, Big Risks, Big Opportunities

Big Data is an area of data analytics and research that is increasingly popular. Big Data has been used to predict the spread of disease with greater accuracy, forecast weather patterns in fine detail, identify false Medicaid claims, optimize trading algorithms, and much more. Countless Big Data startups are in the news, and Crunchbase lists nearly 4,000 companies under the category “Big Data”, ranging from software companies to venture capital firms to accounting firms and more. Despite Gartner dropping Big Data from its “Hype Cycle for Emerging Technologies,” it’s clear that Big Data is here to stay, and that more and more companies of all sizes and from all industries are looking to Big Data for insights and competitive advantages. Big Data has huge potential for increasing our understanding of complex issues, but there are key strategic and security pitfalls that companies need to carefully manage.

 

What Exactly is Big Data?

Big Data refers to large, complex datasets. Today it is often used to describe data that:

  • Is significant in Volume (terabytes), has Variety (different types), and is high Velocity (generated in real-time);

or,

  • Is technically not “big” enough to be called big data, but reflects an organization’s desire for greater, faster, and more efficient decision-making and insights.

Regardless of the technicalities, Big Data is having a big impact on organizational effectiveness, efficiency, and insights into consumer behavior, supply chain and logistics, financial markets, and more.

 

Examples of Big Data Analytical Techniques

In the International Journal for Information Management, Gandomi and Haider provide commonly used methods of data analysis in Big Data [PDF], summarized below:

Text Analytics

Refers to techniques that extract information from textual data. Social network feeds, emails, blogs, online forums, survey responses, corporate documents, news, and call center logs are examples of textual data held by organizations. Text analytics enable businesses to convert large volumes of human generated text into meaningful summaries, which support evidence-based decision-making.

Audio Analytics

Audio analytics analyze and extract information from unstructured audio data. When applied to human spoken language, audio analytics is also referred to as speech analytics.

Video Analytics

Video analytics, also known as video content analysis (VCA), involves a variety of techniques to monitor, analyze, and extract meaningful information from video streams.

Social Media Analytics

Social media analytics refer to the analysis of structured and unstructured data from social media channels, including social networks (Facebook, LinkedIn), blogs (Blogger, WordPress), microblogs (Twitter, Tumblr) social news (Reddit), review sites (Yelp, TripAdvisor), and more. 

Predictive Analytics

Predictive analytics comprise a variety of techniques that predict future outcomes based on historical and current data. In practice, predictive analytics can be applied to almost all disciplines – from predicting the failure of jet engines based on the stream of data from several thousand sensors, to predicting customers’ next moves based on what they buy, when they buy, and even what they say on social media.

Simplified Sample Scenario – Insurance Company Customer Service Call

+ Tap on image to expand.

In this simplified scenario, a customer service agent is on the phone with a current customer, say, to renew her auto insurance policy. There are multiple potential touchpoints for big data analytical applications:

 

  • Speech analytics using a tool like Vokaturi to detect customer emotions, and then provide live guidance to the customer service agent (e.g., phrases to defuse situation if customer is upset).
  • Text analytics using a tool like IBM Watson to identify key concepts in the transcribed speech, and then provide live guidance (e.g., the customer mentioned “boat,” the system can ask the agent to inquire if she needs boat insurance).
  • Predictive analytics for management reporting and decision making (e.g., likelihood of hitting specific sales targets, best prospects to pursue, etc.).

Simplified Big Data Opportunity Example – Point-of-Care Intervention Opportunities

Similarly, the insights drawn from big data provide significant opportunities to improve healthcare outcomes, increasingly at the point-of-care, as shown in four simplified examples here.

Additionally, solutions like SAS can help with fraud prevention and risk adjustment. Other analytics providers, such as NICE Systems, can even help with ensuring broader population well-being with public safety solutions based on audio analytics and more. IBM’s Watson Health initiative spans numerous related technology disciplines including A.I. and machine learning, but at its core relies on Big Data as the source of its insights.

+ Tap on image to expand.

Big Data Technology Landscape

The Big Data landscape is diverse and fragmented, with companies providing a variety of related services:

  • Infrastructure
  • Analytics
  • User Applications
  • Data Sources and Interfaces

Many of the larger companies, such as SAS, Oracle, IBM, Microsoft, Google, and Amazon, are seeking to build a comprehensive ecosystem of data, analytics, machine learning, and business intelligence products.

Matt Turck from Firstmark provides a large overview of the industry players by category, shown here. [PNG]

The companies featured in Turck’s diagram span the spectrum of early-stage and mature companies, both in their overall company existence as well as their development of Big Data solutions and services. Others, such as Knowre in the Education Application space, are on the fringes of Big Data, focusing more on machine learning algorithms with smaller data sets.

+ Tap on image to expand.

Disclaimer: Diagram may be illegible, but is meant to illustrate the complexity of the Big Data landscape, not to point to specific solution providers.

Big Risks?

The risks of Big Data are manifold, and organizations need to carefully plan for their use of Big Data solutions. These risks include strategic and business risks, such as operational impacts and cost overruns, as well as technical risks, such as data quality and security. Six of the key risks are highlighted in the table below.

Loss of Strategic Focus

Organizations that jump into Big Data without specific goals in mind, or do not know what they are looking for from Big Data are at risk of big investment without commensurate results, information overload and analysis paralysis.

Mitigation: 

– Create a strategic roadmap that defines the role of big data and data analytics.

– Tie big data projects to specific strategic business goals and outcomes.

Poor Data Quality

Big Data oftens means Lots of Bad Data. This increases the risk of generating outputs and insights from data analysis that are wrong, or even dangerous. While Big Data’s scale and volume sometimes means that idiosyncracies are washed out, there are challenging problems here: in some industries, small discrepancies can mean life or death; and for many, the idiosyncracies are the real business opportunities.

Mitigation: 

– Design, implement, and maintain an ongoing data quality assurance plan.

– Verify the sources and quality of your data, especially if it has passed through multiple systems.

– Establish a Data Governance plan to ensure accountability and response.

Cost Overruns

Like many other types of solution implementations, Big Data implementations can be expensive. They often require setup of multiple platforms and systems, including cloud infrastructure, applications, and complex integrations with existing systems. The combination of complexity and relative immaturity of these systems greatly increases the risk of cost overruns.

Mitigation: 

– Conduct an assessment to determine the ROI and business case for implementing a Big Data solution.

– Establish strong program/project management and governance for Big Data projects.

Operational Disruption

Big Data provides numerous potential insights from data; in fact, analytics companies are often judged by the novelty and quantity of insights that their algorithms can generate. However, these insights can rapidly change or even contradict others, and should be continuously tested and used as only one factor in decision-making. Companies that chase Big Data insights too aggressively are at risk of making too many high-impact and complex business process changes without being able to accurately predict outcomes.

Mitigation:

– Conduct a strategic assessment to test and ensure alignment of project goals with overall business outcomes.

– Set concrete, achievable, and measurable goals for Big Data projects, and evolve projects to change focus from insight-generation to long-term success.

Security and Privacy

The very nature of Big Data increases its security risks: Big Data uses distributed systems, which requires multiple levels of protection; automated data transfers on a very frequent or real-time basis need to be constantly authenticated; data origin is often unknown or untracked because of the scale of the data; Big Data can include sensitive Personal Identifiable Information or Personal Health Information.

Mitigation:

– Establish a Data Security Plan and Data Governance Plan.

– Ensure the CSO role provides executive-level accountability and oversight over data security.

– Anonymize data where possible.

– Frequently audit data security, privacy, and compliance; and review and update Data Security plans as needed.

– Use independent, third-party review and oversight as needed.

These risks can apply to just about any Big Data project, and are not intended to be comprehensive. Depending on the specific industry, purpose, or niche, there may be additional considerations to ensure that Big Data solutions positively impact an organization. Fortunately, a growing body of knowledge and expertise exists that can help to ensure successful projects and investments.

The Importance of Project Execution in Big Data

The title of this section seems fundamentally contradictory: Big Data is an exciting and young field of interest, especially for enterprise applications, while Project Management is the mundane province of Gantt charts and issue trackers. However, Big Data implementations and data analytics capabilities must fit into the overall organizational strategy, business processes, and business constraints.

With a strong focus on program governance, data governance, and effective project management, Big Data projects can become integrated into the overall business, rather than silos. As with any project, there are several core components of project management required for success:

  • Program/Project Governance
  • Scope Management
  • Schedule Management
  • Budget Management
  • Risk Management
  • Issue Management
  • Decision and Change Management
  • Data Governance and Management
  • Quality Management
  • Communications Planning
  • Other, as required for the project (e.g., Procurement, Organizational Change Management, etc.)

More information about can be found in Excardo’s feature articles about Project Management.

Want to Learn More?

If we didn’t answer all of your questions, or if you just want to chat about Big Data challenges, feel free to send us a message.