Conquering Big Data with an IBM Platform

As time progresses, data volumes are increasing and the number of data generating sources is also constantly going up. These changes occur from:

  • Multiple computing processes
  • Transitioning transactional (OLTP) to interactive mode (Social Media)
  • Switching from desktops to mobile devices

These shifts end up generating a large volume of data with a variety of data types and various processing mechanisms in order to gain in-and-out access.

Generally, size is the primary definition of big data. Big data usually includes data sets with sizes beyond the ability of commonly-used software tools to capture, manage, and process the data within a tolerable time. Big data sizes are a constantly moving target, ranging from a few dozen terabytes to many petabytes of data in a single data set.

Big Data Can Be Very Small And Not All Large Datasets Are Big

The data streaming from a hundred thousand sensors on an aircraft is big data. However the size of the dataset is not as large as might be expected. Even a hundred thousand sensors, each producing an eight byte reading every second would produce less than 3GB of data in an hour of flying (100,000 sensors x 60 minutes x 60 seconds x 8 bytes).

There are an increasing number of systems that generate very large quantities of very simple data. For example, media streaming is generating very large volumes with increasing amounts of structured metadata. Telecommunications companies have to track vast volumes of calls and internet connections. Even though the data volume is large, accessing this data is very quick as they are organized in a structure.

Big Data Spans Across The Following Four Dimensions

1. Volume: Massive growth of transaction data volumes.

Example: 12 terabytes of Tweets created each day into improved product sentiment analysis.

2. Velocity: Faster accessing of data.

Example: Scrutinize 5 million trade events created each day to identify potential fraud.

3. Variety: Structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more.

Example: Monitor 100’s of live video feeds from surveillance cameras to target points of interest.

4. Veracity: 1 in 3 business leaders don’t trust the information they use to make decisions. Establishing trust in big data presents a huge challenge as the variety and number of sources grows.

Using Big Data Creates Value For You

Big data can offer a significant value by making information transparent and usable at a much higher frequency.

As organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days. Therefore they can expose variability and boost performance. Leading companies are using data collection and analysis to conduct controlled experiments to make better management decisions; others are using data for basic low-frequency forecasting to high-frequency nowcasting to adjust their business levers just in time.

Big data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services. Sophisticated analytics can also substantially improve decision-making.

Big data can be used to improve the development of the next generation of products and services. For instance, manufacturers are using data obtained from sensors embedded in products to create innovative after-sales service offerings such as proactive maintenance (preventive measures that take place before a failure occurs or is even noticed).

The Big Data Platform Is Helping Enterprises Get Big Returns On Investments

  • Healthcare: 20% decrease in patient mortality by analyzing streaming patient data
  • Telco: 92% decrease in processing time by analyzing networking and call data
  • Utilities: 99% improved accuracy in placing power generation resources by analyzing 2.8 petabytes of untapped data

Breakthrough Insights From Big Data

1.   IBM Platform provides a broad and balanced view of big data with the needs of an entire platform – the benefit is pre-integration of its components to reduce your implementation time and cost.

The key platform capabilities include:

Visualization and Discovery: Discover, understand, search, and navigate federated sources of big data while leaving that data in place.

Hadoop-based Analytics: Store any data type in the low-cost, scalable Hadoop engine to lower the cost of processing and analyzing massive volumes of data.

Stream Computing: Continuously analyze massive volumes of streaming data with sub-millisecond response times to take action in real-time.

Data Warehousing: Store and analyze large volumes of structured information with workload optimized systems designed for deep and operational analytics.

Text Analytics: Analyze textual content to uncover hidden meaning and insight in unstructured information.

2.   The Informatica Platform is the enabling technology that helps you to combine social interaction data with enterprise transaction data and gain various insights. Leading organizations are enriching their single view of customer with information from social networks. This way they are achieving new insights into customer preferences and behavior that is invaluable for personalized one-to-one marketing and sentiment analysis.

3.   NoSQL (Not only SQL) Open Source solutions originated out of a need by data-oriented companies like Google, Facebook, eBay and Yahoo! to store the massive amounts of information their systems generate. The fate of these companies lie in creating outstanding personal experiences and they are able to utilize every click and every aspect of a page render in their analysis. Commercial software proved to be too expensive, papers were published and companies took up development of solutions, which soon spread to the community in the open source model.

References Used:

Link 1, Link 2