In continuation to the previous posting, Data Warehousing Concepts I, here are the topics related to high level information in building a data warehouse covered in this post. Concepts Covered In This Course 1. Star Schema 2. Snowflake Schema 3. Data Integrity 4. What is OLAP, MOLAP, ROLAP, and HOLAP? Star Schema This is the simplest form of a data warehouse schema. In the star schema design, a single object (the fact table) sits in… read more →
You have probably already heard of the term big data. It seems to be the new water cooler IT term which is generating a lot of buzz. It’s interesting to read up on the exponentially increasing volume and how companies are beginning to experiment with big data analytics as we enter the era of big data, but how can your business actually leverage a big data solution to save (or make) money? This blog will provide a brief introduction to InfoSphere BigInsights, an enterprise Hadoop… read more →
Managing Metadata In The IBM Infosphere World According to the definition from Wikipedia, metadata management can be defined as the end-to-end process and governance framework for creating, controlling, enhancing, attributing, defining and managing a metadata schema, model or other structured aggregation system, either independently or within a repository and the associated supporting processes (often to enable the management of content). IBM Metadata Workbench is a software tool which is a part of the InfoSphere Information Server… read more →
What is Stream Computing? Nowadays, we hear a lot of buzz around stream computing. What is stream computing? According to the definition from Wikipedia – “Stream processing is a computer programming paradigm, related to SIMD (single instruction, multiple data), that allows some applications to more easily exploit a limited form of parallel processing. In computing, the term stream is used in a number of ways, in all cases referring to a sequence of data elements… read more →
As time progresses, data volumes are increasing and the number of data generating sources is also constantly going up. These changes occur from: Multiple computing processes Transitioning transactional (OLTP) to interactive mode (Social Media) Switching from desktops to mobile devices These shifts end up generating a large volume of data with a variety of data types and various processing mechanisms in order to gain in-and-out access. Generally, size is the primary definition of big data.… read more →
Update: A follow up to this post has been written. Read the Data Warehousing Concepts II post. This post will provide you with high level information about building a data warehouse (DW) and the associated maintenance. The main focus will be on the concepts around implementing a data warehouse. A future blog will provide information about some more advanced data warehousing concepts. Concepts Covered In This Post Dimensional Data Model Slowly Changing Dimension Conceptual Data… read more →
The Software Development Life Cycle (SDLC) is a conceptual model used in project management to describe the stages involved in an information system development project from an initial feasibility study through maintenance of the completed application. Several SDLC methodologies have been developed and each of them has it’s own advantages and disadvantages. It is up to the development team to adopt the most appropriate one for the project. Sometimes, a combination of the models is… read more →
For any organization, understanding the customer’s needs is the most important thing in order to provide better services and build healthy relationship with the customer. Customer analytics is a method of combining customer information (personalization data, customer interests, customer browsing behavioral data, and purchase behavioral data) to present a consolidated view of the customer and their activities. This involves segmenting customers into multiple buckets, understanding their purchase behavior, browsing the behavior, and estimating their next… read more →
What is Data Quality? Data quality is defined as the conformance of data to business rules. Data of good quality is suitable for its intended business purpose. In other words, it is subject to the business user or department using the data. An organization has many departments and each department has its own business rules. So, data which is of high quality to one business user or a specific project may not fit into the… read more →
Why Integrate Customer Data? It is very common that organizations interface with customers through multiple platforms/channels (stores, direct/online, contact centers, special events like sweepstakes etc.) and each platform may have its own operational systems (OLTPs). Stores/retail businesses generally uses ‘Point of Sales’ applications where as direct/online brands use e-commerce platforms like ATG, DAX, Ecommetry etc. On the other side, special events like sweepstakes may collect customer information using paper sign-ups which may get entered in… read more →