An Introduction to on-Line Analytical Processing (olap)
Essay by people • February 9, 2011 • Research Paper • 7,430 Words (30 Pages) • 2,071 Views
White Paper
An Introduction to OLAP
Multidimensional Terminology and Technology
Contents
What is OLAP? ______________________________________________________________________ 2
What is Multidimensional Data? _________________________________________________________ 3
Consolidation: The Key to Consistently Fast Response _______________________________________ 5
Simple Hierarchies Within Dimensions ____________________________________________________ 7
Variables ___________________________________________________________________________ 9
Vector Arithmetic ____________________________________________________________________ 10
n-Dimensional Databases _____________________________________________________________ 10
Practical Limitations on Database Size___________________________________________________ 12
Time-Series Data Type _______________________________________________________________ 12
Sparse Data________________________________________________________________________ 14
All Dimensions are Not Created Equal ___________________________________________________ 15
Multiple Hierarchies and Classes within Dimensions ________________________________________ 16
Drilling to Relational Data _____________________________________________________________ 17
Security and Robustness______________________________________________________________ 18
"MDSQL" Multidimensional Query Language ______________________________________________ 19
Conclusion_________________________________________________________________________ 20
What is OLAP?
OLAP stands for "On-Line Analytical Processing." In contrast to the more familiar OLTP ("On-Line
Transaction Processing"), OLAP describes a class of technologies that are designed for live ad hoc data
access and analysis. While transaction processing generally relies solely on relational databases, OLAP
has become synonymous with multidimensional views of business data. These multidimensional views
are supported by multidimensional database technology. These multidimensional views provide the
technical basis for the calculations and analysis required by Business Intelligence applications.
"Having an RDBMS doesn't mean instant decision-support nirvana. As enabling as RDBMSs have been
for users, they were never intended to provide powerful functions for data synthesis, analysis, and
consolidation (functions collectively known as multidimensional data analysis)."
- E. F. Codd, Computerworld
OLTP applications are characterized by many users creating, updating, or retrieving individual records.
Therefore, OLTP databases are optimized for transaction updating. OLAP applications are used by
analysts and managers who frequently want a higher-level aggregated view of the data, such as total
sales by product line, by region, and so forth. The OLAP database is usually updated in batch, often from
multiple sources, and provides a powerful analytical back-end to multiple user applications. Hence, OLAP
databases are optimized for analysis.
While relational databases are good at retrieving a small number of records quickly, they are not good at
retrieving a large number of records and summarizing them on the fly. Slow response time and inordinate
use of system resources are common characteristics of decision support applications built exclusively on
top of relational database technology. Because of the ease with which one can issue a "run-away SQL
query," many IS shops do not give users direct access to their relational databases.
Many of the problems that people attempt to solve with relational technology are actually
multidimensional in nature. For example, SQL queries to create summaries of product sales by region,
region sales by product, and so on, could involve scanning most if not all the records in a marketing
database and could take hours of processing. An OLAP server could handle these queries in a few
seconds.
OLTP applications tend to deal with atomized "record-at-a-time" data, whereas OLAP applications usually
deal with summarized data. While OLTP applications generally do not require historical data, nearly every
OLAP application is concerned with viewing trends and therefore requires historical data. Accordingly,
OLAP databases need the ability to handle time-series data -- an attribute that will be discussed in detail
later in this paper. While OLTP applications and databases tend to be organized around specific
processes (such as order entry), OLAP applications tend to be "subject oriented," answering such
questions as "What products are selling well?" or "Where are my weakest sales offices?"
What is Multidimensional Data?
Relational databases are organized around a list of "records." Each record contains related information
that is organized into "fields." A typical example
...
...