Isom 2500 - Cheat Sheet
Essay by crystalllx • May 8, 2017 • Course Note • 2,237 Words (9 Pages) • 1,914 Views
ISOM 2500 1st Midterm Exam Revision Notes
Statistics Terms
- Population:A group of all items of interest under study
- Census: Survey that includes every member in the population
- Parameter: A numerical descriptive measure of a population
- Sample: A part of population
- Statistic: A numerical descriptive measure of a sample
- Variable: is some characteristic of a population or a sample
- Data/Datum: Observed values of a variable
- Observations/Case: The set of measurements obtained for a particular element
- Elements: Entities on which data are collected
Types of Variables
- Categorical/Qualitative variable: Non-numerical variables
- Numerical/Quantitative variable: Numerical Data
- Discrete Variables: Variables that are integers
Categorical/Qualitative variable | Numerical/Quantitative variable |
Nominal Data | Interval Data |
Ordinal Data | Ratio Data |
- Continuous Variables: Variables can be any real number
Nominal Data: Applies to data that are divided into different groups (E.g. Gender)
**Note that only calculations based on frequencies or % percentages of occurrence are valid
Ordinal Data: A type of nominal data where can be sorted or ranked.
**Note that the data may be treated as nominal but not as interval
Interval Data: Applies to data that can be sorted and for which the difference can be counted and interpreted, has an arbitrarily-defined zero.
**Note that an interval data can be treated as ordinal or nominal as well
Ratio Data: Applies to data that can be sorted and for which the difference and ratio can be calculated and interpreted, has non-arbitrarily defined zero.
Elements | Variables | Period of Time | |
Cross-sectional data | Different | N/A | Same |
Time series data | Same | Same | Different |
Panel or longitudinal data | Different | Different | Different |
Describing One Categorical Variable[pic 1][pic 2][pic 3][pic 4]
Tabular Display – Frequency distribution[pic 5]
Graphical Display: Pie chart, bar chart
Relative frequency of a category = [pic 6]
Percent relative frequency of a category = [pic 7][pic 8]
Describing Two Categorical Variables
Tabular Display: Contingency Table Graphical Display: Cluster bar chart
Describing One numerical variable
Tabular Display: Frequency distribution // Summary table
Graphical display: Dotplot, Stem and Leaf diagram, histogram, polygon and Ogive
The number of observations falling in each class is called the class frequency
Steps for constructing a frequency distribution for numerical variable:
- Sort the data
- Determine the range where range equals to the largest observation minus the smallest observation
- Determine the number of classes k using Sturges’ formula: , where n is the sample size, round up the k.[pic 10][pic 9]
- Divide the range by k to determine the class width[pic 11][pic 12]
- Determine the class limits[pic 13]
Dotplot
A horizontal scale on which dots are placed to show the numerical values of the data points. If a value repeats, the dots will pile up at that location, one dot for one repetition.
**Note that the dotplot is only useful for small sample size where n is less than 30
Stem-and-Leaf Diagram
A partly tabular//graphical way of summarizing data and it is suitable for moderate to large data sets (usually less than 100 observations)[pic 14]
Histogram
A bar chart for numerical values whose areas are proportional to relative frequencies of respective classes, for the sample size greater than or equal to 30. Note that the shapes of histograms can be bell-shaped, positively-skewed or negatively-skewed.
[pic 15][pic 16][pic 17][pic 18][pic 19][pic 20]
[pic 21][pic 22][pic 23][pic 24][pic 25][pic 26][pic 27][pic 28]
For sample size less than 30 | For less than 100 observations | For sample size greater than 30 |
Dotplot | Stem-and-Leaf Diagram | Histogram |
Polygon
...
...