SPSS数据统计分析与实践

minjd

贡献于2011-10-27

字数:0 关键词:

SPSS数据统计分析与实践 第四章:采样方法与SPSS数据的图形展示 主讲:周涛 副教授 2007-9-25 教学网站:http://www.ires.cn/Courses/SPSS 本章内容: 一、数据类型与采样方法 二、数据的直观表述 一、数据类型与采样方法 两种主要的统计方法 •描述统计(Descriptive Statistics) •推论统计(Inferential Statistics) Collecting and describing data. Making decisions based on sample data. Descriptive Statistics •Collect Data e.g. Survey •Present Data e.g. Tables and Graphs •Characterize Data e.g. Mean n x i∑ Inferential Statistics •Estimation •Hypothesis Testing Making decisions concerning a population based on sample results. Types of Data Categorical Discrete Continuous Numerical Data Data Sources Primary Data Collection Secondary Data Compilation Observation Experimentation Survey Print or Electronic Quota (配额) Types of Sampling Methods Samples Non-Probability Samples (非概率样本) Judgement Probability Samples (概率样本) Simple Random (简单随机样本) Systematic (系统样本) Stratified (分层样本) Cluster (分丛样本) Probability Samples Probability Samples Simple Random Systematic Stratified Cluster Subjects of the sample are chosen based on known probabilities. Simple Random Samples(简单随机样本) •Every individual or item from the target frame has an equal chance (等概率)of being selected. •Selection may be with replacement (放回) or without replacement. (不放回) • Systematic Samples(系统抽样样本) • Decide on sample size: n • Divide population of N individuals into groups of k individuals: k = N/n • Randomly select one individual from the 1st group. • Select every k-th individual thereafter. N = 64 n = 8 k = 8 First Group Stratified Samples(分层取样样本) • Population divided into 2 or more groups according to some common characteristic. • Simple random sample selected from each. • The two or more samples are combined into one. Cluster Samples • Population divided into several “clusters”, each representative of the population. • Simple random sample selected from each. • The samples are combined into one. Population divided into 4 clusters. Types of Survey Errors •Coverage Error •Non Response Error •Sampling Error •Measurement Error Excluded from selection. Follow up on non responses. Chance differences from sample to sample. Bad Question! 二、数据的直观表述 2 144677 3 028 4 1 Organizing Numerical Data Numerical Data Ordered Array Stem and Leaf Display Frequency Distributions(频数分布) Cumulative Distributions(累积分布) Histograms Polygons Ogive 累积曲线 Tables 41, 24, 32, 26, 27, 27, 30, 24, 38, 21 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 2 1 4 4 6 7 7 Organizing Numerical Data: •Data in Raw form (as collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38 •Date Ordered from Smallest to Largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 •Stem and Leaf display: 3 0 2 8 4 1 SPSS茎叶图实例 Ogive 0 20 40 60 80 100 120 10 20 30 40 50 60 0 1 2 3 4 5 6 7 10 20 30 40 50 60 2 144677 3 028 4 1 Organizing Numerical Data Numerical Data Ordered Array Stem and Leaf Display Histograms Ogive Tables 41, 24, 32, 26, 27, 27, 30, 24, 38, 21 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Frequency Distributions Cumulative Distributions Polygons Tabulating Numerical Data: •Sort Raw Data in Ascending Order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 •Find Range: 58 - 12 = 46 •Select Number of Classes: 5 (usually between 5 and 15) •Compute Class Interval (width): 10 (46/5 then round up) •Determine Class Boundaries (limits): 10, 20, 30, 40, 50 •Compute Class Midpoints: 15, 25, 35, 45, 55 •Count Observations & Assign to Classes Tabulating Numerical Data: Frequency Distributions Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Class Frequency 10 but under 20 3 .15 15 20 but under 30 6 .30 30 30 but under 40 5 .25 25 40 but under 50 4 .20 20 50 but under 60 2 .10 10 Total 20 1 100 Relative Frequency Percentage Histogram 0 3 6 5 4 2 0 0 1 2 3 4 5 6 7 5 1525364555More Frequency Graphing Numerical Data: The Histogram Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Class Midpoints No Gaps Between Bars SPSS直方图实例 OR Graphing Numerical Data: The Frequency Polygon Frequency 0 1 2 3 4 5 6 7 51525364555More Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Class Midpoints Cumulative Cumulative Class Frequency % Frequency 10 but under 20 3 15 20 but under 30 9 45 30 but under 40 14 70 40 but under 50 18 90 50 but under 60 20 100 Tabulating Numerical Data: Cumulative Frequency Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Graphing Numerical Data: The Ogive(累积曲线)(Cumulative % Polygon) Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Ogive 0 20 40 60 80 100 120 10 20 30 40 50 60 Class Boundaries Organizing Categorical Data Univariate Data: Categorical Data Tabulating Data The Summary Table Graphing Data Pie Charts Pareto DiagramBar Charts Summary Table (for an investor’s portfolio) Investment Category Amount Percentage (in thousands $) Stocks 46.5 42.27 Bonds 32 29.09 CD 15.5 14.09 Savings 16 14.55 Total 110 100 Variables are Categorical. 0 5 10 15 20 25 30 35 40 45 Stocks Bonds Savings CD 0 20 40 60 80 100 120 0 1020304050 Stocks Bonds Savings CD Organizing Categorical Data Univariate Data: Categorical Data Tabulating Data The Summary Table Graphing Data Pie Charts Pareto Diagram (排列图) Bar Charts Bar Chart (for an investor’s portfolio) Investor's P orfolio 0 1020304050 Stocks Bonds CD Savings Amount in K$ Pie Chart (for an investor’s portfolio) Percentages are rounded to the nearest percent. Amount Invested in K$ Savings 15% CD 14% Bonds 29% Stocks 42% Pareto Diagram Pareto diagram 0 10 20 30 40 50 Stocks Bonds Savings CD 0 20 40 60 80 100 120 Axis for bar chart shows % invested in each category. Axis for line graph shows cumulative % invested. Organizing Categorical Data Bivariate Data: Contingency Table: Investment in Thousands of Dollars Investment Investor A Investor B Investor C Total Category Stocks 46.5 55 27.5 129 Bonds 32 44 19 95 CD 15.5 20 13.5 49 Savings 16 28 7 51 Total 110 147 67 324 Organizing Categorical Data Bivariate Data: Comparing Investors 0 102030405060 Stocks Bonds CD Savings Investor A Investor B Investor C Side by Side Chart SPSS Pie Chart (饼图) SPSS Line Chart (折线图) SPSS Pareto diagram (排列图) 图表补充说明-Data in Chart 1. Summaries for Groups of Cases 2. Summaries of Separate Variables 3. Values of Individual Cases 图表补充说明-折线图 1. Summaries for Groups of Cases Categories of a single variable are summarized 2. Summaries of Separate Variables Two or more variables are summarized. Each point represents one of the variables. 3. Values of Individual Cases A single variable is summarized. Each point represents an individual case. 折线图-示例1 1. Summaries for Groups of Cases Categories of a single variable are summarized 折线图-示例2 2. Summaries of Separate Variables Two or more variables are summarized. Each point represents one of the variables. 多个变量相同的统计量 折线图-示例3 2. Summaries of Separate Variables Two or more variables are summarized. Each point represents one of the variables. 单个变量不同的统计量 折线图-示例4 3. Values of Individual Cases A single variable is summarized. Each point represents an individual case. 按观测顺序排列 折线图-示例5 3. Values of Individual Cases A single variable is summarized. Each point represents an individual case. 按某个变量排列(未排序) 折线图-示例6 3. Values of Individual Cases A single variable is summarized. Each point represents an individual case. 按某个变量排列(排序) 折线图-Drop-line-示例7 问题: 某公司在雇员的起始工资上是否存 在性别歧视? 折线图-Drop-line-示例7 问题: 某公司在雇员的起始工资上是否存在性别歧视? 可能的问题:这种差异很有可能 受男性和女性的教育水平影响! 折线图-Drop-line-示例7 问题: 某公司在雇员的起始工资上是否存在性别歧视? 男性和女性的教育水平确实存在差 异,因此起始工资的差别可能不是受 性别影响,而是受教育水平影响! 折线图-Drop-line 折线图-Drop-line-示例7 结论:………… END

下载文档,方便阅读与编辑

文档的实际排版效果,会与网站的显示效果略有不同!!

需要 15 金币 [ 分享文档获得金币 ] 3 人已下载

下载文档

相关文档