科学资源
创建投稿
阅读投稿
投稿分类
🔥热门悬赏
注册
登录
AP Statistic Review Sheet
作者:
AnonTokyo
简介:
AP统计复习学案(常考概念+解释)
最后修改:
2025-04-27 16:56:38.028694
文章状态:
已发布
标签:
AP
数学
AP
statistic
SOP
F
ebruary
16,
2025
1
Prop
ert
y
and
In
terpretation
of
common
statistics
1.1
Mean
The
arithmetic
mean
of
the
data
(
µ
for
p
opulation
parameters,
¯
x
for
sample
statistics):
¯
x
=
1
n
X
i
x
i
1.2
Mo
de
The
v
alue
that
o
ccurs
the
most
times.
1.3
Median
Median
(Q2)
is
the
v
alue
b
elo
w
whic
h
50%
of
the
data
falls.
F
or
an
ordered
sequence
of
length
n
,
calculate
n
+1
2
to
get
the
index
of
the
median.
If
the
result
is
a
decimal,
tak
e
the
nearest
in
tegers
as
indices.
Find
one
or
t
w
o
n
um
b
ers
(a
v
erage
if
t
w
o)
i
n
the
sequence
to
get
the
median.
1.4
Q1
and
Q3
Q1
is
the
v
alue
b
elo
w
whic
h
25%
of
the
data
falls;
Q3
is
the
v
alue
b
elo
w
whic
h
75%
of
the
data
falls.
F
or
an
ordered
sequence
of
length
n
,
calculate
n
+1
4
and
3(
n
+1)
4
to
ge
t
the
indices
of
Q1
and
Q3.
F
or
eac
h
index,
if
it
is
a
decimal,
tak
e
the
nearest
in
tegers
as
indices.
Find
one
or
t
w
o
n
um
b
ers
(a
v
erage
if
t
w
o)
in
the
sequence
to
get
Q1
and
Q3.
1.5
Range
A
measure
of
the
spread
of
the
data
(max
-
min).
1.6
In
ter-quartile
Range
(IQR)
A
measure
of
the
range
within
whic
h
the
middle
50%
of
the
data
falls:
I
QR
=
Q
3
−
Q
1
1.7
V
ariance
A
measure
of
the
disp
ersion
of
the
data
(
σ
2
for
p
opulation
parameters,
s
2
for
sample
statistics):
s
2
=
1
n
−
1
X
i
(
x
i
−
¯
x
)
2
1.8
Standard
Deviation
The
t
ypi
c
al
dierence
of
data
to
the
mean
(
σ
for
p
opulation
parameters,
s
for
sample
statistics):
s
=
√
s
2
=
s
1
n
−
1
X
i
(
x
i
−
¯
x
)
2
1
1.9
Z-score
A
standardized
v
alue
indicates
ho
w
man
y
standard
deviations
a
particular
data
is
from
the
mean:
z
i
=
x
i
−
¯
x
s
-
P
ositiv
e,
0,
negativ
e
→
ab
o
v
e
mean,
exact
mean,
b
elo
w
mean.
-
Allo
ws
comparison
across
dieren
t
distributions
with
dieren
t
scales
or
units.
1.10
Residual
Residual
is
the
dierences
b
et
w
een
actual
v
alue
and
exp
ected
v
alues:
e
=
y
−
ˆ
y
Residual
in
Linear
Regression
is
exp
ec
ted
to
ha
v
e
mean
at
zero,
and
smaller
v
ariance
is
b
etter.
1.11
Correlation
Co
ecien
t
Measures
the
strength
and
direction
of
the
linear
relationship
b
et
w
een
t
w
o
v
ari
ables
:
r
=
1
n
−
1
X
i
x
i
−
¯
x
s
x
·
y
i
−
¯
y
s
y
=
1
n
−
1
X
z
x
·
z
y
The
sign
of
the
correlation
co
ec
ien
t
represen
ts
the
p
ositiv
e
or
negativ
e
correlation.
Correlation
Strength
|
r
|
Linear
Correlation
Degree
0.0
No
Correlation
0.0
∼
0.2
V
ery
W
eak
0.2
∼
0.4
W
eak
0.4
∼
0.6
Mo
derate
0.6
∼
0.8
Strong
0.8
∼
1.0
V
ery
Strong
1.0
Linear
Relationship
1.12
Co
ecien
t
of
Determination
A
measure
of
the
p
ercen
tage
of
the
v
ar
iation
in
the
resp
onse
v
ari
able
can
b
e
explained
b
y
the
linear
relationship
with
exp
lanatory
v
ariable:
r
2
=
1
−
S
S
R
S
S
T
=
1
−
S
R
2
S
T
2
2
Describ
e/compare
distribution
Use
con
text
,
comparativ
e
languages
2.1
1-dimensional
Data
(SOCS)
•
S
-
Shap
e
-
Unimo
dal,
bimo
dal
-
Sk
ew
ed
left,
sk
ew
ed
righ
t
[mean=median
⇒
sk
ew
ed
righ
t;
mean¡median
⇒
sk
ew
e
d
left],
uniform,
symmetric,
b
ell-shap
ed
(write
”appro
ximately”
when
not
sure)
•
C
-
Cen
ter
-
Median,
mean
•
S
-
Spread
-
IQR(Q3-Q1),
range(max-min),
standard
deviation
2
•
O
-
Outliers
If
there
exist
p
oin
t
x
satisfy:
(reme
m
b
er
to
only
c
ho
os
e
one
criterion)
x
<
Q
1
−
1
.
5
I
QR
or
Q
3
+
1
.
5
I
QR
>
x
(robust)
or
x
<
¯
x
−
2
s
or
¯
x
+
2
s
>
x
(not
robust)
Then
x
is
iden
ti
e
d
as
an
ou
tlier.
2.2
2-dimensional
Data
•
Direction
(P
ositiv
e
or
Negativ
e)
•
Strength
(reference
to
P
art
1:
Correlation
Co
ecien
t)
•
F
orm
(linear
or
n
ot)
•
Un
usual
F
eatures
3
Graphs
3.1
Bar
Graph
•
Displa
y
Cate
gori
c
al
Data
•
The
order
of
categories
is
not
imp
ortan
t!
•
Con
v
ert
to
frequency
b
efore
plotting
3.2
Bo
x-plot
•
Displa
y
Nume
r
ic
al
Data
•
The
b
o
x
represen
ts
Q1,
Q3
and
IQR;
The
line
inside
the
b
o
x
is
Q2
(median);
The
whisk
ers
extend
to
minim
um
and
maxim
um,
excluding
outliers
(Reference
to
P
art
2
ab
out
the
recognition
of
outliers);
Outliers
should
b
e
mark
ed
with
aste
ri
s
k(*)
•
The
distributi
on
of
the
most
imp
ortan
t
v
e
lines
(minim
um,
Q1,
me
d
ian,
Q3,
maxim
um)
in
the
Bo
x-plot
can
b
e
rev
ealing.
If
the
lin
e
s
concen
trate
on
the
left
side,
then
the
distribution
sk
ew
ed
to
the
righ
t;
If
the
lines
concen
trate
on
the
righ
t
side,
the
distribution
sk
ew
ed
to
the
left.
3.3
Histogram
•
The
x-axis
represen
ts
the
in
terv
als
(bin
s
)
of
the
data.
•
The
y-axis
represen
ts
the
frequency
(coun
t)
of
data
p
oin
ts
within
eac
h
bin.
•
Most
useful
for
displa
ying
the
shap
e
of
the
distribution
of
n
umerical
data
3.4
Scatter
Plot
•
Explanatory
V
ariable
(usually
x);
Resp
onse
V
ariable
(usually
y)
•
Clearly
lab
el
or
iden
tify
v
ar
iables
with
their
axis,
and
pa
y
atte
n
tion
to
units!
•
By
plotting
dots
according
to
their
co
ordinates,
w
e
can
nd
the
b
es
t
t
li
ne
an
d
c
al
c
ul
ate
its
slop
e
and
in
tersec
t.
•
F
eatures:
1.
Direction:
P
ositiv
e:
P
oin
ts
tend
to
rise
as
y
ou
mo
v
e
from
left
to
righ
t;
N
egativ
e:
P
oin
ts
tend
to
fall
as
y
ou
mo
v
e
form
left
to
righ
t.
3
2.
Strength:
Reference
to
Correlation
Co
ecien
t.
3.
F
orm
(Linear,
Non-Linear)
4.
Un
usual
F
eatures
(In
uen
tial
P
oin
ts):
–
Lev
erage:
distance
to
π
–
Outlier:
Recognized
in
Residual
Plot,
signican
t
bigger
residual,
comparing
to
other
p
oin
ts
(or
r
e
cognized
if
a
p
oin
t
if
signican
t
f
arther
to
the
b
est-t
lin
e
)
3.5
Other
Graphs
•
Pie
c
har
t
–
Help
comparing
parts
of
a
whole
and
quic
kly
iden
tifying
dominan
t
c
ategories.
•
Con
tingency
T
able
–
Displa
y
the
frequency
distribu
tion
of
t
w
o
categorical
v
ariables.
Useful
in
analyzing
the
rela-
tionship
b
et
w
een
t
w
o
categorical
v
ar
iables,
often
used
for
Chi-square
tests
of
indep
e
n
dence
.
•
Stem
plot
–
Recognize
or
dene
the
common
part
(often
is
the
tens
digit)
in
data,
and
group
the
data
b
y
the
common
part
dened,
and
app
end
the
distinctiv
e
information
of
eac
h
data
p
oin
t
on
the
list
of
th
e
common
part
matc
hes
this
data.
The
nal
graph
w
ould
b
e
similar
to
Histogram,
sho
wing
the
shap
e
of
the
distri
bution
of
a
n
ume
ri
c
al
data.
4
Probabilit
y
4.1
Probabilit
y
Basis
V
enn
Diagram
is
helpful.
Probabilit
y
pair
t
hat
is
equiv
alen
t
when
giv
en:
P
(
A
|
B
)
+
P
(
A
C
|
B
)
=
1
P
(
A
|
B
C
)
+
P
(
A
C
|
B
C
)
=
1
T
est
of
Indep
endence:
If
P
(
A
∩
B
)
=
P
(
A
)
×
P
(
B
)
or
P
(
A
|
B
)
=
P
(
A
)
is
true,
then
Ev
en
t
A
and
Ev
en
t
B
are
indep
enden
t.
4.2
Ba
y
es
Theorem
The
core
of
Ba
y
es
Theorem:
(
P
(
A
∩
B
)
=
P
(
A
|
B
)
×
P
(
B
)
=
P
(
B
|
A
)
×
P
(
A
)
P
(
A
)
=
P
(
A
∩
B
)
+
P
(
A
∩
B
C
)
So
w
e
can
deriv
e:
P
(
A
|
B
)
=
P
(
B
|
A
)
×
P
(
A
)
P
(
B
|
A
)
×
P
(
A
)
+
P
(
B
|
A
C
)
×
P
(
A
C
)
Common
settings:
E:
ev
en
t,
P:
test
p
ositiv
e
Sensitivit
y
=
P
(
P
|
E
)
Sp
ecicit
y
=
P
(
P
C
|
E
C
)
5
Linear
Regression
Describing
scattered
dot
plots:
•
Strong/W
eak
/Negativ
e
•
Asso
ciations
Describ
e
the
use
of
linear
regression:
Uses
an
explanatory
v
ariable,
x
,
to
predict
the
resp
onse
v
ariable,
y
.
Describ
e
le
ast-square
regression:
Min
imiz
es
the
sum
of
the
squares
of
the
residuals.
Describ
e
r
and
r
2
:
4
•
r
is
co
ecien
t
of
correlation
that
describ
es
the
indicates
b
oth
the
dir
e
ction
an
d
strength
of
the
linear
relationship.
•
r
2
is
co
ecien
t
of
determination
that
describ
e
the
prop
or
tion
of
v
ariation
in
the
resp
onse
v
ariable
that
is
explained
b
y
the
explanatory
v
ariable
in
the
mo
del.
6
Randomly
assign
sub
jects
Describ
e
ho
w
to
randomly
select/ass
ign
sub
jects:
1.
Num
b
er
all
sub
jects
from
1
to
n
.
2.
Use
random
n
um
b
er
generator
to
generate
in
teger
range
from
1
to
n
.
3.
Selected
the
sub
ject
corresp
onding
to
the
random
n
um
b
er
generated.
7
Design
an
exp
erimen
t
Describ
e
ho
w
to
design
an
exp
erime
n
t:
•
Determine
v
ariables
(what
are
explanatory
what
are
resp
on
s
e)
(b
e
careful
with
confoundi
ng
v
ari-
ables!).
•
Determine
exp
erimen
t
metho
d
:
–
Single
blind:
Su
b
ject
don’t
kno
w
the
e
x
p
erimen
t
ob
jectiv
e.
–
Double
blind:
(Sub
ject
+
Researc
h
mem
b
e
r
don’t
kno
w
the
exp
erimen
t
ob
jectiv
e).
–
Blo
c
k:
Ran
domly
assign
treatmen
t
to
eac
h
similar
blo
c
ks.
–
Matc
hed
P
air:
Set
sub
je
ct
A
and
B
as
a
blo
c
k
assuming
A
and
B
are
similar,
randomly
assign
treatmen
t
to
A
and
assign
the
other
treatmen
t
to
B
.
•
Determine
con
trol
groups:
Use
placeb
o
or
just
do
not
giv
e
tr
e
atmen
t
to
som
e
group
of
sub
jects.
8
Construct
and
in
terpret
a
condence
in
terv
al
Describ
e
what
condence
in
terv
al
is:
Condence
in
terv
al
is
a
range
of
v
alues
used
to
estimate
a
p
opulat
ion
parameter.
8.1
Construct
condence
in
terv
al
for
p
opulation
prop
ortion
Conditions:
1.
Random
sample.
2.
Sample
size
n
is
less
than
10%
of
p
op
ulation.
3.
Both
coun
ts
of
success
np
and
failure
n
(1
−
p
)
are
at
least
10.
Where
p
is
sample
prop
ortion,
and
n
is
sample
size.
The
condence
in
terv
al
for
a
p
opulation
prop
ortion
p
is
giv
e
n
b
y:
ˆ
p
±
z
∗
r
ˆ
p
(1
−
ˆ
p
)
n
5
8.2
Construct
condence
in
terv
al
for
the
dierence
of
t
w
o
p
opulation
pro-
p
ortions
Conditions:
1.
Tw
o
p
opulations
ar
e
indep
enden
t.
2.
Random
sample.
3.
Sample
size
n
1
,
n
2
are
less
than
10%
of
p
o
p
ulation.
4.
Both
samples
ha
v
e
c
ou
n
ts
of
succes
s
n
1
ˆ
p
1
,
n
2
ˆ
p
2
and
failure
n
1
(1
−
ˆ
p
1
),
n
2
(1
−
ˆ
p
2
)
of
at
least
10.
The
condence
in
terv
al
for
the
dierence
of
t
w
o
p
opulations
prop
ortion
p
1
,
p
2
is
giv
e
n
b
y:
(
ˆ
p
1
−
ˆ
p
2
)
±
z
∗
s
ˆ
p
1
(1
−
ˆ
p
1
)
n
1
+
ˆ
p
2
(1
−
ˆ
p
2
)
n
2
Where
ˆ
p
1
,
ˆ
p
2
are
sample
prop
ortions,
and
n
1
,
n
2
are
sample
sizes.
8.3
Construct
condence
in
terv
al
for
p
opulation
means
Conditions:
1.
Random
sample.
2.
Sample
size
n
is
less
than
10%
of
p
op
ulation.
3.
Sample
size
n
≥
30
OR
the
p
opulation
is
appro
ximately
normally
distrib
uted
OR
the
sample
ha
v
e
no
strong
sk
ewness
or
outliers
The
condence
in
terv
al
for
p
opulation
means
µ
is
giv
en
b
y:
¯
x
±
t
∗
s
√
n
Where
¯
x
is
sample
mean,
s
is
sample
standard
deviation,
n
is
sample
size,
and
degrees
of
freedom
d
f
=
n
−
1.
8.4
Construct
condence
in
terv
al
for
the
dierence
of
t
w
o
p
opulations
means
Conditions:
1.
Tw
o
p
opulations
ar
e
indep
enden
t.
2.
Random
sample.
3.
Sample
sizes
n
1
,
n
2
are
less
than
10%
of
p
opulation.
4.
Sample
sizes
n
1
,
n
2
≥
30
OR
b
oth
p
opulations
is
appro
ximately
normally
distributed
OR
b
oth
samples
ha
v
e
n
o
strong
sk
ewnes
s
or
outliers
(
¯
x
1
−
¯
x
2
)
±
t
∗
s
s
2
1
n
1
+
s
2
2
n
2
Where
¯
x
1
,
¯
x
2
are
sample
means,
s
1
,
s
2
are
sample
standard
deviations,
n
1
,
n
2
are
sample
sizes,
and
degrees
of
freedom
d
f
=
the
smaller
b
et
w
een
n
1
−
1
and
n
2
−
1.
6
9
Hyp
othesis
testing
9.1
Describ
e
h
yp
othesis
testing
1.
Assume
H
0
is
v
alid.
2.
Calculate
probabilit
y
of
an
ev
e
n
t
happ
ening.
3.
Compare
P
(ev
en
t)
with
critical
v
alue
α
.
4.
If
P
<
α
,
reject
H
0
.
Else
accept
H
0
.
9.2
Hyp
othesis
test
f
or
p
opulation
prop
ortion
Conditions:
1.
Random
sample.
2.
Sample
size
n
is
less
than
10%
of
p
op
ulation.
3.
Both
coun
ts
of
success
np
and
failure
n
(1
−
p
)
are
at
least
10.
The
test
statistic
for
a
h
yp
othesis
test
ab
out
a
p
opulat
ion
prop
ortion
p
is
giv
en
b
y:
z
=
ˆ
p
−
p
0
q
p
0
(1
−
p
0
)
n
Where
ˆ
p
is
sample
prop
ortion,
and
n
is
sample
size.
9.3
Hyp
othesis
test
f
or
the
dierence
of
t
w
o
p
opulation
prop
ortions
Conditions:
1.
Tw
o
p
opulations
ar
e
indep
enden
t.
2.
Random
sample.
3.
Sample
size
n
1
,
n
2
are
less
than
10%
of
p
o
p
ulation.
4.
Both
samples
ha
v
e
c
ou
n
ts
of
succes
s
n
1
ˆ
p
1
,
n
2
ˆ
p
2
and
failure
n
1
(1
−
ˆ
p
1
),
n
2
(1
−
ˆ
p
2
)
of
at
least
10.
The
test
statistic
for
a
h
yp
othesis
test
ab
out
th
e
dierence
of
p
opulation
prop
ortions
p
1
,
p
2
is
giv
en
b
y:
z
=
ˆ
p
1
−
ˆ
p
2
r
ˆ
p
c
(1
−
ˆ
p
c
)
1
n
1
+
1
n
2
Where
ˆ
p
1
,
ˆ
p
2
are
sample
prop
ortions,
n
1
,
n
2
are
sample
sizes,
and
ˆ
p
c
=
n
1
ˆ
p
1
+
n
2
ˆ
p
2
n
1
+
n
2
is
the
com
bined
prop
ortion.
9.4
Hyp
othesis
test
f
or
p
opulation
means
Conditions:
1.
Random
sample.
2.
Sample
size
n
is
less
than
10%
of
p
op
ulation.
3.
Sample
size
n
≥
30
OR
the
p
opulation
is
appro
ximately
normally
distrib
uted
OR
the
sample
ha
v
e
no
strong
sk
ewness
or
outliers
The
test
statistic
for
a
h
yp
othesis
test
ab
out
a
p
opulati
on
mean
µ
is
giv
en
b
y:
t
=
¯
x
−
µ
0
√
s
Where
¯
x
is
sample
mean,
s
is
sample
standard
deviation,
n
is
sample
size,
and
degrees
of
freedom
d
f
=
n
−
1.
7
9.5
Hyp
othesis
test
f
or
the
dierence
of
t
w
o
p
opulations
means
Conditions:
1.
Tw
o
p
opulations
ar
e
indep
enden
t.
2.
Random
sample.
3.
Sample
sizes
n
1
,
n
2
are
less
than
10%
of
p
opulation.
4.
Sample
sizes
n
1
,
n
2
≥
30
OR
b
oth
p
opulations
is
appro
ximately
normally
distributed
OR
b
oth
samples
ha
v
e
n
o
strong
sk
ewnes
s
or
outliers
The
test
statistic
for
a
h
yp
othesis
test
ab
out
th
e
dierence
of
p
opulation
m
eans
]
µ
1
,
µ
2
is
giv
en
b
y:
t
=
¯
x
1
−
¯
x
2
q
s
2
1
n
1
+
s
2
2
n
2
Where
¯
x
1
,
¯
x
2
are
sample
means,
s
1
,
s
2
are
sample
standard
deviations,
n
1
,
n
2
are
sample
sizes,
and
degrees
of
freedom
d
f
=
the
smaller
b
et
w
een
n
1
−
1
and
n
2
−
1.
10
Bias
/
Error
Iden
tication
10.1
Bias
Iden
tication
•
Selection
Bias
(Occurs
w
h
e
n
some
groups
of
p
eople
ha
v
e
a
lo
w
c
hance
to
b
e
c
hosen;
or
some
p
eople
are
not
includ
e
d
in
the
ass
u
m
ed
p
opulat
ion).
•
Non-resp
onse
(When
some
grou
ps
do
not
resp
ond
to
the
res
earc
h,
in
tro
ducing
dierences
b
et
w
een
p
eople
that
resp
onded
and
p
eopl
e
that
did
not
resp
onse,
probably
in
tro
ducing
other
v
ariables
suc
h
as
the
accessibilit
y
to
In
ternet).
•
V
olun
tary
Bias
(When
some
grou
ps
are
more
inclined
to
tak
e
part
in
researc
h,
who
migh
t
carry
systematic
dierenc
es
in
their
features,
comparing
to
the
o
v
erall
p
opulation).
Ho
w
to
reduce
bias:
•
Increase
the
Randomness
of
the
sampling
pro
cess.
•
(Double-)Blind
Exp
erimen
ts.
•
Stratied
sampling
/
Cluster
sampling.
•
Common
Resp
onse
:
Increas
i
ng
sampling
size.
10.2
Error
Iden
tication
Errors
are
common
in
h
yp
othesis
testing,
and
it
is
also
imp
ortan
t
for
us
to
recognize
the
p
oten
tial
bias
underlying.
Remem
b
er
signicance
lev
el
(
α
)
denes
”Imp
os
sibl
e
”
H
0
is
true
H
0
is
false
Reject
H
0
T
yp
e
I
Error
(
α
)
Happ
y
ending!
Not
reject
H
0
Happ
y
ending!
(p
o
w
er)
T
yp
e
I
I
Error
(
β
)
10.2.1
T
yp
e
I
Error
•
Probabilit
y:
α
,
the
signicance
lev
el
is
the
probabilit
y
of
this
error
t
yp
e
.
•
W
a
ys
to
r
e
du
c
e
p
ossibilit
y:
–
Set
a
smaller
signicance
lev
el
(
α
).
–
Common
Resp
onse
:
Increase
sample
size.
–
Common
Resp
onse
:
Increase
the
n
um
b
er
of
exp
erimen
ts
to
v
erify
the
conclu
s
ion
.
8
10.2.2
T
yp
e
I
I
Error
•
Probabilit
y:
β
=
1
−
p
o
w
er,
p
o
w
er
is
the
correct
probabilit
y
of
rejecting
H
0
.
•
W
a
ys
to
redu
c
e
p
ossibilit
y:
–
Impro
v
e
the
p
o
w
er
of
the
testing
through
impro
ving
data
qualit
y
or
use
testing
with
higher
sensitivit
y
.
–
Set
a
bigger
signicance
lev
el
(
α
),
ma
y
help
to
reduce
T
yp
e
I
I
Error,
but
will
conse
q
uen
tly
increase
T
yp
e
I
Error.
–
Common
Resp
onse
:
Increase
sample
size.
–
Common
Resp
onse
:
Increase
the
n
um
b
er
of
exp
erimen
ts
to
v
erify
the
conclusion
.
9
创建一个文章