WINETASTER ON 10/02/06 WITH 8 JUDGES AND 5 WINES BASED ON RANKS, IDENT=N
Copyright (c) 1995-2006 Richard E. Quandt, V. 1.65
FLIGHT 1:
Number of Judges = 8
Number of Wines = 5
Identification of the Wine: The judges' overall ranking:
Wine A is Grace 2002 ........ 4th place
Wine B is Cardinale 2002 ........ 3rd place
Wine C is Rudd 2002 ........ 5th place
Wine D is Heitz 2001 ........ 2nd place
Wine E is Poetry 2001 ........ 1st place
The Judges's Rankings
Judge Wine -> A B C D E
Ed 5. 3. 4. 2. 1.
Bob 5. 1. 4. 2. 3.
Mike 5. 4. 3. 2. 1.
Frank 2. 1. 5. 4. 3.
Burt 3. 4. 5. 2. 1.
Orley 5. 1. 4. 2. 3.
John 3. 4. 5. 2. 1.
Dick 1. 5. 3. 4. 2.
Table of Votes Against
Wine -> A B C D E
Group Ranking -> 4 3 5 2 1
Votes Against -> 29 23 33 20 15
( 8 is the best possible, 40 is the worst)
Here is a measure of the correlation in the preferences of the judges which
ranges between 1.0 (perfect correlation) and 0.0 (no correlation):
W = 0.3187
The probability that random chance could be responsible for this correlation
is quite small, 0.0372. Most analysts would say that unless this
probability is less than 0.1, the judges' preferences are not strongly
related.
We now analyze how each taster's preferences are correlated with the group
preference. A correlation of 1.0 means that the taster's preferences are a
perfect predictor of the group's preferences. A 0.0 means no correlation,
while a -1.0 means that the taster has the reverse ranking of the group.
This is measured by the correlation R.
Correlation Between the Ranks of
Each Person With the Average Ranking of Others
Name of Person Correlation R Correlation Price
Ed 0.9000 -0.9000
John 0.9000 -0.9000
Burt 0.9000 -0.9000
Mike 0.7000 -0.7000
Bob 0.5000 -0.5000
Orley 0.5000 -0.5000
Frank 0.2000 -0.2000
Dick -0.1000 0.1000
The wines were preferred by the judges in the following order. When the
preferences of the judges are strong enough to permit meaningful differentiation
among the wines, they are separated by -------------------- and are judged to be
significantly different.
1. ........ 1st place Wine E is Poetry 2001
---------------------------------------------------
2. ........ 2nd place Wine D is Heitz 2001
3. ........ 3rd place Wine B is Cardinale 2002
4. ........ 4th place Wine A is Grace 2002
---------------------------------------------------
5. ........ 5th place Wine C is Rudd 2002
We now test whether the ranksums AS A WHOLE provide a significant ordering.
The Friedman Chi-square value is 10.2000. The probability that this could
happen by chance is 0.0372
We now test whether the group ranking of wines is correlated with the
prices of the wines. The rank correlation between them is -1.0000. At the
10% level of significance this would have to exceed the critical value of
0.8000 to be significant.
We now undertake a more detailed examination of the pair-wise rank correla-
tions that exist between pairs of judges. First, we present a table in which you
can find the correlation for any pair of judges, by finding one of the names in the
left hand margin and the other name on top of a column. A second table arranges
these correlations in descending order and marks which is significantly positive
significantly negative, or not significant. This may allow you to find clusters
of judges whose rankings were particularly similar or particularly dissimilar.
Pairwise Rank Correlations
Correlations must exceed in absolute value 1.00 for significance at the 0.05
level and must exceed 0.90 for significance at the 0.1 level
Ed Bob Mike
Ed 1.000 0.600 0.900
Bob 0.600 1.000 0.300
Mike 0.900 0.300 1.000
Frank -0.100 0.300 -0.500
Burt 0.700 0.100 0.600
Orley 0.600 1.000 0.300
John 0.700 0.100 0.600
Dick -0.300 -0.900 -0.100
Frank Burt Orley
Ed -0.100 0.700 0.600
Bob 0.300 0.100 1.000
Mike -0.500 0.600 0.300
Frank 1.000 0.100 0.300
Burt 0.100 1.000 0.100
Orley 0.300 0.100 1.000
John 0.100 1.000 0.100
Dick -0.100 0.300 -0.900
John Dick
Ed 0.700 -0.300
Bob 0.100 -0.900
Mike 0.600 -0.100
Frank 0.100 -0.100
Burt 1.000 0.300
Orley 0.100 -0.900
John 1.000 0.300
Dick 0.300 1.000
Pairwise correlations in descending order
1.000 Bob and Orley Significantly positive
1.000 Burt and John Significantly positive
0.900 Ed and Mike Significantly positive
0.700 Ed and Burt Not significant
0.700 Ed and John Not significant
0.600 Ed and Bob Not significant
0.600 Mike and John Not significant
0.600 Ed and Orley Not significant
0.600 Mike and Burt Not significant
0.300 Bob and Frank Not significant
0.300 John and Dick Not significant
0.300 Bob and Mike Not significant
0.300 Frank and Orley Not significant
0.300 Burt and Dick Not significant
0.300 Mike and Orley Not significant
0.100 Frank and John Not significant
0.100 Bob and Burt Not significant
0.100 Bob and John Not significant
0.100 Frank and Burt Not significant
0.100 Burt and Orley Not significant
0.100 Orley and John Not significant
-0.100 Ed and Frank Not significant
-0.100 Frank and Dick Not significant
-0.100 Mike and Dick Not significant
-0.300 Ed and Dick Not significant
-0.500 Mike and Frank Not significant
-0.900 Bob and Dick Significantly negative
-0.900 Orley and Dick Significantly negative
COMMENT:
All the wines were quite amazing. They had a substantially similar
bouquet but did not taste identical by any means. Nevertheless, the
tasters claimed that they had difficulty in distinguishing among the
wines. On the whole, the Poetry was deemed to be significantly good and
the Rudd was thought to be significantly bad. One taster judged the
wine that was second worst in the aggregate as being first. The real
question was whether the tasters could differentiate among the very
expensive wines (ranging from $105 to $159) and the relatively inexpensive
Heitz costing only $45. In fact, the Heitz was the second highest ranked wine,
which suggests that the higher priced wines are substantially overpriced.
The tasters were asked to identify the Heitz in a
secret ballot, and only one out of eight tasters succeded in identifying
this wine. Every taster's preferences among the wines was negative
correlated with the wine prices except for the one contrarian taster who
ranked the Grace first.
It is worth mentioning that this is a lanmdmark tasting in that it is the
100th tasting since we have started to record the tastings and the statistical
results in a systematic way. The only noteworthy observation we can make is that
a statistical analysis of the results of the tastings suggests on the basis of the
Kendall W-coefficients (or rather, of the p-values corresponding to these coefficients)
that we have not increased over time the degree of agreement among the tasters---that
is to say, we have not learned from each other and have not adopted over time the
tasting standards of other tasters.
Return to previous page