Traditional sensory difference tests demonstrate whether a difference exists or not between two flavors. To determine the degree of difference requires further scaling. The two processes can be combined for perceptually small differences using difference tests based on signal detection measures.
Original Description:
Original Title
Short cut signal detection measures for sensory analysis.pdf
Traditional sensory difference tests demonstrate whether a difference exists or not between two flavors. To determine the degree of difference requires further scaling. The two processes can be combined for perceptually small differences using difference tests based on signal detection measures.
Traditional sensory difference tests demonstrate whether a difference exists or not between two flavors. To determine the degree of difference requires further scaling. The two processes can be combined for perceptually small differences using difference tests based on signal detection measures.
SHORT-CUT SIGNAL DETECTION MEASURES FOR SENSORY ANALYSIS
M. A. P. D. OMAHONY ABSTRACT Traditional sensory difference tests demonstrate whether a differ- ence exists or not between two flavors; to determine the degree of difference requires further scaling. The two processes can be com- bined for perceptually small differences using difference tests based on signal detection measures. Such measures provjde a measure of degree of difference directly which being a probabdity value is sus- ceptible to analysis by parametric statistics Traditional signal detec- ~ tion measures are complex and time consuming but the present paper outlines a short and simple means of calculating such mea- SUIfSS. THE COMMONLY USED flavor difference tests: triangle, pair comparison, duo-trio, etc. (Amerine et al., 1965) are designed to determine whether a difference occurs between the flavors of two foods. Should there be a difference it is sometimes useful to determine the degree of difference. For perceptually large differences this can be achieved using traditional scaling methods (Stevens, 1960) but for very slight differences, the lack of judges skill in using numbers may create enough variance to swamp any slight differ- ences. Further, there is circumstantial evidence in sensory psychophysics to suggest that humans are so unskilled in their use of numbers as to call into doubt the use of para- metric statistics in analysing numerical data generated by them; this latter point, however, remains controversial. However, the degree of difference for these small differ- ences, can be measured directly by using the so-called Sig- nal Detection measures (Green and Swets, 1966) which are difference tests yielding a measure of the degree of differ- ence directly. The most general signal detection measure : applicable to difference testing is the index P(A), which is sometimes denoted as R by those who use a short-cut procedure to obtain it (Brown, 1974). The traditional tech- niques for finding P(A) are too lengthy for sensory analysis but the shorter technique, developed for studies on mem- ory (Brown, 1974), is very applicable. Basically R or P(A) can be visualised as a probability value-the probabil- ity of correctly choosing one of two samples in a pair com- parison task. It has the advantage that in its determination, judges are not required to generate numbers, so avoiding the reliability issue; they are merely required to state b whether they are sure of their judgement or not. From these simple data can be calculated a probability value, more safely amenable to parametric analysis than scaling data. P(A) or R can be calculated simply as follows. Let us assume a judge tastes 10 samples of foodstuff B and 10 samples of a reformulation A in random order. Immedi- ately the sensory analyst will protest the large number of samples per judge which, though desirable, can usually only be tasted in the setting of an academic laboratory. How- Author OMahony is with the Dept. of Food Science & Technology, University of California, Davis, CA 95616. 0022-1147/79/0001-0302$02.25/O 01979 Institute of Food Technologists 302-JOURNAL OF FOOD SCIENCE- Vol. 44, No. 1 /197!/ ever, the required number of samples (say 20) could be obtained by spreading the load evenly over replications (taste 4 over 5 replications) and/or subjects (5 subjects, 4 replications each, etc.), thus obtaining a composite panel R;value rather than one per judge. It is advisable here, that the data required for the calculation be distributed evenly over judges and replications rather than allow one judge or replication an unevenly large influence over the final value. Thus the judge (or panel over replications) tastes, say, 10 samples of foodstuff B and 10 samples of a reformulation, A, in random order. He is required to rate each of the 20 samples as definitely A (A), perhaps A (A?), perhaps B (B?) or definitely B (,B). Let us suppose that when tasting A he rated eight samples A, one A? and one B?, and when tasting B he rated seven as B, two B? and one A?. The results can be summarized in the response (Fig. 1). ::,;-:;I Fig. l-Response matrix. To calculate R, we now predict from these data what might happen in a hypothetical experiment should each of the A samples be presented in pair comparison with each of the B samples. How many times would A be correctly iden- tified (out of 10 x 10 = 100 pair comparisons)? Let us consider the A samples rated as definitely A (A). They would be correctly identified when paired with any of the B samples which were rated A?, B? or B. Even though one of the B samples was identified as A? it would still be chosen as B when compared to A samples rated as A. So this gives us 8 x (1 + 2 + 7) = 80 correct identifica- tions of A, so far. The A sample rated A? would be cor- rectly identifiedif compared with the two B samples rated B? or the seven samples rated B (another (2 + 7) x 1 = 9 correct), but when compared with the B sample rated A? the subject would not know which to choose because they were both rated as the same by him (so score one compari- son as dont know). Similarly, the A sample rated B? would be correctly identified when compared with the seven B samples rated B (another 7 correct) but again the subject would be undecided with the B samples rated B? (score another 2 dont knows). So the predicted final tally of pair comparisons is 80 + 9 + 7 = 96 correct identifica- tions of A and 3 dont knows; similarly one of the pair comparisons would be totally wrong, namely the A sample rated B? when compared with the B sample rated A? It is assumed that when the subject is undecided he guesses cor- rectly half the times. This makes the final score 96 + 3/2 = 97.5 correct identifications out of 100. This score is the R Index or P(A). Hence R = 97.5% or 0.975. This exercise is a conceptually easy way of manipulating the data so as to arrive at a value that is exactly equal to P(A) (OMahony, 1977). It is not the traditional method (Green and Swets, 1966) but it is equivalent and provides a simple conceptual understanding of the index. Response A A? B? B SHORT-CUTSIGNAL DETECTION MEASURES. . . Response 1 Response A A? B? B A A? B? B R = 100% R = 100% R= 100% Fig. 2-Response matrices The numerical value actually obtained has the advantage that it is a numerical value rather than the traditional differ- ence/no difference value thus providing more information. Admittedly the error rate in a traditional test is such a numerical measure but the R-index also takes into account information regarding the judges degree of certainty, which is ignored by the usual tests. As for the meaning of R = 97.5%, it simply means that the judge is expected to cor- rectly differentiate A and B 97.5% of the time which is good differentiation. To ask what exactly is the level that decides good or bad differentiation is defeating the purpose of the test. It loses information by converting the answer back into a difference/no difference situation; any such procedure is as arbitrary as choosing p < 0.01 or p < 0.05 as the statistical level of difference. It is instructive to consider how this technique, by con- sidering the certainty of the judges responses, overcomes response bias, the tendency to name all samples A or all samples B. To illustrate this, consider the above three response matrices that could be produced by the aforemen- tioned experiment. It should be pointed out these matrices are not what is usually obtained in such situations; a more even distribution of numbers across the matrix, indicating less certainty, would be more likely. However, they illus- trate the point simply (Fig. 2). In Figure 2(I) all A samples are rated A; and all B samples are rated B. Using a standard difference test, the samples would be distinguished, while R = 100%. ln Figure 2(H) all B samples were rated B and A sam- ples B?. Although the samples were distinguished by the fact that the judge was unsure with some judgements, a test that merely required subjects to identify whether the sam- ples were A or B would register no difference. The judges response biase to believe that all samples were B can be overcome using the additional certainty judgements. The same is true for the third matrix where the bias is towards calling everything A. Naturally with these degrees of difference, the refine- ment of the R-index as a measure would hardly be neces- sary, but they illustrate the point. It should also be noted that overcoming response bias is not exclusive to signal de- tection; forced-choice procedures do the same for the tradi- tional tests. An advantage of this approach compared with tradi- tional signal detection measurement is brevity. Using tradi- tional signal detection techniques, around 200 readings are required to compute P(A); using this method it can be done in approximately 20. Using computations of R every 20 presentations over a series of 200, for a difference test be- tween water and 3 mM NaCl, it was found that no signifi- cant differences occurred in R-values. Thus any error in- J volved in using fewer readings is random; there are no sys- tematic effects. This is also the case for traditional differ- ence tests. It is necessary to-establish this before using the test; a systematic improvement in performance would have demonstrated the need for obtaining a larger sample of readings. Certainly the test was found simple to use when six judges tested flavor differences in sherry samples, using twenty presentations of each sherry sample and a dorsal flow technique (OMahony and Davies, 1978). The R-index procedure is flexible. Instead of testing for differences between one product B and its reformulation A, two reformulations (A1 and As) can be tested simultane- ously. A random presentation of Ai, As and B can be rated (A, A?, B?, B; A could be simply not-B) and two R-indices calculated by comparison to Bs ratings. This was found to be a simple task when judges compared differences between colas (OMahony et al., 1978). R-indices could also be calculated by using a ranking technique. Ranked data, say for eight samples, can be read- ily transformed to rated data (lst, 2nd = A; 3rd, 4th = A? etc). The difference here is that the experimenter, rather than the subject, defines the difference between certainty and uncertainty. Such techniques are, at present, being compared to rating procedures. If viable, they allow the possiblity of analysis of ranked data by parametric statis- tics. Thus Signal Detection measures can be seen to be a via- ble and flexible tool for flavor measurements. They are not a panacea; they do not replace existing techniques but merely add a possible refinement to difference testing, should it be needed, while keeping the task of the judge simple. It adds just one more small weapon to the arsenal of sensory testing. REFERENCES Amerine, M.A., Pangborn, R.M. and Roessler, E.B. 1965. Principles of Sensory Evaluation of Foods. Academic Press, New York, N.Y. Brown, J. 1974. Recognition assessed by rating and ranking. Brlt. J. Psychol. 65: 13. Green. D.M. and Swets, J.A. 1966. Signal Detection Theory and Psychophysics. Wiley, New York, N.Y. OMahony. M. 1977. Towards shorter criterion free sensitivity mea- sures. Presented at 6th International Symposium on Olfaction and Taste, July, 1977. Orsay. Paris, France. OMahony, M. and Davies, M. 1978. A signal detection approach to taste difference testing between two levels of alcohol in a flowise presented sherrv stimulus. IRCS Medic. Sci. 6: 189. OMahony, M., He&z. C. and Autio, J. 1978. Signal detection dif- ference testing of colas using a modified R-index approach. IRCS Medic. Sci 6: 222. Stevens, S.S. 1960. The psychophysics of sensory function. Amer. Scientist. 48: 226. Ms received 6113178; revised a/31/78; accepted 918178. Vol. 44, No. 1 (1979)-JOURNAL OF FOOD SCIENCE-303
Calculus Made Easy: Being a Very-Simplest Introduction to Those Beautiful Methods of Reckoning Which are Generally Called by the Terrifying Names of the Differential Calculus and the Integral Calculus
Mental Math: How to Develop a Mind for Numbers, Rapid Calculations and Creative Math Tricks (Including Special Speed Math for SAT, GMAT and GRE Students)
Classroom-Ready Number Talks for Kindergarten, First and Second Grade Teachers: 1,000 Interactive Activities and Strategies that Teach Number Sense and Math Facts