Looking back on my childhood days, aside from the fun I had with my classmates and playmates, what I remember vividly in my childhood was when I loiter around the campus. I enjoyed observing the paintings along the walls of preschool buildings, the structure, shapes, and colors of the different plants and trees around the campus, and of course the people who are walking, or shouting, or playing, or just jeering around. I also enjoyed the eerie feeling of walking on the dark paths and hallways of our school and the feeling of delight and success in walking past it.
Come high school and up to my college days, I grew more interested in observing people and wondering the whys of their actions. I have a certain classmate in high school. His stances suggest that he is overconfident, he shows his biceps to us with a sneer, he runs like the ninjas in the anime Naruto, and his emotions are free flowing – when he is angry, you can see the hatred in his maddened face. Many people consider him as ‘weird’ because of his actions, I also saw him as such, but when I observed his little acts, I saw his deep respect for art, chivalry for women, and his perseverance in battling his life’s challenges. This allowed me to see him in a different light, a different angle.
To see things others neglect to see – that is the power of observation.
An observation is done when we carefully observe, through any of our five senses, something or someone to gain information. For example, observing a certain kind of rock may have statements like “The rock feels rough” or “The rock feels light” or “The rock is color gray” or “The rock smells like dust” etc.
It is almost the same with people, though they are more complex. Body movements, voice intonations, postures, facial expressions are just some of the areas we can observe. The information collected can be analyzed and can create an assessment of the one observed.
For example, in an integrity test, you observed that the candidate is so focused on the exam, and answering each question in about less than 7 seconds on approximation, then there on item #37, he stopped for quite a while choosing an answer. Also, he reacted to item #52 by frowning. After the exam, he is fidgety and is sweating a little. You can actually take note of those; ask and probe on it on the following interview.
From there, the true power of observation unfolds – in interviewing or giving tests while observing the candidates. All the information gathered from the tests, interviews, and observations are analyzed to provide a more accurate representation of an individual.
Share your thoughts!
Originally posted on LinkedIn last December 18, 2015
Image by LaVladina
You are the test administrator. You distribute the exam papers, read the instructions aloud, declare the time allotted, watch them answer the exams, then you declare that the time is up and you ask them to pass their papers. More or less, this is the typical scenario in testing.
Now imagine, instead of paper and pencils, your candidates are answering the exam using tablets or iPads.
That would create quite a different scenario, wouldn’t it?
Coupled with the online platforms developed, Tablets, iPads, Notebooks, and other mobile online devices offer various improvements to the testing experience of test-takers, test administrators, and test users alike:
Less Physical Material
Having these devices get rid of the idea of tower-like pile of papers on desks and shelves. Along with the testing platforms developed for online assessments, the devices offer ease in checking, compiling, and interpretation of the data gathered without losing the effectiveness of exams. In the study of Davis, Kong, and McBride, results showed that in administering exams in computer does not have a significant difference in administering exams in tablets or iPads. Also, in comparing administration of exams in paper and in computer, Al-Amri reported that there is no significant effect on the overall validity and reliability if exams on the two testing modes. In the same study, computer familiarity of the students have no effect in the result of the exams. So in answering standardized assessments whether in paper, computer, or mobile devices, results would be relatively the same.
Most online platforms and computerized assessments offer an almost instant report generation and built in interpretation after the candidate has taken the exam. Also, some platforms has the capability of showing the candidate the results immediately after the exam, hence aiding both the candidate and the test user in the assessment process. Traditional paper and pencil exams require manual scoring and interpretation of the results. Using such on a large group would be tedious and tiring for the both the scorer and the interpreter, and consumes much time. Imagine the time saved by the platforms and computerized assessments. The only thing that one should worry about is the comparison of the results to different methods of assessing – interviews, observations, etc.
Behavioral Observation is Possible
One of the problems posed by testing in computers listed in the book, Psychological Testing and Assessment of Cohen, Swerdik, and Sturman is that it would be harder to observe the behavior of the test takers. Testing in smaller devices solves the problem without losing the perks of computerized assessments. Like in the traditional pencil and paper exams, answering in tablets, iPads, or notebooks allow for the behavior – facial expressions, body movements, stances, etc. to be observed and generally, for the administrator to build rapport better. Observations are also possible in people who answer exams in computers but are limited since they are blocked by the whole unit and there would be less movement compared when answering in paper or small devices.
Answer on-site or at home or basically, anywhere with an internet connection. When testing in these devices, there is a possibility of the test acquiring the capabilities of the device itself. The use of webcams, microphones, and speakers of the devices can be integrated on the assessment process, depending on the platform acquired.
Of all these benefits, there are, of course, some issues in the use of these devices are to be considered:
In same study of Davis, Kong, and McBride, the students are asked to comment on their experiences on the device, creating word clouds, and the theme for the “tablet dislikes” had to do with the physical impacts of using the device – glares, neck, finger, uncomfortable, sitting, standing, etc.
The length of the device should also be considered. The PARCC recommends a 9.5 inch tablet size for assessment. as in the study of Davis, Kong, and McBride reported that there are problems in pointing answers due to the 7 inch size of the tablet.
In the study of Noyes and Garland, one of the problems of taking tests in computers are the technical glitches and problems that may arise in the process of assessment. This will pose a little problem if the development behind the platform is a robust one.
Even with these issues, assessment using these devices are cost-effective and has benefits to the organization. Share your thoughts!
Originally posted on LinkedIn last December 15, 2015
Image by Sean MacEntee
I have a spoon. I would want to know if the spoon I have is an excellent spoon. Well, I would do two things, the first one:
I will ask other people who used the same brand of spoon about their opinions or analyses about the spoon, and see if that applies to the spoon I have, or
Have experts cite criteria or standards of what an excellent spoon should have and test the spoon that I have.
As you can see, the former scores the spoon as how its performance relate to the performance of the other spoons; the latter scores the spoon’s performance if it can reach a certain standard.
Apart from spoons, we experience these kinds of interpretation in our schools, industries, and communities – reaching a certain percentage to pass an exam, assessing an employee on how many calls can he make in comparison with the group he or she is in. We are tested and are interpreted in two ways: the norm – referenced tests and criterion – referenced tests.
Norm – referenced Tests
The norm – referenced test basically determines one’s performance in relation to the performance of a certain group. In High School, we have quiz bees. In these contests, the quiz master will look at where you place in comparison with the other contestants. We also have NMAT which provides the test taker percentile ranks. For example in the overall, you scored 90, this means that you scored better than the 89% of the population who took the exam. Another example, a candidate who took a 16PF exam and received a score. That score then is compared to the norm set of choice. Of course, this gives little information on how much the individual knows, or how much the candidate has. We now go to the second one:
Criterion – referenced Tests
For the criterion – referenced tests, the performance is determined by how you scored based on a certain criterion or standard. In schools we having passing grades, or when applying in a company, there is a certain level of competency to be achieved, and this is independent of the performance of others.
Of course, these criteria depends on the school, company or organization one belongs. For example, in School A, you need to reach 50% of the items to pass while in School B would require at least 70%. Same in companies, when you do not reach, let us say, 60% of the items in their exam for coders, you will not be allowed to continue to take the next step in recruitment.
When do we use them? Well it would depend on the need of the company or organization. For example, why do schools use criterion-referenced interpretations, have cut offs in passing or failing? It is for the reason that some schools would want students to be graded on how much they know and how they perform compared to the standard set rather on how they perform in comparison with other students. With that in mind, if a teacher or professor in that certain school would want to determine how the class performs in his or her subject, who are the high, mid, and low scorers, he would then use a norm-referenced interpretation alongside with the standard set by the school to answer his or her question.
The usage of the norms or criteria as references depends on how one would want to interpret the data for analysis and assessment.
Bond, L. (1996). Norm- and criterion-referenced testing. Practical Assessment, Research & Evaluation, 5(2)
Boyd, N. (n.d.). Types of tests: Norm-referenced vs. criterion-referenced. Retrieved from http://study.com/academy/lesson/types-of-tests-norm-referenced-vs-criterion-referenced.html
Huitt, W. (n.d.). Measurement and evaluation: Criterion- versus norm-referenced testing. Retrieved from http://www.edpsycinteractive.org/topics/measeval/crnmref.html
Afolayan, A. (n.d.). The problems and potentials of criterion-referenced testing. Retrieved from http://www.unilorin.edu.ng/journals/education/ije/feb1981/THE%20PROBLEMS%20AND%20POTENTIALS%20OF%20CRITERION%20REFERENCED%20TESTING.pdf
Originally posted on LinkedIn last December 1, 2015
Image by Todd Porter
Let me start off by stating one of the reasons why I need to do cardio-respiratory exercises:
The Kare Kare (the picture above) is a traditional Philippine stew complimented with a thick, savory peanut sauce and uses ox tail, pork leg, tripe, or beef slices as meat ingredients, along with slices of eggplant, string beans, and pechay (chinese cabbage) coupled with bagoong – My favorite dish.
If I am served a bowl of kare kare, after tasting, I would be either so satisfied, a little satisfied, or less satisfied.
I have this standard for the deliciousness of kare kare due to my preferences, so if another variation of the dish comes, I will compare its flavor to the previous kare kares I have taken. If it tastes good, according to how I would like the kare kare to be cooked, I will be so satisfied, if it tastes like a little deviating to the level of sweetness or saltiness that I like, then I will be satisfied but only a little, and then if it tastes too salty or too sweet, I will be less satisfied. Of course, if another variation comes, I will taste it and decide again according to my preferences, then the cycle continues. Imagine that I have tasted 100 varieties of kare kare, the 101st variation that I will taste will be judged on my previous experiences of the dish and will naturally fall in my standards of a so satisfying, a little satisfying, or a less satisfying kare kare.
Basically, that is how norms work! Generally, norms are standards which should be complied or reached. Zooming in, norms, in the book of Angoff (1984), Scales, Norms, and Equivalent Scores, are the scores of members in a particular population where a particular score is compared to. For example, I have an instrument which measures sales potential skills. A candidate will take that instrument and have a certain score of 4 out of 10 .
Is it high or low? I would not be sure. What are the standards for “high” and “low” performance? In their article, The dutch review process for evaluating the quality of psychological tests: history, procedure and results, Evers, Sijtsma, Lucassen, & Meijer (2010) explained that the raw score is difficult to interpret and is unsuited for practical use.
Questions now arise: To what should I compare it to? What perspective should I be looking at? In what way should I interpret the data? And how can I be sure that the 4 out of 10 score would where, relatively, the performance of the candidate truly falls?
This is where norms come in. Norms are there to help the administrator to give meaning to the raw score to see where the score falls in the curve which would be limited by the demographics of a certain group; that the tendency of his or her performance is accurately predicted by his or her score when compared to the representative performance or the scores of the members of that certain group. This gives now confidence to the administrator to say that the individual really performed well or poor.
So then, I could compare the score of the individual to a group which has the scores of the national population, or of a foreign organization, or of a certain industry or a certain group, or of a certain age, gender, and other demographics as explained in the research letter of Schuhfried, in the light of the sales potential skills. It would really depend on what is needed. It would depend on which standard the tester or the organization would like to bank on.
Norms are important in determining the accuracy of the performance happening again given by a human, or an animal, or an event, or a phenomenon, or even just a kare kare.
Angoff, W. (1984). Scales, Norms, and Equivalent Scores. Educational Testing Service: Princeton, NJ.
Evers, A., Sijtsma, K., Lucassen, W., & Meijer, R. (2010). The dutch review process for evaluating the quality of psychological tests: history, procedure and results. International Journal of Testing, 10, 295-317.
Schuhfried, (n.d.). Working with the norm: standardization as the basis of psychological diagnosis. Vienna Test System Neuro. Retrieved from http://schuhfried.at/fileadmin/content/2_Letter_en/NL_RL_NEURO_Norms.en.pdf.
Originally posted on LinkedIn last November 12, 2015
Small corner of the internet that we have put up for sharing ideas about personality and aptitude assessment. Our articles cover topics as light as personal reflections on the day-to-day experience of an assessment consultant to in-depth discussions on current practices and theories of the current assessment field.