## Statistical Analysis of Test and Test Scores

View With Charts And Images

1.0 Introduction

As a student of B.Ed. (Hons) at
Institute of Education and Research, I went through a course named Practicum
where I have completed a three and a half month internship as trainee teacher
at Engineering University Girls School. I took two classes each day- Social
Science of class Six (B) and Social Science of class Eight (A).During my role
as a teacher I executed some tests for both the classes to determine the
pupils’ common trend, their individual strengths and weaknesses and to give
feedback and to fulfill my practicum requirements as well. However in this
report my objective is not to picture through pupils individual trends as my
course doesn’t require it(although T – scores determined in this report show
students’ relative individual position ), rather I shall try to analysis the
tests I have taken, test results, comparison of test results and make comment
over the general trend of the class, not specific. I shall briefly compare
between the results of 1st semester and 2nd semester social science examination
of the respective classes in order to determine if my teaching could be able to
bring more improvement of the pupil (I taught between the period of 1st
and 2nd semester exam.) and if the school teachers’ comment about
their pupil is right or wrong (they think their students are very much dull-not
meritorious at all and their improvement is very difficult because of their
poor social and economic background). Moreover I have an intention to make some
suggestion at ending remark to improve the existing assessment system of Engineering
University Girls School.

This report includes only the
analysis of class Six’s test and test results.

2.0 Review of related literature

2.1 Statistical Analysis

Statistical analysis of particular
test or test result includes the collection, presentation, and analysis of the
scores and a significant explanation as well as decision from that analysis.
Statistical analysis includes some specific and systematic steps, methods and
measurement.

2.2
Item Specification Table

Item specification table is used to
determine the degree of pupils’ achieved desirable behavioral change. It is
prepared before the execution of evaluation activity basing on the importance
of content and learning objectives in terms of behavioral domains. This work
plan is done in order to prepare items of the test.

2.3
Reliability of test

A test is called reliable when its
scores are stable and trustworthy.

s2}]

 n = Number of items of the test M = Arithmetic mean of the scores s = Standard deviation

2.4 Frequency Distribution

Frequency distribution helps to
classify or sort ungrouped data in a systematic way according to the frequency.
It enables the researcher to make the data meaningful.

2.5 Range

Range is the difference between the
highest and the lowest scores.

2.6 Class Interval

The scores have to be arranged in
some small groups. The distance or difference of such a class is called class
interval.

# Number of the class = (Range/class Interval) +1

2.7 Tabulation of scores

This is done in two steps. The first
one is totally the scores in their proper intervals and the second one is to
count the tallies and find out the frequency of each class interval. The sum of
the frequencies is called N.

2.8 Mid Point of the Class Interval

# Mid Point = Lower limit of the class+(upper
limit of the class -lower limit of the class)/2

2.9 Measures of Central Tendency

Measures of central tendency give the
researcher a convenient if describing a set of data with a single number. The
number resulting from computation of a measure of central tendency represents
the average or typical score attained by a group of subjects. Measures of
central tendencies are:

2.9.1 Mean

The mean is the arithmetic average of
the scores and is the most frequently used measure of central tendency.

# True Mean = A.M.+(Σfd/N)i

 A.M.= Assumed Mean f = Frequency d = Deviation from the class Interval containing Assumed Mean Σfd = Summation of the products of f and d N = Total frequency i = Interval length

2.9.2 Median

The median is that point in a
distribution above and below which are 50% of the scores; in other words, the
median is the midpoint.

# Median = L+{(N/2-cfu)/fm}i

 L = Lower point of the median class Cfu = Cumulative frequency of the class below the median class fm = Frequency of the median class i =  Interval length N = Total frequency

2.9.3 Mode

The mode is
the score that is attained by more subjects than any other score.

# Mode = 3 median – 2 mean

2.10 Measures of
Variability

Measures of central tendency have some limitations. Even an
ideal average can represent a series only as best as a figure can. Measures of
central tendency fail to reveal the entire story of the phenomenon.

2.10.1 Range

The range is simply the difference between the highest and
lowest score in a distribution and is determined by subtraction.

# Range = Highest score – Lowest score

2.10.2 Quartile Deviation

The
quartile deviation is one half of the difference between the upper quartile
(the 75th percentile) and lower quartile (the 25th
percentile) in a distribution.

# Q = ( Q– Q1 )/2

 Q = Quartile deviation Q3 = 75th percentile Q1 = 25th percentile

# Q1 = L1 +
{(N/4-F1)/fq1}i

 L1 = Lower point of the 25th percentile class interval F1 = Cumulative frequency of the class below the 25th Percentile class  fq1 = frequency of the class interval of 25th percentile i =  Length of the class interval N = Total frequency

# Q3 = L3 +
{(3N/4 – F3)/fq3}i

 L3 = Lower point of the 75th percentile class interval F3 = Cumulative frequency of the class below the 75th Percentile class  fq3 = frequency of the class interval of 75th percentile i =  Length of the class interval N = Total frequency

2.10.3 Mean Deviation

Mean deviation (M.D.) is
the arithmetic make of all the scores in a series taken from their mean,
occasionally from the mode or median.

S||
/N

 D =  Distance between the midpoint of the class and the mean f = Frequency of that class N =  Sum of the frequencies

2.10.4 Standard Deviation

The standard deviation is the
most stable measure of variability and takes in account each and every score.
It differs from M.D. in several aspects such as:

In computing M.D., we
disregard signs, whereas in finding S.D., we avoid the difficulty of signs by
squaring the separate deviation.

The squared deviation used
in computing S.D. is always taken from the mean.

s
=
ÖSfd2/N
– (
Sfd/N)2}i

 D = Deviation F = frequency i = Length of the class interval N = Total frequency

2.11 Measures of relative position

Measures of relative position
indicate when a score is in relation to all other scores in a distribution. A
major advantage of such measures is that they make it possible to compare the
performance of an individual in two or more different tests. Major measures of
relative position are:

2.11.1 Percentile Ranks

A percentile rank indicates the percentage of scores that
fall at or below a given score. It shows the comparative position of a student
in a class.

# Percentile Rank (P.R) = {fc
+ (X-L)/ i x q} 100/N

 fc = Cumulative frequency of the classes below the class of that particular class X = The score of which the P.R has to be determined L = Lower limit of the class which contains that particular score i = Length of the class interval q = The frequency of that class N = Total frequency

2.11.2 Standard score( Z – Score )

# Standard score (Z – Score), Z
=  (X – M) /
s

X = Score of the student

M = Mean score of the class

s2.11.3 Standard score (T – Score)

A T – Score is nothing more than a Z
– Score expressed in a different form.

2.12 Measures of relationship

Degree of relationship is expressed as a correlation
coefficient, which is computed, based on the two sets of scores. If the change
in one set of score causes the change of the other set of score, then this
phenomenon is called the correlation. The most appropriate measure of
correlation is :

N
SSSY

¡ =

ÖSX2
– (
SX )2 }{ NSSY )2
}

 X = Score of test 1 Y = Score of test 2 N = Total frequency

2.13 Measures of relationship

Degree of relationship is expressed
as a correlation coefficient, which is computed, based on the two sets of
scores. If the change in one set of score causes the change of the other set of
score, then this phenomenon is called the correlation. The most appropriate
measure of correlation is :

N
SSSY

¡ =

ÖSX2 – ( SX )2 }{ NSY2 – ( SY )2 }

 X = Score of test 1 Y = Score of test 2 N = Total frequency

2.14 Normal Distribution and Normal
Probability Curve

If a variable is normally
distributed, that is, does for a normal curve, and then several things are
true:

1. 50% of the scores are above the mean and 50% are below
the mean.
2. The mean, the median, and the mode are the same.
3. Most scores are near the mean and the further from the
mean a score is, the fewer the number of subject who attained that score.

·
·
·
·

· Mean – 3.0 S.D = Approximately the
0.1 Percentile

· Mean – 2.0 S.D = Approximately the 2nd
Percentile

· Mean – 1.0 S.D = Approximately the 16th
Percentile

· Mean = Approximately the 50th
Percentile

· Mean + 1.0 S.D = Approximately the 84th
Percentile

· Mean + 2.0 S.D = Approximately the 98th
Percentile

· Mean + 3.0 S.D = Approximately the 99+
Percentile

 0.13%
 2.15%
 13.59%
 34.13%
 34.13%
 13.59%
 2.15%
 0.13%

sss
Mean 1
sss

Fig: The normal probability curve

2.15 Abnormal Distribution
2.15.1 Skewness

# Skewness = 3 ( mean – median ) /
S.D

2.15.2 Kurtosis

Kurtosis indicates to what degree a
distribution is pointed or flat in comparison to normal probability curve. It
can be of three types:

2.15.2.1 Leptokurtic

It is more pointed than the normal
curve.

2.15.2.2
Platykurtic

It is more flat than a normal curve.

2.15.2.3 Mesokurtic

If the curve is neither too much
pointed, nor too much flat, then the curve is mesokurtic or normal.

2.16 Grading System

The grading system is based on
integration of the traditional absolute marking system and grading student
performance assuming normal distribution. The system is a two tier evaluation
system, which will ensure gradation of the group as a whole in a course, and
that of individuals which the group in comparison with one another.

3.0 General considerations

In preparing the essay type test,
understated matters have been tried to maintain:

i.
1.
It is a 15-mark test. Total item no. is 6.

ii.
The
duration of the test is 30 minutes. Time per mark is 2 minutes.

iii.
The
test has been constructed mainly with short essay type questions rather than
broad essays in order cover more areas of the content.

iv.
Short
essays enabled the increase of items in number in order to cover more areas of
content.

v.
Overlapping
of content between the essay type and MCQ type questions has been avoided.

vi.
Effort
is made to use proper language so that the pupils can easily understand the
requirements of the test. Simple to complex sequence was tried to maintain.

vii.
Mix
up of other type of questions with the essay type test has been avoided.

viii.
Optional
questions are not given.

ix.
Item
and sub-item wise number distribution was shown in the test and key words were
underlined so that the students can prepare their answers according to the
number distribution.

3.0.1Test and Examination –
Explanation of the situation

1. To fulfill my practicum requirement, I had to execute a
50-mark essay type test and a 50 mark MCQ test. But school authority did
not allow me to take a two-hour or of more duration test. So I had to take
the 50-mark essay type test in 4 parts:

Essay Type Test

Class Test no

Date

Major Subject

###### Syllabus

Duration

Total Marks

1

19/05/09

Economics

Two
fundamentals of Economics, Economic activities of human being

35 Min

10

2

27/05/09

Geography

The
Asia continent

40 Min

05

3

03/06/09

The
Gupta dynasty in Indian sub-continent, The Gupta and Mourjo dynasty in
Bengal, Bangladesh in ancient period and ancient settlements in Bengal

35 Min

08

4

08/07/09

Sociology

Evolution
of human society-stone age and ancient society

40 Min

07

150 Min

30

QUESTION
-1

4.1.1 Analysis of Test-1

4.1.1.2 Subject wise
Distribution

Integrated Social Science has been
designed including major six subjects:

1. Sociology
2. History
3. Geography
4. Political Science
5. Economics
6. Population Education

But
Political Science and Population Education were not included for 2nd
semester. So, I had to teach and take tests without those subjects.

The essay type test has included all
other subjects. Subject wise number distribution is shown below:

###### Subject

Subject range

Distributed Marks

Percentage (%)

Sociology

7 pages

12

24

History

8 pages

12

24

Geography

9 pages

14

28

Economics

7 pages

12

24

31 pages

50

100

###### Comment

Distribution of marks differed
because of the difference of ranges among the subjects. This table clearly
shows that an overall equivalency of number distribution according to the
subject range has been tried to maintain in sociology, geography, history and
economics Number per page is 1.61 (approx). According to page range if 100%
equivalency of number distribution is to be maintained, then for sociology
(7×1.61)=11.27, for history (8×1.61)=12.88, for geography (9×1.61)=14.49,for
economics (7×1.61)=11.27, number is to be required to distribute. This figure
is not too far from the real distributed mark.

4.1.1.3 Item
Specification

The item specification according
to the domains of objectives of the essay type test is shown below:

 Objective Knowledge Comprehension Application Analysis Synthesis Evaluation Total mark Percentage 32% 20% 22% 8% 10% 8% Mark No. of item Mark No. of item Mark No. of item Mark No. of item Mark No. of item Mark No. of item Sociology 0 0 4 1 0 0 4 1 0 0 4 1 10 History 7 3 0 0 0 0 0 0 5 1 0 0 05 Geography 6 2 2 1 6 1 0 0 0 0 0 0 08 Economics 3 1 4 2 5 2 0 0 0 0 0 0 07 Total 16/6 10/4 11/3 4/1 5/1 4/1 30

Comment

A good test covers all the domains of
learning such as knowledge, comprehension, application, analysis, synthesis,
and evaluation. This test fulfills this requirement covering all the domains of
learning. A test should not be too easy or too difficult. 32% number has been
given from knowledge sub-domain; items have also been taken from higher
sub-domains such synthesis, evaluation, etc

Measure of difficulty level and
discrimination power is appropriate for MCQ or objective type tests; not for
essay type test. So these two parameters is not measured here for the essay
type test.

A good test requires validity,
reliability and objectivity. Objectivity can’t be statistically determined. I
don’t have sufficient data to measure the validity of the test. Reliability can
be measured following the ‘Kuder and Richardson Method’. The reliability of the
test is determined below:

4.1.1.4 Reliability

#
Reliability,  r1 =
n/(n-1)[1-{M(n-M)/n
s2}]

 n = 16 M = 17.9 (Determined later on) s= 6.2 (Determined later on)

s2 } ]
= 1.13 (Apprx.)

Subject
wise comparative number distribution

Domain wise comparative number
distribution

QUESTION-2

4.2.1.2 General considerations

In
preparing the Multiple Choice Questions (MCQ), understated matters have been
tried to maintain:

1.
It is a 50-mark test. Total item no. is 50.Mark
per item is 1. The duration of the test is 30 minutes. Time for each item is 36
seconds. Students were required to give tick mark (√) on the correct answer.50
minutes is not given as it was not a circle fill up test. So less time was
required. All 36 seconds could be used for thinking.

1. Four distractors are given for each item. Only one is
correct answer.
2. To make guessing difficult, distractors of a specific
item are similar in length; no specific distractor is too lengthy than
others, nor it is too short.
3. The correct answers are arranged in a way so that
students can’t guess it from the arrangement. For example: among 50 items,
a total of 14 correct answers was arranged in distractor  (Ka), 13 from distractor (kha), 12 from
distractor (Ga),11  from distractor
(Gha). And the correct answers is not arranged serially, rather it is
random. So it was difficult to answer correctly basing on guess.
4. Distractors of a specific item are plausible and
homogeneous, not heterogeneous automatically eliminated.
5. ‘All of the above’, ‘None of the above’-this type of
distractors is not given. These distractors weaken the test.
6. Negative sense distractors have been avoided.
7. Stems and distractors have been clearly written so that
it could make clear sense and avoid misconception.
8. Unfamiliar words are not used.Key words were underlined
to give clear concept.
9. Irrelevant clue is not given in the stem or distractors.
10. Overlapping distractors is not given.Simple to complex
sequence was tried to follow.
11. MCQ is prepared mainly from that area of the content,
which was not included in the essay type test. So, combined essay type and
MCQ test together has covered a large area of the content to assess
pupils’ overall performance of the subject.

4.2.1.3 Subject wise Distribution

The MCQ test has included all the
subjects. Subjectwise number distribution is shown below:

###### Subject

Subject range

Distributed Marks

Percentage (%)

Sociology

7 pages

10

24

History

8 pages

07

24

Geography

9 pages

08

28

Economics

7 pages

05

24

31 pages

30

100

###### Comment

Distribution of marks differed
because of the difference of ranges among the subjects. This table clearly
shows that an overall equivalency of number distribution according to the
subject range has been tried to maintain. Number per page is 1.61 (approx).
According to page range if 100% equivalency of number distribution is to be
maintained, then for sociology (7×1.61)=11.27, for history (8×1.61)=12.88, for
geography (9×1.61)=14.49, for economics (7×1.61)=11.27 number is to be required
to distribute. This figure is not too far from the real distributed mark. So
the distribution of numbers is generally equivalent.

4.2.1.4 Item
Specification

The item specification according
to the domains of objectives of the MCQ test is shown below:

 Objective Knowledge Comprehension Application Analysis Synthesis Evaluation Total mark Percentage 68% 6% 8% 4% 4% 10% Mark No. of item Mark No. of item Mark No. of item Mark No. of item Mark No. of item Mark No. of item Sociology 9 9 1 1 0 0 0 0 1 1 1 1 10 History 10 10 0 0 0 0 0 0 0 0 2 2 07 Geography 13 13 1 1 0 0 0 0 0 0 0 0 08 Economics 2 2 1 1 4 4 2 2 1 1 2 2 05 Total 34/34 3/3 4/4 2/2 2/2 5/5 30

Comment

A good test covers all the domains of
learning such as knowledge, comprehension, application, analysis, synthesis,
and evaluation. This test fulfills this requirement covering all the domains of
learning. A test should not be too easy or too difficult. In this regard, 68%
items has been given from knowledge sub-domain; items have also been taken from
higher sub-domains such synthesis, evaluation, etc. It has been done in order
to make the test neither too easy nor too difficult and to cover all learning
domains as well. But the test has a limitation. Usually the higher level of
domain is lower the mark is distributed for the domain. This test couldn’t
follow this rule properly. For example: 3 marks has been distributed for
comprehension sub-domain, wheras 5 mark was given to assess evaluation power.
This irrelevancy can also be observed in application sub-domain. Another
limitation is each subject area couldn’t cover every domains of learning. As a
test maker I feel this quite difficult to prepare an MCQ test covering all the
criterias. This can make the test too much difficult, especially for the
students who have always faced
traditional knowledge based tests and increase their exam fearness.
Fully standardized test can be applied to them step-by-step and only after
providing proper learning situations.

4.2.1.5 Determining Difficulty level

A test is prepared considering
whether it should not be to easy or too difficult, a certain group of pupils
may feel it normal or too difficult or too easy according to their merit level,
learning experience, test experience, social and economic background, etc. As a
teacher I tried to make a not too easy not too difficult test and designed my
teaching-learning activities relevant to the objectives and test. But the
students may feel different about the test because of their pre-mentioned
criteria. So it is necessary to determine if the pupil feel the items are too
difficult or too easy. In this manner determining difficulty level and
discrimination power of the items are necessary.

# Difficulty Level, D = (R/N) 100

 R = H + L R = Number of pupils who answered the item correctly. H = Number of pupils of high score group who answered the item correctly( Upper 27% ). L = Number of pupils of low score group who answered the item correctly(Lower 27% ). N = Total number of pupils who tried the item.

58 students attended the MCQ test.
Out of 58, upper 27% is upper 15.66 students and lower 27% is lower 15.66
students. I shall count the scores of upper 16 students and lower 16 students.
So here:

H = Number of pupils of high score
group(Upper 16) who answered the item correctly.

L = Number of pupils of low score
group(Lower 16) who answered the item correctly.

N = 16 +16 = 32, if all the
students of both upper and lower group try the item.

The item wise value of H, L and N
will be directly counted in the table below; not separately shown here.

Serial no of Item

###### Manipulation of Data According to formula

Determined difficulty level

1

{(9 + 3) / 32 } x 100

37.5%

2

{(7 + 8) / 32 } x 100

46.9%

3

{(4 + 6) / 32 } x 100

31.3%

4

{(13 + 11) / 32 } x 100

75.0%

5

{(8 + 6) / 32 } x 100

43.8%

6

{(9 + 6) / 32 } x 100

46.9%

7

{(13 + 9) / 32 } x 100

68.8%

8

{(16 + 12) / 32 } x 100

87.5%

9

{( 8 + 3) / 32 } x 100

34.3%

10

{( 8 + 5) / 32 } x 100

40.6%

11

{(14 + 8) / 32 } x 100

68.8%

12

{(6 + 4) / 32 } x 100

31.3%

13

{(5 + 4) / 32 } x 100

28.1%

14

{(10 + 6) / 32 } x 100

50.0%

15

{(6 + 2) / 32 } x 100

25.0%

16

{(12 + 9) / 32 } x 100

65.6%

17

{(9 + 9) / 32 } x 100

56.3%

18

{(6 + 2) / 32 } x 100

25.0%

19

{(16 + 10) / 32 } x 100

81.3%

20

{(7 + 5) / 32 } x 100

37.5%

21

{(7 + 4) / 32 } x 100

34.3%

22

{(16 + 9) / 32 } x 100

78.1%

23

{(8 + 7) / 32 } x 100

46.9%

24

{(6 + 6) / 32 } x 100

37.5%

25

{(16 + 16) / 32 } x 100

1.0%

26

{(16 + 14) / 32 } x 100

93.8%

27

{(12 + 8) / 32 } x 100

62.5%

28

{(10 + 6) / 32 } x 100

50.0%

29

{(16 + 13) / 32 } x 100

90.6%

30

{(12 + 6) / 32 } x 100

87.5%

31

{(5 + 4) / 32 } x 100

28.1%

32

{(11 + 10) / 32 } x 100

65.6%

33

{(16 + 8) / 32 } x 100

75.0%

34

{(10 + 7) / 32 } x 100

53.1%

35

{(15 + 5) / 32 } x 100

62.5%

36

{(10 + 3) / 32 } x 100

40.6%

37

{(10 + 5) / 32 } x 100

46.8%

38

{(7 + 9) / 32 } x 100

50.0%

39

{(15 + 8) / 32 } x 100

71.8%

40

{(5 + 3) / 32 } x 100

25.0%

41

{(6 + 4) / 32 } x 100

31.3%

42

{(4 + 5) / 32 } x 100

28.1%

43

{(6 + 3) / 32 } x 100

28.1%

44

{(15 + 9) / 32 } x 100

75.0%

45

{(7 + 2) / 32 } x 100

28.1%

46

{(9 + 6) / 32 } x 100

46.8%

47

{(9 + 3) / 32 } x 100

37.5%

48

{(9 + 6) / 32 } x 100

46.8%

49

{(9 + 7) / 32 } x 100

50.0%

50

{(9 + 4) / 32 } x 100

40.6%

Comment

A good MCQ test contains more items
of difficulty level near 50.A table is given below to show the no. of items the
according to their difficulty level range:

 Range of Difficulty Level Number of items 0-10 0 11-20 1 21-30 8 31-40 9 41-50 12 51-60 3 61-70 6 71-80 5 81-90 3 91-100 3 0-100 50

It is clearly seen that there are 34
items between 0.30-0.70 difficulty levels, 15 items between 41-60 difficulty
levels. So the test has generally followed the rule of difficulty level.

4.2.1.6 Discriminating Power

# Discriminating Power, D.P = (H – L)
/ (N/2)

 H = No. of correct responses from the upper 16 L = No. of correct responses from the lower 16 N = Total number of pupils who tried the item = 16 + 16= 32; N/2 = 32/2 = 16

Serial no of Item

###### Manipulation of Data According to formula

Determined Discriminating Power

1

(9 – 3) / 16

+0.38

2

(7 – 8) / 16

-0.06

3

(4 – 6) / 16

-0.13

4

(13 – 11) / 16

+0.13

5

(8 – 6) / 16

+0.13

6

(9 – 6) / 16

+0.19

7

(13 – 9) / 16

+0.25

8

(16 – 12) / 16

+0.25

9

(8 – 3) / 16

+0.31

10

(8 – 5) / 16

+0.19

11

(14 – 8) / 16

+0.38

12

(6 – 4) / 16

+0.13

13

(5 – 4) / 16

+0.06

14

(10 – 6) / 16

+0.25

15

(6 – 2) / 16

+0.25

16

(12 – 9) / 16

+0.19

17

(9 – 9) / 16

0

18

(6 – 2) / 16

+0.25

19

(16 – 10) / 16

+0.38

20

(7 – 5) / 16

+0.13

21

(7 – 4) / 16

+0.19

22

(16 – 9) / 16

+0.43

23

(8 – 7) / 16

+0.06

24

(6 – 6) / 16

0

25

(16 – 16) / 16

0

26

(16 – 14) / 16

+0.13

27

(12 – 8) / 16

+0.25

28

(10 – 6) / 16

+0.25

29

(16 – 13) / 16

+0.19

30

(12 – 6) / 16

+0.38

31

(5 – 4) / 16

+0.06

32

(11 – 10) / 16

+0.06

33

(16 –8) / 16

+0.50

34

(10 – 7) / 16

+0.19

35

(15 – 5) / 16

+0.63

36

(10 – 3) / 16

+0.43

37

(10 – 5) / 16

+0.31

38

(7 – 9) / 16

-0.13

39

(15 – 8) / 16

+0.43

40

(5 – 3) / 16

+0.13

41

(6 – 4) / 16

+0.13

42

(4 – 5) / 16

-0.06

43

(6 – 3) / 16

+0.19

44

(15 – 9) / 16

+0.25

45

(7 – 2) / 16

+0.31

46

(9 – 6) / 16

+0.19

47

(9 – 3) / 16

+0.38

48

(9 – 6) / 16

+0.19

49

(9 – 7) / 16

+0.13

50

(9 – 4) / 16

+0.31

Comment

Items containing discrimination power
of +o.40 and above are considered very good item, items in between +0.20 to
+0.40 is considered satisfactory item, between 0 to 0.20 is considered weak
item and item having negative value is considered very weak item. A table is
given below to show the no. of items according to their discrimination power:

 Range of D.P. Number of items Comment +0.40 and above 5 Very good item +0.19 to +0.39 26 Satisfactory item 0 to +0.19 15 Weak item D.P. of negative value 4 Very weak item

It is to be noted that as many item
has a discriminating power of +0.19,this score has been included as
satisfactory item(it is very near to +.20).31 items of the test is either very
good or satisfactory item, which is 62% of the total items. That is a good
sign. But a good test should avoid items of negative values.4 items of the test
are of negative values. That is a limitation of the test.

A good test requires validity,
reliability and objectivity. As it is an MCQ test, it must be objective. I
don’t have sufficient data to measure the validity of the test. Reliability can
be measured following the ‘Kuder and Richardson Method’. The reliability of the
test is determined below:

4.2.1.7
Reliability

# Reliability, r1 =
n/(n-1)[1-{M(n-M)/ns}]

 n = 50 M = 22.75 ( Determined later on ) s= 3.65 ( Determined later on )

r1 = n / (n-1) [ 1 – {
M(n-M) / ns2} ] = 0.08 (Apprx)

##### Reliability =0.08 (Apprx) ; Test’s Reliability is low

4.2.1.8 Graphic presentation of Test – 2 analysis

Subject wise
comparative number distribution

Domain wise comparative number
distribution

Number of items according to
Difficulty Level

Sorting of items according to
Discriminating Power

4.3 Graphic comparison between Test-1
and test-2

Subject wise comparative number
distribution

Comparison between Reliability

Comment

Subject wise comparative number
distribution between Test-1 and Test-2 is exactly similar. Domain wise
comparative number distribution between Test-1 & Test-2 is not similar.
Distribution of marks largely differs in knowledge, comprehension, application
and synthesis sub-domains. Reliability of Test-1 is higher than that of Test
–2. Usually descriptive essay type test doesn’t require item analysis. So it is
not done here.

5.0 Test Result Analysis

5.1 Analysis of Test-1 results

Student’s score in Test-1

Total student number of the class is
67.Here, to make a decent analysis and comparison, only the scores of the
students were given who attended both essay type and MCQ test.

 Roll No. Obtained Mark (out of 30) Roll No. Obtained Mark (out of 30) 01 27 20 17.5 02 22 21 18 03 16.5 22 11.5 04 19 23 10 05 11 24 19.5 06 16 25 Absent 07 25 26 18 08 23 27 13 09 14 28 20 10 21 29 Absent 11 09 30 19 12 19 31 12 13 21.5 32 12 14 Absent 33 18 15 11 34 18 16 26.5 35 15 17 12.5 36 23 18 22 37 19 19 9 38 23.5 39 22 43 11.5 40 15 44 22 41 26 45 28.5 42 12.5

Tabulation of scores

# Number of the class = (Range/class Interval) +1

= 4.5 + 1

= 5.5

Frequency Distribution Table

 Class Interval (C.I.) Tallies Frequency 6 – 10 7 11 – 15 |||| |||| |||| 15 16 – 20 |||| |||| |||| | 16 21 – 25 |||| |||| ||| 13 26 – 30 |||| | 6 31 – 35 | 1

N = 58

5.1.1 Measures of Central Tendency

5.1.1.1 The Mean

# True Mean = A.M.+(Σfd/N)i

C.I.

Mid point

Frequency (f)

Deviation (d)

Product (fd)

6 – 10

8

7

– 2

– 14

11 – 15

13

15

– 1

– 15

16 – 20

18

16

0

0

21 – 25

23

13

+ 1

+ 13

26 – 30

28

6

+ 2

+ 12

31 – 35

33

1

+ 3

Σfd = – 1

 Here – A.M. = 18 Σfd = – 1 N = 58 True Mean = A.M.+(Σfd/N)i = 18 + (- 1/58) 5 Mean = 17.9

5.1.1.2 The Median

# Median = L+{(N/2-cfu)/fm}i

 C.I. Lower and upper limit of C.I. Frequency (f) Cumulative Frequency (cfu) 6 – 10 5.5 – 10.5 7 7 11 – 15 10.5 – 15.5 15 22 16 – 20 15.5 – 20.5 16 38 21 – 25 20.5 – 25.5 13 51 26 – 30 25.5 – 30.5 6 57 31 – 35 30.5 – 35.5 1 58

 Here- L = 15.5 Cfu = 22 fm = 16 i = 5 N = 58

= 15.5 + {(29 – 22)/16} 5

 Median = 17.7

### 5.1.1.3 The Mode

Mode = 3 median – 2 mean

Mode = 17.3

Comment

The value of mean, median and
mode is respectively 17.9, 17.7 and 17.3.The value of mean,  median and mode is nearer to each other..
Here 15 – 20 class interval contains maximum number of frequency and mean,
median and mode belong to that class interval.11 – 15, 16 – 20 and 21 – 25
class intervals contain a total of 44 scores together, that is 75.9% of the
total score. It shows that the scores have very high central tendency.

5.1.2 Measures of Variability

5.1.2.1 The Range

Range = 22.5

5.1.2.2 The Quartile Deviation

# Q1 = L1 + {(N/4-F1)/fq1}I

 Here- L1 = 15.5 F1 = 22 fq1 = 16 i = 5 N = 58

Q1 = L1 +
{(N/4-F1)/fq1} i

= 15.5
+{(14.25 – 22)/16} 5

= 15.5 + {(- 7.25)/23} 5

= 15.5 + (-0.32) 5

= 15.5 – 1.6

=
13.9

Q1= 13.9  (Apprx.)

# Q3 = L3 + {(3N/4 – F3)/fq3}
i

 Here- L3 = 20.5 fq3 = 13 N = 58

Q3 = L3 + {(3N/4
– F3)/fq3} I

=
20.5 + (0.44) 5

=
20.5 + 2.22

=
22.72

Q3 = 22.72 (Apprx.)

Quartile Deviation, Q = ( Q
Q1 )/2

= (22.72 – 13.9)/2

= 8.82/2

= 4.41

 Quartile Deviation = 4.41 (Apprx.)

### 5.1.2.3 Mean Deviation

S||
/N

 C.I. Mid point (X) Mean f d = x – M fd 6 – 10 8 17.9 7 – 9.9 – 69.3 11 – 15 13 17.9 15 – 4.9 – 73.5 16 – 20 18 17.9 16 – 0.1 – 1.6 21 – 25 23 17.9 13 + 5.1 + 66.3 26 – 30 28 17.9 6 + 10.1 + 60.6 31 – 35 33 17.9 1 + 15.1 + 15.1

S||
= 286.4

 Here- S|fd| = 286.4 N = 58

S||
/N

=
286.4/58

= 4.94

 Mean Deviation = 4.94 (Apprx.)

5.1.2.4 Standard Deviation

s
=
ÖSfd2/N
– (
Sfd/N)2}I

 C.I. f Deviation (d) fd fd2 6 – 10 7 – 2 – 14 28 11 – 15 15 – 1 – 15 15 16 – 20 16 0 0 0 21 – 25 13 + 1 + 13 13 26 – 30 6 + 2 + 12 24 31 – 35 1 + 3 9

∑fd = – 1
∑fd2 =89

 ∑fd = – 1 ∑fd2 = 89  i =5 N = 58

sÖSSfd/N)2}i

Ö{89/58 –
(- 1/58)2} x 5

Ö{1.53 –
(0.017)2} x 5

Ö{1.53 –
0.000289) x 5

= Ö(1.529) x 5

= 1.24 x 5

= 6.2

 Standard Deviation = 6.2 (Apprx.)
###### Comment

The value of range is 22.5,Quartile
Deviation is 4.41, Mean Deviation is 4.94 and Standard Deviation is 6.2.As
Standard Deviation is the most consistent measure of variability, considering
it, it can be said that lower scores are not much dispersed, they remain nearer
to the mean;. From the analysis of Measures of Central Tendency
&Variability some findings can be noted:

• 29
scores (50%) stand below the mean and 29 scores (50%%) stand above the
mean. So the distribution is supposed to be normal.
• The
mean, median and mode are very near to the same. But Mean>Median>Mode.
• Most
scores are near the mean and the further from the mean a score is, the
fewer the number of subjects who attained that score.
• Mean
± 1.0 S.D., that is, 11.7 – 24.1 contains a total of 39 scores (67.24%).
So the distribution is supposed to be normal
• These findings show that
although the distributions of Mean ± 1.0 S.D but the distribution of Mean ± 2.0
S.D. and Mean ± 3.0 S.D is too far from the normal curve. The distribution is
said to be normal and kurtosis.

The distribution is very near to
normal but very little positively skewed

Fig:
Positive Skewness

Skewness of the distribution = 3 (
mean – median ) / S.D = 0.1 (Apprx).

The value is very low which indicates
the distribution is very near to the normal distribution.

##### The distribution is very near to normal but is very little platykurtic

Leptokurtic – where more
scores than normal distribution stand in mean ±2.0  & 3.0S.D.

### 5.1.3 Measures of Relative Positions

Major measures of relative
positions are percentile ranks, Z-score and T-score. I shall determine only
Z-score and T-score here as they are more reliable than percentile ranks.

# Standard score (Z – Score), Z
=  (X – M) /
s

# T = 50 + 10Z

 Student’s Roll No. Z –score (Z = (X-M)/ s) T-score (T = 50 + 10Z) 01 (30 – 17.9) / 6.2 = 1.95 69.5 05 (22 – 17.9) / 6.2 = 0.66 56.6 08 (16.5 – 17.9) / 6.2 = – 0.23 47.7 09 (19 – 17.9) / 6.2 = 0.18 51.8 10 (11 – 17.9) / 6.2 = – 1.11 49.9 11 (16 – 17.9) / 6.2 = – 0.30 47 12 (14 – 17.9) / 6.2 = – 0.63 43.7 13 (16 – 17.9) / 6.2 = – 0.30 47 14 (14 – 17.9) / 6.2 = – 0.63 43.7 15 (21 – 17.9) / 6.2 = – 0.50 45 16 (9 – 17.9) / 6.2 = – 1.44 35.6 17 (19 – 17.9) / 6.2 = 0.18 51.8 18 (31.5 – 17.9) / 6.2 = 2.19 71.9 19 (24 – 17.9) / 6.2 = 0.98 59.8 20 (11 – 17.9) / 6.2 = – 1.11 48.9 21 (26.5 – 17.9) / 6.2 = 1.39 63.9 22 (12.5 – 17.9) / 6.2 = – 0.87 41.3 23 (22 – 17.9) / 6.2 = 0.66 56.6 25 (9 – 17.9) / 6.2 = – 1.44 35.6 26 43.4 27 (15 – 17.9) / 6.2 = – 0.47 45.3 28 (26 – 17.9) / 6.2 = 1.30 63 29 (12.5 – 17.9) / 6.2 = –0.87 41.3 32 (17.5 – 17.9) / 6.2 = – 0.06 49.4 33 (8 – 17.9) / 6.2 = – 1.60 34 34 (11.5 – 17.9) / 6.2 = – 1.03 39.7 36 (10 – 17.9) / 6.2 = -1.27 37.3 37 (19.5 – 17.9) / 6.2 = 0.26 52.6 38 (30 – 17.9) / 6.2 = 1.95 69.5 39 (18 – 17.9) / 6.2 = 0.02 50.2 40 (13 – 17.9) / 6.2 = – -0.80 42 41 (20 – 17.9) / 6.2 = 0.34 53.4 42 (24.5 – 17.9) / 6.2 = 1.66 66.6 43 (19 – 17.9) / 6.2 = 0.18 51.8 44 (12 – 17.9) / 6.2 = – 0.95 40.5 45 (12 – 17.9) / 6.2 = – 0.95 40.5

Comment

From this table relative
position of any student in comparison to other or whole as a group can easily
be made. The T-score shown above it is a comprehensive presentation of relative
position. Scores above 50 are more than average; scores below 50 are less than
average.

5.1.4
Grading of Test-1 Results

Norm
of Grading

 Norm Number Range (Apprx) Grade GPA Mean + 2.5 S.D. & above 33 and above A 4.00 Mean +1.5 S.D. – up to Mean + 2.5 S.D. 27 -32 B 3.50 Mean + 0.5 S.D. – up to Mean + 1.5 S.D. 21 – 26 C 3.00 Mean – 0.5 S.D.-up to Mean + 1.5 S.D. 14 – 19 D 2.50 Below Mean – 0.5 S.D. 0 – 13 F ——

Grading
of students

 Roll No. Obtained Mark Grade GPA 01 30 B 3.50 02 22 C 3.00 03 16.5 D 2.50 04 19 D 2.50 05 11 F —— 06 16 D 2.50 07 14 D 2.50 08 16 D 2.50 09 14 D 2.50 10 21 C 3.00 11 9 F —— 12 19 D 2.50 13 31.5 B 3.50 14 24 C 3.00 15 11 F —— 16 26.5 B 3.50 17 12.5 F —— 18 22 C 3.00 19 9 F —— 20 22 C 3.00 21 15 D 2.50 22 26 C 3.00 23 12.5 F —— 24 17.5 D 2.50 25 18 D 2.50 26 11.5 F —— 27 10 F —— 28 19.5 D 2.50 29 30 B 3.50 30 18 D 2.50 31 13 F —— 32 20 D 2.50 33 24.5 C 3.00 34 19 D 2.50 35 12 F —— 36 12 F —— 37 18 D 2.50 38 18 D 2.50 39 15 D 2.50 40 23 C 3.00 41 19 D 2.50 42 23.5 C 3.00 43 11.5 F —— 44 22 C 3.00 45 28.5 B 3.50

It
is to be noted that this grading was not executed in the school’s test result;
it is done only to fulfill the purpose of statistical analysis report. Combined
score of MCQ and Essay type test was executed for grading of students and that
result was submitted to school. Then the score was converted into 15 and was
added as class test number

Percentage of pass and
Fail

###### P/F

No. of students

Percentage

Pass

40

69%

Fail

18

31%

Percentage of grading

 Grade No. of students Percentage A 0 0% B 6 10.34% C 14 24.14% D 20 34.48% F 18 31.04%

5.1.5 Graphic presentation of Test-1
result analysis

5.2 Analysis of Test-2 results (MCQ
test)

Student’s score in Test-2

 Roll No. Name Obtained Mark (out of 50) 01 Parvin Akter Sumi 26 05 Tania Akter 29 08 Rawnok Sultana 29 09 Farhana Akter 22 10 Afroza Akter Anika 22 11 Farzana Abedin Chadni 18 12 Shahana Akter Runa 22 13 Mahmuda Akter Asha 23 14 Ivy Moni 21 15 Aesha Akter Lucky 22 16 Moni Akter 20 17 Sabina Yasmin Boishakhi 23 18 Shompa Akter 27 19 Shakil Hossain 25 20 Soleman Hossain 20 21 Aesha Akter Liza 27 22 Surma Akter 20 23 Md. Rubel 23 25 Kohinur Akter Munni 19 26 Eti Rani Das Brishti 24 27 Shamsun Nahar 24 28 Kamrul Hasan Meraj 26 29 Nur Fatema 19 32 Tania Akter 16 33 Bithi Akter Nadia 22 34 Jasim Uddin 21 36 Shima Akter Shimu 21 37 Rita Rani Roy 23 38 Maksuda Akter Priya 28 39 Mominur Parvez Rafil 20 40 Shahadat Hossain 22 41 Sazzad Hossain 18 42 Alamin Mia 23 43 Miraj Hossain Shuvo 22 44 Shongita Rani Roy 27 45 Nargis Akter 20

Tabulation of scores

# Number of the class = (Range/class Interval) +1

=  3.2 + 1

Frequency Distribution Table

 Class Interval (C.I.) Tallies Frequency 11 -15 | 1 16 – 20 |||| |||| |||| 14 21 – 25 |||| |||| |||| |||| |||| |||| 30 26 – 30 |||| |||| ||| 13

5.2.1 Measures of Central Tendency

5.2.1.1 The Mean

# True Mean = A.M.+(Σfd/N)i

 C.I. Mid point Frequency (f) Deviation (d) Product (fd) 11 -15 13 1 -2 – 2 16 – 20 18 14 – 1 – 14 21 – 25 23 30 0 0 26 – 30 28 13 + 1 + 13

N = 58   Σfd = -3

 Here – A.M.= 23 N = 58 Σfd = – 3 i = 5

True Mean =
A.M.+(Σfd/N)i

23 + (- 3/58) 5

5.2.1.2 The Median

# Median = L+{(N/2-cfu)/fm}I

 C.I. Lower and upper limit of C.I. Frequency (f) Cumulative Frequency (cfu) 11 -15 10.5 – 15.5 1 1 16 – 20 15.5 – 20.5 14 15 21 – 25 20.5 – 25.5 30 45 26 – 30 25.5 –30.5 13 58

 Here- L = 20.5 Cfu = 15 fm = 30 i = 5 N = 58

Median =
L+{(N/2-cfu)/fm}i

= 20.5 +{(29 – 15)/ 29} 5

= 20.5 + (14/29) 5

 Median = 22.9

5.2.1.3
The Mode

Mode = 3 median – 2 mean

##### Mode = 23.2

Comment

The value of mean, median and
mode is respectively 22.75, 22.9 and 23.2.The values of mean and median and
mode are nearer to each other. Here 21 – 25 class interval contains maximum
number of frequency, mode, mean and median belong to that class interval.16 –
20 , 21 – 25 and 26 – 30 class intervals contain a total of 57 scores together,
that is 98.28% of the total score. It shows that the scores have very high
central tendency.

5.2.2 Measures of Variability

5.2.2.1 The Range

Range = 16

5.2.2.2 The Quartile Deviation

# Q1 = L1 + {(N/4-F1)/fq1}I

 Here- L1 = 15.5 F1 = 1 fq1 = 14 i =  5 N = 58

Q1 = L1 + {(N/4-F1)/fq1}
i

= 15.5
+{(14.25 – 1)/14} 5

= 15.5 + {(13.25)/14} 5

= 15.5 +
(0.95) 5

= 15.5 + 4.75

=  20.25

# Q3 = L3 + {(3N/4 – F3)/fq3}
i

 Here- L3 = 20.5 fq3 = 30 N = 58

Q3 = L3 +
{(3N/4 – F3)/fq3} I

=
20.5 + (0.96) 5

=
20.5 + 4.8

=  30.3

Quartile Deviation, Q = ( Q
Q1 )/2

= (30.3 – 20.25)/2

= 10.05/2

= 5.03

 Quartile Deviation = 5.03 (Apprx.)

5.2.2.3
Mean Deviation

S||
/N

 C.I. Mid point (X) Mean f d = x – M fd 11 -15 13 22.75 1 – 9.75 – 9.75 16 – 20 18 22.75 14 – 4.75 + 66.5 21 – 25 23 22.75 30 + 0.75 26 – 30 28 22.75 13 + 5.25 + 68.25

S||
= 145.25

 Here- S|fd| = 145.25 N = 58

S||
/N

=
145.25/58

= 2.50

 Mean Deviation = 2.50 (Apprx.)

5.2.2.4 Standard Deviation

s
=
ÖSfd2/N
– (
Sfd/N)2}I

 C.I. f Deviation (d) fd fd2 11 -15 1 -2 – 2 + 4 16 – 20 14 – 1 – 14 +14 21 – 25 30 0 0 0 26 – 30 13 + 1 + 13 13

fd = – 3

fd2 =31

i =5

N = 58

sÖSSfd/N)2}i

Ö{31/58 –
(- 3/58)2} x 5

Ö{0.53 –
(0.05)2} x 5

Ö{0.53 – 0.0025}
x 5

= Ö(0.53) x 5

= 0.73 x 5

= 3.65

 Standard Deviation = 3.65 (Apprx.)

Comment

The value of range is 16,Quartile
Deviation is 5.03, Mean Deviation is 2.50 and Standard Deviation is 3.65.As
Standard Deviation is the most consistent measure of variability, considering
it, it can be said that lower scores are not much dispersed, they remain nearer
to the mean.. From the analysis of Measures of Central Tendency
&Variability some findings can be noted:

• 30
scores (51.72%) stand below the mean and 28 scores (47.28%) stand above
the mean. So the distribution is nearly normal.

• The
mean, median and mode are not the same. Mean>Median>Mode.

• Most
scores are near the mean and the further from the mean a score is, the
fewer the number of subjects who attained that score.

• Mean
± 1.0 S.D., that is, 19.1 – 26.4 contains a total of 38 scores (65.51%).
So the distribution is nearly normal.
• These findings show that
although the distributions not so far from the normal distribution and it are a
little negatively skewed also.

The distribution is very near to
normal & very little  negatively
skewed

Fig: Negative
Skewness

Skewness of the distribution = 3 (mean – median) /
S.D = – 0.04

The value of skewness is very low and therefore
ignorable.

The distribution is very nearly normal

Fig: Normal
Distribution

5.2.3
Measures of Relative Positions

Major measures of relative
positions are percentile ranks, Z-score and T-score. I shall determine only
Z-score and T-score here as they are more reliable than percentile ranks.

# Standard score (Z – Score), Z
=  (X – M) /
s

# T = 50 + 10Z

•  Student’s Roll No. Z –score (Z = (X-M)/ s) T-score (T = 50 + 10Z) 01 (26 – 22.75) / 3.65 = 0.89 58.9 05 (29 – 22.75) / 3.65 = 1.67 66.7 08 (29 – 22.75) / 3.65 = 1.67 66.7 09 (22 – 22.75) / 3.65 = – 0.20 48 10 (22 – 22.75) / 3.65 = – 0.20 48 11 (18 – 22.75) / 3.65 = – 1.30 37 12 (22 – 22.75) / 3.65 = – 0.20 48 13 (23 – 22.75) / 3.65 = 0.07 50.7 14 (21 – 22.75) / 3.65 = – 0.48 45.2 15 (22 – 22.75) / 3.65 = – 0.20 48 16 (20 – 22.75) / 3.65 = – 0.75 42.5 17 (23 – 22.75) / 3.65 = 0.07 50.7 18 (27 – 22.75) / 3.65 = 1.16 61.6 19 (25 – 22.75) / 3.65 = 0.62 56.2 20 (20 – 22.75) / 3.65 = – 0.75 42.5 21 (27 – 22.75) / 3.65 = 1.16 61.6 22 (20 – 22.75) / 3.65 = – 0.75 42.5 23 (23 – 22.75) / 3.65 = 0.07 50.7 25 (19 – 22.75) / 3.65 = – 1.03 38.7 26 (24 – 22.75) / 3.65 = 0.34 53.4 27 (24 – 22.75) / 3.65 = 0.34 53.4 28 (26 – 22.75) / 3.65 = 0.89 58.9 29 (19 – 22.75) / 3.65 = – 1.03 39.7 32 (16 – 22.75) / 3.65 = – 1.85 31.5 33 (22 – 22.75) / 3.65 = – 0.20 48 34 (21 – 22.75) / 3.65 = – 0.48 45.2 36 (21 – 22.75) / 3.65 = -0.48 45.2 37 (23 – 22.75) / 3.65 = 0.07 50.7 38 (28 – 22.75) / 3.65 = 1.44 64.4 39 (20 – 22.75) / 3.65 = – 0.75 42.5 40 (22 – 22.75) / 3.65 = – 0.20 48 41 (18 – 22.75) / 3.65 = -1.30 37 42 (23 – 22.75) / 3.65 = 0.07 50.7 43 (22 – 22.75) / 3.65 = – 0.20 48 44 (27 – 22.75) / 3.65 = 1.16 61.6 45 (20 – 22.75) / 3.65 = – 0.75 42.5

Comment

From this table relative
position of any student in comparison to other or whole as a group can easily
be made. The T-score shown above itself is a comprehensive presentation of
relative position. Scores above 50 are more than average, scores below 50 are
less than average.

5.2.4
Grading of Test-1 Results

Norm
of Grading

•  Norm Number Range (Apprx) Grade GPA Mean + 3 S.D. & above 34 and above A 4.00 Mean + 1.5 S.D. – up to Mean + 3 S.D. 28 – 33 B 3.50 Mean + 0.5 S.D. – up to Mean +1.5 S.D. 23 – 27 C 3.00 Mean – up to Mean – 1.5 S.D. 17 – 22 D 2.50 Below Mean – 1 S.D. 0 – 16 F ——

Grading
of students

•  Roll No. Obtained Mark Grade GPA 01 26 C 3.00 05 29 B 3.50 08 29 B 3.50 09 22 D 2.50 10 22 D 2.50 11 18 D 2.50 12 22 D 2.50 13 23 C 3.00 14 21 D 2.50 15 22 D 2.50 16 20 D 2.50 17 23 C 3.00 18 27 C 3.00 19 25 C 3.00 20 20 D 2.50 21 27 D 2.50 22 20 D 2.50 23 23 C 3.00 25 19 D 2.50 26 24 C 3.00 27 24 C 3.00 28 26 C 3.00 29 19 D 2.50 32 16 F —— 33 22 D 2.50 34 21 D 2.50 36 21 D 2.50 37 23 C 3.00 38 28 B 3.50 39 20 D 2.50 40 22 D 2.50 41 18 D 2.50 42 23 C 3.00 43 22 D 2.50 44 27 C 3.00 45 20 D 2.50

It is to be noted that this grading was not executed in the
school’s test result; it is done only to fulfill the purpose of statistical
analysis report. Combined score of MCQ and Essay type test was executed for
grading of students and that result was submitted to school. Then the score was
converted into 15 and was added as class test number.

Percentage of pass and
Fail

• ###### P/F

No. of students

Percentage

Pass

55

94.83%

Fail

3

5.17%

Percentage of grading

•  Grade No. of students Percentage A 0 0% B 4 6.9% C 23 39.65% D 28 48.27 F 3 5.17%

N
SSSY
¡ÖSX2 – (SX )2 }{ NSY2 – ( SY )2 }

Roll No

Score of Test-1 ( X )

X2

Score of Test-2 ( Y )

Y2

###### XY

01

30

900

26

676

780

05

22

484

29

841

638

08

16.5

272.25

29

841

478.5

09

19

361

22

484

418

10

11

121

22

484

242

11

16

256

18

324

288

12

14

196

22

484

308

13

16

256

23

529

368

14

14

196

21

441

294

15

21

441

22

484

462

16

9

81

20

400

180

17

19

361

23

529

437

18

31.5

992.25

27

729

850.5

19

24

576

25

625

600

20

11

121

20

400

220

21

26.5

702.25

27

729

715.5

22

12.5

156.25

20

400

250

23

22

484

23

529

506

25

9

81

19

361

171

26

22

484

24

576

528

27

15

225

24

576

360

28

26

676

26

676

676

29

12.5

156.25

19

361

237.5

32

17.5

306.25

16

256

280

33

18

324

22

484

396

34

11.5

132.25

21

441

241.5

36

10

100

21

441

210

37

19.5

380.25

23

529

448.5

38

30

900

28

784

840

39

18

324

20

400

360

40

13

169

22

484

286

41

20

400

18

324

360

42

24.5

600.25

23

529

563.5

43

19

361

22

484

418

44

12

144

27

729

324

45

12

144

20

400

240

N = 45

S
X = 1038

SX2
= 20677

S
Y = 1313

SY2
= 30441

S
XY = 24178.5

N
SSSY¡ÖSX2 – ( SX )2 }{ NSY2 – ( SY )2 }

Ö [{ 58 x 20677 – ( 1038 )2 }{ 58 x 30441 –
( 1313 )2 }]

=Ö ( 1199266 – 1077444 )(
1765578 – 1723969 )
Ö 121822 x 41609== + 0.55 (Apprx.)

 Correlation = + 0.55

Comment

The value of correlation between
Test-1 & Test-2 is +0.55. That indicates the tests have a positive,
substantial and marked relationship. The two tests scores have a tendency to
vary in the same direction. For example: a student who has done well in test-1
has a high chance to do well in Test-2 and a student who could not do well in
Test-1 has a high chance to score low in Test-2 and vise versa. This
relationship is substantial and marked.

5.4 Graphic comparison between Test-1
& Test-2 results

1. PATEL
R.N., Educational Evaluation, New Delhi: Himalaya Publishing House, 1985.
1. BLOMMWRS
Paul and LINDQUIST. E.F., Elementary Statistical Methods, University of
London Press Ltd.
1. THORNDIKE
R.L. and ELIZABETH H.H., Measurement and Evaluation in Psychology and
Education, Wiley Eastern Pvt. Ltd. New Delhi.
1. GARRET
H.F. Statistics in Psychology and Education.
1. TAPAN
SHAJAHAN and HOSSAIN MONIRA, Educational Evaluation, School of Education,
Bangladesh Open University, 1998.