Introduction
Educational assessment is not a peripheral procedure added to teaching after learning has taken place. It is one of the main mechanisms through which educational systems define what counts as legitimate knowledge, what forms of reasoning are valued, and what kinds of learner performance are recognised. In examination-oriented systems, the cognitive demand of assessment questions is particularly decisive because it guides teachers’ instructional practices and learners’ study strategies. When national examinations repeatedly privilege recall, learners are indirectly trained to reproduce information; when they require interpretation, transfer, argumentation, and problem-solving, they encourage a deeper relationship with knowledge.
This issue is especially important in educational systems that have formally adopted a competency-based curriculum. In such systems, assessment is expected to move beyond the verification of isolated knowledge and to measure the learner’s ability to mobilise knowledge, methods, and attitudes in meaningful situations. Algerian curriculum reform, particularly the adoption of the competency-based approach, has repeatedly emphasised the need to develop learners’ autonomy, critical thinking, problem-solving skills, and capacity to participate actively in social life. However, the effectiveness of such reforms depends on the degree of alignment between curricular discourse, classroom practice, and national examinations.
Middle school exit examinations occupy a strategic position within this alignment. They function simultaneously as certification instruments, selection mechanisms, and signals to teachers, learners, and families regarding the cognitive expectations of the educational system. For disciplines such as History, Geography, Islamic Education, and Civic Education, the stakes are not merely informational. These subjects are meant to cultivate historical consciousness, spatial reasoning, ethical judgement, civic responsibility, deliberative capacity, and the ability to interpret social realities. Their assessment should therefore be capable of measuring more than memorised definitions, dates, or ready-made formulations.
The present study addresses the following problem: to what extent do Algerian middle school exit examination questions reflect the higher-order cognitive skills promoted by the official curriculum? More precisely, the study seeks to identify the distribution of questions across the levels of Bloom’s Revised Taxonomy and to determine whether the examinations sufficiently assess analysing, evaluating, and creating. The study therefore examines both the quantitative distribution of question types and the pedagogical meaning of this distribution within the broader framework of competency-based education.
Two research questions structure the inquiry. First, how are examination questions in History, Geography, Islamic Education, and Civic Education distributed across the six levels of Bloom’s Revised Taxonomy? Second, what does this distribution reveal about the relationship between national assessment practices and the stated objective of developing higher-order thinking skills among middle school learners? By answering these questions, the study contributes to the analysis of assessment design in Algeria and provides practical orientations for improving the cognitive quality of examination items.
1. Theoretical framework and literature review
1.1. Educational assessment and cognitive demand
Educational assessment may be defined as a systematic process of collecting, interpreting, and using information about learners’ achievements to support learning, certify attainment, and improve educational decision-making. From a contemporary perspective, assessment is not limited to the production of marks. It includes the design of tasks, the formulation of criteria, the interpretation of learner responses, and the use of results for instructional regulation (Nitko & Brookhart, 2014; Pellegrino et al., 2001). In this sense, the cognitive level of an examination question is not a purely technical matter; it embodies a pedagogical conception of learning.
A distinction must therefore be made between questions that verify the availability of information and questions that require the learner to transform, mobilise, or evaluate that information. The first category generally involves recognition, recall, or simple explanation. The second category demands transfer, comparison, causal reasoning, argument evaluation, or the construction of a reasoned response. In national examinations, this distinction becomes crucial because repeated exposure to low-cognitive-demand tasks encourages surface learning, whereas carefully designed high-cognitive-demand tasks foster deeper comprehension and intellectual autonomy.
The notion of constructive alignment is useful here. A curriculum that claims to promote problem-solving, analysis, and critical thinking must be aligned with teaching methods and assessment instruments that actually require these operations (Biggs & Tang, 2011). Otherwise, the system produces an inconsistency between its declared aims and its evaluative reality. In such a context, learners may be officially expected to develop competencies while being practically rewarded for memorising pre-established answers.
1.2. Bloom’s Revised Taxonomy as an analytical framework
Bloom’s taxonomy, originally formulated in the cognitive domain and later revised by Anderson and Krathwohl, remains one of the most widely used frameworks for classifying educational objectives and assessment tasks (Anderson & Krathwohl, 2001; Bloom et al., 1956). The revised version distinguishes six cognitive processes: remembering, understanding, applying, analysing, evaluating, and creating. Although the taxonomy should not be treated as a mechanical hierarchy to be applied without contextual interpretation, it offers a useful language for examining the cognitive expectations embedded in examination questions.
In the present study, remembering and understanding are treated as lower-order cognitive processes. Applying occupies an intermediate position because it may range from routine procedural use to more complex transfer. Analysing, evaluating, and creating are considered higher-order processes insofar as they require the learner to reorganise information, examine relationships, formulate judgments, or generate coherent responses. The absence or scarcity of these latter levels in examination papers is therefore pedagogically significant: it indicates that learners are rarely required to display critical or creative performance under formal assessment conditions.
Bloom’s Revised Taxonomy is particularly appropriate for analysing written examination questions because it links question wording, expected learner operations, and educational objectives. Nevertheless, the coding of questions cannot rely solely on surface verbs. A verb such as explain may signal understanding in one context and analysis in another, depending on the complexity of the task and the type of evidence required. This study therefore treats verbs as indicators but interprets them in relation to the full question and the expected response.
Table 1. Bloom’s Revised Taxonomy: operational levels and indicative verbs
|
Level |
Operational description |
Typical verbs and task indicators |
|
Remembering |
Retrieval of previously learned facts, names, dates, definitions, or rules. |
define, identify, list, name, recall, enumerate, complete, match |
|
Understanding |
Construction of meaning through explanation, paraphrase, comparison, or interpretation. |
explain, summarise, compare, classify, interpret, justify, infer |
|
Applying |
Use of learned knowledge or procedures in familiar or moderately transformed situations. |
apply, use, calculate, solve, employ, extract, connect, draw |
|
Analysing |
Decomposition of information into parts and identification of relations, causes, structures, or assumptions. |
analyse, distinguish, examine, infer, prove, test, scrutinise |
|
Evaluating |
Judgment based on explicit or implicit criteria, argumentation, and critical appraisal. |
evaluate, argue, judge, defend, critique, justify a decision |
|
Creating |
Production of a new, coherent, or original response by reorganising elements into a novel pattern. |
design, propose, compose, generate, plan, formulate, create |
Source: Adapted from Bloom (1956), Anderson and Krathwohl (2001), and the author’s analytical grid.
1.3. Competency-based curricula and assessment in Algeria
Official Algerian educational documents assign assessment a central role in curriculum implementation. The competency-based approach requires learners not only to acquire knowledge but also to apply it in situations that involve interpretation, problem-solving, communication, and decision-making. The National Education Orientation Law and the General Reference for Curricula emphasise learners’ ability to understand their environment, adapt to change, participate in civic life, and develop personal and professional projects (Ministry of National Education, 2008, 2009). These orientations imply a form of assessment capable of capturing complex cognitive and social performances.
The General Reference for Curricula explicitly calls for a shift away from encyclopaedic assessment centred on memorisation and retrieval. It foregrounds cognitive processes such as synthesis, induction, criticism, and regulation of learning (Ministry of National Education, 2009). In principle, this orientation is compatible with a taxonomy-based analysis because it requires assessment instruments to include tasks that go beyond knowledge reproduction. The issue, however, is whether examination papers actually operationalise this curricular ambition.
In disciplines such as History and Geography, higher-order thinking may involve comparing sources, interpreting maps, explaining causal relations, or evaluating the consequences of historical and spatial processes. In Civic Education, it may involve analysing social situations, justifying civic choices, or debating rights and responsibilities. In Islamic Education, it may involve ethical reasoning, the application of principles to real-life situations, and the evaluation of conduct in light of values. A balanced examination should therefore include a range of cognitive operations adapted to the epistemological nature of each discipline.
1.4. Previous studies
Previous research on examination questions in Arab and regional educational contexts generally points to a recurrent dominance of lower cognitive levels. Al-Sweidan’s analysis of geography textbook questions for the first year of secondary education in Syria showed that questions tend to concentrate on the lower levels of Bloom’s taxonomy. Jasim’s evaluation of sixth-grade primary examination questions similarly found a predominance of remembering, understanding, and application. Giousi’s study of final examination questions at Palestine Technical University also revealed that lower levels occupy a substantial proportion of assessment items.
In Algeria, Luqman’s content analysis of university achievement tests at Oum El Bouaghi University reached comparable conclusions: assessment questions focused mainly on lower cognitive levels and paid limited attention to higher-order skills. Boualgamah and Haywani’s analysis of primary school certificate examination questions in Arabic, French, and Mathematics for the 2005–2016 period showed that the first three levels of Bloom’s taxonomy were dominant, while higher levels were absent or marginal. These findings suggest that the problem is not limited to a single educational level or subject. It reflects a broader assessment culture in which knowledge reproduction often remains more visible than analysis, judgment, or creation.
The present study extends this line of inquiry to middle school exit examinations in four key subjects over an eleven-year period. Its contribution lies in combining a subject-specific analysis with a cross-disciplinary reading of cognitive demand, thereby allowing a more precise evaluation of the relationship between national examination design and the stated objectives of the competency-based curriculum.
2. Methodology
This section specifies the methodological architecture of the study. In order to avoid fragmentation, the research design, corpus, unit of analysis, coding procedure, reliability check, and limits of the study are presented as integrated components of a single content-analysis protocol. The purpose is to make the classification procedure transparent and reproducible.
2.1. Research design, corpus, and unit of analysis
The study adopts a descriptive-analytical design based on content analysis. This approach is appropriate because the object of inquiry is not learner performance but the cognitive structure of examination questions. Content analysis makes it possible to classify items according to explicit criteria, quantify their distribution, and interpret their pedagogical implications. The unit of analysis is the individual question item, understood as the smallest assessable task requiring a distinct learner response.
The corpus covers middle school exit examination questions in History, Geography, Islamic Education, and Civic Education for the period 2015–2025. After editorial consolidation of the subject-specific coding tables, the corpus comprises 279 question items. This total is retained because it is the only one mathematically consistent with the detailed disciplinary distributions. Table 2 summarises the number of coded items by year and subject.
Table 2. Number of examination question items by year and subject
|
Year |
History |
Geography |
Islamic Education |
Civic Education |
Total |
|
2015 |
07 |
04 |
12 |
04 |
27 |
|
2016 |
07 |
04 |
10 |
07 |
28 |
|
2017 |
10 |
05 |
11 |
06 |
32 |
|
2018 |
06 |
03 |
12 |
05 |
26 |
|
2019 |
05 |
05 |
10 |
04 |
24 |
|
2020 |
04 |
04 |
12 |
04 |
24 |
|
2021 |
04 |
04 |
11 |
05 |
24 |
|
2022 |
05 |
04 |
10 |
05 |
24 |
|
2023 |
04 |
03 |
11 |
04 |
22 |
|
2024 |
04 |
03 |
11 |
05 |
23 |
|
2025 |
04 |
04 |
11 |
05 |
24 |
|
Total |
60 |
43 |
122 |
54 |
279 |
Source: Author’s corpus, editorially consolidated from the detailed coding tables.
2.2. Coding grid, reliability, and statistical treatment
The analysis is guided by Bloom’s Revised Taxonomy. Each item was assigned to one dominant cognitive level based on the expected mental operation required to answer it. When a question included several sub-operations, the dominant level was identified according to the highest operation explicitly required for a correct response. This methodological decision is important because an item may include a recall component while ultimately requiring interpretation, transfer, or judgment.
The coding grid consisted of six categories corresponding to the revised taxonomy: remembering, understanding, applying, analysing, evaluating, and creating. The operational definitions were established before coding and were linked to indicative verbs and expected response types. However, verbs were not used mechanically: each item was examined in relation to the task context and the cognitive operation necessary to produce the expected answer.
To ensure reliability, the examination questions were coded twice, with a three-week interval between the first and second coding. Agreement was calculated using Holsti’s formula, and the reliability coefficient reached 0.89. This coefficient indicates a satisfactory level of stability in the classification procedure. Frequencies and percentages were then calculated for each subject and for the corpus as a whole.
2.3. Scope and methodological limitations
The study focuses on the cognitive level of examination questions and does not analyse learners’ actual responses, scoring rubrics, or classroom preparation practices. It therefore evaluates the cognitive demand of the assessment instrument, not the full assessment process. Moreover, Bloom’s taxonomy provides a useful analytical framework but does not, by itself, capture all qualitative dimensions of a good question, such as authenticity, disciplinary accuracy, fairness, linguistic clarity, or cultural relevance.
These limits do not weaken the value of the study; rather, they define its scope. By identifying the cognitive architecture of national examination questions, the analysis provides a necessary basis for future work on item quality, rubric design, teacher training, and alignment between curriculum objectives and assessment practices.
3. Results
This section presents the empirical results of the content analysis. The findings are first reported by subject in order to preserve disciplinary specificity, then consolidated across the whole corpus. Each table is introduced before presentation so that the reader can identify the purpose of the data display before examining the numerical distribution. Percentages are rounded to one decimal place in the synthesis table and to whole numbers in subject tables when this does not affect interpretation.
3.1. Islamic Education
Table 3 reports the distribution of Islamic Education items across Bloom’s cognitive levels. This subject contains the largest number of coded items in the corpus and therefore has substantial influence on the general profile of the examinations.
Table 3. Distribution of Islamic Education questions according to cognitive levels (2015–2025)
|
Year |
Remembering |
Understanding |
Applying |
Analysing |
Evaluating |
Creating |
Total |
|
2015 |
5 |
2 |
3 |
0 |
2 |
0 |
12 |
|
2016 |
5 |
2 |
3 |
0 |
1 |
0 |
10 |
|
2017 |
4 |
3 |
2 |
0 |
2 |
0 |
11 |
|
2018 |
4 |
3 |
5 |
0 |
0 |
0 |
12 |
|
2019 |
4 |
3 |
2 |
0 |
1 |
0 |
10 |
|
2020 |
6 |
2 |
1 |
0 |
3 |
0 |
12 |
|
2021 |
3 |
3 |
4 |
0 |
1 |
0 |
11 |
|
2022 |
6 |
1 |
2 |
0 |
1 |
0 |
10 |
|
2023 |
4 |
5 |
1 |
0 |
1 |
0 |
11 |
|
2024 |
5 |
4 |
1 |
0 |
1 |
0 |
11 |
|
2025 |
4 |
4 |
1 |
0 |
2 |
0 |
11 |
|
Total |
50 |
32 |
25 |
0 |
15 |
0 |
122 |
|
% |
41 |
26 |
21 |
0 |
12 |
0 |
Source: Author’s coding.
Islamic Education contains 122 coded question items. Remembering is the dominant level, representing 50 items, or 41% of the total. Understanding comprises 32 items (26%), while applying comprises 25 items (21%). Evaluating accounts for 15 items (12%). No item was classified at the analysing or creating levels. This distribution indicates that Islamic Education examinations remain strongly oriented toward recall and controlled comprehension, even though the subject could legitimately include ethical reasoning, situational interpretation, and argumentative evaluation.
3.2. History
Table 4 presents the cognitive distribution of History questions. It is important to examine this subject separately because historical understanding is expected to involve chronology, causality, comparison, and source-based reasoning.
Table 4. Distribution of History questions according to cognitive levels (2015–2025)
|
Year |
Remembering |
Understanding |
Applying |
Analysing |
Evaluating |
Creating |
Total |
|
2015 |
2 |
0 |
4 |
0 |
1 |
0 |
07 |
|
2016 |
3 |
1 |
2 |
0 |
1 |
0 |
07 |
|
2017 |
4 |
3 |
1 |
0 |
2 |
0 |
10 |
|
2018 |
2 |
2 |
1 |
0 |
1 |
0 |
06 |
|
2019 |
1 |
0 |
1 |
0 |
3 |
0 |
05 |
|
2020 |
2 |
1 |
1 |
0 |
0 |
0 |
04 |
|
2021 |
1 |
2 |
1 |
0 |
0 |
0 |
04 |
|
2022 |
0 |
2 |
2 |
0 |
1 |
0 |
05 |
|
2023 |
1 |
2 |
0 |
1 |
0 |
0 |
04 |
|
2024 |
1 |
1 |
2 |
0 |
0 |
0 |
04 |
|
2025 |
1 |
2 |
1 |
0 |
0 |
0 |
04 |
|
Total |
18 |
16 |
16 |
1 |
9 |
0 |
60 |
|
% |
30 |
27 |
27 |
1 |
15 |
0 |
Source: Author’s coding.
History presents a more diversified but still insufficiently demanding cognitive profile. The 60 coded items are distributed between remembering (18 items, 30%), understanding (16 items, 27%), applying (16 items, 27%), and evaluating (9 items, 15%). Only one item was classified as analysing, and no item reached the creating level. The presence of evaluative questions is noteworthy, yet the near-absence of analysis is problematic for a discipline whose epistemological core involves causal explanation, source interpretation, chronology, comparison, and historical argumentation.
3.3. Geography
Table 5 displays the distribution of Geography questions. In this subject, understanding and application are particularly relevant because learners are often asked to interpret maps, spatial data, and socio-environmental phenomena.
Table 5. Distribution of Geography questions according to cognitive levels (2015–2025)
|
Year |
Remembering |
Understanding |
Applying |
Analysing |
Evaluating |
Creating |
Total |
|
2015 |
1 |
1 |
2 |
0 |
0 |
0 |
04 |
|
2016 |
2 |
1 |
1 |
0 |
0 |
0 |
04 |
|
2017 |
0 |
2 |
2 |
0 |
1 |
0 |
05 |
|
2018 |
1 |
2 |
0 |
0 |
0 |
0 |
03 |
|
2019 |
0 |
2 |
3 |
0 |
0 |
0 |
05 |
|
2020 |
1 |
2 |
1 |
0 |
0 |
0 |
04 |
|
2021 |
0 |
3 |
1 |
0 |
0 |
0 |
04 |
|
2022 |
0 |
1 |
1 |
0 |
2 |
0 |
04 |
|
2023 |
1 |
1 |
1 |
0 |
0 |
0 |
03 |
|
2024 |
0 |
2 |
1 |
0 |
0 |
0 |
03 |
|
2025 |
1 |
2 |
1 |
0 |
0 |
0 |
04 |
|
Total |
7 |
19 |
14 |
0 |
3 |
0 |
43 |
|
% |
16 |
44 |
33 |
0 |
7 |
0 |
Source: Author’s coding.
Geography shows a comparatively stronger orientation toward understanding and application. Out of 43 items, understanding accounts for 19 items (44%), application for 14 items (33%), remembering for 7 items (16%), and evaluating for 3 items (7%). Analysing and creating are absent. The predominance of understanding and application is consistent with tasks involving maps, data extraction, or interpretation of spatial phenomena. Nevertheless, the absence of analysis-level items limits learners’ opportunities to explain spatial relations, compare territorial configurations, or reason about socio-environmental processes.
3.4. Civic Education
Table 6 summarises the distribution of Civic Education items. The profile of this subject is particularly significant because civic learning requires not only knowledge of institutions but also deliberation, justification, and situated judgment.
Table 6. Distribution of Civic Education questions according to cognitive levels (2015–2025)
|
Year |
Remembering |
Understanding |
Applying |
Analysing |
Evaluating |
Creating |
Total |
|
2015 |
1 |
1 |
1 |
0 |
1 |
0 |
04 |
|
2016 |
1 |
3 |
2 |
0 |
1 |
0 |
07 |
|
2017 |
3 |
1 |
1 |
0 |
1 |
0 |
06 |
|
2018 |
2 |
1 |
0 |
0 |
2 |
0 |
05 |
|
2019 |
1 |
1 |
2 |
0 |
0 |
0 |
04 |
|
2020 |
0 |
2 |
2 |
0 |
0 |
0 |
04 |
|
2021 |
1 |
2 |
2 |
0 |
0 |
0 |
05 |
|
2022 |
1 |
1 |
2 |
0 |
1 |
0 |
05 |
|
2023 |
0 |
2 |
1 |
0 |
1 |
0 |
04 |
|
2024 |
1 |
2 |
1 |
0 |
1 |
0 |
05 |
|
2025 |
1 |
3 |
1 |
0 |
0 |
0 |
05 |
|
Total |
12 |
19 |
15 |
0 |
8 |
0 |
54 |
|
% |
22 |
35 |
28 |
0 |
15 |
0 |
Source: Author’s coding.
Civic Education includes 54 coded items. Understanding represents the largest share, with 19 items (35%), followed by applying with 15 items (28%), remembering with 12 items (22%), and evaluating with 8 items (15%). Analysing and creating are entirely absent. This profile is pedagogically ambivalent. On the one hand, the reduced weight of remembering is positive because the subject aims to develop civic awareness and practical understanding. On the other hand, the absence of analysis and creation prevents the examination from fully measuring deliberation, civic problem-solving, proposal formulation, or reasoned judgment in social situations.
3.5. Cross-subject synthesis
Table 7 consolidates the distributions of all four subjects. This synthesis makes it possible to move beyond disciplinary variation and identify the overall cognitive architecture of middle school exit examinations.
Table 7. Overall distribution of coded question items according to Bloom’s Revised Taxonomy
|
Cognitive level |
Frequency |
Weighted percentage |
|
Remembering |
87 |
31.2% |
|
Understanding |
86 |
30.8% |
|
Applying |
70 |
25.1% |
|
Analysing |
1 |
0.4% |
|
Evaluating |
35 |
12.5% |
|
Creating |
0 |
0.0% |
|
Total |
279 |
100% |
Source: Author’s coding; percentages calculated on the consolidated total of 279 items.
Across the corpus as a whole, remembering and understanding account for 173 items, or 62.0% of the total. Applying represents 70 items, or 25.1%. Together, these three levels represent 87.1% of the corpus. Higher-order thinking skills remain marginal: analysing appears only once (0.4%), evaluating accounts for 35 items (12.5%), and creating is absent. The findings therefore reveal a structural imbalance in the cognitive architecture of middle school exit examinations. The examinations do not simply underrepresent creative tasks; they almost entirely exclude the forms of analysis through which learners could demonstrate deeper conceptual understanding and critical reasoning.
4. Discussion
The results confirm a persistent misalignment between the declared aims of competency-based education and the cognitive demands embedded in national examinations. If the curriculum seeks to develop autonomy, problem-solving, critical thinking, and active citizenship, examination tasks must provide learners with opportunities to demonstrate these capacities. The present data show that such opportunities remain limited. Most items require retrieval, explanation, or routine application, while tasks demanding analysis, argumentation, judgment, or production are rare or absent.
This imbalance has pedagogical consequences. In examination-driven contexts, teachers and learners tend to adapt their practices to what is rewarded by formal assessment. If national examinations mainly assess recall and simple understanding, classroom preparation is likely to reproduce the same emphasis, even when curriculum documents call for competencies. Assessment therefore becomes a powerful regulator of pedagogy. The scarcity of higher-order items may indirectly reinforce memorisation strategies and reduce incentives for discussion, problem-solving, inquiry, source analysis, and structured argumentation.
The results also show that the four subjects do not have identical profiles. Islamic Education is more strongly associated with remembering, whereas Geography and Civic Education show a relatively greater emphasis on understanding and application. This variation suggests that improvement is possible. The examination system already contains some movement toward tasks requiring interpretation and use of knowledge. However, this movement remains incomplete because it rarely reaches analysis and never reaches creation. In other words, the issue is not the absence of all reform effects; it is the insufficiency of their translation into high-level assessment design.
History deserves particular attention. A history examination that does not substantially assess analysis risks reducing the discipline to the reproduction of events, dates, and ready-made explanations. Yet historical thinking depends on the ability to establish relations between causes and consequences, compare periods, interpret sources, identify continuity and rupture, and construct reasoned explanations. Similarly, Civic Education cannot be limited to knowledge of rights and duties; it should also assess learners’ ability to deliberate, justify choices, and propose solutions to civic problems.
The near absence of creating-level tasks should not be interpreted as a call to impose artificial creativity on every examination. Rather, it points to the need for tasks that invite learners to produce organised responses: a justified proposal, a civic action plan, a short argumentative text, a map-based explanation, or a historically grounded interpretation. Creation, in the revised taxonomy, does not necessarily mean artistic originality. It refers to the ability to reorganise elements into a coherent whole. Such tasks are fully compatible with middle school assessment when they are carefully scaffolded and criteria-based.
The findings are consistent with previous studies in the region, which have documented the predominance of lower cognitive levels in textbook and examination questions. They also support the argument that assessment reform cannot be reduced to general policy statements. It requires operational tools: validated item banks, training in cognitive-level specification, moderation procedures, model rubrics, and post-examination analyses. Without such mechanisms, competency-based discourse may remain disconnected from actual examination practices.
A balanced examination does not mean that all cognitive levels should receive equal percentages in all subjects. Remembering and understanding remain necessary, especially at the end of compulsory education. The central issue is proportionality and alignment. Lower-order questions should provide a foundation, but they should not dominate to the extent that higher-order thinking becomes exceptional. A more balanced design would ensure that every subject includes a meaningful proportion of analysis and evaluation items, and that selected tasks invite learners to construct responses rather than merely identify or reproduce information.
5. Recommendations
The following recommendations translate the results into operational measures for assessment reform. They are not intended as general statements of principle; each recommendation targets a specific stage of examination design, from item writing to post-examination evaluation. Their shared objective is to make higher-order thinking explicit, teachable, and assessable within national examination practice.
-
Establish an official cognitive-specification table for each subject, indicating the expected distribution of questions across Bloom’s levels before examination papers are validated.
-
Train examination writers to distinguish between surface verbs and genuine cognitive demand, with examples of well-designed items at each level.
-
Develop subject-specific question banks that include validated items targeting analysing, evaluating, and creating, adapted to the age and curriculum level of middle school learners.
-
Introduce moderation procedures in which examination drafts are reviewed for cognitive balance, linguistic clarity, disciplinary validity, and fairness.
-
Design rubrics for open-ended tasks so that higher-order questions can be assessed reliably and transparently.
-
Conduct annual post-examination cognitive analyses and publish summary reports to support evidence-based improvement of national assessment practices.
Conclusion
This study analysed Algerian middle school exit examination questions in History, Geography, Islamic Education, and Civic Education for the period 2015–2025 through the lens of Bloom’s Revised Taxonomy. The results show that the cognitive structure of these examinations remains dominated by remembering, understanding, and routine application. Higher-order thinking skills, particularly analysing and creating, are either marginal or absent. The overall pattern reveals a significant gap between the objectives of competency-based curriculum reform and the assessment instruments used to certify learning at the end of middle school.
The study does not argue for the elimination of lower cognitive levels. Remembering and understanding are indispensable components of learning, and application remains essential for the mobilisation of knowledge. The issue is that these levels should not monopolise the assessment space. If examinations are to support the development of critical, analytical, and creative learners, they must include tasks that require interpretation, justification, evaluation, problem-solving, and coherent production.
Improving the cognitive quality of examinations is therefore not a secondary technical matter. It is a condition for making curriculum reform effective. A national assessment system aligned with competency-based education must make higher-order thinking visible, teachable, and assessable. Such alignment would help transform examinations from instruments of knowledge reproduction into levers for deeper learning, intellectual autonomy, and responsible citizenship.
