IB DP · Thinka-original Practice Paper

2025 IB DP Computer Science Practice Paper with Answers

Thinka Nov 2025 HL IB Diploma Programme-Style Mock — Computer Science

195 marks270 mins2025
An original Thinka practice paper modelled on the structure and difficulty of the Nov 2025 HL IB Diploma Programme Computer Science paper. Not affiliated with or reproduced from IB.

Paper 1 Section A

Answer all questions. Section A comprises short-answer and calculations.
10 Question · 25 marks
Question 1 · Short Answer
2.5 marks
Explain one disadvantage of a "direct changeover" (immediate integration) compared to a "phased changeover" when deploying a new database system in a hospital.
Show answer & marking scheme

Worked solution

In a direct changeover, the old system is stopped immediately and the new system is started. If a critical failure occurs in the new database, there is no operational backup to revert to, risking complete system downtime. In a high-stakes environment like a hospital, this downtime can delay treatments or lose vital patient records. In contrast, a phased changeover introduces modules step-by-step, minimizing total system vulnerability.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for identifying the lack of fallback/backup system if failure occurs.
- [1 mark] for explaining the consequence (system downtime, data loss, or high risk).
- [0.5 mark] for linking the risk directly to the hospital context (e.g., impact on patient safety or staff adaptation stress).
Question 2 · Short Answer
2.5 marks
Outline the role of the Memory Address Register (MAR) and the Memory Data Register (MDR) during a write operation to RAM.
Show answer & marking scheme

Worked solution

During a write operation, the CPU identifies where the data must be written and what data must be written. The destination address is sent to the MAR via the address bus. The actual data payload is placed into the MDR. The control unit then triggers a write instruction, causing the contents of the MDR to be copied to the RAM location pointed to by the MAR.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for correctly describing the MAR's role (holding/pointing to the target RAM address).
- [1 mark] for correctly describing the MDR's role (holding the actual data payload to be written).
- [0.5 mark] for explaining the coordinated write signal action linking the two registers to RAM.
Question 3 · Short Answer
2.5 marks
Describe how packets are reassembled at the destination in a packet-switching network, and explain why they may arrive out of order.
Show answer & marking scheme

Worked solution

In packet switching, the original message is divided into smaller packets. Each packet header contains metadata, including sequence numbers. Upon arrival, the destination host reads these sequence numbers to reconstruct the packets in their original order. Packets often travel via different routes through various routers based on real-time network conditions (such as traffic load or links going down), resulting in different travel times and out-of-order arrivals.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for stating that sequence numbers inside packet headers are used for reassembly.
- [0.5 mark] for explaining the destination's process of ordering packets using these sequence numbers.
- [1 mark] for explaining that dynamic, independent routing across varying paths with different congestion levels causes out-of-order arrival.
Question 4 · Short Answer
2.5 marks
State the primary precondition required for a binary search algorithm to operate correctly on an array, and explain why a linear search does not require this precondition.
Show answer & marking scheme

Worked solution

For binary search, the array must be in a sorted order (either ascending or descending) because the algorithm relies on comparing the midpoint to discard half of the search space at each step. If the array is unsorted, the relationship of elements on either side of the midpoint is unpredictable, and the search fails. Conversely, a linear search checks elements sequentially (one by one) from index 0 to the end, guaranteeing finding the element (if it exists) regardless of their order.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for identifying that the array must be sorted for binary search.
- [1 mark] for explaining that linear search examines elements sequentially (one-by-one).
- [0.5 mark] for concluding that spatial arrangement or order has no impact on sequential verification.
Question 5 · Short Answer
2.5 marks
Contrast the conceptual processing order of elements in a Queue versus a Stack, referencing their operational acronyms.
Show answer & marking scheme

Worked solution

A queue simulates a real-world line where elements are added to the back (enqueue) and removed from the front (dequeue), enforcing a First-In-First-Out (FIFO) order. A stack is like a stack of plates where elements are both added (push) and removed (pop) from the top, enforcing a Last-In-First-Out (LIFO) order. This contrasts chronological arrival processing (Queue) with reverse chronological processing (Stack).

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for defining Queue behavior with the FIFO acronym explained.
- [1 mark] for defining Stack behavior with the LIFO acronym explained.
- [0.5 mark] for a clear contrasting statement highlighting the chronological order difference.
Question 6 · Short Answer
2.5 marks
Explain how virtual memory is utilized by the operating system when physical RAM is fully allocated, and identify one performance drawback of this process.
Show answer & marking scheme

Worked solution

Virtual memory acts as an extension of primary memory. When RAM is saturated, the OS swaps inactive memory blocks (pages) out of RAM onto the hard drive or SSD (secondary storage). When these pages are needed again by a running process, they are swapped back into RAM, displacing other inactive pages. The bottleneck is the speed difference: reading/writing to secondary storage is orders of magnitude slower than RAM, leading to latency and potentially 'thrashing' if swapping is excessive.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for explaining how inactive memory pages are moved from RAM to secondary storage.
- [0.5 mark] for noting the reciprocal swap-back process when the data is requested again.
- [1 mark] for identifying the performance penalty (slowdown/disk thrashing) caused by the slower read/write speeds of secondary storage compared to RAM.
Question 7 · Short Answer
2.5 marks
With reference to an automated greenhouse ventilation system, explain how a closed-loop feedback control system differs from an open-loop system.
Show answer & marking scheme

Worked solution

In a closed-loop system, the output (current temperature) is measured by a sensor and fed back to the controller. The controller compares it to the desired setpoint and adjusts the ventilation motor to close the gap. In an open-loop system, there is no sensory feedback loop; the vents might open for 10 minutes every hour. If the greenhouse becomes unexpectedly hot or cold due to external weather, the open-loop system cannot self-correct.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for explaining how closed-loop control uses sensor data as feedback to dynamically adjust output to match a setpoint.
- [1 mark] for explaining that an open-loop system operates strictly on preset inputs or timers without examining output.
- [0.5 mark] for contextualizing with the greenhouse example (e.g., actual internal temperature vs. a static timer).
Question 8 · Short Answer
2.5 marks
Distinguish between inheritance and polymorphism in object-oriented programming, providing a brief conceptual distinction.
Show answer & marking scheme

Worked solution

Inheritance establishes an 'is-a' relationship between classes, enabling a child class to inherit fields and methods from a parent class, promoting code reuse. Polymorphism (specifically dynamic method overriding) allows a program to process objects differently depending on their data type or class, meaning the same method signature can produce different outputs depending on the runtime object instance.

Marking scheme

Award up to [2.5 marks] as follows:
- [1 mark] for a clear definition of inheritance (parent/child relationship, acquisition of properties/methods, code reuse).
- [1 mark] for a clear definition of polymorphism (one interface, multiple implementations/behaviors via overriding/overloading).
- [0.5 mark] for distinguishing the focus (inheritance is about structural hierarchy, while polymorphism is about behavioral flexibility at runtime).
Question 9 · Short Answer
2.5 marks
In the context of system design, outline one advantage of prototyping [1.5 marks] and identify one scenario where a direct changeover installation method is preferred over a parallel changeover method [1 mark].
Show answer & marking scheme

Worked solution

Prototyping helps illustrate system requirements to stakeholders early in the development cycle. Users can interact with the mock-up, identify missing features, and give practical feedback, preventing costly re-designs later. Direct changeover is preferred when the organization lacks the resources (financial, technological, or staff) to operate both systems concurrently, or when the database formats and structures are so vastly different that parallel runs are technically impossible.

Marking scheme

Award up to 2.5 marks in total: Prototyping Advantage [1.5 marks maximum]: Award 1 mark for identifying a valid advantage (e.g., active user feedback, early error detection, requirement clarification, cost-efficiency in the long run) and award 0.5 marks for an appropriate expansion or explanation of this advantage. Direct Changeover Scenario [1 mark maximum]: Award 1 mark for identifying a suitable situation (e.g., limited budget/staff to run duplicate systems, old and new systems are completely incompatible, the new system is low-risk/not mission-critical).
Question 10 · Short Answer
2.5 marks
Explain the interaction between the Program Counter (PC) and the Memory Address Register (MAR) during the fetch phase of the machine instruction cycle.
Show answer & marking scheme

Worked solution

During the fetch phase, the CPU must retrieve an instruction from primary memory (RAM). The Program Counter (PC) holds the memory address of the next instruction to be executed. This address is copied/transferred to the Memory Address Register (MAR). The MAR then uses this address to locate the specific instruction in primary memory, preparing it to be read into the Memory Data Register (MDR).

Marking scheme

Award up to 2.5 marks in total: Award 1 mark for explaining that the Program Counter (PC) contains/holds the address of the next instruction to be executed. Award 1 mark for explaining that this address is copied or transferred from the PC to the Memory Address Register (MAR). Award 0.5 marks for describing that the MAR uses this address to reference/locate the instruction in primary memory (RAM) or to load it onto the address bus.

Paper 1 Section B

Answer all questions. Section B consists of structured scenario-based tasks and complex pseudocode algorithms.
5 Question · 75 marks
Question 1 · Structured
15 marks
A sensor monitoring system tracks hourly temperatures in a greenhouse.

(a) Describe two advantages of using modular programming (sub-programs) when developing algorithms for this system. [4 marks]

(b) Construct a trace table for the following pseudocode segment with the input array TEMP = [12, 15, 14, 10, 18, 11].

TEMP = [12, 15, 14, 10, 18, 11]
MAX_TEMP = TEMP[0]
MIN_TEMP = TEMP[0]
I = 1
loop while I < 6
if TEMP[I] > MAX_TEMP then
MAX_TEMP = TEMP[I]
end if
if TEMP[I] < MIN_TEMP then
MIN_TEMP = TEMP[I]
end if
I = I + 1
end loop

Your trace table should track the variables I, TEMP[I], MAX_TEMP, and MIN_TEMP at each iteration. [5 marks]

(c) Construct a pseudocode algorithm that processes an array TEMP of size N and finds the maximum drop in temperature between any two consecutive hours (i.e. TEMP[I-1] - TEMP[I]). Your algorithm should output the maximum temperature drop found. If there are no drops, output 0. [6 marks]
Show answer & marking scheme

Worked solution

Part (a):
1. Code Reusability: Sub-programs can be written once and called multiple times within the greenhouse monitoring system (e.g., read_sensor() or print_stats()).
2. Ease of Debugging and Maintenance: Individual modules can be tested independently of the main program, making it easier to isolate errors in specific functions like temperature validation.

Part (b):
Trace Table:
Initial state: MAX_TEMP = 12, MIN_TEMP = 12, I = 1
- Iteration 1: I = 1, TEMP[I] = 15. Since 15 > 12, MAX_TEMP = 15. MIN_TEMP remains 12. I becomes 2.
- Iteration 2: I = 2, TEMP[I] = 14. Neither condition met. MAX_TEMP = 15, MIN_TEMP = 12. I becomes 3.
- Iteration 3: I = 3, TEMP[I] = 10. Since 10 < 12, MIN_TEMP = 10. MAX_TEMP remains 15. I becomes 4.
- Iteration 4: I = 4, TEMP[I] = 18. Since 18 > 15, MAX_TEMP = 18. MIN_TEMP remains 10. I becomes 5.
- Iteration 5: I = 5, TEMP[I] = 11. Neither condition met. MAX_TEMP = 18, MIN_TEMP = 10. I becomes 6.
Loop terminates because I is no longer < 6.
Final output state: MAX_TEMP = 18, MIN_TEMP = 10.

Part (c):
Pseudocode Algorithm:
MAX_DROP = 0
I = 1
loop while I < N
DROP = TEMP[I-1] - TEMP[I]
if DROP > MAX_DROP then
MAX_DROP = DROP
end if
I = I + 1
end loop
output MAX_DROP

Marking scheme

Part (a): [4 marks]
- 1 mark for identifying an advantage (e.g., modularity, reusability, team collaboration, simplified debugging).
- 1 mark for explaining it in the context of the scenario.
- (Repeat for the second advantage, up to 4 marks total).

Part (b): [5 marks]
- 1 mark for correct initialization (I=1, TEMP[1]=15, MAX_TEMP=15, MIN_TEMP=12).
- 1 mark for correct tracking of variables at I=2 and I=3.
- 1 mark for correct tracking of variables at I=4 and I=5.
- 1 mark for loop termination check at I=6.
- 1 mark for correct final values of MAX_TEMP (18) and MIN_TEMP (10).

Part (c): [6 marks]
- 1 mark for initializing MAX_DROP to 0.
- 1 mark for starting the loop index at 1 (or correct boundary controls).
- 1 mark for correct loop termination condition (I < N or equivalent).
- 1 mark for correctly calculating the drop as TEMP[I-1] - TEMP[I].
- 1 mark for comparing current drop with MAX_DROP and updating it.
- 1 mark for incrementing loop index and correctly outputting MAX_DROP.
Question 2 · Structured
15 marks
A customer helpdesk uses a linear queue of capacity 100 implemented as an array of strings named QUEUE to manage customer support tickets. Two integer variables, HEAD and TAIL, are used where HEAD points to the front customer and TAIL points to the next available slot at the rear of the queue. Initially, HEAD = 0 and TAIL = 0.

(a) Define the term 'static data structure' and outline one disadvantage of using a static array to implement this customer queue. [4 marks]

(b) Construct a pseudocode algorithm for the method enqueue(customerName) that inserts a customer at the rear of the queue. Your algorithm must check for overflow conditions and output appropriate error messages. [6 marks]

(c) Explain how a circular queue resolves the 'creeping queue' issue associated with linear queues without relocating elements. [5 marks]
Show answer & marking scheme

Worked solution

Part (a):
A static data structure is one whose size is fixed at the time of declaration/compile-time and cannot change dynamically during runtime.
Disadvantage: If the helpdesk experiences an unexpected surge in tickets exceeding 100, the system will crash or reject customers due to Queue Overflow. Alternatively, if only 2 tickets are active, 98 slots of memory are wasted.

Part (b):
method enqueue(customerName)
if TAIL >= 100 then
output \"Error: Queue Overflow. Cannot add customer.\"
else
QUEUE[TAIL] = customerName
TAIL = TAIL + 1
end if
end method

Part (c):
In a linear queue, when elements are dequeued, HEAD increases, meaning the memory locations before HEAD become unusable. Even if the queue is mostly empty, TAIL will eventually reach the array limit (99), causing an false overflow condition.
A circular queue solves this by wrapping the HEAD and TAIL pointers back to index 0 when they exceed the boundaries (e.g., using modulo arithmetic: TAIL = (TAIL + 1) % MAX). This reuses empty memory locations at the beginning of the array, preventing the queue from 'creeping' indefinitely.

Marking scheme

Part (a): [4 marks]
- 2 marks: Definition of static data structure (1 mark for fixed size, 1 mark for declared at compile-time/cannot change during execution).
- 2 marks: Outline of disadvantage (1 mark for identifying overflow risk or memory wastage, 1 mark for contextualizing to the helpdesk scenario).

Part (b): [6 marks]
- 1 mark for correct method header accepting customerName.
- 2 marks for checking the boundary condition correctly (TAIL >= 100 or TAIL == 100).
- 1 mark for outputting an appropriate error message.
- 1 mark for assigning customerName to QUEUE[TAIL].
- 1 mark for incrementing TAIL.

Part (c): [5 marks]
- 1 mark for defining 'creeping queue' (unused spaces at the start of the array as elements are dequeued).
- 2 marks for explaining the mechanism of a circular queue (modulo arithmetic or wrapping pointers back to 0).
- 2 marks for explaining how this avoids relocation of elements and allows memory reuse.
Question 3 · Structured
15 marks
A university is restructuring its local area network (LAN) to support high-throughput hybrid learning environments.

(a) Explain how a router directs data packets from a laptop in the engineering department to a web server in the administrative department. [4 marks]

(b) Describe the process of packet switching, referencing headers, routing, and reassembly. [5 marks]

(c) Discuss the technical benefits and drawbacks of implementing virtual local area networks (VLANs) across the university campus. [6 marks]
Show answer & marking scheme

Worked solution

Part (a):
1. The laptop sends packets to its local default gateway (the router interface for the engineering subnet).
2. The router extracts the destination IP address from the packet header.
3. The router consults its internal routing table to determine the optimal next hop or physical interface to reach the administrative department's subnet.
4. The packet is forwarded through that interface to reach the web server.

Part (b):
1. Segmentation: The data payload is divided into small chunks called packets.
2. Header Attachment: Each packet gets a header containing control information, including source IP, destination IP, packet size, and a sequence number.
3. Independent Routing: Each packet is transmitted independently through the network. Routers determine the best path for each individual packet, meaning packets may take different routes.
4. Reassembly: At the target destination, the receiving system uses the sequence numbers in the headers to reassemble the packets in the correct order, requesting retransmission for any lost packets.

Part (c):
Benefits of VLANs:
- Logical Segmentation: Students and administrative staff can be placed on separate logical networks regardless of their physical location on campus, reducing broadcast traffic (improving network efficiency).
- Enhanced Security: Sensitive admin data is isolated from the student network, making unauthorized cross-network access harder.

Drawbacks of VLANs:
- Increased Complexity: VLAN configuration, maintenance, and inter-VLAN routing require skilled network administrators.
- Equipment Costs: Requires more expensive managed switches and routers capable of handling IEEE 802.1Q tagging.

Marking scheme

Part (a): [4 marks]
- 1 mark for identifying the router acts as the default gateway.
- 1 mark for mentioning the inspection of the IP header / destination address.
- 1 mark for referencing the routing table lookup.
- 1 mark for forwarding the packet to the correct output port / interface.

Part (b): [5 marks]
- 1 mark for mentioning the division of data into smaller packets.
- 1 mark for detailing header components (source, destination, sequence number).
- 1 mark for independent routing of packets based on network traffic conditions.
- 1 mark for sequential reassembly at the receiving node.
- 1 mark for handling packet loss / error detection.

Part (c): [6 marks]
- Up to 3 marks for benefits (1 mark for identifying, up to 2 marks for clear explanation/context of broadcast domains or security).
- Up to 3 marks for drawbacks (1 mark for identifying, up to 2 marks for clear explanation of technical complexity or hardware costs).
Question 4 · Structured
15 marks
A community hospital is migrating its legacy database of electronic health records (EHR) to a modern secure cloud platform.

(a) Compare the 'direct changeover' and 'phased introduction' implementation methods, recommending the best option for this medical scenario. [6 marks]

(b) Explain the importance of providing both user documentation and training seminars during this system migration. [4 marks]

(c) Explain the role of data validation during the data migration process and identify two validation checks that would ensure high data quality. [5 marks]
Show answer & marking scheme

Worked solution

Part (a):
Direct Changeover involves completely shutting down the legacy system and immediately starting the new cloud system at a specific time.
- Advantage: Cheaper, fast, no duplication of efforts.
- Disadvantage: Extremely risky. If the new system fails, patient care is compromised, potentially risking lives.

Phased Introduction involves introducing parts of the new system (e.g., billing first, then medical records) in stages.
- Advantage: Lower risk, issues can be resolved in one module before deploying others, staff adapt progressively.
- Disadvantage: Takes longer, requires managing data compatibility between legacy and new modules.

Recommendation: Phased introduction is highly recommended here, as patient safety is paramount and a complete system failure during direct changeover cannot be tolerated.

Part (b):
- User Documentation: Provides a permanent, searchable reference manual that staff can consult when experiencing issues during daily tasks, reducing dependency on active IT support.
- Training Seminars: Offer hands-on experience and direct Q&A, allowing clinical staff to ask specific workflow questions, reducing data-entry errors and operational anxiety during the cutover.

Part (c):
Data validation ensures that the migrated records are complete, accurate, and in the correct format, preventing corruption of patient files.
1. Format Check: Ensures that data matches a specific template (e.g., Patient ID must be in the format 'AAA-1111').
2. Range Check: Confirms that numeric values fall within realistic parameters (e.g., Patient Body Temperature must be between 30.0 and 45.0 degrees Celsius).

Marking scheme

Part (a): [6 marks]
- 2 marks: Explanation of Direct Changeover (including 1 advantage and 1 disadvantage).
- 2 marks: Explanation of Phased Introduction (including 1 advantage and 1 disadvantage).
- 2 marks: Justified recommendation choosing Phased (or Parallel, if argued convincingly) over Direct to guarantee patient safety.

Part (b): [4 marks]
- 2 marks: Clear explanation of the role and value of User Documentation (reference tool, troubleshooting).
- 2 marks: Clear explanation of the role and value of Training Seminars (hands-on, reducing errors, confidence building).

Part (c): [5 marks]
- 1 mark for explaining the role of data validation in migrations (preventing corrupted or unusable data).
- 2 marks for the first validation check (1 mark for identification, 1 mark for application in a medical context).
- 2 marks for the second validation check (1 mark for identification, 1 mark for application in a medical context).
Question 5 · Structured
15 marks
An online bookstore wants to model its inventory system. A Book class holds specific book details, and a Bookstore class holds an array of up to 1000 Book objects.

(a) Distinguish between aggregation and inheritance relationships, explaining which is appropriate for modeling the relationship between the Bookstore and the Book class. [4 marks]

(b) Construct the class definition for Book in pseudocode, containing:
- Private attributes: title (String), price (double), and isDigital (boolean).
- A constructor that initializes these attributes.
- Accessor methods for price and isDigital, and a mutator method for price. [5 marks]

(c) Write a pseudocode method calculateTotalValue() within the Bookstore class that iterates through the internal array inventory of length numBooks and returns the combined price of all non-digital books. [6 marks]
Show answer & marking scheme

Worked solution

Part (a):
- Inheritance is an \"is-a\" relationship where a subclass inherits attributes and behaviors from a parent class (e.g., EBook is a subclass of Book).
- Aggregation is a \"has-a\" relationship where one object contains or consists of other independent objects (e.g., Bookstore has Books).
- Selection: Aggregation is correct because the Bookstore contains a collection of Book objects, and books can exist independently of the bookstore.

Part (b):
class Book
private title : String
private price : double
private isDigital : boolean

constructor(newTitle, newPrice, newIsDigital)
title = newTitle
price = newPrice
isDigital = newIsDigital
end constructor

method getPrice()
return price
end method

method getIsDigital()
return isDigital
end method

method setPrice(newPrice)
price = newPrice
end method
end class

Part (c):
method calculateTotalValue()
total = 0.0
loop I from 0 to numBooks - 1
currentBook = inventory[I]
if currentBook.getIsDigital() == false then
total = total + currentBook.getPrice()
end if
end loop
return total
end method

Marking scheme

Part (a): [4 marks]
- 1 mark: Define Inheritance (is-a relationship).
- 1 mark: Define Aggregation (has-a relationship).
- 1 mark: State that Aggregation is the appropriate relationship.
- 1 mark: Explain the choice (Bookstore contains Books; Books exist independently of the store).

Part (b): [5 marks]
- 1 mark for declaring attributes as private with appropriate data types.
- 1 mark for correct constructor structure initializing all attributes.
- 1 mark for correct accessor getPrice().
- 1 mark for correct accessor getIsDigital().
- 1 mark for correct mutator setPrice().

Part (c): [6 marks]
- 1 mark for initializing a total counter (double float format).
- 1 mark for correct loop syntax iterating up to numBooks - 1.
- 1 mark for accessing the element inside the array (inventory[I]).
- 1 mark for checking the isDigital flag using the accessor getIsDigital() == false.
- 1 mark for accumulating the price via getPrice().
- 1 mark for returning the total counter.

Paper 2 Option D

Answer all questions from the Object-Oriented Programming (OOP) Option.
5 Question · 65 marks
Question 1 · Structured
13 marks
A library system categorizes all media using an inheritance hierarchy. A superclass MediaItem contains attributes title (String) and id (String). Two subclasses, Book (which has an additional attribute author of type String) and CD (which has an additional attribute artist of type String), inherit from MediaItem.

(a) State two advantages of using inheritance in the design of this library system. [2]

(b) Explain how polymorphism applies to a getDetails() method defined in MediaItem and overridden in Book and CD. [3]

(c) Construct Java-like code for the Book class, including its constructor (which must call the superclass constructor) and its getDetails() method. Assume MediaItem has a constructor that accepts title and id, and a getDetails() method returning a String. [5]

(d) Describe why declaring instance variables as private and methods as public is an example of encapsulation. [3]
Show answer & marking scheme

Worked solution

(a) Advantages of using inheritance:
1. Code reuse: Common attributes (title, id) and methods do not need to be duplicated in Book and CD classes.
2. Extensibility: New media types (e.g., DVD) can easily be added to the system by inheriting from MediaItem without modifying existing code.

(b) Polymorphism allows a uniform interface (the getDetails() method) to behave differently depending on the actual subclass object being referred to at runtime (dynamic binding). If a collection contains mixed MediaItem references, calling getDetails() on a Book object will dynamically invoke the Book class's overridden version of the method, whereas calling it on a CD object will invoke the CD class's version.

(c) Class Implementation:
```java
public class Book extends MediaItem {
private String author;

public Book(String title, String id, String author) {
super(title, id);
this.author = author;
}

@Override
public String getDetails() {
return super.getDetails() + ", Author: " + this.author;
}
}
```

(d) Declaring instance variables as private prevents direct unauthorized access or modification from outside classes (data hiding). Declaring methods as public provides a controlled, standardized interface through which external classes interact with this hidden data, enforcing validation rules and preserving object integrity.

Marking scheme

(a) Award up to [2] marks:
- Award [1] mark for stating code reuse/reducing redundancy.
- Award [1] mark for stating ease of maintenance/extensibility.

(b) Award up to [3] marks:
- Award [1] mark for explaining that overriding allows subclasses to have customized implementations.
- Award [1] mark for identifying dynamic/runtime binding (resolving method call at runtime based on actual object type).
- Award [1] mark for linking to the specific example (calling getDetails() produces different outputs for Book vs CD).

(c) Award up to [5] marks:
- Award [1] mark for correct class header with inheritance keyword (extends MediaItem).
- Award [1] mark for correct private instance variable (private String author).
- Award [1] mark for correct constructor signature and call to super class (super(title, id)).
- Award [1] mark for correct initialization of subclass instance variable (this.author = author).
- Award [1] mark for correct method overriding of getDetails() calling super.getDetails() to retrieve parent data.

(d) Award up to [3] marks:
- Award [1] mark for defining encapsulation / data hiding concept.
- Award [1] mark for describing the function of private variables (restricting direct mutation).
- Award [1] mark for describing public methods as the safe accessor/modifier interface.
Question 2 · Structured
13 marks
A fitness club booking system uses a class FitnessClass to manage club members booking a specific session. A Member class contains attributes memberID (String) and name (String). The FitnessClass class maintains an array of Member objects representing the registered participants.

(a) Distinguish between 'aggregation' and 'composition' relationships, referring to the relationship between FitnessClass and Member. [4]

(b) Write a Java method registerMember(Member newMember) inside the FitnessClass class. The method must check if the class is full (the maximum capacity is 20, tracked by an integer variable numRegistered) and if the member is already registered in the array members. If there is space and the member is not already registered, add the member to the array, increment numRegistered, and return true. Otherwise, return false. [6]

(c) Explain the purpose of the keyword 'this' in Java when used within a constructor or a method. [3]
Show answer & marking scheme

Worked solution

(a) Aggregation is a weak 'has-a' relationship where the child object can exist independently of the parent container class (its life cycle is independent). Composition is a strong 'has-a' relationship where the child object's lifecycle is bound to the parent container class (if parent is destroyed, child is also destroyed). The relationship between FitnessClass and Member is aggregation because if a FitnessClass is cancelled/deleted, the Member objects continue to exist in the system.

(b) Method Implementation:
```java
public boolean registerMember(Member newMember) {
if (this.numRegistered >= 20) {
return false;
}
for (int i = 0; i < this.numRegistered; i++) {
if (this.members[i].getMemberID().equals(newMember.getMemberID())) {
return false;
}
}
this.members[this.numRegistered] = newMember;
this.numRegistered++;
return true;
}
```

(c) The keyword 'this' represents the reference to the current calling object instance. It is used to:
1. Resolve naming conflicts (shadowing) when local variables or parameters share the same name as instance variables (e.g., this.name = name).
2. Pass the current object as an argument to other methods or call other constructors within the same class (using this()).

Marking scheme

(a) Award up to [4] marks:
- Award [1] mark for defining aggregation (independent life cycles).
- Award [1] mark for defining composition (dependent life cycles / cascading delete).
- Award [1] mark for correctly identifying that FitnessClass and Member share an aggregation relationship.
- Award [1] mark for justifying the relationship (Members continue to exist even if FitnessClass is deleted).

(b) Award up to [6] marks:
- Award [1] mark for correct method header (public boolean registerMember(Member newMember)).
- Award [1] mark for checking if active capacity limits are exceeded (numRegistered >= 20).
- Award [1] mark for implementing a loop that iterates up to numRegistered (not the whole array size to avoid NullPointerException).
- Award [1] mark for comparing member IDs correctly using .equals().
- Award [1] mark for inserting newMember at index numRegistered and incrementing count.
- Award [1] mark for returning correct booleans (false for duplicate/overflow, true for successful registration).

(c) Award up to [3] marks:
- Award [1] mark for identifying 'this' as reference to the current object instance.
- Award [1] mark for explaining parameter disambiguation / instance variable shadowing.
- Award [1] mark for providing a clear code/logical example of its application.
Question 3 · Structured
13 marks
A weather station uses an array of WeatherReading objects to keep track of daily climate conditions. Each WeatherReading object has private instance variables temperature (double) and humidity (double).

(a) Explain the purpose and behavior of a 'static' variable in a class, using an example such as keeping a count of the total number of WeatherReading objects instantiated. [3]

(b) Construct a Java method calculateAverageTemp(WeatherReading[] readings) that takes an array of WeatherReading objects as a parameter and returns the average temperature as a double. Assume each WeatherReading object has a getter method getTemperature(). If the array is empty or null, the method should return 0.0. [6]

(c) Discuss one disadvantage of using an array of fixed size to store these readings and suggest how a dynamic data structure might resolve this. [4]
Show answer & marking scheme

Worked solution

(a) A 'static' variable belongs to the class itself rather than to any individual instance of that class. This means only a single copy of the variable exists in memory, shared by all instances. In this example, a static variable 'readingCount' can be incremented inside the WeatherReading constructor, successfully tracking the total number of instances created across the entire program execution.

(b) Method Implementation:
```java
public double calculateAverageTemp(WeatherReading[] readings) {
if (readings == null || readings.length == 0) {
return 0.0;
}
double total = 0.0;
int validCount = 0;
for (int i = 0; i < readings.length; i++) {
if (readings[i] != null) {
total += readings[i].getTemperature();
validCount++;
}
}
if (validCount == 0) {
return 0.0;
}
return total / validCount;
}
```

(c) Disadvantage of fixed-size arrays: They are initialized with a set size that cannot change dynamically at runtime. If the weather station records more readings than the array size, an ArrayIndexOutOfBoundsException will occur or readings will be lost. Conversely, if the array is oversized to prevent this, memory is wasted.

Resolution: A dynamic data structure (such as ArrayList in Java) dynamically resizes itself as data is added or removed. It allocates memory as needed, preventing overflow and optimizing physical memory use.

Marking scheme

(a) Award up to [3] marks:
- Award [1] mark for stating static variables belong to the class, not individual instances.
- Award [1] mark for explaining that static variables share a single memory space among all instances.
- Award [1] mark for explaining how incrementing it in the constructor keeps track of total created instances.

(b) Award up to [6] marks:
- Award [1] mark for checking if the parameter is null or length is zero.
- Award [1] mark for initializing accumulation variable (total/sum) to 0.0.
- Award [1] mark for setting up a loop to traverse the entire array.
- Award [1] mark for defensive checking to skip potential null array indices.
- Award [1] mark for retrieving the temperature using getTemperature() and accumulating the sum.
- Award [1] mark for performing correct division and returning the double average.

(c) Award up to [4] marks:
- Award [1] mark for identifying fixed capacity limitations of arrays.
- Award [1] mark for discussing consequences (overflow exceptions or memory wastage from overallocation).
- Award [1] mark for suggesting a dynamic alternative (such as ArrayList or Linked List).
- Award [1] mark for explaining how dynamic resizing manages memory efficiently.
Question 4 · Structured
13 marks
An online payment gateway uses an object-oriented design to process payments. It defines an abstract class Payment and an interface Refundable.

(a) Distinguish between 'method overloading' and 'method overriding'. [4]

(b) Explain the difference between an abstract class and an interface, referencing how each is used in program design. [4]

(c) Write the Java class definition for a CreditCardPayment class that extends the abstract class Payment and implements the Refundable interface. Define only the class header, private instance variables (cardNumber of type String, and cardType of type String), and the signature of the constructors and methods required (you do not need to write the actual implementation bodies of the methods). Assume Payment has abstract method process() and Refundable requires refund(double amount). [5]
Show answer & marking scheme

Worked solution

(a) Method overloading occurs within the same class, where multiple methods have the same name but different signatures (different number, types, or order of parameters); this is resolved at compile-time. Method overriding occurs between a subclass and a superclass, where a subclass defines a method with the exact same name, return type, and parameters as a method in its superclass; this is resolved at runtime.

(b) An abstract class can have both abstract (unimplemented) and concrete (fully implemented) methods, as well as instance variables of any access level. It represents an 'is-a' relationship. An interface can only contain abstract methods (prior to Java 8/9) and constants. It represents a capability ('can-do' relationship). A class can extend only one abstract class but can implement multiple interfaces.

(c) Class Definition Code Outline:
```java
public class CreditCardPayment extends Payment implements Refundable {
private String cardNumber;
private String cardType;

public CreditCardPayment(String cardNumber, String cardType) {
// constructor body
}

@Override
public void process() {
// abstract method from Payment
}

@Override
public void refund(double amount) {
// interface method from Refundable
}
}
```

Marking scheme

(a) Award up to [4] marks:
- Award [1] mark for defining overloading (same name, different parameter signature in same class).
- Award [1] mark for stating overloading is resolved at compile-time.
- Award [1] mark for defining overriding (same name and signature in subclass to customize parent behaviour).
- Award [1] mark for stating overriding is resolved at runtime (polymorphic dispatch).

(b) Award up to [4] marks:
- Award [1] mark for identifying abstract classes can contain concrete methods and instance variables, while interfaces historically contain only abstract method declarations.
- Award [1] mark for noting single inheritance limitations (can only inherit from one abstract class) vs multiple implementation capabilities (can implement multiple interfaces).
- Award [1] mark for stating abstract class represents a structural 'is-a' hierarchy.
- Award [1] mark for stating interfaces represent functional 'can-do' behaviors or contracts.

(c) Award up to [5] marks:
- Award [1] mark for correct class header with correct extends and implements keywords.
- Award [1] mark for declaring private instance variables (cardNumber, cardType).
- Award [1] mark for correct constructor declaration containing parameters.
- Award [1] mark for including abstract method process() signature.
- Award [1] mark for including interface method refund(double amount) signature.
Question 5 · Structured
13 marks
An e-commerce platform manages user carts using a ShoppingCart class that contains an array items of Item objects and an integer count representing the current number of items.

(a) Explain how the OOP principles of modularity and reduced coupling improve the maintenance and extensibility of this e-commerce system. [4]

(b) Write a Java method removeItem(String itemID) inside the ShoppingCart class. If an Item with the specified itemID exists in the items array (search up to count), remove it from the array, shift all subsequent items one position to the left to fill the empty slot, decrement count, and return true. If the item is not found, throw an ItemNotFoundException. Assume the items array has no null spaces between elements up to count. [6]

(c) Describe how Java's Automatic Garbage Collection manages memory when an Item object is removed from the ShoppingCart and no other references to that Item exist. [3]
Show answer & marking scheme

Worked solution

(a) Modularity divides the system into distinct self-contained units (like ShoppingCart, Item, Payment), allowing developers to write, test, and debug each class independently. Reduced coupling ensures classes are minimally dependent on each other's internal structure. If the Item class representation changes (e.g. changing price representation), it won't impact or break ShoppingCart, making the system highly maintainable and easy to extend.

(b) Method Implementation:
```java
public boolean removeItem(String itemID) throws ItemNotFoundException {
int indexToRemove = -1;
for (int i = 0; i < this.count; i++) {
if (this.items[i].getItemID().equals(itemID)) {
indexToRemove = i;
break;
}
}
if (indexToRemove == -1) {
throw new ItemNotFoundException("Item not found: " + itemID);
}
for (int j = indexToRemove; j < this.count - 1; j++) {
this.items[j] = this.items[j + 1];
}
this.items[this.count - 1] = null;
this.count--;
return true;
}
```

(c) Java's Automatic Garbage Collection runs continuously in the background. When an Item object is removed from the shopping cart and has no active reference paths pointing to it from the program stack (making it 'unreachable'), the Garbage Collector marks it. During sweep phases, it reclaims and deallocates the heap memory occupied by that unreachable object, preventing memory leaks automatically.

Marking scheme

(a) Award up to [4] marks:
- Award [1] mark for defining modularity (independent classes/components).
- Award [1] mark for linking modularity to system benefits (easier localized debugging/testing).
- Award [1] mark for defining coupling (degree of dependency between classes).
- Award [1] mark for explaining that low coupling protects classes from changes in another class's internal design.

(b) Award up to [6] marks:
- Award [1] mark for correct search loop up to count.
- Award [1] mark for correct ID comparison using .equals() on string values.
- Award [1] mark for throwing custom exception (throw new ItemNotFoundException(...)) if element is not found.
- Award [1] mark for shifting elements properly using a loop starting from indexToRemove to count - 1.
- Award [1] mark for resetting the final element of the array to null and decrementing count.
- Award [1] mark for correct method signature containing throws ItemNotFoundException and returning true.

(c) Award up to [3] marks:
- Award [1] mark for defining 'unreachable' objects (no active references on stack/heap).
- Award [1] mark for stating that the GC identifies and sweeps these objects from the heap.
- Award [1] mark for explaining that this is automated, freeing developer from manual memory management and preventing leaks.

Paper 3

Answer all questions based on the Case Study: The perfect chatbot.
4 Question · 30 marks
Question 1 · Recall & Extended Response Essay
7.5 marks
Explain how Reinforcement Learning from Human Feedback (RLHF) is utilized to align a Large Language Model (LLM)-based chatbot with human safety standards. Discuss one major limitation of RLHF when trying to eliminate bias completely.
Show answer & marking scheme

Worked solution

Reinforcement Learning from Human Feedback (RLHF) is a crucial technique for training modern chatbots. It involves:
1. **Supervised Fine-Tuning (SFT)**: The base model is trained on curated high-quality conversational prompts and answers.
2. **Reward Model Training**: Human evaluators review multiple generated outputs from the chatbot for a single prompt and rank them based on helpfulness, accuracy, and safety. A separate 'reward model' is trained on this ranking data to predict human-like evaluation scores.
3. **Reinforcement Learning (PPO)**: The conversational model is fine-tuned using Proximal Policy Optimization (PPO). The environment is the reward model, which gives a positive scalar reward for aligned behavior and negative rewards for toxic/unsafe behavior.

**Limitation**: The main limitation is the subjectivity and demographical skew of human annotators. The 'alignment' of the model is restricted to the values of the specific subgroup of annotators chosen for the evaluation tasks. Consequently, minority perspectives can be silenced, and cultural biases are inevitably baked into the reward functions, making a truly unbiased 'perfect chatbot' mathematically and socially unachievable through RLHF alone.

Marking scheme

**RLHF Process (Up to 4 marks):**
- **1 mark**: Explaining the initial Supervised Fine-Tuning (SFT) stage where models learn basic dialogue structures.
- **1 mark**: Explaining the collection of human preference data (ranking model responses based on safety/helpfulness).
- **1 mark**: Explaining the training of a surrogate Reward Model based on these human rankings.
- **1 mark**: Explaining the optimization loop where the policy (LLM) is updated using reinforcement learning (PPO) to maximize rewards.

**Limitation of RLHF (Up to 3.5 marks):**
- **1 mark**: Identifying a specific limitation (e.g., human annotator demographic skew, reward hacking, or subjectivity).
- **1.5 marks**: Explaining clearly how this limitation leads to persistent or localized bias (e.g., the reward model learns the specific cultural biases/norms of a small, homogenous group of annotators, which are then enforced globally on all users).
- **1 mark**: Critical insight/synthesis showing why this prevents the chatbot from being universally safe or unbiased.
Question 2 · Recall & Extended Response Essay
7.5 marks
Chatbots rely on a finite 'context window' to maintain the flow of a conversation. Discuss the trade-offs between the size of this context window, system latency, and computational costs. Suggest and explain one optimization strategy to mitigate these issues.
Show answer & marking scheme

Worked solution

The context window represents the maximum number of tokens (words/characters) a transformer model can process in a single inference step.

**Trade-offs:**
1. **Size vs. Latency**: The self-attention mechanism in the standard Transformer architecture scales quadratically \(O(N^2)\) with the prompt length. As the context window grows, the time to compute key-value states increases, causing higher latency in generating the first token (Time-To-First-Token, TTFT).
2. **Memory/Computational Cost**: Longer contexts require storing more key-value (KV) states in high-speed GPU VRAM (KV caching). This limits the number of concurrent users a single GPU can support, increasing deployment costs.

**Optimization Strategy (e.g., Conversation Summarization):**
To mitigate these trade-offs, developers can implement an automated summarizing module. Instead of passing the entire raw historical chat log back to the model, an background task periodically summarizes older turns of the conversation into a concise bullet-point list. Only the active context (e.g., the last 3-4 turns) and the dense summary are sent. This maintains essential context while keeping the overall token length significantly lower, saving memory and decreasing computational latency.

Marking scheme

**Trade-offs Analysis (Up to 4 marks):**
- **1 mark**: Defining the context window and its role in maintaining conversation coherence.
- **1 mark**: Explaining why larger context windows lead to higher system latency (referencing the quadratic \(O(N^2)\) complexity of attention algorithms).
- **1 mark**: Explaining the memory/hardware impact (high VRAM usage for KV cache on GPUs).
- **1 mark**: Linking these constraints to financial/operational scaling challenges for hosting the chatbot application.

**Optimization Strategy (Up to 3.5 marks):**
- **1 mark**: Proposing a valid technical optimization strategy (e.g., rolling context window, conversation summarization, semantic cache, or sliding-window attention).
- **1.5 marks**: Detailed explanation of how this strategy works technically to reduce token burden or computation.
- **1 mark**: Clear connection showing how the strategy improves latency or lowers costs without completely destroying the conversational history.
Question 3 · Recall & Extended Response Essay
7.5 marks
Describe how Retrieval-Augmented Generation (RAG) is implemented in a chatbot architecture to prevent 'hallucinations'. In your answer, explain the role of vector databases and semantic search in this process.
Show answer & marking scheme

Worked solution

Retrieval-Augmented Generation (RAG) resolves the issue of 'hallucination' (where a model generates plausible-sounding but false statements) by grounding the LLM in actual external data.

**The RAG Pipeline:**
1. **Ingestion & Embedding**: Documents are broken into small text chunks. An embedding model converts these text chunks into high-dimensional numerical vectors, which capture the semantic meaning of the text.
2. **Vector Database**: These vectors are stored in a vector database designed for fast multidimensional indexing.
3. **Query & Retrieval**: When the user enters a prompt, the system converts the prompt into a vector using the same embedding model. It queries the vector database using distance metrics (like Cosine Similarity or Euclidean Distance) to retrieve the top-'k' most semantically similar text chunks.
4. **Augmented Generation**: The system merges the retrieved text chunks with the original user query inside a predefined template (e.g., 'Answer the user query using only the following facts: [Retrieved Chunks]'). This context-rich prompt is sent to the LLM, which synthesizes a coherent response constrained to the provided facts.

Marking scheme

**RAG Architecture Concept (Up to 3 marks):**
- **1 mark**: Explaining that RAG grounds the chatbot in authoritative external data source to bypass reliance on static parametric model knowledge.
- **1 mark**: Explaining how the retrieved context is appended directly to the system/user prompt.
- **1 mark**: Explaining how this constraints the LLM, substantially mitigating or stopping hallucinated answers.

**Vector Database and Semantic Search (Up to 4.5 marks):**
- **1.5 marks**: Describing vector embeddings (conversion of natural language to mathematical vectors representing semantic/conceptual meaning, rather than just raw keyword strings).
- **1.5 marks**: Explaining how the vector database performs semantic search (using metric algorithms like cosine similarity to match the conceptual meaning of the query with stored text chunks).
- **1.5 marks**: Explaining the operational cycle (Query -> Embedded Query -> Semantic Match -> Context Retrieval -> Prompt Injection) showing how these elements interact seamlessly in real-time.
Question 4 · Recall & Extended Response Essay
7.5 marks
Evaluate the ethical and technical challenges of using conversation logs from live user interactions to continuously fine-tune a chatbot. Refer specifically to data privacy regulations such as GDPR.
Show answer & marking scheme

Worked solution

Continuous fine-tuning on live user logs helps the chatbot adapt to changing trends and user behaviors, but presents significant challenges:

**1. Technical and Ethical Challenges:**
- **PII Leakage**: LLMs are known to 'memorize' training samples. If a user shares their credit card number, medical diagnosis, or password with the chatbot, this private data may be memorized during fine-tuning and later outputted to an entirely different user.
- **Consent and Purpose Limitation**: Users frequently chat without expecting their text to become training data. Ethical compliance requires clear opt-in mechanics and distinct user understanding.

**2. Legal Challenges (GDPR Compliance):**
- **Right to Erasure (Article 17)**: GDPR gives users the right to have their personal data deleted ('Right to be Forgotten'). Once a user's data is embedded in the trillions of interconnected weights of a neural network through a training epoch, it is mathematically complex and computationally expensive to selectively 'unlearn' that specific user's contributions without completely retraining the model from scratch.
- **Mitigations**: Developers must implement rigorous preprocessing pipelines to scrub PII (such as emails, addresses, names) using Named Entity Recognition (NER) models before the logs reach the fine-tuning database. Alternatively, they can employ Differential Privacy (DP-SGD) during training to mathematically guarantee that individual user secrets cannot be extracted from the final network weights.

Marking scheme

**Ethical & Technical Challenges (Up to 3.5 marks):**
- **1 mark**: Identifying the issue of PII leakage (how neural networks memorize and can accidentally output sensitive user training data in subsequent inference sessions).
- **1 mark**: Highlighting the challenge of obtaining informed, unambiguous user consent in a continuous deployment cycle.
- **1.5 marks**: Discussing how training on user logs can inadvertently capture and reinforce negative/toxic patterns if the incoming user inputs are malicious (adversarial pollution/poisoning).

**GDPR and Legal/Mitigation Evaluation (Up to 4 marks):**
- **1 mark**: Explicitly referencing GDPR standards, specifically the 'Right to be Forgotten' (Article 17) or principles of data minimization.
- **1.5 marks**: Explaining why compliance is difficult in machine learning (the technical impossibility of easily 'deleting' a specific data point's mathematical influence from deep neural network parameters).
- **1.5 marks**: Proposing and evaluating a realistic mitigation technique (such as automatic PII scrubbing pipelines or utilizing Differential Privacy frameworks) and recognizing its performance-accuracy trade-offs.

Wondering how well you actually know this?

Thinka is an AI practice app for DSE students — unlimited questions, instant auto-marking, and detailed step-by-step solutions. 100,000+ students use it to confirm they actually know it, not just think they do.

Want more questions like this? Practice unlimited on Thinka — instant answers included.

Start Practising Free