Ancora Imparo: August 2025

Friday, August 15, 2025

CST334: Week 8 (Week 32)

I learned a lot this semester but I had four major take aways.

1) I got a better understanding of what's happening between the userspace applications that we usually interact with and the underlying hardware (CPU, memory, storage, etc.). We don't always think about whether we are using a character device, a block device, or a network device. We don't think about whether it's interrupt driven or if it uses polling. We don't always think about how the browser window stays active while it simultaneously streams audio or what happens when we open a file on our local disk. This class gives perspective on the aspects of computers that we don't interact with directly.

2) I saw another aspect of computing where efficient algorithms are the centerpiece. Operating systems, like AI, like networking, and different search and pattern recognition technologies are very centered on well-developed algorithms. Especially when it comes to scheduling and caching systems. I looked up some of the problems that people have solved and some that are in active research. There is a lot of interesting work being done and this class helps form a good foundation to at least have a fundamental understanding of it.

3) I saw how important it is for multiple things to be happening in a system at once and for specialized components to be used to drive performance. Concurrency and multithreading are critical in modern computing and this class did a good job covering it. For example, PA5 could lead you to imagine what it would be like if applications blocked while waiting for I/O or network operations to complete. On the hardware side performance boosts we get from purpose-specific hardware like MMUs and DMA controllers is something else that I think about. Multiple tasks, multiple threads, multiple processing components, all working in concert.

4) I see how important it is to recognize and consider trade-offs. For example, is there a sweet spot for the size of different layers of cache? Does it depend on the targeted application? A Xeon processor in a server or Core i9 in a laptop. Is it worth the cost for additional fast cache? Additional cache, cores, memory, etc come at the price of power consumption, heat, space, and an increase in price. This class helped expand my understanding of those trade-offs.

Tuesday, August 12, 2025

CST334: Week 7 (Week 31)

This week we learned about topics relating to device I/O, disks and persistent storage, file system structure and operations.

With regard to I/O devices, we covered different ways of handling data transfers between the CPU and peripherals devices. Polling(programmable) I/O involves the CPU constantly checking a device's status, i.e. polling the device. It’s simple but inefficient due to wasted CPU cycles. Interrupt-driven I/O is more efficient since devices interrupt the CPU only when they need attention. That way the CPU can work on other tasks while waiting. DMA-based I/O (Direct Memory Access) is better in most cases. It allows I/O devices to transfer data directly to and from memory without involving the CPU, minimizing CPU overhead. It requires dedicated hardware called a DMA controller.

We also covered hard drives, their basic structure, and performance implications. Hard drives are made up of spinning platters, usually in a stacked configuration. The platters have magnetic coatings for storing data. Tracks are rings on the plater that expand outward from the center. One ring inside of the other. The tracks are divided into sectors. A head moves across the platters to read or write data at commanded locations. Access time is the time it takes to retrieve data. It’s impacted by seek time, the time it takes for the head to move to the correct track. It’s also impacted by rotational delay, the time it takes for the targeted sector to rotate under the head.

Lastly, we covered the basics of files and directories. At a basic level, files and folders are just sequences of bits/bytes. Directories organize the files (and other directories) hierarchically. A superblock holds important file system information, like its size and the number of available blocks. Inodes contain metadata about files and directories (ownership, permissions, etc.) and pointers to their data blocks. Blocks are the basic unit of storage for file contents. The metadata describes files and directories, and are used for file system management and access.

Tuesday, August 5, 2025

CST334: Week 6 (Week 30)

This week we continued learning about concurrency. We learned about atomicity, semaphores, signaling, condition variables, and some of the pitfalls of multithreaded implementations like deadlocks, starvation, livelocks, etc.

We learned that semaphores are better suited for signaling between threads vs resource locking like mutexes. A semaphore represents an integer value but that is incremented or decremented via atomic operations. It is initialized to some integer value specified by the programmer. For example, a software interface for a device with N transmit queues may use a semaphore initialized to N. Each time a process thread successfully requests a queue, the semaphore is decremented, until it reaches 0. At that point, future requesters must wait. When a thread is done transmitting, it increments the semaphore, thus signaling waiting threads that a queue is available. The behavior seems similar to mutexes but mutexes are better suited to protecting a shared resource in a critical section than they are for signaling.

It's interesting that for semaphores and mutexes to work, the instruction set has to implement an atomic read-modify-write operation. Otherwise, the lock/unlock or increment/decrement operations may be interrupted before they complete. It just shows how specific some of the instruction set needs to be to provide the higher level features we take for granted.

We also learned how to use condition variables that allow waiting threads to be notified when a condition is met. For example, a condition variable may be used to notify thread that a buffer is empty so that it can step out of it end its wait loop and lock the buffer, before before queueing new buffer entries.

We also learned about the risks that come with multithreaded programing like deadlocks, starvation, livelock, etc. When we are dealing with a lot of shared resources and many actors contending for access it's easy to imagine how things can go wrong. Circular dependencies, ill-conceived priority + preemption algorithms, and other failings can wreak havoc. That's why we were also shown design patterns that use locks, semaphores, and condition variables in ways that help avoid some of the pitfalls.

I have a lot of respect for the researchers and engineers who devise these techniques.

Ancora Imparo