Notes on 📕Principled Programming by 🐻Tim Teitelbaum

These are things I found interesting while reading Principled Programming: Introduction to Coding in Any Imperative Language

The notes are still in progress as I have not yet finished reading the book

Preface

The book is aimed at the level of a first computer class. However, it seems to fit better at a second class level, which is why it is a suggested resource in a second computer class, 🐻CS 2110

1. Introduction

The book advises writing it right the first time, not assuming it will be fixed later, and avoiding debugging as much as possible. This is something that I try to do, but I’ve noticed a lot of other people don’t
There’s a great example on how to think methodically about code in this chapter

2. Prerequisites

The book includes the program counter in the definition of state, then defines effect as a change in state. This implies that stepping through the code is an effect, which is not how most people think about effects

3. Specifications & Implementations

The book advises to think of specifications in two directions: inwards in defining what the implementation will do, and outwards in defining how the specification will be used by the rest of the program
The book bemoans that specifications contained in comments are essential to programming but are not used by the compiler or other tools. Perhaps it’s worth exploring ways to change that
The book advises that specifications should detail what to do and why you did it (don’t assume you’ll remember anything later), not how to do it–requirements, not processes
Information hiding is defined in Wikipedia as “the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed”
The book recommends that programs should explicitly complain when inputs violate preconditions, to avoid bugs where the user of the specification is unaware that bad inputs are being sent
State space is defined in Wikipedia as “a discrete space representing the set of all possible configurations of a system”
A representation invariant is defined in Wikipedia as “a set of invariant properties that remain uncompromised regardless of the state of the object”

Stepwise refinement / top-down approach is defined in Wikipedia as “an overview of the system is formulated, specifying, but not detailing, any first-level subsystems. Each subsystem is then refined in yet greater detail, sometimes in many additional subsystem levels, until the entire specification is reduced to base elements”
The book warn to be wary of a stepwise refinement situation where you’ve divided the problem into subproblems that may be harder to solve then the overall problem
The book points out that a specification of going from a precondition to a postcondition, is satisfied by an implementation that goes from a weakened version of the precondition (a superset of states allowed by the precondition) to a stronger version of the postcondition (a subset of states required by the postcondition). This way of thinking about specifications is incredibly useful
The book points out that one reason pulling code into a helper method is useful is that it is easy to tell which variables do not get passed to the helper function, and thus are not accessed or changed by the helper function’s functionality
Loop variant is defined in Wikipedia as “a mathematical function defined on the state space of a computer program whose value is monotonically decreased with respect to a (strict) well-founded relation by the iteration of a while loop under some invariant conditions, thereby ensuring its termination”
Loop invariant is defined in Wikipedia as “a property of a program loop that is true before (and after) each iteration…expressed by formal predicate logic and used to prove properties of loops and by extension algorithms that employ loops (usually correctness properties)”
The book advises that to determine a loop invariant, examine the postcondition and weaken it. For example, if {C^I} body {I}, then {I} while(C){body} {!C^I} where !C^I satisfies the postcondition you are trying to achieve. This is a great way to think about loop conditions and invariants
The book divides code into types: sequential, case analysis (conditional statements), iterative (loops), recursive, and library of previously solved patterns. This is an interesting way of thinking about how to break down problems
The book suggests coding a loop in the following order: body, termination, initialization, finalization, boundary conditions (first & last iteration). I’ve never consciously thought about the order–it may be worth trying in the future

5. Online Algorithms

Online algorithm is defined in Wikipedia as “process its input piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start”. The keys are that you cannot store the input as it comes in, only update summary values, and you have to make decisions for each input without being able to look at the up-coming input.
The book suggests that loop invariants are particularly visible, and crucial to specify correctly, in online algorithms

6. Enumeration Patterns

Enumeration is defined in Wikipedia as “complete, ordered listing of all the items in a collection”
The book points out two commonly off-by-one errors: the index associated with the nth object is n-1, and the number of objects between index m and n is m-n+1
The BigInteger class is described in the Java documentation as “immutable arbitrary-precision integers”
In the Python documentation, “integers have unlimited precision”
The book presents a typical indeterminate enumeration: while ( i <= maxi && condition ) then check at end if i > maxi
The books recommends using for loops for determinate enumerations and while loops for indeterminate enumerations. This is often good advice–I’ve seen example code use for loops with break statements inside that are confusing. However, it means you have to use Iterators instead of for-each loops for indeterminate enumerations, which seems less clean
The book points out that to subtract 1 in arithmetic mod N, add N-1 to keep the result positive
The book advises that if performing a potentially unnecessary harmless operation takes as long as checking to see whether it should be done, don’t bother checking and just simplify the code by doing it. In general, the book advises using making the boundary case special code as small as possible–which seems a variant of the “premature optimization” advice. It’s also not great on multi-dimensional arrays, which are presented akwardly later in the chapter
The book advises against performing arithmetic on indices, which would seem to make the use of for-each loops preferable
The BitSet class is described in the Java documentation as “implements a vector of bits that grows as needed” and may be preferable to using a simpler boolean array if space is an issue
The difference between row- and column-major order is described in Wikipedia as “in row-major order, the consecutive elements of a row reside next to each other, whereas the same holds true for consecutive elements of a column in column-major order
The book describes picking two indices from a set as using the lower-triangular region ((2,1),(3,1),(3,2)…). I find it more natural to think of the upper-trangular region ((1,2),(1,3),(2,3)), but the former is easier to code
Aleph-zero is defined in Wikipedia as “ℵ₀ (aleph-nought, aleph-zero, or aleph-null) is the cardinality of the set of all natural numbers, and is an infinite cardinal. The set of all finite ordinals, called ω or ω₀ (where ω is the lowercase Greek letter omega), also has cardinality ℵ₀. A set has cardinality ℵ₀ if and only if it is countably infinite, that is, there is a bijection (one-to-one correspondence) between it and the natural numbers”

7. Sequential Search

Linear / sequential search is defined in Wikipedia as “sequentially checks each element of the list until a match is found or the whole list has been searched”
Gestalt psychology, gestaltism, or configurationism is defined by Wikipedia as “school of psychology and a theory of perception that emphasises the processing of entire patterns and configurations, and not merely individual components”
The book advises to develop an algorithm systematically, sticking in well-known patterns as you proceed, rather than relying on overall pattern intuition. This is because an algorithm cannot look at an entire array with intuition, but must look at it one step at a time
The book implies that sometimes it’s more natural to figure out when the loop should end, then negate that for use as the loop condition
Sentinel value is defined in Wikipedia as “special value in the context of an algorithm which uses its presence as a condition of termination, typically in a loop or recursive algorithm”

8. Binary Search

The book has a bit of fun with the idea that “looking up a word in the dictionary” is no longer a thing that people physically do
Binary search is defined in Wikipedia as “compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array”
The book points out that a while loop consists of four parts, while a for loop consists of six parts
The book’s algorithm finds the leftmost instance of the key by searching left is the middle element is the key. This a stronger result then the specification of find-an-index, and an optimization of stopping if the middle element contains the key could have been used
Sometimes indices like j, left, and right are inclusive and sometimes they are exclusive. The choice sometimes makes the algorithm easier or harder. Be very careful and document which is which to avoid off-by-one errors
java.util.Arrays.binarysearch is described in the Java documentation as “Searches the specified array for the specified object using the binary search algorithm. The array must be sorted into ascending order according to the natural ordering of its elements (as by the sort(Object[]) method) prior to making this call… If the array contains multiple elements equal to the specified object, there is no guarantee which one will be found…. Returns: index of the search key, if it is contained in the array; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the array: the index of the first element greater than the key, or a.length if all elements in the array are less than the specified key”

9. One-Dimensional Array Rearrangements

The book advises using left and right indices when reversing an array, to make it easier to code and understand. The book advises against index arithmetic in general. Then again, the next example uses index arithmetic so 🤷
Aliasing is defined in Wikipedia as “data location in memory can be accessed through different symbolic names in the program”
The book advises that when trying to figure out something like a for loop limit, look at how it depends on variables x and y (positively or negatively), then write “+x -y +constant” and try to figure out the constant
The book has a clever in-place algorithm that has a rotation using 3 reverses (-a-> -b->) -1-> (<-a- -b->) -2-> (<-a- <-b-) -3-> (-b-> -a->)
The book points out that locality affects the choice of algorithm. Jumping around a large array may not be the best choice. In addition, the amount of extra space needed affects the choice of algorithm
Dutch national flag problem is defined by Wikipedia as “Given balls of these three colors arranged randomly in a line (it does not matter how many balls there are), the task is to arrange them such that all balls of the same color are together and their collective color groups are in the correct order. The solution to this problem is of interest for designing sorting algorithms; in particular, variants of the quicksort algorithm that must be robust to repeated elements may use a three-way partitioning function that groups items less than a given key (red), equal to the key (white) and greater than the key (blue)… One algorithm is to have the top group grow down from the top of the array, the bottom group grow up from the bottom, and keep the middle group just above the bottom. The algorithm indexes three locations, the bottom of the top group, the top of the bottom group, and the top of the middle group. Elements that are yet to be sorted fall between the middle and the top group”. See also the Wikipedia page for repeated elements in quicksort: “To solve the Lomuto partition scheme problem (sometimes called the Dutch national flag problem), an alternative linear-time partition routine can be used that separates the values into three groups: values less than the pivot, values equal to the pivot, and values greater than the pivot”
If there are 3 choices for an item, having an unknown area that can shrink from the left and from the right is useful
The book notes that the partition value does not have to be in the array
Merge is defined in Wikipedia as “take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order”. It typically has 3 steps: merge a and b into c, move remaining elements of a into c, and move remaining elements of b into c (only one of the last two steps will actually be executed)

Notes on 📕Principled Programming by 🐻Tim Teitelbaum

Preface

1. Introduction

2. Prerequisites

3. Specifications & Implementations

4. Stepwise Refinement

5. Online Algorithms

6. Enumeration Patterns

7. Sequential Search

8. Binary Search

9. One-Dimensional Array Rearrangements

10. Counting