Notes on 📕Object-Oriented Design and Data Structures by 🐻Myers & Kozen

I use these notes to track thoughts I had, and things I found interesting, while reading Object-Oriented Design and Data Structures

The notes are still in progress as I have not yet finished reading the book

2. Introduction

The book starts off with the great observation: “One of the joys of working with computers is that it is relatively easy to create new things…The constraints of the real world weigh much less heavily on software developers than on engineers in other disciplines.” This is often expressed in technology social media as “you can just do things”

3. Values, Types, and Variables

There are different definitions of a strongly-typed language, but the one used in this book is “run-time type errors are not possible.” Wikipedia points to an article by Luca Cardelli with a similar definition: “the absence of unchecked run-time type errors”
The book observes that Java, unlike Python, is whitespace-insensitive, but, in practice, everyone uses it in a whitespace-sensitive manner, with line breaks and indenting, for readability
The book states “Local variables are in scope from the point of declaration until the end of the block in which they are declared” This is different from languages like Python, which have function, not block, scope. In Javascript, variables defined with var had function scope, but variables defined with the modern let and const are more similar to Java.
I believe the only variable shadowing in Java that is disallowed is local variables over other local variables. Everything else is fair game and, as the book points out, the cause of various errors. As far as I know, there’s no option to get the Java compiler to be stricter about shadowing and print warnings

4. Java Execution Model: Arrays, Strings, Autoboxing

The book points out because Java arrays are subtypes of Objects, you can create an array of Objects, then point the first element of that array at the array itself
The book points out that autoboxing uses Integer.valueOf: “This method will always cache values in the range -128 to 127, inclusive, and may cache other values outside of this range”
The book advises that the less the obscure corners of a language are used, the better

5. Representing Java Values

The book advises that though you can program while only knowing the top-level abstraction that the language provides, it’s useful to understand the internals. This advice, and the “don’t use obscure corners of a language” advice from the previous chapter, should be complementary, not contraditory
The book points out that variables and fields with type short, byte, and char all use 32 bits of memory, like an int, in Java, but that arrays pack the smaller data types together more tightly. It also suggests using a Bitset for large lists of boolean values
Exponents are biased, according to Wikipedia, because “_ exponents have to be signed values in order to be able to represent both tiny and huge values, but two’s complement, the usual representation for signed values, would make comparison harder_”
The book cautions about the subtleties of representing decimals as floating point numbers. One alternative in the documentation is “scaling up so the monetary value is an integer — for example, multiplying by 100 if the value is denominated in cents or multiplying by 1000 if the value is denominated in mills — and then storing that scaled value in an integer type”
The book advises to use the type double unless there is a particular reason to use float

6. Encapsulation and information hiding

Abstraction is defined in Wikipedia as “the process of generalizing concrete details, such as attributes, away from the study of objects and systems to focus attention on details of greater importance”
Coupling is defined in Wikipedia as “the degree of interdependence between software modules, a measure of how closely connected two routines or modules are, and the strength of the relationships between modules”
Separation of concerns is definied in Wikipedia as “a design principle for separating a computer program into distinct sections. Each section addresses a separate concern, a set of information that affects the code of a computer program”
Information hiding is defined in Wikipedia as “the principle of segregation of the design decisions in a computer program that are most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed”
Encapsulation is definied in Wikipedia as “the bundling of data with the mechanisms or methods that operate on the data”
An abstract data type is defined in Wikipedia as “a mathematical model for data types, defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations”
The BigInteger class represents “immutable arbitrary-precision integers”

7. Interfaces and subtyping

The book points out that downcasting can be used to get around information hiding – avoid doing this
Structural subtyping is defined in Wikipedia as “the structure of two types determines whether or not one is a subtype of the other”, and nominal subtyping as “only types declared in a certain way may be subtypes of each other”
Behavioral subtyping is defined in Wikipedia as “behavioral subtyping is the principle that subclasses should satisfy the expectations of clients accessing subclass objects through references of superclass type, not just as regards syntactic safety (such as the absence of “method-not-found” errors) but also as regards behavioral correctness”
The Java new operator requires the class of the created object to be explicitly specified at compile time. Factory methods allow the object creation process to be polymorphic, and the class of the created object to be specified elsewhere, at runtime.

8. Designing and documenting interfaces

The book points out the there are disadvantages from making interfaces too wide and too narrow, but in practice, most problematic interfaces are too wide
According to the Java documentation, abstract classes “can declare fields that are not static and final, and define public, protected, and private concrete methods”, while interfaces require that “all fields are automatically public, static, and final, and all methods that you declare or define (as default methods) are public”
The principle of least surprise is defined in Wikipedia as “a component of a system should behave in a way that most users will expect it to behave, and therefore not astonish or surprise users”
Command-query separation is definied in Wikipedia as “every method should either be a command that performs an action, or a query that returns data to the caller, but not both”
The book advises that it may be desirable to inherit the Object.equals method for reference equality if the class is mutable, in order to represent Liebniz equality rather than state equality. It’s desirable to implement a state equality equals method for immutable classes as it will prevent a client from differentiating different implementions
The book advices to treat null values passed in as arguments to a method, or returned from a method as exceptional conditions that must be documented if possible
The book advises to be aware of a potential lack of symmetry between s.equals(t) and t.equals(s) if S is a subclass of T (or other similar other situations)

9. Exceptions

The purpose of the finally block is to perform cleanup even if control is not returned to the code directly after the try-catch loop (the catch block returns, or continues or breaks out of a loop, or the exception is uncaught, or the catch block throws another exception…)
The unchecked exception classes are defined in the Java documentation as “the run-time exception classes and the error classes”, and the checked as “_all exception classes other than the unchecked exception classes”. The book constrasts unchecked as programmer errors, and checked as unusual conditions. If the unusual condition should never occur, the book advises catching it then throwing an unchecked exception like Error
A try-with-resources statement is defined in the Java documentation as “a try statement that declares one or more resources…ensures that each resource is closed at the end of the statement”

10. Inheritance and the specialization interface

The book provides a Java inheritance example that forks a game implementation, changes it, then merges changes from the original, that sounds very similar to a source control story. I don’t know what insights to take from that yet, but it’s interesting
Dynamic dispatch is defined in Wikipedia as “the process of selecting which implementation of a polymorphic operation (method or function) to call at run time”

11. Modular design and implementation

The book advises to assume by default that fields cannot be null, and note in the specifications if they can be null
Behavioral subtyping is defined by Wikipedia as “the principle that subclasses should satisfy the expectations of clients accessing subclass objects through references of superclass type, not just as regards syntactic safety (such as the absence of “method-not-found” errors) but also as regards behavioral correctness”
The book advises to use top-down design on the parts where you’re worried about matching client expectations and overall structure, and bottom-up on the parts where you’re worried about feasibility and performance

12. Testing

Continuous testing is defined in Wikipedia as “the process of executing automated tests as part of the software delivery pipeline to obtain immediate feedback on the business risks associated with a software release candidate”
Regression testing is defined in Wikipedia as “re-running functional and non-functional tests to ensure that previously developed and tested software still performs as expected after a change”
Black-box testing is defined in Wikipedia as “a method of software testing that examines the functionality of an application without peering into its internal structures or workings”. It’s useful for helping design the specification in the first place
Test coverage is defined in Wikipedia as “the percentage of software requirements that are tested by black-box testing for a system or application”
Glass-box testing is defined in Wikipedia as “a method of software testing that tests internal structures or workings of an application, as opposed to its functionality”
Code / test coverage is defined in Wikipedia as “a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run”
Fuzzing is defined in Wikipedia as “an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks…. Typically, a fuzzer is considered more effective if it achieves a higher degree of code coverage…. A white-box fuzzer leverages program analysis to systematically increase code coverage or to reach certain critical program locations… A gray-box fuzzer leverages instrumentation rather than program analysis to glean information about the program”
Reference implementationis defined in Wikipedia as “a program that implements all requirements from a corresponding specification…. A reference implementation may or may not be production quality”
The book alleges that if there is a bug, there is usually a small counterexample that exposes a bug. So it may be worthwhile searching shallowly but comprehensively, rather than deeply but nonexhaustively
The book points out that formal proofs of code correctness are an area of future research
Symbolic execution is defined in Wikipedia as “a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inputs rather than obtaining actual inputs as normal execution of the program would. It thus arrives at expressions in terms of those symbols for expressions and variables in the program, and constraints in terms of those symbols for the possible outcomes of each conditional branch”

13. Recursion

Recursive data type is defined in Wikipedia as “a data type for values that may contain other values of the same type”
Sentinel node is defined in Wikipedia as “a specifically designated node used with linked lists and trees as a traversal path terminator”
The book presents the canonical list iteration: while (n != null) { _do something_; n = n.next; }. A for loop version is also presented, but I prefer not to use for loops when the length is unspecified
Tail call is defined in Wikipedia as “a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive…. Tail calls can be implemented without adding a new stack frame to the call stack…. Tail recursion can be related to the while statement, an explicit iteration”
The book points out that Java does not implement tail call optimization. I believe one reason is that the Java security mechanism does stack inspection

14. Linked Lists

The Java documentation describes List implementations as follows: “ArrayList - Resizable array implementation of the List interface (an unsynchronized Vector). The best all-around implementation of the List interface. ArrayDeque - Efficient, resizable array implementation of the Deque interface. LinkedList - Doubly-linked list implementation of the List interface. Provides better performance than the ArrayList implementation if elements are frequently inserted or deleted within the list. Also implements the Deque interface. When accessed through the Queue interface, LinkedList acts as a FIFO queue”

15. Parametric polymorphism (generics)

Subtype polymorphism is defined in Wikipedia as “a form of type polymorphism. A subtype is a datatype that is related to another datatype (the supertype) by some notion of substitutability, meaning that program elements (typically subroutines or functions), written to operate on elements of the supertype, can also operate on elements of the subtype”
Parametric polymorphism is defined in Wikipedia as “allows a single piece of code to be given a ‘generic’ type, using variables in place of actual types, and then instantiated with particular types as needed”
The books has a clever contrast of subtype polymorphism, where a client can use multiple implementations of an interface, with parametric polymorphism, where an interface can be used by multiple clients in different ways
Type inference is defined in the Java documentation as “a Java compiler’s ability to look at each method invocation and corresponding declaration to determine the type argument (or arguments) that make the invocation applicable. The inference algorithm determines the types of the arguments and, if available, the type that the result is being assigned, or returned. Finally, the inference algorithm tries to find the most specific type that works with all of the arguments”
The Java documentation discusses generics and subtypes: “Given two concrete types A and B…, MyClass<A> has no relationship to MyClass<B>, regardless of whether or not A and B are related. The common parent of MyClass<A> and MyClass<B> is Object”
Variance is defined in Wikipedia as “the category of possible relationships between more complex types and their components’ subtypes”
The Java documentation discusses wildcards: “An “in” variable serves up data to the code…. An “out” variable holds data for use elsewhere…. An “in” variable is defined with an upper bounded wildcard, using the extends keyword. An “out” variable is defined with a lower bounded wildcard, using the super keyword”. This is basically covariance and contravariance
Type erasure is defined in the Java documentation as “To implement generics, the Java compiler applies type erasure to: - Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods…”
Raw type is defined in the Java documentation as “the name of a generic class or interface without any type arguments…. When using raw types, you essentially get pre-generics behavior”
Lambda expressions are defined in the Java documentation as “let you express instances of single-method classes more compactly”
Method references are definied in the Java documentation as “compact, easy-to-read lambda expressions for methods that already have a name”

16. Asymptotic Complexity

L’Hôpital’s rule is defined in Wikipedia as “a mathematical theorem that allows evaluating limits of indeterminate forms using derivatives”. In certain cases, lim_x→c f(x) / g(x) = lim_x→c f’(x) / g’(x)

17. Trees

Branching factor is defined in Wikipedia as “number of children at each node, the outdegree”
Height is defined in Wikipedia as “length of the longest downward path to a leaf from that node. The height of the root is the height of the tree”, while depth is “length of the path to its root”
Binary search tree is defined in Wikipedia as “rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node’s left subtree and less than the ones in its right subtree”
The Java documentation specifies the difference between the Comparable interface: “imposes a total ordering on the objects of each class that implements it. This ordering is referred to as the class’s natural ordering, and the class’s compareTo method is referred to as its natural comparison method. Lists (and arrays) of objects that implement this interface can be sorted automatically by Collections.sort (and Arrays.sort)” and the Comparator interface: “a comparison function, which imposes a total ordering on some collection of objects. Comparators can be passed to a sort method (such as Collections.sort or Arrays.sort) to allow precise control over the sort order”
The book explains that many tree operations are expressed naturally as recursive functions
In tree algorithms, there are usually variations in whether null is checked for before making a recursive call on a child, or at the start of each recursive call
TreeMap is defined in the Java documentation as “Red-Black tree based NavigableMap implementation. The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used”
Randomized algorithm is defined in Wikipedia as “typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the “average case” over all possible choices of random determined by the random bits; thus either the running time, or the output (or both) are random variables”
Fisher–Yates shuffle is defined in Wikipedia as “takes a list of all the elements of the sequence, and continually determines the next element in the shuffled sequence by randomly drawing an element from the list until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely”
The book directs that converting non-tail-recursive tree algorithms to iterative versions can either use parent pointers if available, or a explicit stack
Tree sort is defined in Wikipedia as “builds a binary search tree from the elements to be sorted, and then traverses the tree (in-order) so that the elements come out in sorted order. Its typical use is sorting elements online
The book explains that range queries can be done by performing an in-order traversal that skips the unneeded parts of the tree
Tree traversals are contrasted on Wikipedia: “pre-order traversal is a topologically sorted one, because a parent node is processed before any of its child nodes is done…. Post-order traversal can be useful to get postfix expression of a binary expression tree…. in-order traversal retrieves the keys in ascending sorted order_”

18. Grammars and parsing

Parse tree is defined in Wikipedia as “ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar”
Syntax error is defined in Wikipedia as “error in the syntax of a sequence of characters that is intended to be written in a particular programming language”
Formal grammar is defined in Wikipedia as “describes which strings from an alphabet of a formal language are valid according to the language’s syntax”
Context-free grammar is defined in Wikipedia as “formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context”. For example aⁿbⁿ is not context-free
Symbols are defined in Wikipedia as “lexical elements used in specifying the production rules constituting a formal grammar”
Lexical tokenization is defined by Wikipedia as “conversion of a text into (semantically or syntactically) meaningful lexical tokens belonging to categories defined by a ‘lexer’ program”
Production is defined in Wikipedia as “rewrite rule specifying a symbol substitution that can be recursively performed to generate new symbol sequences”
Formal language is defined in Wikipedia as “words whose letters are taken from an alphabet and are well-formed according to a specific set of rules called a formal grammar”
Recursive grammar is defined in Wikipedia as “production rules that are recursive, meaning that expanding a non-terminal according to these rules can eventually lead to a string that includes the same non-terminal again”
Ambiguous grammar is defined in Wikipedia as “context-free grammar for which there exists a string that can have more than one leftmost derivation or parse tree”
Recursive-descent parser is defined in Wikipedia as “top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure implements one of the nonterminals of the grammar”, and predictive parser as “a recursive descent parser that does not require backtracking”
Abstract syntax tree is defined in Wikipedia as “tree representation of the abstract syntactic structure of text (often source code) written in a formal language…. The syntax is ‘abstract’ in the sense that it does not represent every detail appearing in the real syntax, but rather just the structural or content-related details”
Left recursion is defined in Wikipedia as “special case of recursion where a string is recognized as part of a language by the fact that it decomposes into a string from that same language (on the left) and a suffix (on the right)…. Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of most top-down parsers) or because they expect rules in a normal form that forbids it (as in the case of many bottom-up parsers). Therefore, a grammar is often preprocessed to eliminate the left recursion”
Metasyntax is defined in Wikipedia as “syntax used to define the syntax of a programming language or formal language. It describes the allowable structure and composition of phrases and sentences of a metalanguage, which is used to describe either a natural language or a computer programming language”

19. Hash tables

Map is defined in Wikipedia as “abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection”
Partial function is defined in Wikipedia as “binary relation over two sets that associates to every element of the first set at most one element of the second set”
Lookup table is definied in Wikipedia as “array that replaces runtime computation of a mathematical function with a simpler array indexing operation, in a process termed as direct addressing”
Injective function is defined in Wikipedia as “maps distinct elements of its domain to distinct elements of its codomain”
Hash functions are discussed on Wikipedia: “A good hash function should map the expected inputs as evenly as possible over its output range. That is, every hash value in the output range should be generated with roughly the same probability. The reason for this last requirement is that the cost of hashing-based methods goes up sharply as the number of collisions—pairs of inputs that are mapped to the same hash value—increases”
Separate chaining is defined in Wikipedia as “building a linked list with key–value pair for each search array index”
Open addressing / closed hashing / probing is defined in Wikipedia as “searching through alternative locations in the array (the probe sequence) until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table”
Double hashing is defined in Wikipedia as “using a secondary hash of the key as an offset when a collision occurs”
Lazy deletion is defined in Wikipedia as “deletions are done by marking an element as deleted, rather than erasing it entirely. Deleted locations are treated as empty when inserting and as occupied during a search. The deleted locations are sometimes referred to as tombstones”
The book advises to use chaining over probing
Perfect hash function is defined in Wikipedia as “maps distinct elements in S to a set of m integers, with no collisions…. Perfect hash functions may be used to implement a lookup table with constant worst-case access time”
Simple Uniform Hashing Assumption is defined in Wikipedia as “a hypothetical hashing function will evenly distribute items into the slots of a hash table. Moreover, each item to be hashed has an equal probability of being placed into a slot, regardless of the other elements already placed”
Random oracle is defined in Wikipedia as “oracle (a theoretical black box) that responds to every unique query with a (truly) random response chosen uniformly from its output domain. If a query is repeated, it responds the same way every time that query is submitted…. Random oracles are typically used as an idealised replacement for cryptographic hash functions in schemes where strong randomness assumptions are needed of the hash function’s output”
Diffusion is defined in Wikipedia as “diffusion is hiding the plaintext statistics by spreading it over a larger area of ciphertext…. Diffusion means that if we change a single bit of the plaintext, then about half of the bits in the ciphertext should change, and similarly, if we change one bit of the ciphertext, then about half of the plaintext bits should change”
java.util.HashSet is described in the Java documentation as “Hash table implementation of the Set interface. The best all-around implementation of the Set interface” and java.util.HashMap as “Hash table implementation of the Map interface (an unsynchronized Hashtable that supports null keys and values). The best all-around implementation of the Map interface”
Object.hashCode is described in the Java documentatation as “Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified…. [this is important for mutable objects] If two objects are equal according to the equals method, then calling the hashCode` method on each of the two objects must produce the same integer result [without this requirement, collision resolution won’t work] It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables [for collision avoidance of similar objects]…. As far as is reasonably practical, the hashCode method defined by class Object returns distinct integers for distinct objects_”.
The book recommends keeping Object.hashCode for mutable objects, because they’re not really equal unless they’re the same object since they can change
Cyclic redundancy check is defined in Wikipedia as “an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to digital data”
Cryptographic hash function is described in Wikipedia as having the following property: “finding any pair of different messages that yield the same hash value (a collision) is also infeasible”
java.security.MessageDigest is described in the Java documentation as “provides applications the functionality of a message digest algorithm, such as SHA-1 or SHA-256. Message digests are secure one-way hash functions that take arbitrary-sized data and output a fixed-length hash value”
java.security.SecureRandom is described in the Java documentation as “provides a cryptographically strong random number generator (RNG)”
The book recommends take the String.hashcode of the result of the toString method of your object if your toString method has the properties described above in the Object.hashcode section
java.util.Arrays.hashcode is described in the Java documentation as “returns a hash code based on the contents of the specified array. If the array contains other arrays as elements, the hash code is based on their identities rather than their contents”
java.util.Objects.hash is described in the Java [documentation](https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/util/Objects.html#hashCode(java.lang.Object)) as "_Returns a hash code for a sequence of input values. The hash code is generated as if all the input values were placed into an array, and that array were hashed by calling Arrays.hashCode(Object[]). This method is useful for implementing Object.hashCode()` on objects containing multiple fields_”

20. Loop invariants

Loop invariant is defined in Wikipedia as “property of a program loop that is true before (and after) each iteration…. Because of the similarity of loops and recursive programs, proving partial correctness of loops with invariants is very similar to proving the correctness of recursive programs via induction. In fact, the loop invariant is often the same as the inductive hypothesis to be proved for a recursive program equivalent to a given loop”
If binary search needs to find the first instance of an item, it iterates on the left half (including the midpoint) if it finds the item. If binary search needs to find any instance of an item, it stops immediately if it finds the item at the midpoint
The book advises writing down what the loop invariant is, and using assertions to check it if possible
The book points out that postcondition = loop-invariant && !loop-condition
Partial correctness is defined in Wikipedia as”if an answer is returned it will be correct”, while total correctness as “additionally requires that an answer is eventually returned, i.e. the algorithm terminates”
Loop variant is defined in Wikipedia as “mathematical function defined on the state space of a computer program whose value is monotonically decreased with respect to a (strict) well-founded relation by the iteration of a while loop under some invariant conditions, thereby ensuring its termination”
The book describes a total correctness proof with four steps: establishment / initialization (loop invariant is true before enter loop), postcondition (if loop invariant is true and loop condition is false, then postcondition is true), preservation / maintenance (if loop invariant and loop condition is true when entering loop body, then the loop invariant is true when exiting loop body), and termination (loop invariant decreases each time in the loop, until the loop guard is false)
The book advises that too weak loop invariants usually fail the postcondition proof, while too strong loop invariants usually fail the preservation proof

21. Sorting

java.util.Comparator<T> is defined in the Java documentation as “comparison function, which imposes a total ordering on some collection of objects”. A lambda expression can be used to implement the compare function. There a also useful static helper functions that create Comparators and combine Comparators
The book has an interesting way of defining a sort: a sorted list of values was permuted, and a sorting algorithm must undo that permutation
Insertion sort is defined in Wikipedia as “at each iteration, insertion sort removes one element from the input data, finds the location it belongs within the sorted list, and inserts it there. It repeats until no input elements remain. Sorting is typically done in-place, by iterating up the array, growing the sorted list behind it… While some divide-and-conquer algorithms such as quicksort and mergesort outperform insertion sort for larger arrays, non-recursive sorting algorithms such as insertion sort or selection sort are generally faster for very small arrays (the exact size varies by environment and implementation, but is typically between 7 and 50 elements). Therefore, a useful optimization in the implementation of those algorithms is a hybrid approach, using the simpler algorithm when the array has been divided to a small size”
Stable is defined in Wikipedia as “sort equal elements in the same order that they appear in the input”
The book points out that spreadsheet sorting by multiple columns in sequence, relies on stable sorting
Selection sort is defined in Wikipedia by “proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging (swapping) it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right…. While selection sort is preferable to insertion sort in terms of number of writes (n−1 swaps versus up to n(n−1)/2 swaps, with each swap being two writes)… This can be important if writes are significantly more expensive than reads, such as with EEPROM or Flash memory, where every write lessens the lifespan of the memory.”. The book points out that it is not stable because of the swaps
Decision tree is defined in Wikipedia as “decision support recursive partitioning structure that uses a tree-like model of decisions and their possible consequences… It is one way to display an algorithm that only contains conditional control statements”
Stirling’s formula is defined in Wikipedia as ln(n!) = n ln n - n + O(ln n)
The book points out that generic sorting algorithms cannot be faster than n lg n. This is not to say that sorting algorithms that use information about the structure of the problem, like counting sort, cannot be faster
Divide and conquer is defined in Wikipedia as “recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem”
Merge sort is defined in Wikipedia as “Divide the unsorted list into n sub-lists, each containing one element (a list of one element is considered sorted). Repeatedly merge sublists to produce new sorted sublists until there is only one sublist remaining”
The book’s merge sort algorithm uses an optimization in that you don’t need to copy any values left in the right subproblem into the temp array, they can just stay in place in the normal array. There are a lot of ways to get off-by-one errors in a merge algorithm, so you have to be particularly careful
System.arraycopy is described in the Java documentation as “Copies an array from the specified source array, beginning at the specified position, to the specified position of the destination array… The number of components copied is equal to the length argument”
java.util.Arrays.sort for Objects is defined in the Java documentation as “stable, adaptive, iterative mergesort that requires far fewer than n lg(n) comparisons when the input array is partially sorted, while offering the performance of a traditional mergesort when the input array is randomly ordered…. The implementation was adapted from Tim Peters’s list sort for Python”
java.util.Arrays.parallelsort for Objects is described in the Java documentation as “parallel sort-merge that breaks the array into sub-arrays that are themselves sorted and then merged. When the sub-array length reaches a minimum granularity, the sub-array is sorted using the appropriate Arrays.sort method”
Quicksort is defined in Wikipedia as “selecting a “pivot” element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot… The sub-arrays are then sorted recursively. This can be done in-place, requiring small additional amounts of memory to perform the sorting…. Since at least one instance of the pivot is present, most partition routines ensure that the value that ends up at the point of division is equal to the pivot, and is now in its final position (but termination of quicksort does not depend on this, as long as sub-ranges strictly smaller than the original are produced)” Note that values equal to the pivot can end up on either side
The book uses a variant of the Hoare partition scheme which is described in Wikipedia as “two pointers (indices into the range) that start at both ends of the array being partitioned, then move toward each other, until they detect an inversion: a pair of elements, one greater than the pivot at the first pointer, and one less than the pivot at the second pointer; if at this point the first pointer is still before the second, these elements are in the wrong order relative to each other, and they are then exchanged…. While there is no reason to exchange elements equal to the pivot, this change allows tests on the pointers themselves to be omitted, which are otherwise needed to ensure they do not run out of range.”
Lomuto partition scheme is defined in Wikipedia as “_ most formulations this scheme chooses as the pivot the last element in the array. The algorithm maintains index i as it scans the array using another index j such that the elements at lo through i-1 (inclusive) are less than the pivot, and the elements at i through j (inclusive) are equal to or greater than the pivot. As this scheme is more compact and easy to understand, it is frequently used in introductory material, although it is less efficient than Hoare’s original scheme e.g., when all elements are equal_”
java.util.Arrays.sort and parallelsort for primitive types are described in the Java documentation as “Dual-Pivot Quicksort…offers O(n log(n)) performance on all data sets, and is typically faster than traditional (one-pivot) Quicksort implementations”. A lot of online guides claim sort and parallelsort are different for primitive types, but I don’t believe that is the case
kth order statistic of a statistical sample is defined in Wikipedia as “its kth-smallest value”
Quickselect is defined in Wikipedia as “uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side – the side with the element it is searching for. This reduces the average complexity from O(n log n) to O(n), with a worst case of O(n2)…. It is also an in-place algorithm, requiring only constant memory overhead if tail call optimization is available, or if eliminating the tail recursion with a loop”
The book’s algorithm doesn’t check for n == k because of its partitioning scheme. This may be an allowed optimization with certain partitioning schemes.

22. Graphs and graph representations

Graph is defined in Wikipedia as “graph is a structure consisting of a set of objects where some pairs of the objects are in some sense “related”. The objects are represented by abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge… The edges may be directed or undirected…. A weighted graph or a network is a graph in which a number (the weight) is assigned to each edge…. A binary relation R on a set X defines a directed graph. An element x of X is a direct predecessor of an element y of X if and only if xRy”
Garbage collection is defined by Wikipedia as “form of automatic memory management. The garbage collector attempts to reclaim memory that was allocated by the program, but is no longer referenced”
Reachability is defined in Wikipedia as “ability to get from one vertex to another within a graph”
The book points out that many variants of graph problems are NP-complete
Vertex is defined in Wikipedia as “vertex (plural vertices) or node is the fundamental unit of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges (unordered pairs of vertices), while a directed graph consists of a set of vertices and a set of arcs (ordered pairs of vertices)…. The degree of a vertex, denoted 𝛿(v) in a graph is the number of edges incident to it… In a directed graph, one can distinguish the outdegree (number of outgoing edges), denoted 𝛿+(v), from the indegree (number of incoming edges), denoted 𝛿−(v); a source vertex is a vertex with indegree zero, while a sink vertex is a vertex with outdegree zero”
Adjacency matrix is defined in Wikipedia as “a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph”
Dense graph is defined in Wikipedia as “graph in which the number of edges is close to the maximal number of edges (where every pair of vertices is connected by one edge). The opposite, a graph with only a few edges, is a sparse graph”
Adjacency list is defined in Wikipedia as “each unordered list… describes the set of neighbors of a particular vertex in the graph”
Transpose is defined in Wikipedia as “another directed graph on the same set of vertices with all of the edges reversed compared to the orientation of the corresponding edges in G”. The book adds the term “dual” to describe this graph, but I don’t think it’s commonly used. The book advises that storing transpose graphs enable quickly finding nodes that point to your node
The book advises that there are ways to speed up certain operations using hashtables in the implementation of adjacency lists. Hashtables can also be used for similar reasons in the implementations of adjacency matrices
Path is defined in Wikipedia as “finite or infinite sequence of edges which joins a sequence of vertices which, by most definitions, are all distinct (and since the vertices are distinct, so are the edges)…. A walk is a finite or infinite sequence of edges which joins a sequence of vertices…. A path is a trail in which all vertices (and therefore also all edges) are distinct…. Some authors do not require that all vertices of a path be distinct and instead use the term simple path to refer to such a path where all vertices are distinct”
Cycle is defined in Wikipedia as “non-empty trail in which only the first and last vertices are equal”
Directed acyclic graph is defined in Wikipedia as “directed graph with no directed cycles”
Topological sort is defined in Wikipedia as “linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering…. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering”
Total order is defined in Wikipedia as “partial order [reflexive, antisymmetric, and transitive] in which any two elements are comparable”
Hamiltonian path is defined in Wikipedia as “path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a cycle that visits each vertex exactly once. A Hamiltonian path that starts and ends at adjacent vertices can be completed by adding one more edge to form a Hamiltonian cycle, and removing any edge from a Hamiltonian cycle produces a Hamiltonian path. The computational problems of determining whether such paths and cycles exist in graphs are NP-complete”
Travelling salesman problem is defined in Wikipedia as “‘Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?’ It is an NP-hard problem…. Even though the problem is computationally difficult, many heuristics and exact algorithms are known…. An equivalent formulation in terms of graph theory is: Given a complete weighted graph (where the vertices would represent the cities, the edges would represent the roads, and the weights would be the cost or distance of that road), find a Hamiltonian cycle with the least weight”
Graph coloring is defined in Wikipedia as “methodic assignment of labels traditionally called “colors” to elements of a graph…. A coloring using at most k colors is called a (proper) k-coloring. The smallest number of colors needed to color a graph G is called its chromatic number…. Graph coloring is computationally hard…. Vertex coloring models to a number of scheduling problems”