## Colorado State University — Pueblo, Fall 2018 Math 345, Algorithms and Data Structures Course Schedule & Homework Assignments

Here is a link back to the course syllabus/policy page.

This schedule is will be changing very frequently, please check it at least every class day, and before starting work on any assignment (in case the content of the assignment has changed).

If you see the symbol below, it means that class was videoed and you can get a link by e-mailing me. Note that if you know ahead of time that you will miss a class, you should tell me and I will be sure to video that day for you.

#### Week 1

• :
• A lot of bureaucracy and introductions.
• Some content we discussed:
• Ideas of what an algorithm is
• Mention of the origin of the word algorithm in the name of the great Persian scholar Al-Khwarizmi, mathematician and astronomer at the House of Wisdom in the early 9th century, located in what is now Baghdad. [One of the words, al-jabr, in the title of Al-Khwarizmi's book on solving linear and quadratic equations also gave rise to the modern term algebra.]
• An example of a very naïve algorithm to scan through a list of courses in the univesity with the number of students enrolled in each, to compute the average class size for each department on campus, described in pseudocode:
Input:
• A list $\Cc$ of classes, each entry saying to which department the class belongs and how many students are registered.
• A list $\Dd$ of all departments at the university.
Output:
• A list $\Aa$ of consisting of pairs: a department name and the average number of students in that department's classes.
Naïve algorithm:
• set $\Aa$ to be the empty list
• for each d in $\Dd$:
• total_number_students = 0
• number_classes = 0
• for each c in $\Cc$:
• if c.deptartment == d:
• total_number_students += c.students
• number_classes += 1
• append to $\Aa$ the pair [d, total_number_students/number_classes]
[Note: the notation x += y is shorthand for x = x + y .]
• Starting discussion of how long it would take for this algorithm to run, at least by counting the number of steps
• :
• Some content we discussed:
• We have a plan for the content of the week, which is:
• Different ideas of efficiency for algorithms: time complexity, in which we compute how long it will take for an algorithm to run [in some idealized sense where each simple statement, or piece thereof, in a computer language is thought of as taking a certain unit of time; note this is not completely realistic, since some operations actually take much more clock time than others], and space complexity, which measures the amount of computer memory an algorithm would use. We shall most frequently be concerned with time complexity in this class.
• When discussing time complexity, we usually have these considerations in mind:
• We want to know how the time complexity [=running time!] depends upon the input size $n$.
• We are interested in having an upper bound, which will tell us that the running will be no more than a certain amount — which will be a formula in terms of $n$.
• We are also interested in the behavior of the upper bound as the input size $n$ tends to infinity. That is, we don't care so much how much time the algorithm will take for very small problem sizes. Instead we care what might happen if we have to solve the problem for a huge input size... and then even if we're running on a huge input, what if it gets even bigger In short, we want to know the behavior in the limit as $n\to\infty$.
• When giving such a formula for an upper bound, we tend not to care about multiplicative constants in the formula. That is, $n^2$ and $\frac12n^2$ would be pretty much the same bound. The reason for this is that practical experience tells us that many implementation details, such as which computer we use, what language, and the intricacies of how the algorithm is realized in code, will change the run time by such a constant factor. Therefore, the time complexity of the underlying algorithm (and not its concrete realization in code) should be described in some way that doesn't care about multiplicative constants. This motivates the following definition (in which we should think of the $f(n)$ as the actual, complicated formula for how long the algorithm takes for input of size $n$ and $g(n)$ is some generalized measure of complexity which we are using as an upper bound).
• Definition: For two positive functions $f(n)$ and $g(n)$, we say that $f(n)=O(g(n))$, pronounced "$f(n)$ is big Oh of $g(n)$", if there exist some numbers $k$ and $C$ such that for all $n>k$ the inequality $f(n)\le C g(n)$ is true.
In other words, for all sufficiently large $n$ (those bigger than the cut-off $k$), $f(n)$ is bounded above by a constant $C$ times $g(n)$ (which constant $C$ doesn't depend upon $n$).
• Examples:
• $x+1=O(x)$; similarly, $100x+17=O(x)$.
• $x=O(x^2)$; although this a pretty stupid bound to use for $x$, since $x=O(x)$, also.
• Common time complexities in computer science include $O(1)$, $O(n)$, $O(n^{1+\epsilon})$, $O(n^2)$, $O(2^n)$, $O(n\log n)$, ....
• Some discussion of the median in statistics. Remember, it is the middle number in a dataset, after it has been put in increasing [or decreasing] order if there are an odd number of data values; if there are an even number of data values, the median is the average of the middle two values of the dataset, again after it has been put in increasing [or decreasing] order.
• So the importance of the process of putting a dataset in numerical order appears for the first time in this course: let us start to SORT!
• A very naïve algorithm for SORT would do the following:
1. Find the minimum element in the list of values and switch it with the first element of the list.
2. Now repeat the above with the part of the list from the second element to the last, then the third to last, etc.
Here is pseudocode for that algorithm:
Input:
• A list $\Aa=\{a_1,\dots,a_n\}$ of numerical values (let's say real numbers or integers), in no particular order.
Output:
• An reordering of $\Aa$ into the list $\Bb=\left(b_1,\dots,b_n\right)$ satisfying $a_1\le\dots\le a_n$.
Naïve algorithm:
• for i in range(0,n):
• find the minimum value in $\left(a_i,\dots,a_n\right)$ — say it is at location j
• switch contents of locations i and j
• Please send your instructor an email at jonathan@poritz.net to show that we can communicate by email. Please send it from an account you use regularly so that I will then have in my contacts a good way to get in touch with you.
• :
• Read before class any (or all, if you like) of the following for definitions of the "big Oh" notation: [In many ways, the last one is the best; others are pretty good, too, though.]
Note: In any of these sources, you don't need to look at the other, related, notations like little Oh, omega, etc.
• Some content we discussed:
• Note that $f(n)=O(1)$ means that, from a certain point onward (on the $x$-axis), the function $f(n)$ is bounded above by a constant.
• Useful Python syntax:
from random import *
allows you then to use things like the program
randint(begin,end)
which generates one integer in the range from begin to end (inclusive!). (Of course, if you just wanted this one program randint, you could use instead
from random import randint
• More useful Python syntax:
[<some expression, optionally using the variable i> for i list]
which basically runs the loop
for i in list:
<some expression, optionally using the variablei>

and puts the values into a list. Note: often, this is used with the list being range(n) for some n. For example:
[1 for i in range(10)]
makes a vector of ten 1's, while
[i**2 for i in range(10)]
makes the vector
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
and
[randint(0,5) for i in range(3)]
makes a vector whose three entries are randomly chosen integers in the range from 0 to 5.
• We wrote Python code to implement the naïve algorithm from this Monday which computed the average class sizes for each department at a university.
For this implementation, we assumed the departments were given small integer code numbers, not names. So the list $\Dd$ of all departments was of the form range(n) where n was the number of departments, and their code numbers were from 0 to n-1.
We were then able to generate toy data [i.e., randomly chosen data which could be used to test the code, by the command
classlist=[[randint(0,4),randint(1,100)] for i in range(100)]
making 100 "class records" with department code numbers from 0 to 4 and class sizes from 1 to 100
• But that approach is absurdly slow: it goes through the class list way too many times! Can you think of a way to do this which only goes through the class list once? Does your new method have the same space complexity?
• :
• We will be using matplotlib to do basic plotting. If you don't have it on your personal computer, the easiest way to install it seems to be with the program pip; I don't know much about it, though, so asking your classmates for suggestions is probably your best bet — along with, of course, just searching the Internet.
• Once matplotlib is install, you will find this is a nice tutorial about how to use it for basic plotting.
• Here is the most basic thing we will do with this package for plotting:
• First, do
import matplotlib.pyplot as plt
• Put the $x$ coordinates of some interesting data values into a list x and the corresponding $y$ coordinates into y. [It is important that len(x) is the same as len(y).]
• Create a basic plot of the data with
plt.plot(x,y)
• Display the plot with
plt.show()
• Some basic formatting notes:
• That basic command just makes lines connecting the points, in the order they appear in the lists of $x$ and $y$ coordinates.
• If you want just to show the data points themselves, without the connecting line, do something like
plt.plot(x,y,"o")
which will make round dots at the locations of the data points. You can also use "+", "*", and "^" to get data points represented as a plus, a star, or a triangle, respectively.
• In addition to specifying the format of the markers of data points, you can specify the type of the lines connecting them: giving a format string witout any mention of lines will make a plot without lines; using "_" will make normal lines, while ":" will make dotted lines.
• You can specify the color of the point markers and/or lines by adding a color specifier to the format string, where "b" is blue, "k" is black, "r" is red, "g" is green, etc.
• If you call the plt.plot function more than once, it will display the separate datasets, with whatever formatting you specify, on the same graph. If the formatting does not specify colors, then succesive calls to plt.plot will be automatically assigned different colors.
• Therefore, the commands
plt.plot(x,y,"go")
plt.plot(x,y,"r:")
will first plot the data points with green [round] dots, then connect those dots with a dotted red line.
• To clear the current plot, so you can start building up an interesting new one, you can use
plt.clf()
• A basic way to get timing information about a program in Python is to save the current time, run the program, save the current time again, and print out the difference of the two times. A useful tool for this is the time.time() program, which returns the number of seconds since the Unix epcoh, which was 00:00:00 UTC on 1 January 1970. You would use this as follows:
import time
start=time.time()
<do some work>
end=time.time()
print(end-start)
• :
• Some content we discussed:
• An issue we haven't talked very much so far, but which is obviously very important, is that the algorithms we design and then implement in code should be correct. We often take a fairly mathematical approach in which we essentially make a proof of program correctness.
• Proving program correctness obviously becomes harder the more complex the program is, particularly its control flow. We look first at one of the simplest types of program statements which are not simple declaratory or computational ones, the loop. A common approach for loops is to find a loop invariant, that is, a [logical or mathematical] statement which is true about the state of [all the variables of] the program before and after each iteration of the loop.
• Why do loop invariants work? Because of The Principle of Mathematical Induction.
• Some notation we shall use in class, and future assigned readings may use:
• $\forall x \in P(x)$ — means that the statement $P(x)$ is true for all values of $x$; the $\forall$ is pronounced for all Often there is some restriction, like $\forall x>17 P(x)$, which means $P(x)$ is true for all $x$ which are greater than 17.
• $\sum_{i=a}^b f(i)$ or $\displaystyle\sum_{i=a}^b f(i)$ — this formula computes the value sum also found by this code snippet:

sum=0
for i in range(a,b+1)
sum += f(i)
• Discussion of how it would be great to send your HW solutions, even the ones which have some calculations and special [mathematical] symbols to your instructor by e-mail. The way to include interesting typesetting in an email is to use the notation of the LATEX mathematics typesetting system. Here is a nice introduction to LATEX, while here is a reasonable tutorial — but there are many other free LATEX resources, including the program itself and many thousands of additional packages to make it do additional, wonderful things.
• Today [Friday] is the last day to add classes.

#### Week 2

• :
• The divide and conquer approach to algorithm design, which approach has a quite simple structure.
To solve a complex problem — or, at least, a problem which seems computationally expensive on large input — do the following:
1. break up the original problem into smaller problems which are similar to the original one;
2. or, if the problem actually is small enough to solve immediately by consideration of some special case, just report the solution;
3. solve those smaller problems as new instances of the whole algorithm;
4. combine the solutions of the smaller problems into one which solves the original, larger instance.
• Very often, we use recursion rather than looping to solve problems with the divide and conquer. With recursion, the program which implements an algorithm calls itself ... which seems like a recipe for an infinite loop, but in fact is not: every time the program calls itself, the new version is running on an instance of the problem which is strictly smaller than the original. Since
• the starting input was not infinitely large,
• each layer of the recursion is on applied to a problem which is smaller than the previos by an amount which is not getting smaller and smaller, and
• a particular instance of the problem will not generate a new layer of recursion once it is down to a small enough size that the special case immediate solution applies
• the proram will terminate in a finite time.
If the above considerations did not apply, then either one would have to deal with getting infinitely far with finite steps, or else with an instance of Zeno's Paradox. Therefore, it is important to check that they are true in any real-world problem and proposed algorithm/implementation.
• Digression on Zeno's Paradox, just because it's fun.
• Examples of the divide and conquer approach in action, including:
1. Computing values which are in fact defined recursively, such as the factorial or the Fibonacci sequence.
2. Finding the minimum of a numerical dataset.
• Actually, this one is so simple it is unclear if recursion or looping is best.
3. The "naïve sorting algorithm" from last Monday can be thought of as a [very simple version of] a divide and conquer approach.
4. Starting merge sort.
• HW1: either hand or email to your instructor the following:
1. Do any single problem from the Exercises for §1.1 here.
2. Suppose $f(n)=O(g(n))$ and $g(n)=O(f(n))$. Does that mean that $f(n)=g(n)$ for all $n\in\NN$? If so, explain why. If not, give a counter-example.
3. Write a Python program which implements each of the naïve algorithms discussed in Week 1: the average class size computation from Monday and the sorting algorithm from Tuesday. Please make readible code, which has comments and reasonable variable names. Include a way of generating both small toy data (for which the correctness can be checked by eye) and large toy data (for which we can see if the program takes an impossibly long amount of time).
Email the code to your instructor at jonathan@poritz.net.
4. As part of a proof of correctness of the naïve sorting algorithm we've been disucssing, find a loop invariant for the main loop of the pseudocode we used to describe it.
Extra credit: Use the loop invariant method and the Principle of Mathematical Induction to prove that the algorithm we described does indeed correctly sort a collection of $n$ number, for any $n\in\NN$.
Note: if you don't finish every part of this homework assignment, that's ok, you can turn in the rest as part of HW2, next Monday.
• :
• Read §11.1.1 here and/or Merge Sort [ignore the section "Analysis" in the second of these references]
• Some hands-on coding of merge sort, of some ways to collect timing data on running programs, and graphing the results.
• :
• There are lots of tutorials on how to use matplotlib, just use your favorite search engine to get one if you need a little help with this. — Or, email your instructor for some personalized suggestions!
• Start working on HW2, due this coming Monday — it's quite long and you can use these class times to work on it (even working together, which is fine, so long as you each turn in your own work).
• :
• Keep working on HW2. Two thoughts:
1. You should make sure you have a Python environment in which you can work at home, just as you do at school. So, spend a little timet to install matplotlib on your personal device, if it is not already installed, and to make sure that you can develop and run things both at school and at home.
2. If you get stuck at any point on the HW, don't hesitate to email your instructor. You may not get a particularly speedy response, particularly during the week, but at least on the weekend there should be faster help available this way.
• :
• Keep working on HW2. Wouldn't it be nice to be finished even before the weekend?

#### Week 3

• :
• HW2: either hand or email to your instructor the following:
1. It's time we stopped simply calling the SORT algorithm discussed above on the second day of this class the naïve sorting algorithm — in fact, this is called selection sort.
In this exercise, you will study the time complexity of selection sort both theoretically and in your Python implementation.
1. Look over your Python code for selection sort and derive as precise a formula for the time complexity in theory of your implementation as you can. That is, look at your code, line by line, and see how much computation it has to do. You may assume that each basic operation takes one unit of time. This means
a=b
takes one unit of time, while
a=b[j]
takes two units — one for the determination of the memory location defined by b[j], which usually involves addition of j times the memory size of an integer to the address in memory of the beginning of the list b, and one for the assignment to the variable a; similarly,
a=b[j+k]
takes three units of time, because it also has to do the addition j+k.
Building on this, build your careful formula for the time complexity of your program.
Note that when there are if statements in your code, the program may take a different amount of time depending upon the result of that test. So you should actually find two time complexity formulæ:
• A worst case complexity, where you just assume that each time there is an if statement, it comes out in whichever way makes for the most computation.
• A best case complexity, where you just assume that each time there is an if statement, it comes out in whichever way makes for the least computation.
2. Take your two complexity formulæ (best and worst case) and then for each give the simplest function $f(n)$ you can find with the property that your careful complexity formulat is $O(f(n))$. Explain how you know this simple formula works as desired.
3. Extra credit: Give an example of an input of size $n$ to your selection sort program which would make it have the worst case running time. Next, give an example which would make it have the best case running time. Of course, explain how you know that your answers do what you claim they do.
4. Next write a program that takes inputs $n$ and $m$ and does the following: $m$ times it generate a random input of toy data of size $n$, does selection sort on the toy data, and computes the average over those $m$ times of the seconds it took to do the sorting. Be careful: when adding up the times of doing those $m$ sorts, make sure you only count the time of sorting, not the time it took to generate the toy data.
5. At this point, pick a pretty large value for $m$ (the number of runs you are averaging together to get the average running time on input size $n$), such as 100 or 1000 (don't pick one which makes this run so long you have to wait around a while for it to finish), then make two lists:
x, being the values of $n$ for which you run the average running time on inputs of size $n$ ... use something like x=[5, 10, 15, 20, ...] up as high as you have patience for (it might be nice to have a couple of very large values, like 100 or 500 or 1000) and y, the corresponding average runtime for inputs of those sizes. x and y should both be lists of around 20 or 30 values.
Make a graph of y vs x.
6. Finally, on the same graph you just made, graph the function $f(n)$ that you found above, multiplied by some constant so that it does show how the running time is $O(f(n))$.
2. Do exactly the same thing you just did for selection sort [with all the steps and parts!], only now for merge sort.
3. For selection sort, do a careful examination to figure out a precise formula for the space complexity of your program: that is, a formula telling how much memory it uses on inputs of size $n$.
Is there a difference between best and worst case space complexity? If so, give two different formulæ; if not, explain why not.
Give a simple formula $f(n)$ so that the space complexity you just found is $O(f(n))$ or, if you had different best and worst case space complexities, give two such functions, $f(n)$ and $g(n)$.
4. Do exactly the same thing you just did for the space complexity of selection sort, only now for merge sort.
5. Extra, extra credit: Can you prove that the merge sort algorithm we've been using does what it is supposed to do? You can't quite use a loop invariant, because we're using recursion rather than looping, but a quite similar approach should work.
• Some discussions of HW2 issues and working to get matplotlib working on one student's personal laptop — for which this site was very helpful.
• Today [Monday] is the last day to drop classes without a grade being recorded.
• :
• Some individualized discussion of assymptotics of recursive algorithms — in particular, merge sort.
• :
• Review of big-O notation, discussion of related HW1 problem.
• Discussion of big-$\Omega$ and big-$\Theta$ notations.
• Mentioning The Master Theorem for assymptotics of recursive algorithms. Here is a nice reference for this theorem [or here is another place to find that same document, if the first one should happen to disappear from the 'net]. But note that your instructor mentioned in class that it is hard [and not so important] to keep the precise details of the theorem in your head. However:
• It is a good idea to be able to know how to check if the theorem does apply in a particular situation, and then to be able to state what conclusion comes from that application.
• E.g.: check for yourself if the Master Theorem applies to the work we've been doing for merge sort and, if so, what it tell us.
• :
• Read the resource linked yesterday about The Master Theorem and read (or at least skim) either this resource, this one, this one, or this one, on object-oriented programming in Python.
• In math and computer science, it is best to go for a mixture of text and equations/diagrams/code. Too many words misses out on the precision of equations — e.g., better than saying "yadda-yadda happens for all integers" is to say "$\forall k\in\ZZ$ we have $S(k)$" [where $S(k)$ is the precise statement you are going for, particularized to the integer $k$ and probably having or being an equation.
• Explanations in math and computer science tend to be structured more like recipes in cookbooks than like good essays or other well-written prose. For example, an induvtion proof should have certain pieces in a certain order (stating clearly what the pieces are and why!), and that's just fine.
• In this class, at least, you must always justify your work. E.g., if the assignment is "give an example of an even prime number," the answer "2" is woefully incomplete, while "2, because it is even, being clearly divisible by 2, and a prime, since the only natural numbers it has as divisors are 1 and 2, simply becuase they are clearly divisors and there is no room for other divisors $d$ which satisfy $1\le d\le 2$."
• In order to make sure that the work we did yesterday about the time complexity of merge sort was valid, we looked fairly carefully at the steps it must perform, following the pseudocode here. Everything seemed to agree with what we did yesterday, except for the possible case of appending an element to the end of a list, and deleting an element from the beginning of another list.
If both of these processes take $O(1)$ time, we're OK; if it somehow takes more and more time to append to a larger and larger list, or to delete entries from a very large list, then we would be in trouble.
Therefore we wrote code to test this. That is, it should:
• Loop up to some very large number $N$.
• At each step, it should time the operation of appending one small item [like a small integer] to the end of a single list [which would therefore be getting very large by the end of the main loop].
• Make a plot using matplotlib of the time to append an item (as $y$) against the size of the list to which that item is being appended (as $x$).
• Joey pointed out that the time for a single append might be so small that it simply always registered as zero. So another thing to do would be to time how long it takes to append 100 items, or 1000, or some number like that, to the list which is getting longer and longer, with a small loop inside the main loop.
• You should get a picture like
But make sure you can do that yourself!
• :
• Discussion about how it is a very nice feature of what some call "rapid prototyping languages" like Python that one can very quickly produce small bits of running code, to test new ideas and to see how well some approach performs. The timing graph from yesterday's class is an example.
In order to be able to take advantage of this kind of rapid prototyping, is important to have a fairly high comfort level with Python syntax and a few other tools (such as some methods in the random and time modules), as well as with the structures in which you are interested (such as different sorting algorithms in this unit of class).
If you do not feel you have that comfort level with any of those necessary pieces, please find some time to sit down with your instructor, soon, to figure out how to get there!
• Syntax to switch variable values in Python: oddly enough,
x,y=y,x
works, as does
a[i],a[j]=a[j],a[i]
as might appear in one of our sorting algorithms. This is very convenient, use it!
• Some object-oriented work in Python; see the links from yesterday. In particular, discussing making singly linked lists, and the complexity of adding to or deleting from the head or tail of the list.
• Insertion sort, yay!

#### Week 4

• :
• We've somehow gotten way behind on doing T&Qs!
Recall from the syallabus that these are short emails you should send to your instructor about the current readings you've been assigned and/or the material we've been discussing in class.
Their content should be a thought or question you might have about the topic in question, expressed in just a few sentences (or even one, very well crafted sentence!).
There is no requirement about depth or sophistication of your T&Qs, other than that they show real engagement with the material. For example, something like
"I think American politics suck."
"This article says that the Electoral College has its roots in the racist origins of the US Consstitution, and while I can see that is important, I don't see any path to changing the way we elect presidents in the near future."
does.
Also, don't be embarrassed if your T&Q is a question and not a statement. If there is something in the assigned T&Q topic which you don't understand, state clearly and explicitly what that confusing bit is. E.g.,
"This article talks about the problem with the Electoral College being that one person does not have one vote, but I don't understand the reasoning there, or the example of big states and small ones — don't big states have more Electoral College votes, so what's the problem with that?"
would be fine [on the same current event topic].
Each T&Q is graded 0 or 1 ... you should expect to get pretty much all 1s on your T&Qs, so long as you are actually engaging with the readings and class material, and keeping on schedule with them.
With that understood, please do the following T&Qs (send each in its own email to your instructor) before class on Monday (Tuesday is also OK, if this seems like too much, too soon, or if you have a question you want to ask in class about T&Qs on Monday):
• HW3: Implement insertion sort in Python.
• Use plain Python lists — no object-oriented monkey business with linked lists, as we had discussed in class on Friday.
• Write also code to collect data and display a graph of data on the time-complexity of your implementation [you should be able to steal a bit of your own code from HW2 for the complexity piece!].
• Send your code to your instructor as a <something>.py file attached to an email. As always, when sending code to someone else, make sure your variable names are informative and you have comments in your code. Your instructor should be able to execute your code and see the compleity graph[s], so there is no need for any other text or explanation beyond the code itself and its output.
• Some content we discussed:
• Using LATEX a little more seriously:
• Consider Overleaf, an online LATEX editor, if you don't have your own LATEX on your personal computer.
• It is very easy to put LATEX on web pages, just sneak the line
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/latest.js?config=TeX-MML-AM_CHTML async></script>
between the <head> and </head> at the beginning of the HTML file. Then you can use LATEX snippets in the <body> of your HTML file, and it will get automagically processed into beautiful mathematics, courtesy of the MathJax display engine. E.g., this very page used
\$\forall x P(X)\$ ...
where it said (above, on the first Friday of the semester)
$\forall x P(x)$ ....
• Going over a bunch of the issues that came up in T&Qs.
• Quick description of bubble sort
• :
• Read either this resource, this one, or this [much longer] one, or this one-page cheat sheet and submit T&Q6 on this topic.
• Some content we discussed:
• Using numpy: installing, basic use and features.
• Comparing numpy with other kinds of data structure, such as native Python lists and our own linked lists made with Python classes.
• In-class activity: take your code for insertion sort and change it to use numpy arrays, rather than native Python lists. Then do a timing graph (as we've done lots of times in the past) of the time complexity of this version as a function of input size, with different-sized (larger and larger) toy input data.
• Please email whatever you have at the end of class to your instructor — code, graphs, code that makes graphs, etc.
• :
• Selecting elements in numpy arrays
• In-class activity: follow these steps:
1. Make a numpy array of integers from 2 to some limit $n$.
2. Select all elements of the array which are either in location less than or equal to 0 or are not multiples of what is in location 0 (which should be the integer 2), and update the array to have as its new value that subset of values.
3. Now do the same thing with new verision of the array, but for location 1.
4. Now make a loop that does this all the way through the array. What is a good criterion for when to stop?
5. What have you just done? Do you know what it's called?
6. Write some code which makes a time complexity graph of the above program as a function of the input number $n$.
• :
• Read this resource, this one, or this one, on operator overloading and submit T&Q7 on this subject.
• In-class (or at home, or in a room elsewhere on campus: your choice, since your instructor will not be physically present in the classroom today!) activity: make your own Python code for singly and doubly linked lists, then make versions of insertion sort which use each of these structures instead of native Python lists. Then build a time complexity graph (as we have done many times before!) for the performance of these new versions as you make the toy input data larger and larger.
• You will eventually have to email email whatever you make to your instructor — code, graphs, code that makes graphs, etc. — so make sure you keep a copy.
• :
• No class meeting — a good opportunity to get caught up on things which will need to be handed in.

#### Week 5

• :
• Read the first sections of this article on the stack &mdash only up to (and not including) the section labeled Applications of stacks, and submit T&Q9 on this subject.
• HW4 is due, consisting of:
1. On Tuesday last week you made (or, at least, started to make) a version of insertion sort using numpy arrays, along with code to test it for larger and larger input sizes and to make the resulting time complexity graph. Send this code to your instructor now!
2. Then on Thursday last week you made (or, at least, started to make) a version of insertion sort using linked listss, along with code to test it for larger and larger input sizes and to make the resulting time complexity graph. Send this code to your instructor as well.
3. Make one additional graph that superimposes all time complexities you have computed for insertion sort: with Python lists, with numpy arrays, and with your own linked lists. Send the code for this to your instructor.
4. On Wednesday last week you made (or, at least, started to make) code which implements the Sieve of Eratosthenes, and also to collect and graph time complexity information about this program for different. Send this code to your instructor.
Notes:
• Make sure that whenever you send code, it has good comments and variable names, code to generate toy test cases and graphs of time complexities. Your instructor should be able to cut and paste your code into a Python environment and then both fiddle with the basic code and, by running some program you have supplied, generate the graphs you were asked to make.
• You might want to make a legend when graphing several different things on the same graph. The way to do this is to put the argument "label='some text'" in each call to plot and, have an additional command plt.legend(loc='upper left') before your plt.show(). Here is a simple example:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,20,1000)
y1 = np.sin(x)
y2 = np.cos(x)
plt.plot(x, y1, '-b', label='sine')
plt.plot(x, y2, '-r', label='cosine')
plt.legend(loc='upper left')
plt.show()
• Discussion of some points which came up in T&Qs.
• Discussion of some uses of stacks, particularly to be a place to put local variables in instances of programs when they recurse.
• A thematic move: our course will move forward a bit in the themes of material we cover, in that we will start to discuss searching, having been working on sorting for a while.
• As an example of a searching problem, here is a bit of code to do a guessing game which searches for a hidden random number:
import random
def guess(n=1023):
if n < 1:
print("Input must be a natural number")
return
number = random.randint(0,n)
guesses = 0
while True:
guess_text = input("Guess the number from 0 to %d: " % n)
if guess_text == "I give up":
print("Give up, eh?  Well, my number was %d." % number)
return
guess = int(guess_text)
if guess < 0 or guess > n:
print("Your guesses must be integers in the range from 0 to %d." % n)
continue
guesses += 1
if guess > number:
print("Too high")
elif guess < number:
print("Too low")
else:
print("Got it (in %d guesses)!!" % guesses)
return
Try running this to play the game — come up with a good strategy!
• In class, we discussed the binary search strategy, which repeatedly guesses the midpoint of the possible region, and then recursing to work on the lower or upper remaining half, depending upon the answer to the guess, "too high" or "too low."
• We mentioned how the time complexity of binary search is $O(n)$. This is in contrast with the most naïve searching approach — which is just to go through all the possible values and ask if it is correct — which has time complexity $O(n)$.
• In-class activity: write code which implements binary search. You may use the following class:
class Quizzer:
def __init__(self, max_val):
import random
self.n = random.randint(0,max_val)
self.max_val = max_val
def try_guess(self, k):
if k < 0 or k >self.max_val:
return 17
if k < self.n:
return -1
elif k > self.n:
return 1
else:
return 0
def __str__(self):
return str(self.n)
For example, it can be used as in the following
q=new Quizzer(65635)
for i in range(65535):
ans=q.try_guess(i)
if ans == -1:
print("Too low, trying again")
elif ans == +1:
print("Too high, how did that happen?")
elif ans == 0:
print("Got it!  Answer was %d" % i)
break
else:
print("weird.  quitting")
break
which implements the naïve searching algorithm.
But you should implement binary search! Do so, and send your nice code to your instructor.
• :
• Discussing worst case, best case, and average case complexity, in particular for the two searching methods we discussed yesterday: the naïve and binary search algorithms.
• Important topic (with examples): hen talking about average case complexity, it is very important to think about average over what population, or, even more precisely, average over what probability distribution.
• Adapting our binary search strategy from last class to situations when we're not searching among all integers from $0$ to $n$, but rather in a more random or sparse set of values. What changes need to be made?
• :
• In-class activity to test your binary search algorithm on input which is not just all integers from $0$ to $n$.
• Write some code which takes input $n$ and $p$ and generates $n$ numbers which lie in some randomly chosen interval of integers $[a,b]$, where $(b-a)/n=p$ (approximately).
• Write code which picks a random element from the above "sparse interval."
• Combine with your binary search to see if it all works.
• Make a time complexity graph.
• Starting to talk about binary trees.
• :
• Read, in Open Data strutures, the introduction to Chapter 6, here, and its continuation as § 6.1, and submit T&Q10 on this subject.
• In class we talked about many of the things we will code tomorrow.
• :
• The depth of a node in a tree is the number of links from the root to that node.
• The height of a node in a tree is the largest number of links from that node to a leaf.
• The height of a tree is the height of its root. Equivalently, it is the maximum depth of any node of that tree.
• It's time to make some code for trees. [This will be a large part of the homework due next Tuesday.]
In particular, you could use this for the basic structure of a tree node:
class BSTnode:
def __init__(self,data,left=None,right=None):
self.data = data
self.left = left
self.right = right
def is_leaf(self):
return self.left==None and self.right==None
This code includes the possibly convenient method that tells if a node is a leaf, is_leaf().
Note that sometimes it is convenient to have each node "know" its parent node. That would be with a slightly improved class definition:
class BSTnode:
def __init__(self,data,parent=None,left=None,right=None):
self.data = data
self.parent = parent
self.left = left
self.right = right
def is_leaf(self):
return self.left==None and self.right==None
def is_root(self):
return self.parent==None
This version of the code includes the possibly convenient method that tells if a node is the root of the tree, is_root().
Now please write the following methods for that class:
• insert inserts a new data value into a tree
• max_data_val finds the maxium value of a subtree starting at a certain node
• min_data_val finds the minimum value of a subtree starting at a certain node
• verify checks if a binary search tree is well-formed
• delete deletes a node
• depth_first_print prints data values in depth-first order
• ordered_print prints data values in increasing order
• depth returns the depth of a node
• height prints the height of a node
• height prints the height of a tree

#### Week 6

• :
• No in-person class meeting.
• Work on HW5, due tomorrow (see below).
• :
• HW5 is due, consisting of:
1. Send your instructor the code which implements (including testing with toy data and making a pretty graph) the binary searching of "sparse intervals" which we did in class last Wednesday.
2. Send your instructor the code which implements (including testing with toy data) the various algorithms for binary search trees which we did in class last Friday.
3. Write code whie scans through an entire binary search tree (this is called tree traversal, and you can do it in the same way we did depth-first printing of the data values in a tree, only do something different when each node rather than simply printing that node's data value) and counts how many nodes there are at each depth. Then make a histogram of these values.
• See, e.g., this site or this one for help making histograms. The baisc idea is that if x is a list of data values, the
import matplotlib.pyplot as plt
plt.hist(x)
plt.show()

will show a histogram of the values in x
Staring with an empty tree, insert a bunch of random integers, and then make this depth histogram. What would you expect this histogram to look like? What would like it to look like? Make a bunch of such histograms and save some interesting ones.
Send your code for this, and some interesting depth histograms (along with the date values which were inserted, in the order you used, to get that histogram) to your instructor, along with your answers to the two "what" questions just asked, above.
4. Make another histogram: fix a value of $n$, then generate $n$ random integers in random order, insert them into an empty binary search tree, and compute the height of that tree. Do this many times (maybe 100), and find the max, min, and average value of the height of the resulting tree. What would you expect the max, min, and average to be, in terms of $n$? Make a graph of the maxes against $n$, the mins against $n$, and the averages against $n$.
Send code, the graphs, and your answers to the "what would you expect" question to your instructor.
5. Extra credit: Print out the contents of a binary search tree in breadth-first order, meaning that all the nodes at a particular height are all printed before the next height is printed. This is surprisingly hard! (Searching around on the 'net gives some hints, but still there is a fair bit of work to do).
NOTE: this HW assignment is too long! Do a substantial portion of it, particularly parts you find most interesting, but don't kill yourself to do it all!
• Actually, some students are behind, so this HW will not be counted as late if it comes on on Thursday
• Discussed breadth-first traversal. Code will be posted here soon, but in the meantime, you might look at one of these resources for some ideas:
• :
• This week, we're going to calculate the entropy of some data sources. Here are the formalities:
• A finite probability distribution is the assignment of a probability (a real number in the range from zero to one) to each of finite number of outcomes of some experiment. Usually, we denote those outcomes as $1,\dots,n$, and the associated probabilities as $p_1,\dots,p_n$.
• The conditions for a collection of real numbers $\{p_1,\dots,p_n\}$ to be a finite probability distribution are $$0\le p_i\le1\ \ \forall i=1,\dots,n$$ and $$\sum_{i=1}^n p_i = 1\ \ .$$
• The entropy of a finite probability distribution is the value $$H = -\sum_{i=1}^n p_i\,\log_2(p_i)\ \ .$$
• This kind of entropy was invented by the amazing mathematician Claude Shannon, who used it for a number of purposes. For example, the higher the entropy of some data, the less compression you can do of that data (if you are trying to re-code it to take less space so that you can, for example, transmit it to someone without using up too much bandwdith). This work of Shannon's is part of a very beautiful area of mathematics he invented called Information Theory, which is in many ways the mathematics which underlies the Internet.
• In practice, one has a sample of data and wants to compute its entropy. To do this, one takes the data as if it were a random sample from some data source, and uses that sample to compute the [approximate] probabilities of the outcomes. For example, if you have a very long string of letters in the English language, you can count the number of times the letter a occurs, the number of times b occurs, etc., calling these $n_0$, $n_1$, etc. Then the approximate probability of the $i^{th}$ letter will be given by $p_i = n_i/N$, where the string is of total length $N$. These numbers $n_i$ are called (quite sensibly) the frequency counts.
• To compute frequency counts of letters in a string, we can use a Python dictionary. E.g.:
def print_letter_counts(string):
counts = {}
for i in string:
if i in counts:
counts[i] += 1
else:
counts[i] = 1
for i in counts:
print("Letter "+i+" ocurred with frequency "+str(counts[i]))
• Actually, a much faster way to do this would be to associate to all letters a number in the range 0 to 25. Then given a string, one would make an array of zeros, go through the letters of the string and increment the entry of the array by one at the location given by the number correspoding to that letter. That way there would be no searching in a table for a particular key, the computer would just be going to a particular location in the array. So, potentially, much faster! This could look like this:
def print_letter_counts_array(string):
import numpy as np
counts = np.zeros(26, np.int)
for i in string:
counts[ord(i)-ord('a')] += 1
for i in range(26):
print("Letter "+chr(i+ord('a'))+" ocurred with frequency "+str(counts[i]))
• This way of just using the key as an index into an array is very fast, but it requires a way to convert keys into integers. A function that converts keys into integers is called a hash function.
• One way to use this approach would be to make an array which had the size of the number of all possible keys, filled with zeros, and then use a hash function. That's likely to be way too huge, so instead we use a smaller size, and a hash function which takes all keys and gives integers only in the range from 0 to one less than the size of this smaller array.
• But then it will happen that there are different keys which hash down to the same index! This is called a [hash] collision.
• What do we do with collisions? One thing to do is not to put the actual counts (or whatever the data values are) into the table, but instead the heads of linked lists. Then when a key is give, one computes the hash of the key, then does a linear search of that linked list to find the entry with that key value. As long as there are not so many collisions, so that the linear searching will always be quite short, this method produces a very fast access.
• A table of data which you access with a hash function and some way of handling collisions is called a hash table. The collision-resolution approach mentioned immediately above, where the table contains lists of elements with a particular hash value, results in what is called a chained hash table.
• A Python dictionary is a hash table which is built into the language. But we will code our own hash tables on our way to computing frequencty counts of words in English text and, therefore, the entropy of English.
• :
• Remember, HW5 modified due date is today.
• Read any introduction to Python dictionaries, such as this one, this one, or this one, and submit T&Q11 on this subject.
• The goal today is to make a working version of a chained hash table, where the keys will be strings and the data values will be integers. Here are some toos:
• To compute the hash of a string, there is a built-in function in Python called hash; it makes a reasonable choice for hash tables, but not for cryptographic applications of hash functions.
• Hash tables should keep track of how many keys they have and how big the table is, and should grow the table (usually by a factor of two) when that ratio gets to be bigger than 1.
• Methods we will need:
• __init__(self, k) creates an empty hash table of size $2^k$
• __len__(self) returns the number of distinct keys currently in the table, which allows the syntax len(table)
• load(self) returns the fraction of entries in the array part of the hash table which are full; which value is len(self)/2**k
• __getitem__(self, key), which gets the data value associated with that key, allowing the syntax print(table["key"])
• Should throw an IndexError if that key is not already in the table. The way to do this is with the code
raise IndexError("Key not found in this hash table")
• __setitem__(self, key, value), which sets the data value associated with that key, allowing the syntax table["key"]=value
• This method should look at the location given by hash(key) in the array part of the table.
• If the value is None, it should be replaced by a new doubly linked list with one node whose data value is the given value, and the variable keeping track of number of distinct keys should be incremented.
• Otherwise, a linear search should be done in the doubly linked list whose head is in that array location, looking for the key. If the key is found, the corresponding data value should be changed to the new value.
• If the key is not found in that doubly linked list, a new list node should be added to the end, whose data field is given the value value. Also, the variable keeping track of number of distinct keys should be incremented.
• If the variable keeping track of number of distinct keys was just changed, the load should be recomputed. If the load is greater than 1, the size of the array part of the table should be doubled, all of the keys and corresponging values taken out of the old table and inserted into the new one.
• remove(key) which removes that key and its associated data value from the table
• get_keys() return a Python list of all keys in the table
• increment(key) — increment the data value associated with that key value, or set it to 1 if it was not there already (i.e., implicitly, if a key has not yet been seen, that's the same thing as if its count was zero, which increments to 1 [obviously]).
• __contains__(self, key) returns True if that key is already in the table, False otherwise; enables the syntax key in table which is equivalent to self.__contains__(key)
• :
• Read this resource or this one and submit T&Q12 on this subject.
• Each student will need to choose one of the following algorithms as the topic of a (slightly) larger project in the next weeks. We might therefore talk briefly about each of them so you could pick one with some idea of what you're getting into.
• Quicksort
• Heap Sort
• Continue with the in-class work from yesterday. If you finish with that, please start the next part, to compute the entropy of files of Enlish text.
• We're going to compute the entropy of the distribution of words in various pieces of English text. To do this, we will need to get the words out of a file. Here is how to read lines of a file:
with open('filename','r') as f:
for line in f:
print(line) # or do anything you want to do with that line
• Another thing it helps to do is to break up a string into words, that is, pieces separated by spaces or other "whitespace" like tabs, etc. The method split() returns a list of the words in a string, so is very useful for this.
• You will also want to consider words to be the same if they are at the beginning of a sentence — so, capitalized — of anywhere else in the sentence — not capitalized. The easiest way to do this is simply to pre-process every word to make all the letters one case. For this, lower() and upper() are useful string methods, which make all letters the respective case.
• Finally, you will want to remove all non-letter characters such as punctuation. One way to do this is to look at lines one character at a time, and to use the string method isalpha() which tells if the string is all letters or not.
• [There is also a way to do this with regular expressions which is a fair bit more complex, but quite effective.]
• Once the words are out of the source file, they can be used as keys into the hash table you just built (the increment method is very useful here!) to get all the frequency counts for words in the source document.
• Having the frequency counts, you can then compute the entropy, as described in the notes above for this Wednesday
• This coding task will form most of the homework for Monday.

#### Week 7

• :
• HW6 is due, consisting of:
1. The general chained hashtable code from last Thursday. Makes sure you have nice comments and variable names. Include some code which does simple tests of all of the features you have implemented.
2. The code from last Friday which finds word frequency counts from a text file, using your chained hash table, and then computes the entropy. Again, include some test cases to show it works the way we want.
3. Extra credit: Do some work to show the hash shape of your hash table as it does a lot of work, e.g., in the above entropy computation when processing a very large text file. What this means is, every $k^{th}$ time you insert something new into the hash table (where $k$ is 1 or 10 or 100, depending upon how big your data file is), make a histogram which shows the length of the list attached to each entry in the primary array of the hash table. This should start out mostly as zero, then there will be a few bars of height one, then, as there are more and more collisions, some (many?) of the lists will get longer and longer. Then, when you resize the hash table's array, it will flatten out and lower, presumably....? It would be nice to have a bunch of histograms showing this behavior.
• [This should be quite easy to implement: whenever you make or resize a hash table, make a parallel integer array of the same size, filled with zeros. Then, when you add an element to the linked list at some hash index into the hash table's array, increment the corresponding integer array entry. Then simply make a bar chart with that array of integers whenever you want.]
• Skim the Wikipedia pages (links below) for the three algorithms and write an associated T&Q13 on one of those sources.
Also, email your instructor the list of those three algorithms, sorted by your preference for doing your project on that algorithm (make it clear if the list is best-to-worst or worst-to-bets!).
• Introduction to graph theory:
• A directed graph $\Gg=(V,E)$ consists of two finite sets:
• a set $V$ of vertices; and
• a set $E$ of edges, where $E\subseteq V\times V$, meaning that $E$ is a set of [ordered!] pairs of vertices.
We shall write an edge as $e=(v_1,v_2)\in E$, and refer to the first element of the pair as the source of the edge and the second element as the destination. That is, $e=(s,d)\in E\iff\text{source}(e)=s\in V\text{ and dest}(e)=d\in V$.
• In a graph $\Gg=(V,E)$, a path from $a\in V$ to $b\in V$ is a finite sequence of edges $e_1,\dots,e_n\in E$ such that $\text{source}(e_1)=a$, $\text{dest}(e_1)=\text{source}(e_2)$, ... , $\text{dest}(e_{n-1})=\text{source}(e_n)$, $\text{dest}(e_n)=b$.
• A graph $\Gg=(V,E)$ is connected if it is possible to get from any vertex to any other by a path. A subgraph of a graph is a connected component if it is a connected graph on its own and no strictly larger subgraph of $\Gg$ is connected.
• Sometimes graphs can have decorations [you might say] such as, for example, a number assigned to each edge [these are then usually called weights]. Some examples:
1. The weight might represent the distance you need to travel to get from one city to another, the vertices being cities.
2. Or the vertices might be email addresses and the number on an edge from from address $A$ to address $B$ would be the number of times $A$ sent an email to $B$.
3. Or perhaps the vertices represent people, the edges stand for the relationship "person $A$ has exhaled and person $B$ inhaled, after less then five minutes, some of that air," and then we might label [decorate] the vertices with one of two values: "sick" and "not sick."
• The Traveling Salesman Problem [one of the most famous problems in computer science] is: given a weighted graph as in the example number 1 immediately above, find a path which visits every vertex in the graph and which minimizes, among all paths which visit every vertex, the sum of the weights along the edges of the path. This is famously hard problem!
• Another problem with graphs is, given two vertices, to find a path with the fewest number of steps (edges) between those two vertices. This is an increbily important problem! Many communications networks (such as the Internet) confront this problem many, many times (per second, maybe!), as they try to route information from a server to a client.
• Representing graphs in Python: We will want to build Python code to represent graphs, so that we can then implement algorithms which solve important graph problems.
• It makes sense to use a Python set to represent both the set of vertices and the set of edges of a graph, since we want something that doesn't care about ordering, but for which we can test membership and iterate through all of the elsements. E.g.:
• x in S is a Boolean which tells us about set membership of the element x in the set S.
• for i in S: starts a loop which has the variable i run through all of the elements in S, each element appearing exactly once [albeit in an unpredictable order].
• It will be convenient [and fun] to use a lot of operator overloading to make our Graph class easy to use.
• It will also be convenient [and fun] to develop quickly a bit of code which makes a pretty picture of a graph. This is actually fairly hard to do optimally, but fairly easy to do in a pretty nice [and useful] way.
• :
• Read A Gentle Introduction To Graph Theory and submit T&Q14 on this subject.
• OK, modification of what we said yesterday:
• Let's use the term digraph [short for "directed graph"] for the thing we defined yesterday. We should [in agreement with most authors] reserve the term graph for the case where the edges are unordered pairs of vertices. So an edges on a graphs will be a set of vertices with two elements ... but, actually, we will also allow an edge to be a set of vertices with one element, to indicate one of those edges which are little loops from a vertex to itself.
• In an undirected graph $\Gg=(V,E)$, a path from $a\in V$ to $b\in V$ is a finite sequence of edges $e_1,\dots,e_n\in E$ such that one of the vertices of $e_1$ is $a$, the other is one of the vertices of $e_2$, the other vertex of $e_2$ being one of the vertices of $e_3$, ... , and the vertex of $e_n$ which does not equal one of the vertices of $e_{n-1}$ is equal to $b$.
• The terms connected and connected component for an undirected graph are defined in exactly the same way as for a digraph.
• Today we start an implementation of code for handling graphs. What this means is that we will build classes Graph and Digraph which have the methods
• making new [di]graphs: __init__(self, vertices=set(), edges=[]}
[we'd really like to give the defaults here as "vertices={}," but that makes an empty dictionary, not an empty set, while set() does make an empty set]
make and empty [di]graph, and then add the vertices and edges specified [if any are]
Let's always identify vertices as strings
Let's always identify edges as sets which have exactly two strings, being the names of vertices, except if we are trying to specify an edge from vertex v to itself, in which case the set will just be {v}; the vertices specified must already be in the set of known vertices, otherwise throw a ValueError("Vertex not in graph.")
Let's always identify edges as tuples which always have exactly two strings, each being the name of a vertex in the graph — if the vertex is not already in the graph, throw a ValueError("Vertex not in graph.")
• overloading the addition of both vertices and edges: can be achieved by this code [for digraphs]:
def __iadd__(self, item):
if type(item) == str:
elif type(item) == tuple:
else
raise ValueError("Can only add a vertex [=str] or edge =[tuple].")
[To do the same thing for graphs rather than digraphs, replace every occurrance of "tuple" in that code with "set".]
This will allow both the syntax graph += "new vertex" and the syntax graph += ("source", "dest") [for digraphs, or graph += set("a", "b") for graphs].
• overloading the test of vertices and edges in a graph: __contains__(self, item)
This can use the same approach of testing the type of the input item to test either for existence of a vertex or edge in a [di]graph. This will enable the syntax item in graph where when item is a string it tests if that vertex is a known vertex and if item is a tuple [for digraphs, or set, for graphs], tests if that edge is in the [di]graph.
• basic graph-theory methods: define methods which compute basic things like order and size of a graph, the degree of a vertex in a graph or in-degree and out-degree of a vertex in a digraph, etc.; see this Glossary of graph theory terms. We will also eventually want methods which find paths between given vertices, if such exist, and find the connected components of a graph, etc.; some of those tasks are quite complicated and we will discuss algorithms to do them in class.
• displying graphs: display(self, circular=False)
Make a picture of a [di]graph. This can be done as follows:
1. If circular is true, assign the vertices, in a random order, locations spread evenly around the unit circle of the xy-plane. Otherwise, assign them random locations in some portion of the plane. Keep the locations you have assigned the vertices in a dictionary.
2. Use matplotlib to dislay big dots at the locations where you have chosen to put the vertices; use plt.scatter(x,y,s=nice_size) to do this.
3. For digraphs, plot arrows from source to destination, between chosen vertex locations (from the dictionary) corresponding to each edge in the graph; see this page for how to do that.
4. For graphs, you could use the above code as well and simply set the head_width and head_length to zero, to make line segments instead of arrows.
• :
• Read Introduction to Graph Theory, pages 1 through 4, paying particular attention to the Exercises on pages 3 and 4. As your T&Q15 for today, look through those Exercises and see if you have an idea about how to answer all of them, some of them, or none of them, and say why.
• Continuing with the coding of Graph and Digraph from yesterday.
• :
• Read this Wikipedia page on planar graphs [which are graphs that can be drawn on the plane without any of the arcs used to draw edges crossing each other; skip anything where they use terminology that you don't understand — or, if it piques your curiosity, follow the links (as technical terms in such Wikipedia pages usually have) and see what the fuss is about] and submit T&Q16 responding to this reading.
• Continuing with the coding of Graph and Digraph from Tuesday. ...But here are some other things we will want to start coding:
• The description above of how to make a (far from ideal, but probably nevertheless fun to look at!) visual representation of a graph or digraph didn't suggest what to do with the edges from one vertex back to itself. There are two ways you might do that:
1. Simply make all of the dots corresponding to the vertices of one color, except for the vertices which have a little self-looping edge: make those of some other color. Then [particularly if there is a key which explains this convention] someone looking at the image would know how to interpret it [by mentally visualizing the extra looping edges at those differently colored vertices].
2. Use matplotlib.Circle to get a little circle in your graphical representation of your [di]graph
• It is also a nice thing to use annotate to put in the name of the vertices near the dots you have made to represent each one.
• Here is some example code from which you might get useful ideas, to do the self-looping edges and vertex labels:
import matplotlib.pyplot as plt
import math
def dots_around_the_circle(n, self_loop_locs): # n here is the number of dots;
# self_loop_locs is a list of
# the indices of the self-loops
x = [] # will be the x-coords of the vertices
y = [] # will be the x-coords of the vertices
label_x = [] # will be the x-coords of the labels
label_y = [] # will be the y-coords of the labels
labels="ABCDEFGHIJKLMNOPQRSTUVWXYZ" # will be the label values; not a
# good way to do this! will not work
# if more than 26 vertices; use
# the real labels in your code
for i in range(n):
x.append(math.cos(2*i*math.pi/n))
label_x.append(1.2*math.cos(2*i*math.pi/n))
y.append(math.sin(2*i*math.pi/n))
label_y.append(1.2*math.sin(2*i*math.pi/n))
ax = plt.gca()
ax.set_xlim((-1.5,1.5))
ax.set_ylim((-1.5,1.5))
ax.set_aspect('equal') # set aspect ratio, so circle will be round not oval
plt.scatter(x,y,s=200) # plot the vertices
for i in range(n):
if i in self_loop_locs: # this will  # plot the selected self-loops edges
loop = plt.Circle([1.2*x[i],1.2*y[i]], .2, color='b', fill=False)
plt.show()
Try running dots_around_the_circle(6,[1,3,4]), for example.
Note that instead of scaling up the points on the unit circle by multiplying them by 1.2, to get the centers of the small self-loops (which in turn have radii .2), when you use the method of random placement of vertices, you might want to pick some small, random displacement from the vertex location for the location of the center of the self-loop.
• Here's a new(ish) definition: Given a digraph $\Gg=(V,E)$, put the vertices in some order so they can be written as $V=\left\{v_1,\dots,v_n\right\}$. Then the adjacency matrix $M$ is the $n\times n$ matrix with entries $m_{ij}=\begin{cases} 1 & \text{if }(v_i,v_j)\in E\\ 0 & \text{otherwise} \end{cases}$ If $\Gg$ is a graph, then we need only change the term "$(v_i,v_j)\in E$" to "$\{v_i,v_j\}\in E$."
Note that the adjacency matrix of a graph will always be a symmetric matrix [meaning that $m_{ij}=m_{ji}\ \forall i,j$].
• You might ask yourself what matrix powers of the adjacency matrix tell us about a [di]graph....
• Write code that initializes a graph or digraph from a matrix of 0s and 1s, resulting in a [di]graph with that matrix as its adjacency matrix. [This can be used to make random graphs by feeding this initialization code random matrices.]
• Write code that computes the adjacency matrix of a [di]graph.
• It was mentioned above that you could write code which would compute the degree of a vertex in a graph [or in- and out-degree of a vertex in a digraph]. It would be interesting to make a histogram of the degrees of all of the vertices of a graph.
• Let's compute the distance between two vertices $v_1$ and $v_2$ in a [di]graph; that is, the smallest number of edges in a path from $v_1$ to $v_2$. Here is pseudocode to do this:
Input:
• The sets $V$ of vertices and $E$ of edges in a digraph $\Gg$.
• Two particular vertices $a,b\in V$.
Output:
• The number $i$ of edges in the shortest path from $a$ to $b$.
Algorithm:
• set $S$ [for "seen already"] to be the empty set
• set $i$ = 0
• set $C$ [for "current vertices"] to be the set $\{a\}$
• while $C\neq \emptyset$:
• $i$ += 1
• set $C^\prime$ [building up new version of $C$] to be the empty set
• for each $v$ in $C$:
• add $v$ to $S$ (if its not already there)
• for all edges of the form $(v,v^\prime)\in E$:
• if $v^\prime$ == $b$, return $i$
• if $v^\prime\notin S$, add $v^\prime$ to $C^\prime$
• replace $C$ by $C^\prime$
• return $\infty$
Note that each time through the main while loop in this algorithm, the set $C$ consists of those vertices which are at distance exactly $i$ from $a$. Effectively, $C$ is the circle of radius $i$ centered at $a$.
It would be interesting to make a graph of the size of these sets $C$ versus $i$, for $i$ from zero to the size of the graph, and for different choices of the points $a$.
• :
• No in-person class meeting.
• Spend your time working on Major Project 1:
• You have been assigned an algorithm.
• Please research that algorithm by searching on the Internet, getting references from the library, talking to your instructor.
• Your study and work on this algorithm should culminate in three work products:
1. A report, written up in LATEX, talking about your algorithm. This should include at least the following information:
1. A discussion (as good English text, not just pseudocode or something like that) of the key ideas which make the algorithm work.
2. A clear description of the algorithm in pseudocode.
3. A discussion of the strengths and weaknesses of this algorithm, comparing to other algorithms we've studied or you have read about. [Always explaining "why" for everythign you assert.]
4. A discussion of the cases where your algorithm applies and does not apply. [And why, of course.]
5. A discussion of the time and space complexity of the algorithm. Give average and worst-case complexities, and examples of the kind of thing which leads to the worst-case performance. [And why....]
2. A carefully written and tested bit of Python code which does the following:
1. Implements the algorithm.
2. Tests your implementation with some toy data.
3. Generates enough toy data then to display an empirical time-complexity graph [which graph should be well-labeled].
4. Be clearly written, with comments and suggestive variable names.
5. Be able to stand on its own, in the sense that your instructor should be able to cut-and-paste it into a Python environment and get the toy data to run, generate the complexity plot, and even try the algorithm on some new toy data at will.
3. Give a 10-15 minute presentation to the class of your findings. This presentation might simply consist of projecting your report and your [running] Python code, walking through both to talk about the most interesting things, they key points of the ideas, and any key issues or interesting discoveries you made while writing the code.
• [It might actually make sense to make a more succinct, presentation-style document to project, instead of your report. You might consider using the Beamer style for LATEX to do this, it would be a good thing to learn how to do, since it is the standard method of doing serious scientific presentations in mathematics, physics, and computer science. If you use the Overleaf online LATEX tool, look for the option to make a "presentation," one of the formats for which will be Beamer.
But note that, just since we first talked about it in class, Overleaf seems to have started to require registration. You can register for a free, personal account, though, and it probably makes sense to do so, if you do not have your own LATEX installation, with which you are comfortable, on your personal device.]

#### Week 8

• :
• :
• Individual consultations of the status of students' work for Major Project 1.
• :
• No in-person class meeting.
• Working on Major Project 1. If you are feeling it is in really good shape and/or you need a break from that topic, please work on implementing more of the graph structures and algorithms we discussed last Tuesday and Thursday.
• :
• No in-person class meeting.
• Working on Major Project 1 and, if time permits, the graph structures and algorithms we discussed last Tuesday and Thursday.
• :

#### Week 9

• :
• :
• Working time to catch up coding the graph data structures and algorithms from previous classes. To remind you, we want code which implements the following:
• __init__(self, vertices=set(), edges=[])
• __contains__(self, item)
• display(self, circular=False)
• a way to initialize a [di]graph from an adjacency matrix
• distance(self, a, b) (to compute the number of edges in the shortest path from a to b)
• :
• Review of the algorithm to find the shortest path, in the sense of path with the fewest edges, between two vertices in a graph.
• Discussion of Dijkstra's Algorithm which finds the shortest path, in the sense of minimum total of the edge lengths, in a graph where edges are labeled with a positive real number referred to as it's length.
• :
• Work on the graph methods we've been discussing (see a list of the methods we need to implement above in this page, at the entry for Tuesday this week). In addition, start building an implementation of Dijkstra's algorithm, as follows:
• Make new versions of the Graph and Digraph classes with (positive!) real number weights on the edges, call them WtGraph and WtDigraph. These should have the same methods as the unweighted versions. The difference will be in the edges. In both WtGraphs and WtDigraphs, we should make edges be tuples [or you could use lists, it doesn't make that much difference], the first element of which will be a set, for a WtGraph, or a tuple, for a WtDigraphi.e., be what the edges in Graphs and Digraphs were — and the second element of which will be a positive real number.
• The above new kind of edges has consequences for the add_edge and __contains__ methods and, indirectly, the display method (which could use the approach above from Thursday, two weeks ago, to disply the weights near the line segments or loops showing edges).
• The adjacency matrix in a weighted graph generally has the weights as the values, rather than 1s and 0s, in the matrix locations corresponding to edges. This will have implications for your method adjaceny and the method you wrote to build a [di]graph from an adjacency matrix.
• Finally, use Dijkstra's algorithm to make your method distance for WtGraphs and WtDigraphs. Follow the pseudocode from this page — if you are ambitous, you could add a method this shows the actual shortest path, which that pseudocode explains how to do.
• Python coding note: In Dijkstra's algorithm, following the pseudocode linked to above, it would be convenient to have a value of infinity to use as a variable value. It turns out that Python does have this capability: x=float("inf") gives the variable x a value which behaves like $\infty$ in many situations. For example, the following Booleans are all true for the x:
x == 2*x
1234567890 < x
x == x + 1
-x < -123123123123
This can make for much cleaner code than using, say, None as an estimated distance value when none is yet known.
• :
• Talked about some basic ideas of dynamic programmin, which is an algorithmic strategy that relies upon
• memoization: the process of saving, rather than recomputing many times on different branches of a recursion tree, intermediate calculations; the archetypical example of this is how it would be better to store values Fib(k) of the Fibonacci sequence for small $k$ rather than recomputing them every time down many recursion branches when using the defining formula Fib(n) = Fib(n-1) + Fib(n-2).
• optimal subproblems: this happend when the large problem is some sort of optimization task, and it has some specific relation (not simply equavalence in both directions, though, usually) with optimization of smaller problems; length-minimizing paths like Dijkstra's algorithm are examples of this situation.
• Today [Friday] is the last day to withdraw (with a W) from classes.

#### Week 10

• :
• Work on HW7, consisting of writing [and testing] code to do the following things for graphs, digraphs, weighted graphs, and weighted digraphs:
1. __init__(self, vertices=set(), edges=[]) — our vertices will be strings, each edge will be
• at set with one element — for a looping edge from that vertex to itself in a graph
• a set with two elements — for a non-directed edge in a graph
• a list with one element — for a looping edge from that vertex to itself in a digraph
• a list with two elements — for a directed edge in a digraph
• OR, in the weighted versions of graph and digraph, a list consisting of one of the above as the first element and a positive real number as the second element
3. remove_vertex(self, v) — check if the proposed vertex to be removed is used in any edges in the [di]graph and throw an error if so
4. add_edge(self, e) — you might set this up for the weighted versions so that if a weight is not specified, it is defaulted to the value 1; alwasy check if proposed edges used in this method are between vertices which are already in this [di]graph and throw an error if not
5. remove_edge(self, e) — for weighted [di]graphs, the edge to be removed here should be specified just by its terminating vertices [its endpoints], no need to specify the weight
6. set_weight(self, e, w) — sets the weight to w in a weighted [di]graphs for an edge e specified merely by its terminating vertices [endpoints]
7. get_weight(self, e, w) — gets the weight in a weighted [di]graph for an edge e specified merely by its terminating vertices [endpoints]
8. __iadd__(self, item) — should check the type of the item and do either add_vertex or add_edge, as appropriate
9. __contains__(self, item) — should check the type of the item and then search either the vertices or the edges for that edge. In weighted graphs and digraphs, you probably want to search on just the non-weight part of the edge
10. display(self, circular=True) — a fancy version with weighted graphs might show the weights near the line segment or arrow for the edge
11. get_adjacency(self) — compute the adjacency matrix of a [di]graph; for weighted graphs and digraphs, the location m[i,j] of the adjacency matrix m should tell the weight of that edge (and not merely be the value 1 for an edge).
12. set_adjacency(self) — test first if the [di]graph is empty and if not, or if the input matrix is not symmetric and you are trying to make a graph, throw an error; the versions of this for weighted [di]graphs should take the values for the weights out of the matrix.
13. distance(self, a, b) — compute the number of edges in the shortest path from a to b using the algorithm described a couple of weeks ago on a Thrusday, in the case of unweighted [di]graphs, and compute the smallest total weight along a path from a to b using Dijkstra's Algorithm using the pseudocode from this page, in the case of weighted graphs.
14. rand_adj_symmat(n, p=.5, allow_diagonals=True) — generate an $n\times n$ possible symmetric adjacency matrix which has many zeros and then ones with probability p, where allow_diagonals tells if there are allowed ones on the diagonal
15. rand_adj_mat(n, p=.5, allow_diagonals=True) — generate an $n\times n$ possible adjacency matrix which has many zeros and then ones with probability p, where allow_diagonals tells if there are allowed ones on the diagonal ... not necessarily (not likely, in fact) symmetric
16. Extra credit:
• dijkstra(self, a, b) — for weighted [di]graphs, returns a list containing first the value of self.distance(a,b) and next a list of the vertices visited, in order, along the shortest path found by Dijkstra's algorithm [the psuedocode here actually includes what you need to do this]
• shortest_path(self, a, b) — for unweighted [di]graphs, returns a list containing first the value of self.distance(a,b) and next a list of the vertices visited, in order, along a path which realizes this distance [you can use a modification of the approach from the Dijkstra's algorithm path recording, above, to do this]
• Python coding notes:
1. Don't forget to consider using the value $\infty$ for Python variables, as described in the Python coding note from last Thursday.
2. When debugging graphics code using matplotlib, it can be annoying that the graph persists and new things get added on top of old ones. To clear the current figure, you can type matplotlib.clf(). Then you can do other plotting commands, followed by a plt.show() and you will not see anything left over from before the clf().
3. Another thing to do when debugging code with plotting is to use the command plt.figure(), which creates a whole new plotting panel in which future plotting output will appear. This is useful to compare to plots.
4. One last plotting technique is to call the show command with block=False, or, in full plt.show(block=False). This allows you to go back to Python to do more work without first killing the plot window.
• :
• No in-person class meeting.
• Keep working on HW7.
• :
• Working together to finish code and debgugging for HW7. [Did you know that the word "debug" was invented by Grace Hopper?]
• HW7 is due. Send all of your code — please make it easy to read and thoroughly tested — along with some test cases that show it works well, to your instructor by email.
• :
• Please implement the following (before class!):
• is_connected() for [di]graph classes, which returns True or False depending upon whether you can get from any node to any other by some path
• degree(v) for graph (not digraph) classes, which returns the degree (number of edges connecting to that vertex v
• in_degree(v) and out_degree(v) for digraph classes, which returns the number of edges terminating or originating at that vertex v
• Definition: An Eulerian circuit is a path which includes every edge in a [di]graph exactly once.
• Naïve approaches to finding Eulerian circuits.
• Python coding note: Suppose you have a method which is conceptually associated with some class, but doesn't need a particular instance of the class to be useful. For example, we had a method rand_adj_mat which generates a random matrix that can be used as the adjacency matrix of a graph -- in fact, it is associated with the classes we made for graphs, and in fact should be slightly different for Graph and Digraph, since in the former case the matrices created should be symmetric, while they do not need to be in the latter. You can achieve this by putting the special Python word @staticmethod on a line before the def methodname line, at the same level of indentation, inside the class declaration. Then, even though the method is defined inside that class, it has not self argument. You would then call the method without using an instance of the class, but instead doing something like class_name.method_name(arguments).
For example, suppose our graph class definition included
class Graph:
...
...
@staticmethod
...
Then you would use this with something like
gr = Graph()
... stuff with gr, such as ,,,
...
... now to make an adjacency matrix
... note that was called with Graph and not gr
... you could even have done that before gr was defined:
...     no specific instance of Graph need exist before
...     using rand_adj_mat, since it is a staticmethod
• :
• Announcements:
• There is an exploration project to do for today, described immediatley below, which is (a large) part of the homework which will be due on Monday.
• It's time to start thinking about Major Project 2! This will be quite similar in general structure to Major Project 1, except the topic will be graph algorithms rather than MP1's searching algorithms. More information will be on this page this weekend. MP2 will be due a week from Monday, i.e., ten days from now.
• No in-person class meeting.
• Definition: In a [di]graph, a path which includes each edge exactly once is called an Eulerian path or, if it returns to the same vertex where it started, an Eulerian ciruit.
• So, the question is: when do [di]graphs have Eulerian paths/circuits and how do we find them when they exist? Obviously, this is only on the honor system, but please don't research this [e.g., on the Internet] until you have done the explorations below.
• Let's think about a greedy algorithm to find an Eulerian path. Recall that greedy algorithms work by, usually: do a problem step by step, at each step doing the thing which seems best in the most immediate, local sense. Hopefully this will magically do the best thing, globally as well. This works in rather rare, special situations ... but sometimes, for very difficult problems (e.g., ones where there can be no particularly efficient solutions), it is the best one can do.
• Greedy algorithms usually have some sort of objective function which is what is optimized locally at each step, in the hopes of producing a globally optimal value. For the Eulerian circuit problems, there is no obvious objective function, but at least we could choose at each step not to do an obviously wrong thing. So there is one possible greedy approach, in pseudocode:
Input:
• A graph $\Gg=(V,E)$.
Output:
• An Eulerian path $P$ for $\Gg$.
Naïve algorithm:
1. Pick a random starting verted $v\in V$.
2. Set $P=[\ ]$.
3. Loop as long as possible doing:
1. Pick any edge $e$ starting at $v$ for which $e$ is not in $P$. If none exists, exit the loop.
2. Append $e$ to $P$.
3. Set $v \leftarrow w$, where $w$ is the other end of the edge $e$.
4. If $P$ contains every edge in $E$, output $P$. Otherwise, admit failure.
This is greedy mostly in the sense that we aren't picking an edge, at 3a, which has already been used: we're not locally optimizing an objective function, we're simply not doing a stupid thing of making a local choice which would ruin the Eulerian-ness.
• Here is some code we wrote together in class yesterday which implements this algorithm (should be in the class Graph) which starts at some chosen vertex $v$:
def greedy_euler(self, v):
path = []
while True:
successful = False
for e in self.edges:
if v in e:
if e in path:
continue
if len(e) == 1:
w = v
else:
list_version = list(e)
for a in list_version:
if a != v:
w = a
break
path.append(e)
v = w
successful = True
if not successful:
break
if len(path) != len(self.edges):
return ("Failed",path)
return path
• Please use this [probably tinkering with as seems appropriate] to explore. In particular, make lots of examples, and try to answer the questios:
• Does this algorithm work?
• Always?
• Sometimes?
• When does it work? Give conditions on $\Gg$ and $v$ which make it work.
• Can you improve the above algorithm — maybe add another condition in the loop? loop in a different way? If so, give new answers to the above exploration questions.
• [Think of this exploration as being what Euler would have done if he had a laptop and Python, back in 1730.]

#### Week 11

• :
• HW8 is due, consisting of the following:
1. Implent the is_connected method mentioned this past Thursday. Note that this takes no arguments (except for self as it is inside the class definition), it should be a property of the whole graph and not just two particular vertices. Make two versions of this code:
1. One version which simply loops through all pairs of vertices in the graph and computes the distance between them, and says the graph is connected if all those distances are finite, while it says the graph is not connected if one (or more) of the distances is inf.
• The algorithmic approach where you simply try all possibilities to see if something is true or to solve some problem is called brute force.
For example, if you try to break into someone's account by trying all possible passwords, that would be a brute force attack.
This approach to is_connected() uses a brute force approach. It's time complexity is pretty bad, since it involves doing $n\cdot(n-1)$ distance computations, each of which is quite complex.
2. A slightly faster method was suggested by Joey in class: pick one vertex, call it $v$, and loop through all others, seeing if the distance to them is all finite. If so, that would mean you could get from any vertex $a$ to $b$, by going from $a$ to $v$ and then from $b$ to $v$. The time complexity here is still pretty serious, doing $n-1$ fairly expensive distance computations.
3. The second version should be more intelligent and much faster... using some approach you dream up yourself! [But, hint: you can use quite a bit of the structure of your distance method.]
Explain the idea of your more efficient approach, give pseudocode, actual code, and some examples of it running correctly. Discuss its time complexity. [Send this stuff to your instructor.]
2. Implent the degree(v), in_degree(v), and out_degree(v) methods mentioned this past Thursday. Send the code and some examples of it running correctly to your instructor.
3. Do the exploration suggested above, last Friday, at the end. Send to your instructor any code you use, examples which guided your exploration, any new code you write, any conclusions you come to about which graphs have Eulerian paths and how to find them.
One thing that might help your exploration would be to make a version of your display method in the [di]graph class which takes a path [a list of edges] as an optional argument. If that argument is present, it should do the usual circular graph display, and then over-write the edges from the path in another color. E.g.: this bit of Python3 dialog
>>> from graph import *
>>> m
array([[0, 1, 0, 1, 0],
[1, 0, 1, 0, 0],
[0, 1, 1, 0, 1],
[1, 0, 0, 0, 1],
[0, 0, 1, 1, 1]])
>>> g=Graph()
>>> g.display([{"A","D"},{"D","E"},{"E"},{"E","C"}])
would make the image

Note that the version of display which made the image here used two extra wrinkles:
1. an extra argument color='m' in the ax.arrow command — m stands for "magenta" — that makes the edges, and in an ax.annotate command which adds the index in the path of each edge, when drawing in the path
2. the block of code
ax = plt.gca()
ax.set_xlim((-1.6,1.6))
plt.setp(ax.get_xticklabels(), visible=False)
ax.xaxis.set_tick_params(size=0)
ax.set_ylim((-1.6,1.6))
plt.setp(ax.get_yticklabels(), visible=False)
ax.yaxis.set_tick_params(size=0)
ax.set_aspect('equal')
to set up the axes without tick marks or numbers [because they are irrelevant to the graph information] and the aspect ratio
If you get stuck on any part of this assignment, don't hesitate to email your instructor for help or to ask for bits of useful Python code.
• In class, we discussed some basics of matrices and vectors, including definitions and basic operations including
• matrix multiplication
• multiplication of a matrix and a vector
• the transpose of a matrix
• :
• Some content we discussed:
• eigenvalue and eigenvector of a matrix
• importance of the dimension of the set of eigenvectors with a particular eigenvalue
• The set of vectors consisting of all eigenvectors for one particular eigenvalue $\lambda$, together with also the zero vector $\vec{0}=\left(\begin{matrix}0\\\vdots\\0\end{matrix}\right)$, is called the eigenspace for eigenvalue $\lambda$.
• The eigenspace for the eigenvalue $0$, if $0$ happens to be an eigenvalue, is also called the nullspace of the matrix.
• Fact: $n\times n$ symmetric matrices [remember: that means a matrix $A$ which equals its transpose $A^T$; also, the adjacency matrices of graphs — not digraphs! — are always symmtric] have nice properties with respect to eigenstuff:
• They have $n$ real eigenvalues (well, when you count them each with it's multiplicity, which is the dimension of the corresponding eigenspace).
• Eigenvectors corresponding to distinct eigenvalues are always orthogonal [which is just the fancy way of saying perpendicular].
• The degree matrix of a graph with $n$ vertices is the $n\times n$ matrix which has zeros everywhere except on the diagonal and whose $i^\text{th}$ diagonal entry is the degree of the $i^\text{th}$ vertex.
• E.g., the degree matrix of this graph
x is $$D=\begin{pmatrix}2&0&0&0&0\\0&2&0&0&0\\0&0&4&0&0\\0&0&0&2&0\\0&0&0&0&4\end{pmatrix}$$ (numbering the vertices in alphabetic order).
• The Laplacian matrix of a graph is the difference $L=D-A$, where $D$ is the degree matrix and $A$ is the adjacency matrix of the graph.
• E.g., the Laplacian matrix of the graph above on Monday this week is $$D=\begin{pmatrix}2&-1&0&-1&0\\-1&2&-1&0&0\\0&-1&2&0&-1\\-1&0&0&2&-1\\0&0&-1&-1&2\end{pmatrix}$$ (numbering the vertices in alphabetic order).
• We did a(n extremely) small amount of exploration, and stumbled upon this conjecture: The dimension of the eigenspace corresponding to the eigenvalue $0$ of the Laplacian matrix of a graph tells us the number of connected components of the graph.
• :
• Networks were in today's Math with Bad Drawings:
• We discussed Major Project 2 and problems were assigned! The three projects which were assigned (in some random permutation) to students were:
1. The traveling salesperson problem: find the tour of $n$ cities which minimizes the total travel distance, where the distance is known for all pairs of cities. It is known that this is a very hard problem, but please code both
• A brute-force approach which tries all possibilities — which will be impractical for collections of even a moderate number of cities; and
• some greedy heuristic approach, to see if it is much better, and when it works.
Here are some resources which are useful in understanding this problem and algorithms to solve it:
• this is a basic place to start
• here is one which outlines also a dynamic programming approach
• here is some real-world data to test your code on!
2. The problem of finding a minimal spanning tree in a given graph, where we are using the definitions:
• A tree is a graph which is connected and has no cycles, i.e., no closed, non-trivial paths.
• A spanning tree is a subgraph of a given graph which is a tree and which includes all of the vertices of the original graph; i.e., it throws away enough edges to get rid of all cycles, but remains connected and touching all vertices.
Here are some resources which are useful in understanding this problem and algorithms to solve it: ...or find your own!
3. The problem of ordering the vertices of a (possibly weighted) graph by some prioritization scheme. Two to include for sure are:
• The Google PageRank algorithm, which gives each vertex some kind of prestige by giving it parts of the prestige of the other vertices link to it. This is used to present to searchers the most prestigeous page among the thousands (usually) which match their search terms. The success of this algorithm is part of why Alphabet, the parent company of Google, is worth a total of
• Betweenness centrality measures how much vertices would act as conduits for flows of information or resources if the graph were a network of social relationships between people or organizations (e.g., finding key individuals in a terrorist cell when their phone records are known).
• At least one approach to PageRank (and not the one whose Python code is in the Wikipedia page!)
• Computation of the betweenness centrality of vertices — feel free to use a brute-force approach which will be quite slow on large graphs, or, if you like, code a more complex algorithm of your own devising or one you find on the 'net.
Here are some resources which are useful in understanding this problem and algorithms to solve it:
• this is a place to start for PageRank while this is good for betweenness centrality
• this has some high-level descriptions for centrality
• while this scientific paper describes a faster algorithm for betweenness centrality
• this is another nice description of PageRank
• while the original paper on PageRank is quite accessible.
Whichever algorithm you work on, the format of Major Project 2 is quite similar to MP1:
1. you should write up (in LATEX) a description of the problem, issues of efficiency, and the solution(s) you will be coding
2. you should code at least two approaches to solving the problem (see the problem descriptions, above, for some specifics for two of the three problems)
3. you should prepare a 10-15 minute presentation on your problem, its solution, and showing (some of) your code in action
Please email your write-up (LATEX and PDF), code, and slides (LATEX and PDF) to your instructor on the due date, next Monday
• :
• In-class time devoted to working on and consulting with each other and instructor about your Major Project 2.
• :
• In-class time devoted to working on and consulting with each other and instructor about your Major Project 2.

#### Week 12

• :
• Group work, discussion, debugging, feedback to improve student work on Major Project 2.
• :
• More group work, discussion, debugging, feedback to improve student work on Major Project 2.
• :
• Major Project 2 is due. Send to your instructor the parts listed above, last Wednesday.
• Student presentations for Major Project 2 will happen during class time.
• :
• No in-person class meeting.
• :
• Starting a unit on cryptology! Basic terms we mentioned (definitions in the reading for Monday) were:
• encryption
• decryption
• cryptosystem
• key
• cleartext
• ciphertext
• keyspace
• We discussed on particular cryptosystem, the Cæsar Cipher, which works by moving every letter in the cleartext 3 letters down the alphabet (if you are Julius Cæsar; if you are Augustus Cæsar, you move 25 letters down the alphabet).
• Here is some code to make a version clean_string consisting of all, and only, the letters, put in lower case, taken from the characters from an original string input_string:
clean_string = ""
for c in input_string.lower():
if c.isalpha():
clean_string += c
This may be useful in implementing cryptosystems in Python.

#### Week 13

• :
• Read the introduction to Chapter 4 and sections §§4.1 & 4.2 to be found starting here, and submit T&Q20 responding to this reading.
• HW9 is due, consisting of the following:
1. A function caesar(string, key) which returns the Cæsar encryption of the given string using the given key (in the rôle of the 3 and the 25 that Julius and Augustus Cæsar used). Note you should first remove the non-letter characters from the string and make it entirely lower case.
2. A function we will use in class on Monday to build an automated Cæsar decrypter: let_freqs(filename). This should
• make an array counts of size 26 filled with zeros
• open the file filename for reading
• process the open file line by line (see example code above for how to do this)
• for each line, process each character
• for each character, skip it if it is not a letter
• convert it to lower case if necessary
• convert it to a number by 'a'=0, 'b'=1, etc.
• increase by one the value in the counts array at the location given by that character number.
• when done processing the file, total up all the values in the counts array
• return an array of floats containing the values in the counts array, each divided by the total
• We discussed cracking the Cæsar cipher, which goes like this:
• Start by computing E the letter frequency distribution of standard English by using the code from HW9 on some large piece of English text.
• Input: c a ciphertext which was Cæsar-encrypted with an unknown key
• Do the following:
1. generate all possible decrypted plaintexts p by using all possible keys k
2. for each, compute its letter frequency distribution f
3. compute the distance d between f and E
4. report the key k and plaintext p which has the smallest d
• Output: that k and p.
• The last piece needed to make the above Cæsar cracker work is a way of computing the distance between two frequency distributions. One easy way is to use the square of the Euclidean distance in 26-dimensional space. That is, for lists $a=[a_0,\dots,a_{25}]$ and $b=[b_0,\dots,b_{25}]$, use the (squared) distance $$\sum_{i=0}^{25} \left(a_i-b_i\right)^2\ .$$
• We also discussed the Vigenère cryptosystem, which is basically just $n$ copies of Cæsar used in parallel. That is, given a key which is a list $k=[k_0,\dots,k_n]$ we do Cæsar encryption of the first letter of the plaintext with key $k_0$, the second with key $k_1$, etc. moving down the plaintext and the key together. When we run out of key, we simply start over at the beginning of the key and keep going down key and plaintext together again.
• How to break the Vigenère cryptosystem? Well, just break up the ciphertext into $n$ pieces, consising of every $n^{th}$ letter starting with the first, or the second, or so on, and do automated Cæsar cracking on each piece on its own.
The problem with this is that one doesn't know what $n$ to use! So in practice, one loops through the values $n=1$ then $2$, $3$, etc., each time trying to crack as just described.
• The large-key limit of this process is when the key size is exactly the same size as the message, in which case we call it a one-time pad. One-time pads are annoying to use because you need to share a lot of key digits with your communications partner -- and you must never use the same bit of pad again (hence the name) or else you run the risk of both messages being easily cracked -- but one-time pads are completely secure! so sometimes it is worth it.
• :
• Skim this page [skim only! it is way too long and complicated for close reading at this point!] and submit T&Q21 responding to this reading.
• We discussed some background and basic examples of (actual, currently used) symmetric cryptosystems.
• Mentioned how often such cryptosystems deal with chunks of plaintext at a time, so are called block ciphers.
• When working with a block cipher, if the message size is not a perfect multiple of the block size, we often have to pad the last chunk of message to make a full block -- this has some surprising intricacies which need to be carefully addressed if the padding is not to decrease the security of the cryptosystem.
• In addition, it makes sense to have the enryption of the separate blocks not be entirely independent of each other — which, when it is done, is called Electronic Codebook Mode [ECB] — so instead one tends to cut pieces of blocks and combine them in various ways with following blocks, so that the encryption will abe interconnected; this is called block chaining.
• Mentioned also how having the same plaintext always encrypt to the same ciphertext can be a security problem, so one tends to put a little bit of randomness — called cryptographic salt — into every encryption. For example, one can start the block chaining with a bit of randomness called the initialization vector.
• Discussed some of the history of symmetric crypto, from the first official, US-government recognized one DES (Data Encryption Standard), through the modest improvement of 3DES ("triple DES"), to the current AES (Advanced Encryption Standard). Others which are considered pretty good include Blowfish and RC2.
• Apparently, students can use pip to install a module called Cryptodomex which has some useful examples of symmetric cryptosystems. See this page for installation hints if you get stuck. In fact, that whole site has lots of useful documentation.
• Please install that package and encrypt a message of your choosing (maybe ask a question, so you can get a reply) to your instructor using AES. The block chaining method you should use is ECB, and once your import AES from the crypto package, you make an AES object by
k=b'asdfjkl;asdfjkl;asdfjkl;asdfjkl;'
obj=AES.new(k, AES.MODE_ECB)
ciphertext=obj.ecrypt(plaintext)
Yes, please use that stupid key b'asdfjkl;asdfjkl;asdfjkl;asdfjkl;' when you encrypt — that way, your instructor will be able to decrypt the message.
Note the hardest parts of doing this may be to get the plaintext into the form of a byte string (something that looks like that key), have it be the right size (the AES blocksize is 32 bytes), and get the ciphertext into an email to your instructor. (For that last part, you might open a file, do a raw write of the bytes to the file, and then send the file as an attachment to your instructor).
• :
• No in-person class meeting. Instead:
• Please use this time to finish your implementation of your automated Cæsar cracker, the basic Vigenère encrption, and, if you have time and desire, a Vigenère cracker.
• Please send your finished Cæsar cracker and Vigenère encrption to your instructor before you leave for Thanksgiving break. Include your Vigenère cracker, too, if you did one, for extra credit.
• For reference: §4.3, to be found starting here, has more detailed discussion of Vigenère cracking.
• The above will count as our last homework, HW10.
• :
• Discussed basics of asymmetric cryptosystems, giving in particular the example of the RSA cryptosystem.
• Also briefly mentioned how an asymmetric cryptosystem can be used to put a digital signature onto a message: that's another piece of data on a message that a recipient can use -- together with the sender's public key and the message itself -- to verify that the sender indeed is who they said they were (or, at least, they are someone who has access to that person's private key).
• :
• Read §§4.4 & 4.5 to be found starting here, and submit T&Q22 responding to this reading.
• More on the RSA cryptosystem, particularly putting together the ingredients for a toy implementation:
• You will need to test if some natural numbers are relatively prime, meaning that they have no non-trivial common factors. The way to test this for two integers a and b in Python is
from math import gcd
if gcd(a,b) == 1:
print("They're relatively prime.")
else:
print("They are NOT relatively prime.")
since having a non-trivial common factor means their greatest common factor is not just 1.
• Act first as Bob: Pick a pair of primes p and q. Define n=p*q and phi=(p-1)*(q-1). Pick a number e in range(phi) which is relatively prime to phi [it actually doesn't matter if e has very few 1 bits in its base-2 expression, since we won't be doing any big computation for which we need to use clever, fast algorithms for modular exponentiation]. Print out n and e.
• Still as Bob: to find the decryption key, we need to find the inverse of e mod phi. Do this by brute force (with large primes, you would have to be much more clever): loop through all of range(phi) and find a number d such that (d*e)%phi == 1 — that will be the decryption key d.
• Now act as Alice, take a plaintext and break it into pieces which can each be treated as integers in range(n). The easiest way to do this is to make the pieces simply be the characters of the string taken one at a time, and turned into a number by doing ord(c). That will work if n is begger than 256; if n is bigger than 26 but smaller than 256, you might have to skip non-letters and use ord(c.lower())-ord('a'). If n is less than 27, you should probably start over.
• Still as Alice, for each piece m of the plaintext, compute (m**e)%n (which is the Python for $m^e\pmod{n}$). Print out (or raw write to a file, if you prefer -- may be easier with non-printable characters) all of these encrypted pieces of the plaintext, which together form the ciphertext.
• Go back to being Bob and decrypt the message. To do this, you must break up the ciphertext in the same way Alice did, and for each piece (converted to a number) c, do (c**d)%n (which is, of course, $c^d\pmod{n}$). Reassemble these pieces to get the complete plaintext back.
• Please make a file which shows all of the steps of doing the above process in Python, and send it to your instructor ... some time before the end of the semester, not necessarily today, if you are not finished.
• We also talked about Man-in-the-Middle Attacks and how to prevent them with a PKI or web of trust.

#### Week 15

• :
• Some content we discussed:
• :
• Some content we discussed:
• :
• Some content we discussed:
• :