As I wrote just the other day, our study group has encountered the first really meaty part of SICP. Thanks to the combination of lecture notes, the books, and the 20-year-old videos of G. J. Sussman counting his parentheses, I have come to a few interesting conclusions about writing algorithmic programs.
I have no formal CS education. I am an autodidact. So some of the concepts and a lot of the methodology I’m encountering are unfamiliar. Some people claim that self-learning leads to narrowmindedness, but I believe just the opposite. Anyone who takes on SICP without the benefit of a classroom setting is willing to endure a lot of hard and often lonely work to win a broader understanding of programming and computer science.
Here’s what’s interesting (to me) about the week two material:
Numbers numbers numbers
I’ve spent too much time thinking computer programming was primarily about building websites on top of relational databases. Week 2 drives home that it’s really all about math, about numbers. It’s about F(n) and F(n+1), O(n), O(n^2), O(log n). This is strange and unfamiliar, because even though I’ve encountered these ideas before, they seem so far away form the work most web programmers do.
If you’re working with a framework, then that framework probably makes algorithmic decisions for you and hides it all behind its API and classes. Your regression testing is about demonstrating that your
Person class can take a value of “Bob Foo” for
name, not that your method for finding the billionth Fibonacci number is the greatest. You’re testing that your sort put “Bob Bar” ahead of “Bob Foo”, not whether it gets the job done in O(n) time.
That’s kind of sad, actually. Once I got over my dread of tackling two dozen exercises about problems in mathematics that people got famous for solving, I began to appreciate how lovely it is to work on these hard but beautiful problems. Employing mathematical induction to prove an algorithm works for all legal values of
n feels a lot more substantial (and more “proofy”) than writing a fleet of unit tests to check all the values I imagine could be passed as inputs for some interactive widget.
Does it have to be this way? Is there a way to make our programming more mathematical, more numeric — in concept at least — and hence more immediately suited to tools like induction, order of growth analysis, and so on? Can that be done in a way that preserves our ability to operate at a high level of abstraction. Or is this level of analysis limited to the design of systems and frameworks that the majority of other programmers will simply take and use as black boxes? I’m going to keep these questions in mind as I venture forth, and maybe I’ll find an answer.
Design really matters, but not up front
Speaking of theta of n, isn’t it remarkable how much difference the right algorithm design makes? When I tried to find the 650th member of the 1000th row of Pascal’s triangle using a procedure that employed tree recursion, I watched my mighty Mac Book Pro stand still for over a minute before I stopped the procedure.
But you aren’t expected to design the perfect algorithm up front. In fact, a number of the exercises for Section 1.2 ask you employ a recursive process first and then find an iterative way. Once you understand how the recursive process works, it’s much easier to modify it into an iterative process. If the recursive process for evaluating
b^n defers multiplication, then why not perform the multiplication prior to recursion? Once you’ve seen the line
(* b (exp b (- n 1)))
you can imagine a coefficient named
a and recast the problem as
a starts as 1 and is multiplied by
b in each iteration:
(exp-iter (* a b) b (- n 1))
This is better, but we’re not expected to think of it first, unless we’ve tackled this kind of problem before. The recursive process is more intuitive, and for a new kind of problem, it’s the proper starting place.
This step-by-step problem-solving method relates to a concept form the lectures that I like very much: wishful thinking. When approaching a new kind of problem, allow yourself to conceive of a construct, like a helper procedure, that could handle a reduction of the problem, and then proceed on planning the reduction of the problem as though the construct actually existed. This allows you to defer the challenge of actually implementing such a construct until later, when you’ll have had time to refine your understanding of exactly what your construct has to do. This kind of thinking is itself kind of iterative, and it also mirrors the design strategy of breaking big procedures down into smaller ones.
The fixed point
When I posted about the videos before, someone on reddit thought I was making fun of Sussman’s blackboard performance; but I’m actually really impressed with his teaching skills, and I’m loving the videos. From lecture 2a, the third in the series, comes Sussman’s exploration of the fixed point, or “the place where the value you give to a function equals the value that the function passes back.” As an example, he points out what happens when you put a calculator in radians mode and hit the cosine button repeatedly — the value on the calculator gets closer and closer to a specific number until it stops changing. That’s the fixed point. Some functions, like the cosine function, have the capacity to approach a fixed point if called recursively. Which is great when you have a problem whose solution lies at the fixed point: all you have to do is recursively call the function that homes in on that point until it gets close enough.
This is the how the square root method that the book, lectures, and videos put forward works, by recursively approaching this fixed point. Every time I hear “fixed point”, the letter Y appears in my thoughts, since Y is a “fixed-point combinator.” I’m now becoming dimly aware of the connection. When you supply the Y combinator with a function, it computes the fixed point of your function.
The fixed point illustrates the place of procedures in scheme as first-class data objects. You can pass procedures as arguments to other procedures. Learning how to employ this practice, I am convinced, is a milestone in the growth of any programmer. Still, it’s not immediately obvious how this works, how you employ it, and what its implications are. I’ve kind of just been “winging it” for a long time. I’m looking forward to getting a real grounding in it from this course.