Review of “Racing the Beam” by MIT Press

Racing the Beam by Nick Montfort and Ian Bogost, published by MIT PressThe difficulty is legendary, yet I can’t say I ever truly “understood” the troubles in programming an Atari Video Computer System, or Atari 2600 as it is better known. MIT Press’ new series “Platform Studies” aims to choose a target system and examine how the construction of that platform dictates the growth and maturation of its software. Great artists struggle within the confines of their chosen medium, and this is just as true in software as in physical media.

Their first examination is the Atari 2600 and its “television adapter interface” (TIA) in the book Racing the Beam by Nick Montfort and Ian Bogost. To say I was enthralled is a complete understatement. I found the book a quick, enjoyable read that didn’t get bogged down too much into the arcane, while not shying away from digging deeply into the machinations of the TIA to show the challenges faced by developers on that system.

One would be hard pressed to find a developer today who really understands what is happening behind the code she writes. We’re so accustomed to writing to software APIs which themselves are written upon a software API which is itself built upon a software API which is built upon an OS which is built upon a kernel. The 2600 and its TIA were basically a direct link between program and television electron beam. The system had no operating system and so each game essentially became a custom-tuned, handcrafted series of assembly language calls that did one thing and one thing only: play one particular game.

Reuse of concepts was important, but reuse of code seemed to be almost an impossibility. Some code relied upon happenstances of other code to feed registers appropriate information at appropriate times. When you have a system with 128 bytes of RAM (!) the programmer must essentially try to outrun the television scan-line, feeding it the color and position of pixels in a kind of JIT manner.

As the system could not hold an entire line of data in memory, let alone an entire screen, we must surely appreciate the efforts that looked at these limitations as challenges to overcome, not barriers to creativity. Consider the game Pitfall! (one of six games covered in the book) and its 256-screen wide map of exploration. Now consider that the entire map (which is location consistent, not random), its graphics, its gameplay, its sound effects… all of it fit into 4K.

Also consider that another of the profiled games, Adventure, fit in 4K as well and only had a map of about 30 screens (warning, linked map graphic at 30K is 8.5x larger than the entire original game). Comparing the world size and graphics fidelity, we can clearly see that programmers gained more and more mastery (and by mastery here I do truly mean a deep understanding and ability) of the machine. Racing the Beam seeks to illuminate this process for us and does so very well. Perhaps the highest compliment I can pay the book is to say that it sparked in me a desire to tinker with some Atari 2600 code. Not so much because I wish to make a game for it, but more to appreciate that as a programmer I am always programming a machine, no matter how abstracted that level has become for me. To appreciate that, I feel I must actually do such a thing at least once.

For understanding the machine behind the code, I must also recommend the No Starch Press series Write Great Code. I found the books to be well-written and also found myself challenged to reconsider some of my daily programming challenges in terms of the machine. Not from an implementation point-of-view, but more in the sense of, “Do I actually know what the code I’m writing does?” As an intellectual pursuit, considering how I would cram the state of a chessboard into as few bits (I do mean bits, not bytes) as possible proved almost liberating in its tightly focused scope.

I also felt that Racing the Beam helped me put into words what I find so boring about video games and media these days. It really boils down to there not being any limits nor boundaries to consider during the creation process. There is no longer any real reason to create art that looks abstract when it is so EASY to make it look real. Just take some digital photos, scan in some 3D geometry, map the textures and you’re done. We can see this lazy approach by just looking at the plethora of look-alike games and look-alike graphics. The “gritty, urban” environments that dominate the game worlds, the “gritty, urban, stubble-headed” characters that populate them, the orchestral soundtracks, the lifelike sound effects… all of this stuck onto a Blu-Ray disc that holds 30GB. There’s not really any constraint in storage, and while the Cell processor of the PS3 is difficult to program, it isn’t really presenting “limitations” unless absolute reality is your goal (and it often is, and so it is seen as being challenging).

Basically, we’re at a point in game design where if you can think of it, it can be done. Why put much deep thought into a project concept when your first concept is instantly possible? There is nothing to struggle against. Nothing to harden yourself against. No boundary to constrain your vision. No limit. This seems like such a great idea, and yet Barry Schwartz’ The Paradox of Choice builds a compelling argument that it may not be all its cracked up to be from the consumer level. Perhaps this also holds true at the creator level?

And yet, on a machine like the Atari 2600 with its many limitations, entire genres were born and an entire language for gameplay was created. I can’t help but wonder what a generation raised on games that are literal interpretations of reality must build in the future. When I can’t tell the difference between Madden 2010 and a televised game, I have to wonder if this is actually progress. Isn’t going back to reality the regressive state? I am all too aware of the games that try to do something different. We all know this short list by heart (Rez, Space Channel 5, Portal, World of Goo, Ico, Shadow of the Colossus, et al)  precisely because it is so short.

I would love to see an iPad enhanced version of this book, taking cues from the The Elements for the iPad and AppStar GamesTechnical Wizardry series for the iPhone. Seeing living representations of the machine and its code, giving the reader an opportunity to tinker and tweak register settings, then seeing live changes in a running game would really get some of the more technical points across. All said, it is a great book and a must-read for anyone who programs and has an interest in video gaming. If you’re older, you’ll recall the system fondly and learn a lot about a machine you thought you knew. If you’re younger, seeing the roots of gaming’s visual vocabulary may kick your thinking off in new directions and build an appreciation for the deep, rich history this young hobby already enjoys.

Meeting the “Minimum Acceptable User Experience”

While working on file_wrangler_2 I’m constantly amazed at the number of new technologies I need to learn. There is a bar that is yea-high (when I say “yea”, I’m putting my hand about eye-level) that effectively sets the level of expectation Macintosh users have for their software. Apple has raised that bar considerably, by adding polish and sheen to every tiny aspect of the user experience, often creating new concepts in user interaction for a standalone program. We look at Garage Band vs. iTunes and see that they are not afraid to let function dictate design, even if it is unlike anything we’ve seen before. I don’t really have a problem with this.

What I can say, as an independent, solo developer is that in order to give Macintosh users the experience they have grown to expect out of their software requires one person to know a LOT of different things. This both fascinates and frustrates me as a Cocoa developer as a few key companies and products (let’s start with Apple on that list) have raised the minimum acceptable user experience (MAUE) to almost absurd levels. Macintosh and iPhone/iPad users expect a beautiful, polished, “lickable” icon, multi-lingual support, and an interface that is intuitive and even a little bit fun. Objects are expected to look “real”, so if there is a tuner knob it should look like a photorealistic tuner knob, and wood should not repeat its grain pattern and look shiny and hand-crafted, and the pages of books should bend and fold like real paper. Like real paper! So now, if I want to make a killer e-book reader with features I’ve not seen in any other program, I have to know the physics of page turning just to compete with my lowest competitor. In this regard, the graphical polish has become a key feature.

This, for me, is a bit of a problem. (Interestingly enough I would not say users expect “robust documentation”, which I believe illustrates my point exactly.)

Think about how much time/talent that takes for the graphic assets alone. Oftentimes these user expectations are not easy (and even impossible) to make with Apple’s pre-built interface tools. Look at the scrollbar in iTunes. That isn’t built into the developer kit, that is a custom graphic set applied to the NSScroller class. I now need to understand this class and learn how to create custom graphics and stitch them together if I want that in my own app. This is a non-trivial amount of research and experimentation for a one-man operation. More to the point, using ONLY Apple’s pre-built interface tools can make an application look rigid, stuffy and stuck in the past.

This, for me, is a bit of a problem.

It isn’t that I don’t have a great deal of fun digging into multiple APIs to find out how to animate, filter, draw, preview, etc…, nor do I dislike the learning process (as evidenced by my last blog post). However, meeting the MAUE does significantly increase time-to-develop and if you’re one man working alone, that time is incredibly important. Can I really afford a four-month development cycle to balloon to six months in an effort to meet the MAUE. That’s the difference between three software releases and two, which could be the difference between “barely surviving” and “drowning”. I can’t afford NOT to meet the MAUE, but it also severely restricts my ability to get products into my users’ hands in a timely manner.

This, for me, is a bit of a problem.

It also begs the question of, “Can a one-man operation succeed any longer?” Can a two-man operation succeed? There are examples of this happening, even today, but without a big hit (like “Delicious Library” or “World of Goo”) it can be more-than-a-little daunting. Over time I’m certain I will develop internal code libraries and have a deeper understanding of the “important” APIs to reduce the time of reaching the MAUE. However, I am surprised almost daily about the breadth of knowledge necessary to achieve what seems to be such a narrow process as “renaming a batch of files.”

Three Left Turns Make a Right: Learning to Learn

This has been a very productive week for me as a programmer, and I mean that in the gestaltist sense of the word. Java skills? Level up! Applescript skills? Level up! Program design skills? Level up! Cocoa skills? BIG level up! Perhaps the aspect of programming that I enjoy the most is the iterative process of:

  1. Have an idea.
  2. Can’t figure out how to execute the idea.
  3. Study the concepts behind the idea.
  4. Test my understanding.
  5. Implement the idea.

It really is a constant learning experience, and that is what I find enjoyable. The other day I was asked by someone new to Cocoa, “How do you remember all of those methods? There’s thousands of them!” He seemed frustrated by the perceived vertical wall of learning ahead of him. The truth of the matter is, I DON’T remember all of the Cocoa API calls, nor the Core Foundation calls, nor the Core Image calls. What I enjoy about programming is that I oftentimes don’t need to remember all of these things nor do I feel that a good programmer should be expected to recall arcane API calls at the drop of a hat.

What is more important, in my opinion, is learning how to discover what I need to solve a problem. In other words, learning how to learn. In this way, it is more important to think about my programs in terms of objects, to consider Apple’s use of design patterns, to read sample code, and to develop one’s own style. By internalizing these concepts I begin to understand the shared language that programmers use to discuss programming and subsequently learn how to convert my thoughts into the appropriate question. I don’t mean Java vs. Objective-C vs. C#, when I say language. I mean using terms like “notification” when talking about “reducing coupling” within a Cocoa application. Consider that I first need to know about the existence of the concept of coupling. Then I need to know that “coupling is bad” from an object-oriented design sense (please, allow me that over-simplification for a moment). Then I need to know that the “observer” design pattern addresses the problem of loose coupling through its publish-and-subscribe methodology. Then I need to know that Apple implements the NSNotification related classes as a robust Cocoa implementation of this pattern.

Of course, that is not typically the order we learn about such things, is it? These days, many will read Aaron Hillegass’ book because they want to cash in on iPod/iPad mania. They learn about NSNotificationCenter and NSNotification objects, while getting a nice overview of Apple’s design philosophy behind Cocoa, but Aaron very much makes it clear that his book teaches Cocoa, not programming. Then vaguely-worded questions are posted to the newsgroups or Stack Overflow while the beginner attempts to formulate his questions, which are then clarified and re-clarified and answered and re-answered and, well, I think we all understand the learning process.

I was reminded this week, while working on file_wrangler_2 just how much there is to know and understand. Not just about the Cocoa APIs but about the process of program design. In developing the base and subclasses for the FWFilter objects in the program, I realized that suppositions I had about how it would work in the interface were challenged at every design decision. Ultimately I came up with a new programming model that I feel closely mirrors the user’s mental model of what would happen, but the trip to get to that point required a lot of research, a lot of consideration, and a lot of mistakes.

The FWFilter class allows one to target a subset of files and folders that have been added to the interface. So, one may drag in a giant folder of stuff, but is only interested in all Microsoft Word documents created after April 1, 2009. This seemed to me to have a fairly obvious solution whereby the user would add an “extension filter”, and a “date filter” to the filter well. After doing research on robust filtering in Cocoa, NSPredicate seemed the way to go. Here’s a case where I understood enough about Cocoa to ask the right question and found this class to be perfect. Each filter would create an NSPredicate representing user choices in the interface. The full chain of filters would be combined into an NSCompoundPredicate and “Bob’s your uncle” as they say.

NSCompoundPredicate allows us to create AND, OR, and NOT compound predicates. It seemed fairly obvious that the user would want those files that intersected the “.doc” extension and the modification date and so I created an AND predicate. All was right with the file_wrangler_2 world until I decided to add two date filters to the interface.

Suddenly the assumptions I had made about the filter class were challenged and even upended.

If I have a date filter for everything created April 10 and another for April 15, but also only want to see Microsoft Word documents, from a predicate boolean point of view this means we want everything in ((modification date = April 10 OR = April 15) AND = .doc). What if I put in two date filters, one set to creation date later than April 1 and another set to earlier than May 1? Clearly we’re trying to define a range, so now I want ((creation date > April 1 AND creation date < May 1) AND extension = .doc). The logic behind AND’ing at certain times and OR’ing at other times were getting very confusing and the more I tried to code my way around the matter, the more confusing the whole process became.

This is where being a scrum of ONE has some drawbacks. On paper it all made so much sense, and the code worked beautifully (including the notification system in place to keep the filtered file list accurate) but in practice it made little sense at all. Luckily, I am nothing if not tenacious and set about tackling the problem at a deeper level. First I had to learn that a search for a date is not a specific thing. If someone wants files created on February 1, what she REALLY wants is everything created from the time 00:00:00 to 23:59:59 of February 1. In other words, she’s looking for a range, even in the cases where she specifies an “exact date.” Less than just means from [NSDate distantPast] to the target date.  Greater than means the target date to [NSDate distantFuture].

Second, I had to redesign the DateFilter.xib file to accommodate a range. I thought I would be able to handle this by simply adding two filters, but it proved not to be true. By designing the interface to handle ranges, I didn’t have to worry about trying to “guess” whether a user intended a range or not.

Then, suddenly, the mechanism fell into place. When we consider user intent on the question, “Why is she adding two date ranges?” the intention, it seems to me, is because she wants to be inclusive. I want everything made in February and and I want everything made in April and everything made yesterday. (We must be very careful here not to confuse the boolean AND with the user model of “and”). It assumes a list that has nothing, but gradually increases with each filter of the same type.

When adding filters of differing types, the intention is to be exclusive. In other words, “Within that group of files from my date choices, ONLY show me those things that are .doc files and also show me .psd files.” In some of these cases, the boolean OR applies and so we can write her intentions thusly:

((creationDate >= Feb 1 AND creationDate =< Feb 28) OR (creationDate > March 31 AND creationDate < May 1) OR (creationDate >= midnight today AND creationDate <= 23:59:59 today))
((extension == .doc) OR (extension == .psd))

I know that looks confusing, but it boils down to this. Every date filter does an internal AND for its individual predicate. Every date filter in the user interface is then OR’ed against one another and combined into a total, composite date filter compound. This is done for each type of filter in the well where every filter of a like type is OR’ed against one another to create their individual compound predicates. For the full chain of filtering we AND every unique compound.

So, the new filtering mechanism works like a champ and I believe accurately reflects user expectations. However, this was a prime example of how my own understanding of these concepts and API usages had to evolve and grow. I also had to throw away a lot of code as I went off on more than one dead-end path of exploration. But, as in life, it is only through implementation that the truth of a problem can be revealed. What is the real problem I’m trying to solve?

Its been said before, but I’ll say again, there is really only one way to learn these things: write code. Over time you’ll find that there are many APIs you fall back on time and again and become loyal companions that you know intimately. Then, you’ll start to see patterns in how a class you’ve never used before feels comfortable because the method names and objects so closely mimic another class (like in my earlier post on QuickLook). Then you’ll start to expect those similarities and grow to rely on using this knowledge to minimize your time of discovery on new problems. But even a program as conceptually simple as file_wrangler_2 reveals the need to continually research, iterate, and try new things every single day.

Encapsulating the Worker: An Object-Oriented Approach to Business Workflow

Let me clarify for those who aren’t familiar with object-oriented programming principles that “encapsulating” in the title of this blog is a software term, not a suggestion to put your workers into “boxes.” A somewhat obtuse description can be found in its Wikipedia article. A more friendly, illustrated writeup may be found here.

The fragility of a company’s workflow can be easily determined by asking yourself, with each employee in turn, “What if he quit tomorrow?” Envision what kind of chaos would ensue. Think about the knowledge-base that would go with him. What aspects of the workflow and business are known by that person alone? What documentation would teach the next person how to do that job?

An important consideration in evaluating a business’ workflow is in understanding that the creation of information is up to the humans, the storage and manipulation is up to the computers. Understanding how to manipulate the knowledge as it processes through a workflow should not be locked away in anyone’s brain. This is where natural, easy-to-understand software to capture that knowledge comes into play. It is critically important that we separate those tasks that are easy for humans but hard for computers from those things that are easy for computers but hard for humans. I want to be clear here that I’m really talking about internal, custom-built software that facilitates business workflow. Processes that require “the worker must be proficient at Photoshop” is simply identifying a skill that has become commonplace in many industries.

Much like good software design, good workflow design should follow a general principle of separating data from its interface. In the case of bad software injected into a workflow, the interface (i.e. – how to use the software) has become inextricably entwined into the data (i.e. – the knowledge the worker brings to the process). The shakier the software, the more these things are tangled and the more critically important that particular person becomes to the workflow. We all like to think of ourselves as being important, and we are, but things happen. Life happens. Change happens. This kind of scenario is completely inflexible to unexpected life events, and necessarily leads to fragility.

Suppose some obtuse, incomprehensible, held-together-by-duct-tape piece of custom software has become a critical point of failure in a workflow. “Mary uses that software every day, and she knows all of its quirks. Our workflow, while not ideal, continues to chug along,” tends to be the attitude of many businesses. The key consideration here is that the logic behind the attitude of, “If it ain’t broke, don’t fix it.” has a major fallacy. It is broken, but Mary is holding it together like the little boy sticking his finger in the dike and we only have the illusion of being “not broken.” Yes, we’re not leaking water but what happens when a tiger eats that little boy (stranger things have happened)?

Consider also how difficult it is to document what Mary does for the business. I seriously doubt that any company has documented all of the little tweaks and quirks and non-obvious uses of internally built software that Mary has discovered. We can describe the intent of her job, but we cannot ever fully know the minutiae she performs hourly. In other words, we know from her job description what data Mary holds, and if the workflow has any semblance of structure (that’s a big if) we know her public interface, but we don’t know her private interface. The problem here is that the company has forced the data and private interface to become one.

From an object-oriented point of view, we don’t want to know her private interface whatsoever. As Mary is using company-developed software, the company has injected process into the data at a very low level, then abandoned that for Mary to work out. This is exactly the opposite of what we want for a modular, contingency-based workflow. So, what are we to do? How can we separate these concepts such that if Mary wins the lottery and quits tomorrow, someone of equal skill level can step in and take over Mary’s position immediately?

In this case, our worker needs to be able to discover her private interface. A well written piece of software will make it very clear to the new worker which pieces of data are important, how to retrieve that data, and what the software will provide in return. The software should be portable (i.e. – not dependent on a particular computer system, if possible) with a clear interface that makes sense to someone with a particular set of knowledge or skills.

This also means steering away from cutesy or dated names for files, folders, software, or hardware. Don’t call the FileMaker database “General Tracker” or something equally vague, making it a kind of “god class” managing all things FileMaker. Don’t name the printers after the Dharma Initiative stations. At Macy’s West we had printers named after Alvin, Simon and Theodore and after 7 years I still couldn’t remember which printer was in which room. I understand the desire to “personalize” the organization and make things fun and less “corporate.” There are plenty of other ways this can be achieved, with the software icon, desktop wallpaper, the writing style in training manuals, and the general corporate culture. Giving things names that make sense to the workflow is not a suppression of creativity. Rather, it frees the worker from having to constantly remember details that should be self-documenting. “That printer is in room 4-012 because the printer name starts with 4-012,” makes much more efficient use of a valuable worker’s mental energies.

Now we can feel free to hire a sharp employee who possesses the data creation tools (i.e. – the intuition and creativity) we need and provide a piece of software that allows her to slot in, like a Lego block, into the organization. Well-written software helps her encapsulate her knowledge, protecting the organization’s business continuity during change and protecting the new worker from stumbling on the job, despite her excellence at her skills.

Now, let us think about this from the gestalt perspective. Does your company encapsulate your workers at all? Can you define the public interface for any given worker? Can you draw a map that shows, without knowing HOW the work is done or using individual person’s names, what your company does? Going back to the question I posed at the beginning of this blog, think about each employee and consider, “What if she quit tomorrow?”

The implication of this approach, of course, is that the entire workflow must be analyzed and encapsulated. Tools must be developed that allow the business to continue working despite the most dramatic turnover of personnel. Some of those tools will be part of the public interface for the worker (i.e. – all incoming JPG files should be checked in using the company-built software called Deliver_To_Art_Department) and some for her private interface (a drag-and-drop script on her desktop called WidgetCorp_Intranet_JPG_Converter). In fact, sometimes it may be as simple as just renaming an existing application.

As is all-too-often the case, I can hear the cries now, “But we don’t have the time. We’re TOO BUSY. When can we stop what we’re doing to reimplement our entire workflow?” There is no easy answer to this, but the first thing to consider is that you don’t have to do it all in one fell swoop. It is not an all-or-nothing proposition. It is a process that can be evaluated, considered, planned, constructed, tested, and implemented in phases.

What we can immediately understand is what the end result of NOT doing it can be. Continual process improvement is a mantra that has been shouted from the rooftops for many years now, but I can honestly say I don’t think many are listening to the message. It is easy to say and hard to do, but a good development team and a good business analyst can find ways to bring about change in an organic way that minimizes the short-term impact on business continuity while greatly increasing the longevity of the business. It is easy to think that such a thing is impossible until you’ve talked to experts in this realm.

Once implemented, the organization’s structure will be such that the entire process will not need to be re-engineered ever again. Once the business modules are defined and implemented, those modules can be reworked, reprocessed, restructured to our heart’s content without affecting the workflow chain as a whole. There is an implication here that a module may even be identified as redundant and removed entirely. As there is a human being inside that module, this is something to consider; however, there is also a supremely good chance that that employee is doing work that does not leverage her skills and may be able to contribute more meaningfully in other workflow bottleneck areas.

Short Hiatus

As we are all aware, there are bills and such to be paid. file_wrangler_2 development is on a one-week hiatus while I help a client repair their workflow software. I’ll try to make progress at night, but there are only so many hours in a day.

For now, rest assured that some very nice things have been implemented thus far and I’m very happy with the direction the project is going and the speed I’m seeing in the interface.