Wednesday, December 06, 2017

Error handling in Upspin

The Upspin project uses a custom package, upspin.io/errors, to represent error conditions that arise inside the system. These errors satisfy the standard Go error interface, but are implemented using a custom type, upspin.io/errors.Error, that has properties that have proven valuable to the project.

Here we will demonstrate how the package works and how it is used. The story holds lessons for the larger discussion of error handling in Go.

Motivations

A few months into the project, it became clear we needed a consistent approach to error construction, presentation, and handling throughout the code. We decided to implement a custom errors package, and rolled one out in an afternoon. The details have changed a bit but the basic ideas behind the package have endured. These were:

  • To make it easy to build informative error messages.
  • To make errors easy to understand for users.
  • To make errors helpful as diagnostics for programmers.

As we developed experience with the package, some other motivations emerged. We'll talk about these below.

A tour of the package

The upspin.io/errors package is imported with the package name "errors", and so inside Upspin it takes over the role of Go's standard "errors" package.

We noticed that the elements that go into an error message in Upspin are all of different types: user names, path names, the kind of error (I/O, permission, etc.) and so on. This provided the starting point for the package, which would build on these different types to construct, represent, and report the errors that arise.

The center of the package is the Error type, the concrete representation of an Upspin error. It has several fields, any of which may be left unset:

  type Error struct {
      Path upspin.PathName
      User upspin.UserName
      Op  Op
      Kind Kind
      Err error
  }

The Path and User fields denote the path and user affected by the operation. Note that these are both strings, but have distinct types in Upspin to clarify their usage and to allow the type system to catch certain classes of programming errors.

The Op field denotes the operation being performed. It is another string type and typically holds the name of the method or server function reporting the error: "client.Lookup", "dir/server.Glob", and so on.

The Kind field classifies the error as one of a set of standard conditions (Permission, IO, NotExist, and so on). It makes it easy to see a concise description of what sort of error occurred, but also provides a hook for interfacing to other systems. For instance, upspinfs uses the Kind field as the key to translation from Upspin errors to Unix error constants such as EPERM and EIO.

The last field, Err, may hold another error value. Often this is an error from another system, such as a file system error from the os package or a network error from the net package. It may also be another upspin.io/errors.Error value, creating a kind of error trace that we will discuss later.

Constructing an Error

To facilitate error construction, the package provides a function named E, which is short and easy to type.

  func E(args ...interface{}) error

As the doc comment for the function says, E builds an error value from its arguments. The type of each argument determines its meaning. The idea is to look at the types of the arguments and assign each argument to the field of the corresponding type in the constructed Error struct. There is an obvious correspondence: a PathName goes to Error.Path, a UserName to Error.User, and so on.

Let's look at an example. In typical use, calls to errors.E will arise multiple times within a method, so we define a constant, conventionally called op, that will be passed to all E calls within the method:

  func (s *Server) Delete(ref upspin.Reference) error {
    const op errors.Op = "server.Delete"
     ...

Then through the method we use the constant to prefix each call (although the actual ordering of arguments is irrelevant, by convention op goes first):

  if err := authorize(user); err != nil {
    return errors.E(op, user, errors.Permission, err)
  }

The String method for E will format this neatly:

  server.Delete: user ann@example.com: permission denied: user not authorized

If the errors nest to multiple levels, redundant fields are suppressed and the nesting is formatted with indentation:

  client.Lookup: ann@example.com/file: item does not exist:
          dir/remote("upspin.example.net:443").Lookup:
          dir/server.Lookup

Notice that there are multiple operations mentioned in this error message (client.Lookup, dir/remote, dir/server). We'll discuss this multiplicity in a later section.

As another example, sometimes the error is special and is most clearly described at the call site by a plain string. To make this work in the obvious way, the constructor promotes arguments of literal type string to a Go error type through a mechanism similar to the standard Go function errors.New. Thus one can write:

   errors.E(op, "unexpected failure")

or

   errors.E(op, fmt.Sprintf("could not succeed after %d tries", nTries))

and have the string be assigned to the Err field of the resulting Err type. This is a natural and easy way to build special-case errors.

Errors across the wire

Upspin is a distributed system and so it is critical that communications between Upspin servers preserve the structure of errors. To accomplish this we made Upspin's RPCs aware of these error types, using the errors package's MarshalError and UnmarshalError functions to transcode errors across a network connection. These functions make sure that a client will see all the details that the server provided when it constructed the error.

Consider this error report:

  client.Lookup: ann@example.com/test/file: item does not exist:
         dir/remote("dir.example.com:443").Lookup:
         dir/server.Lookup:
         store/remote("store.example.com:443").Get:
         fetching https://storage.googleapis.com/bucket/C1AF...: 404 Not Found

This is represented by four nested errors.E values.

Reading from the bottom up, the innermost is from the package upspin.io/store/remote (responsible for taking to remote storage servers). The error indicates that there was a problem fetching an object from storage. That error is constructed with something like this, wrapping an underlying error from the cloud storage provider:

  const op errors.Op = `store/remote("store.example.com:443").Get`
  var resp *http.Response
  ...
  return errors.E(op, errors.Sprintf("fetching %s: %s", url, resp.Status))

The next error is from the directory server (package upspin.io/dir/server, our directory server reference implementation), which indicates that the directory server was trying to perform a Lookup when the error occurred. That error is constructed like this:

  const op errors.Op = "dir/server.Lookup"
  ...
  return errors.E(op, pathName, errors.NotExist, err)

This is the first layer at which a Kind (errors.NotExist) is added.

The Lookup error value is passed across the network (marshaled and unmarshaled along the way), and then the upspin.io/dir/remote package (responsible for talking to remote directory servers) wraps it with its own call to errors.E:

  const op errors.Op = "dir/remote.Lookup"
  ...
  return errors.E(op, pathName, err)

There is no Kind set in this call, so the inner Kind (errors.NotExist) is lifted up during the construction of this Error struct.

Finally, the upspin.io/client package wraps the error once more:

  const op errors.Op = "client.Lookup"
  ...
  return errors.E(op, pathName, err)

Preserving the structure of the server's error permits the client to know programmatically that this is a "not exist" error and that the item in question is "ann@example.com/file". The error's Error method can take advantage of this structure to suppress redundant fields. If the server error were merely an opaque string we would see the path name multiple times in the output.

The critical details (the PathName and Kind) are pulled to the top of the error so they are more prominent in the display. The hope is that when seen by a user the first line of the error is usually all that's needed; the details below that are more useful when further diagnosis is required.

Stepping back and looking at the error display as a unit, we can trace the path the error took from its creation back through various network-connected components to the client. The full picture might help the user but is sure to help the system implementer if the problem is unexpected or unusual.

Users and implementers

There is a tension between making errors helpful and concise for the end user versus making them expansive and analytic for the implementer. Too often the implementer wins and the errors are overly verbose, to the point of including stack traces or other overwhelming detail.

Upspin's errors are an attempt to serve both the users and the implementers. The reported errors are reasonably concise, concentrating on information the user should find helpful. But they also contain internal details such as method names an implementer might find diagnostic but not in a way that overwhelms the user. In practice we find that the tradeoff has worked well.

In contrast, a stack trace-like error is worse in both respects. The user does not have the context to understand the stack trace, and an implementer shown a stack trace is denied the information that could be presented if the server-side error was passed to the client. This is why Upspin error nesting behaves as an operational trace, showing the path through the elements of the system, rather than as an execution trace, showing the path through the code. The distinction is vital.

For those cases where stack traces would be helpful, we allow the errors package to be built with the "debug" tag, which enables them. This works fine, but it's worth noting that we have almost never used this feature. Instead the default behavior of the package serves well enough that the overhead and ugliness of stack traces are obviated.

Matching errors

An unexpected benefit of Upspin's custom error handling was the ease with which we could write error-dependent tests, as well as write error-sensitive code outside of tests. Two functions in the errors package enable these uses.

The first is a function, called errors.Is, that returns a boolean reporting whether the argument is of type *errors.Error and, if so, that its Kind field has the specified value.

  func Is(kind Kind, err error) bool

This function makes it straightforward for code to change behavior depending on the error condition, such as in the face of a permission error as opposed to a network error:

  if errors.Is(errors.Permission, err) { ... }

The other function, Match, is useful in tests. It was created after we had been using the errors package for a while and found too many of our tests were sensitive to irrelevant details of the errors. For instance, a test might only need to check that there was a permission error opening a particular file, but was sensitive to the exact formatting of the error message.

After fixing a number of brittle tests like this, we responded by writing a function to report whether the received error, err, matches an error template:

  func Match(template, err error) bool

The function checks whether the error is of type *errors.Error, and if so, whether the fields within equal those within the template. The key is that it checks only those fields that are non-zero in the template, ignoring the rest.

For our example described above, one can write:

  if errors.Match(errors.E(errors.Permission, pathName), err) { … }

and be unaffected by whatever other properties the error has. We use Match countless times throughout our tests; it has been a boon.

Lessons

There is a lot of discussion in the Go community about how to handle errors and it's important to realize that there is no single answer. No one package or approach can do what's needed for every program. As was pointed out elsewhere, errors are just values and can be programmed in different ways to suit different situations.

The Upspin errors package has worked out well for us. We do not advocate that it is the right answer for another system, or even that the approach is right for anyone else. But the package worked well within Upspin and taught us some general lessons worth recording.

The Upspin errors package is modest in size and scope. The original implementation was built in a few hours and the basic design has endured, with a few refinements, since then. A custom error package for another project should be just as easy to create. The specific needs of any given environment should be easy to apply. Don't be afraid to try; just think a bit first and be willing to experiment. What's out there now can surely be improved upon when the details of your own project are considered.

We made sure the error constructor was both easy to use and easy to read. If it were not, programmers would resist using it.

The behavior of the errors package is built in part upon the types intrinsic to the underlying system. This is a small but important point: No general errors package could do what ours does. It truly is a custom package.

Moreover, the use of types to discriminate arguments allowed error construction to be idiomatic and fluid. This was made possible by a combination of the existing types in the system (PathName, UserName) and new ones created for the purpose (Op, Kind). Helper types made error construction clean, safe, and easy. It took a little more work—we had to create the types and use them everywhere, such as through the "const op" idiom—but the payoff was worthwhile.

Finally, we would like to stress the lack of stack traces as part of the error model in Upspin. Instead, the errors package reports the sequence of events, often across the network, that resulted in a problem being delivered to the client. Carefully constructed errors that thread through the operations in the system can be more concise, more descriptive, and more helpful than a simple stack trace.

Errors are for users, not just for programmers.

by Rob Pike and Andrew Gerrand

Thursday, October 26, 2017

The Upspin manifesto: On the ownership and sharing of data

Here follows the original "manifesto" from late 2014 proposing the idea for what became Upspin. The text has been lightly edited to remove a couple of references to Google-internal systems, with no loss of meaning.

I'd like to thank Eduardo Pinheiro, Eric Grosse, Dave Presotto and Andrew Gerrand for helping me turn this into a working system, in retrospect remarkably close to the original vision.

Augie
Augie image Copyright © 2017 Renee French


Manifesto



Outside our laptops, most of us today have no shared file system at work. (There was a time when we did, but it's long gone.) The world took away our /home folders and moved us to databases, which are not file systems. Thus I can no longer (at least not without clumsy hackery) make my true home directory be where my files are any more. Instead, I am expected to work on some local machine using some web app talking to some database or other external repository to do my actual work. This is mobile phone user interfaces brought to engineering workstations, which has its advantages but also some deep flaws. Most important is the philosophy it represents.

You don't own your data any more. One argument is that companies own it, but from a strict user perspective, your "apps" own it. Each item you use in the modern mobile world is coupled to the program that manipulates it. Your Twitter, Facebook, or Google+ program is the only way to access the corresponding feed. Sometimes the boundary is softened within a company—photos in Google+ are available in other Google products—but that is the exception that proves the rule. You don't control your data, the programs do.

Yet there are many reasons to want to access data from multiple programs. That is, almost by definition, the Unix model. Unix's model is largely irrelevant today, but there are still legitimate ways to think about data that are made much too hard by today's way of working. It's not necessarily impossible to share data among programs (although it's often very difficult), but it's never natural. There are workarounds like plugin architectures and documented workflows but they just demonstrate that the fundamental idea—sharing data among programs—is not the way we work any more.

This is backwards. It's a reversal of the old way we used to work, with a file system that all programs could access equally. The very notion of "download" and "upload" is plain stupid. Data should simply be available to any program with authenticated access rights. And of course for any person with those rights. Sharing between programs and people can be, technologically at least, the same problem, and a solved one.

This document proposes a modern way to achieve the good properties of the old world: consistent access, understandable data flow, and easy sharing without workarounds. To do this, we go back to the old notion of a file system and make it uniform and global. The result should be a data space in which all people can use all their programs, sharing and collaborating at will in a consistent, intuitive way.

Not downloading, uploading, mailing, tarring, gzipping, plugging in and copying around. Just using. Conceptually: If I want to post a picture on Twitter, I just name the file that holds it. If I want to edit a picture on Twitter using Photoshop, I use the File>Open menu in Photoshop and name the file stored on Twitter, which is easy to discover or even know a priori. (There are security and access questions here, and we'll come back to those.)

Working in a file system.

I want my home directory to be where all my data is. Not just my local files, not just my source code, not just my photos, not just my mail. All my data. My "phone" should be able to access the same data as my laptop, which should be able to access the same data as the servers. (Ignore access control for the moment.) $HOME should be my home for everything: work, play, life; toy, phone, work, cluster.

This was how things worked in the old single-machine days but we lost sight of that when networking became universally available. There were network file systems and some research systems used them to provide basically this model, but the arrival of consumer devices, portable computing, and then smartphones eroded the model until every device is its own fiefdom and every program manages its own data through networking. We have a network connecting devices instead of a network composed of devices.

The knowledge of how to achieve the old way still exists, and networks are fast and ubiquitous enough to restore the model. From a human point of view, the data is all we care about: my pictures, my mail, my documents. Put those into a globally addressable file system and I can see all my data with equal facility from any device I control. And then, when I want to share with another person, I can name the file (or files or directory) that holds the information I want to share, grant access, and the other person can access it.

The essence here is that the data (if it's in a single file) has one name that is globally usable to anyone who knows the name and has the permission to evaluate it. Those names might be long and clumsy, but simple name space techniques can make the data work smoothly using local aliasing so that I live in "my" world, you live in your world (also called "my" world from your machines), and the longer, globally unique names only arise when we share, which can be done with a trivial, transparent, easy to use file-system interface.

Note that the goal here is not a new file system to use alongside the existing world. Its purpose is to be the only file system to use. Obviously there will be underlying storage systems, but from the user's point of view all access is through this system. I edit a source file, compile it, find a problem, point a friend at it; she accesses the same file, not a copy, to help me understand it. (If she wants a copy, there's always cp!).

This is not a simple thing to do, but I believe it is possible. Here is how I see it being assembled. This discussion will be idealized and skate over a lot of hard stuff. That's OK; this is a vision, not a design document.

Everyone has a name.

Each person is identified by a name. To make things simple here, let's just use an e-mail address. There may be a better idea, but this is sufficient for discussion. It is not a problem to have multiple names (addresses) in this model, since the sharing and access rights will treat the two names as distinct users with whatever sharing rights they choose to use.

Everyone has stable storage in the network.

Each person needs a way to make data accessible to the network, so the storage must live in the network. The easiest way to think of this is like the old network file systems, with per-user storage in the server farm. At a high level, it doesn't matter what that storage is or how it is structured, as long as it can be used to provide the storage layer for a file-system-like API.

Everyone's storage server has a name identified by the user's name.

The storage in the server farm is identified by the user's name.

Everyone has local storage, but it's just a cache.

It's too expensive to send all file access to the servers, so the local device, whatever it is—laptop, desktop, phone, watch—caches what it needs and goes to the server as needed. Cache protocols are an important part of the implementation; for the point of this discussion, let's just say they can be designed to work well. That is a critical piece and I have ideas, but put that point aside for now.

The server always knows what devices have cached copies of the files on local storage. 

The cache always knows what the associated server is for each directory file in its cache and maintains consistency within reasonable time boundaries.

The cache implements the API of a full file system. The user lives in this file system for all the user's own files. As the user moves between devices, caching protocols keep things working.

Everyone's cache can talk to multiple servers.

A user may have multiple servers, perhaps from multiple providers. The same cache and therefore same file system API refers to them all equivalently. Similarly, if a user accesses a different user's files, the exact same protocol is used, and the result is cached in the same cache the same way. This is federation as architecture.

Every file has a globally unique name.

Every file is named by this triple: (host address, user name, file path). Access rights aside, any user can address any other user's file by evaluating the triple. The real access method will be nicer in practice, of course, but this is the gist.

Every file has a potentially unique ACL.

Although the user interface for access control needs to be very easy, the effect is that each file or directory has an access control list (ACL) that mediates all access to its contents. It will need to be very fine-grained with respect to each of users, files, and rights.

Every user has a local name space.

The cache/file-system layer contains functionality to bind things, typically directories, identified by such triples into locally nicer-to-use names. An obvious way to think about this is like an NFS mount point for /home, where the remote binding attaches to /home/XXX the component or components in the network that the local user wishes to identify by XXX. For example, Joe might establish /home/jane as a place to see all the (accessible to Joe) pieces of Jane's world. But it can be much finer-grained than that, to the level of pulling in a single file.

The NFS analogy only goes so far. First, the binding is a lazily-evaluated, multi-valued recipe, not a Unix-like mount. Also, the binding may itself be programmatic, so that there is an element of auto-discovery. Perhaps most important, one can ask any file in the cached local system what its triple is and get its unique name, so when a user wishes to share an item, the triple can be exposed and the remote user can use her locally-defined recipe to construct the renaming to make the item locally accessible. This is not as mysterious or as confusing in practice as it sounds; Plan 9 pretty much worked like this, although not as dynamically.

Everyone's data becomes addressable.

Twitter gives you (or anyone you permit) access to your Twitter data by implementing the API, just as the regular, more file-like servers do. The same story applies to any entity that has data it wants to make usable. At some scaling point, it becomes wrong not to play.

Everyone's data is secure.

It remains to be figured out how to do that, I admit, but with a simple, coherent data model that should be achievable.

Is this a product?

The protocols and some of the pieces, particularly what runs on local devices, should certainly be open source, as should a reference server implementation. Companies should be free to provide proprietary implementations to access their data, and should also be free to charge for hosting. A cloud provider could charge hosting fees for the service, perhaps with some free or very cheap tier that would satisfy the common user. There's money in this space.

What is this again?

What Google Drive should be. What Dropbox should be. What file systems can be. The way we unify our data access across companies, services, programs, and people. The way I want to live and work.

Never again should someone need to manually copy/upload/download/plugin/workflow/transfer data from one machine to another. 

Thursday, September 21, 2017

Go: Ten years and climbing



Drawing Copyright ©2017 Renee French


This week marks the 10th anniversary of the creation of Go.


The initial discussion was on the afternoon of Thursday, the 20th of September, 2007. That led to an organized meeting between Robert Griesemer, Rob Pike, and Ken Thompson at 2PM the next day in the conference room called Yaounde in Building 43 on Google's Mountain View campus. The name for the language arose on the 25th, several messages into the first mail thread about the design:

Subject: Re: prog lang discussion From: Rob 'Commander' Pike Date: Tue, Sep 25, 2007 at 3:12 PM To: Robert Griesemer, Ken Thompson i had a couple of thoughts on the drive home. 1. name 'go'. you can invent reasons for this name but it has nice properties. it's short, easy to type. tools: goc, gol, goa. if there's an interactive debugger/interpreter it could just be called 'go'. the suffix is .go ...

(It's worth stating that the language is called Go; "golang" comes from the web site address (go.com was already a Disney web site) but is not the proper name of the language.)


The Go project counts its birthday as November 10, 2009, the day it launched as open source, originally on code.google.com before migrating to GitHub a few years later. But for now let's date the language from its conception, two years earlier, which allows us to reach further back, take a longer view, and witness some of the earlier events in its history.


The first big surprise in Go's development was the receipt of this mail message:

Subject: A gcc frontend for Go
From: Ian Lance Taylor Date: Sat, Jun 7, 2008 at 7:06 PM To: Robert Griesemer, Rob Pike, Ken Thompson One of my office-mates pointed me at http://.../go_lang.html . It seems like an interesting language, and I threw together a gcc frontend for it. It's missing a lot of features, of course, but it does compile the prime sieve code on the web page.

The shocking yet delightful arrival of an ally (Ian) and a second compiler (gccgo) was not only encouraging, it was enabling. Having a second implementation of the language was vital to the process of locking down the specification and libraries, helping guarantee the high portability that is part of Go's promise.


Even though his office was not far away, none of us had even met Ian before that mail, but he has been a central player in the design and implementation of the language and its tools ever since.


Russ Cox joined the nascent Go team in 2008 as well, bringing his own bag of tricks. Russ discovered—that's the right word—that the generality of Go's methods meant that a function could have methods, leading to the http.HandlerFunc idea, which was an unexpected result for all of us. Russ promoted more general ideas too, like the the io.Reader and io.Writer interfaces, which informed the structure of all the I/O libraries.


Jini Kim, who was our product manager for the launch, recruited the security expert Adam Langley to help us get Go out the door. Adam did a lot of things for us that are not widely known, including creating the original golang.org web page and the build dashboard, but of course his biggest contribution was in the cryptographic libraries. At first, they seemed disproportionate in both size and complexity, at least to some of us, but they enabled so much important networking and security software later that they become a crucial part of the Go story. Network infrastructure companies like Cloudflare lean heavily on Adam's work in Go, and the internet is better for it. So is Go, and we thank him.


In fact a number of companies started to play with Go early on, particularly startups. Some of those became powerhouses of cloud computing. One such startup, now called Docker, used Go and catalyzed the container industry for computing, which then led to other efforts such as Kubernetes. Today it's fair to say that Go is the language of containers, another completely unexpected result.


Go's role in cloud computing is even bigger, though. In March of 2014 Donnie Berkholz, writing for RedMonk, claimed that Go was "the emerging language of cloud infrastructure". Around the same time, Derek Collison of Apcera stated that Go was already the language of the cloud. That might not have been quite true then, but as the word "emerging" used by Berkholz implied, it was becoming true.


Today, Go is the language of the cloud, and to think that a language only ten years old has come to dominate such a large and growing industry is the kind of success one can only dream of. And if you think "dominate" is too strong a word, take a look at the internet inside China. For a while, the huge usage of Go in China signaled to us by the Google trends graph seemed some sort of mistake, but as anyone who has been to the Go conferences in China can attest, the measurements are real. Go is huge in China.


In short, ten years of travel with the language have brought us past many milestones. The most astonishing is at our current position: a conservative estimate suggests there are at least half a million Go programmers. When the mail message naming Go was sent, the idea of there being half a million gophers would have sounded preposterous. Yet here we are, and the number continues to grow.


Speaking of gophers, it's been fun to watch how Renee French's idea for a mascot, the Go gopher, became not only a much loved creation but also a symbol for Go programmers everywhere. Many of the biggest Go conferences are called GopherCons as they gather together gophers from all over the world.


Gopher conferences are taking off. The first one was only three years ago, yet today there are many, all around the world, plus countless smaller local "meetups". On any given day, there is more likely than not a group of gophers meeting somewhere in the world to share ideas.


Looking back over ten years of Go design and development, it is astounding to reflect on the growth of the Go community. The number of conferences and meetups, the long and ever-increasing list of contributors to the Go project, the profusion of open source repositories hosting Go code, the number of companies using Go, some exclusively: these are all astonishing to contemplate.


For the three of us, Robert, Rob, and Ken, who just wanted  to make our programming lives easier, it's incredibly gratifying to witness what our work has started.

What will the next ten years bring?

- Rob Pike, with Robert Griesemer and Ken Thompson

Saturday, February 25, 2017

The power of role models

I spent a few days a while back in a board meeting for a national astronomy organization and noticed a property of the population in that room: Out of about 40 people, about a third were women. And these were powerful women, too: professors, observatory directors and the like. Nor were they wallflowers. Their contributions to the meeting exceeded their proportion.

In my long career, I had never before been in a room like that, and the difference in tone, conversation, respect, and professionalism was unlike any I have experienced. I can't prove it was the presence of women that made the difference - it could just be that astronomers are better people all around, a possibility I cannot really refute - but it seemed to me that the difference stemmed from the demographics.

The meeting was one-third women, but of course in private conversation, when pressed, the women I spoke to complained that things weren't equal yet. We all have our reference points.

But let's back up for a moment and think about the main point: In a room responsible for overseeing the budget and operation of major astronomical observatories, including things like the Hubble telescope, women played a major role. The contrast with computing is stark.

It really got me thinking. At dinner I asked some of the women to speak to me about this, how astronomy became so (relatively) egalitarian. And one topic became clear: role models. Astronomy has a long history of women active in the field, going all the way back to Caroline Herschel in the early 19th century. Women have made huge contributions to the field. Dava Sobel just wrote a book about the women who laid the foundations for the discovery of the expansion of the universe. Just a couple of weeks ago, papers ran obituaries of Vera Rubin, the remarkable observational astronomer who discovered the evidence for dark matter. I could mention Jocelyn Bell, whose discovery of pulsars got her advisor a Nobel (sic).

The most famous astronomer I met growing up was Helen Hogg, the (adopted) Canadian astronomer at David Dunlap Observatory outside Toronto, who also did a fair bit of what we now call outreach.

The women at the meeting spoke of this, a history of women contributing, of role models to look up to, of proof that women can make major contributions to the field.

What can computing learn from this?

It seems we're doing it wrong. The best way to improve the representation of women in the field is not to recruit them, important though that is, but to promote them. To create role models. To push them into positions of influence. Women leave computing in large numbers because they don't see a path up, or because the culture makes them unwelcome. More women excelling in the field, famous women, brilliant women, would be inspiring.

Men have the power to help fix those things, but they also should have the courage to cede the stage to women more often, to fight the stupid bias that keeps women from excelling in the field. It may take proactive behavior, like choosing a women over a man when growing your team, just because, or promoting women more freely.

But as I see it, without something being done to promote female role models, the way things are going computing will still be backwards a hundred years from now.

What We Got Right, What We Got Wrong

  This is my closing talk ( video ) from the GopherConAU conference in Sydney, given November 10, 2023, the 14th anniversary of Go being lau...