blarg?

May 27, 2016

Developers Are The New Mainframes

Filed under: documentation,future,interfaces,lunacy,mozilla,science,weird,work — mhoye @ 3:20 pm

This is another one of those rambling braindump posts. I may come back for some fierce editing later, but in the meantime, here’s some light weekend lunacy. Good luck getting through it. I believe in you.

I said that thing in the title with a straight face the other day, and not without reason. Maybe not good reasons? I like the word “reason”, I like the little sleight-of-hand it does by conflating “I did this on purpose” and “I thought about this beforehand”. It may not surprise you to learn that in my life at least those two things are not the same at all. In any case this post by Moxie Marlinspike was rattling around in the back of my head when somebody asked me on IRC why it’s hard-and-probably-impossible to make a change to a website in-browser and send a meaningful diff back to the site’s author, so I rambled for a bit and ended up here.

This is something I’ve asked for in the past myself: something like dom-diff and dom-merge, so site users could share changes back with creators. All the “web frameworks” I’ve ever seen are meant to make development easier and more manageable but at the end of the day what goes over the wire is a pile of minified angle-bracket hamburger that has almost no connection the site “at rest” on the filesystem. The only way share a usable change with a site author, if it can be done at all, is to stand up a containerized version of the entire site and edit that. This disconnect between the scale of the change and the work needed to make it is, to put it mildly, a huge barrier to somebody who wants to correct a typo, tweak a color or add some alt-text to an image.

I ranted about this for a while, about how JavaScript has made classic View Source obsolete and how even if you had dom-diff and dom-merge you’d need a carefully designed JS framework underneath designed specifically to support them, and how it makes me sad that I don’t have the skill set or free time to make that happen. But I think that if you dig a little deeper, there are some cold economics underneath that whole state of affairs that are worth thinking about.

I think that the basic problem here is the misconception that federation is a feature of distributed systems. I’m pretty confident that it’s not; specifically, I believe that federated systems are a byproduct of computational scarcity.

Building and deploying federated systems has a bunch of hard tradeoffs around development, control and speed of iteration that people are stuck with when computation is so expensive that no single organization can have or do enough of it to give a service global reach. Usenet, XMPP, email and so forth were products of this mainframe-and-minicomputer era; the Web is the last and best of them.

Protocol consensus is hard, but not as hard or expensive as a room full of $40,000 or $4,000,000 computers, so you do that work and accept the fact that what you gain in distributed stability you lose in iteration speed and design flexibility. The nature of those costs means the pressure to get it pretty close to right on the first try is very high, because real opportunities to revisit will be rare and costly. You’re fighting your own established success at that point, and nothing in tech has more inertia than a status quo whose supporters think is good enough. (See also: how IPV6 has been “right around the corner” for 20 years.)

But that’s just not true anymore. If you need a few thousand more CPUs, you twiddle the dials on your S3 page and go back to unified deployment, rapid experimental iteration and trying to stay ahead of everyone else who’s doing the same. That’s how WhatsApp can deploy end to end encryption with one software update, just like that. It’s how Facebook can update a billion users’ experiences whenever they feel like it, and presumably how Twitter does whatever the hell Twitter’s doing this week. They don’t ask permission or seek consensus because they don’t have to; they deploy, test and iterate.

So the work that used to enable, support and improve federated systems now mostly exists where domain-computation is still scarce and expensive: the development process itself. Specifically the inside of developers heads, developers who stubbornly and despite our best efforts remain expensive, high-maintenance and relatively low-bandwidth, with lots of context and application-reasoning locked up in their heads and poorly distributed.

Which is to say: developers are the new mainframes.

Right now great majority of what they’re “connected” to from a development-on-device perspective are de-facto dumb terminals. Apps, iPads, Android phones. Web pages you can’t meaningfully modify for values of “meaningful” that involve upstreaming a diff. From a development perspective those are the endpoints of one-way transmissions, and there’s no way to duplex that line to receive development-effort back.

So, if that’s the trend – that is, if in general centralized-then-federated systems get reconsolidated in socially-oriented verticals, (and that’s what issue trackers are when compared to mailing lists) – then development as a practice is floating around the late middle step, but development as an end product – via cheap CPU and hackable IoT devices – that’s just getting warmed up. The obvious Next Thing in that space will be a resurgence of something like the Web, made of little things that make little decisions – effectively distributing, commodifying and democratizing programming as a product, duplexing development across those newly commodified development-nodes.

That’s the real revolution that’s coming, not the thousand-dollar juicers or the bluetooth nosehair trimmers, but the mess of tiny hackable devices that start to talk to each other via decentralized, ultracommodified feedback loops. We’re missing a few key components – bug trackers aren’t quite source-code-managers or social-ey, IoT build tools aren’t one-click-to-deploy and so forth, but eventually there will be a single standard for how these things communicate and run despite everyone’s ongoing efforts to force users into the current and very-mainframey vendor lock-in, the same way there were a bunch of proprietary transport protocols before TCP/IP settled the issue. Your smarter long-game players will be the ones betting on JavaScript to come out on top there, though it’s possible there will be other contenders.

The next step will be the social one, though “tribal” might be a better way of putting it – the eventual recentralization of this web of thing-code into cultural-preference islands making choices about how they speak to the world around them and the world speaks back. Basically a hardware scripting site with a social aspect built in, communities and trusted sources building social/subscriber model out for IoT agency. What the Web became and is still in a lot of ways becoming as we figure the hard part – the people at scale part, out. The Web of How Stuff Works.

Anyway, if you want to know what the next 15-20 years will look like, that’s the broad strokes. Probably more like 8-12, on reflection. Stuff moves pretty quick these days, but like I said, building consensus is hard. The hard part is always people. This is one of the reasons I think Mozilla’s mission is only going to get more important for the foreseeable future; the Web was the last and best of the federated systems, worth fighting for on those grounds alone, and we’re nowhere close to done learning everything it’s got to teach us about ourselves, each other and what it’s possible for us to become. It might be the last truly open, participatory system we get, ever. Consensus is hard and maybe not necessary anymore, so if we can’t keep the Web and the lessons we’ve learned and can still learn from it alive long enough to birth its descendants, we may never get a chance to build another system like it.

[minor edits since first publication. -mhoye]

July 24, 2015

Hostage Situation

(This is an edited version of a rant that started life on Twitter. I may add some links later.)

Can we talk for a few minutes about the weird academic-integrity hostage situation going on in CS research right now?

We share a lot of data here at Mozilla. As much as we can – never PII, not active security bugs, but anyone can clone our repos or get a bugzilla account, follow our design and policy discussions, even watch people design and code live. We default to open, and close up only selectively and deliberately. And as part of my job, I have the enormous good fortune to periodically go to conferences where people have done research, sometimes their entire thesis, based on our data.

Yay, right?

Some of the papers I’ve seen promise results that would be huge for us. Predicting flaws in a patch prereview. Reducing testing overhead 80+% with a four-nines promise of no regressions and no loss of quality.

I’m excitable, I get that, but OMFG some of those numbers. 80 percent reductions of testing overhead! Let’s put aside the fact that we spend a gajillion dollars on the physical infrastructure itself, let’s only count our engineers’ and contributors’ time and happiness here. Even if you’re overoptimistic by a factor of five and it’s only a 20% savings we’d hire you tomorrow to build that for us. You can have a plane ticket to wherever you want to work and the best hardware money can buy and real engineering support to deploy something you’ve already mostly built and proven. You want a Mozilla shirt? We can get you that shirt! You like stickers? We have stickers! I’ll get you ALL THE FUCKING STICKERS JUST SHOW ME THE CODE.

I did mention that I’m excitable, I think.

But that’s all I ask. I go to these conferences and basically beg, please, actually show me the tools you’re using to get that result. Your result is amazing. Show me the code and the data.

But that never happens. The people I talk to say I don’t, I can’t, I’m not sure, but, if…

Because there’s all these strange incentives to hold that data and code hostage. You’re thinking, maybe I don’t need to hire you if you publish that code. If you don’t publish your code and data and I don’t have time to reverse-engineer six years of a smart kid’s life, I need to hire you for sure, right? And maybe you’re not proud of the code, maybe you know for sure that it’s ugly and awful and hacks piled up over hacks, maybe it’s just a big mess of shell scripts on your lab account. I get that, believe me; the day I write a piece of code I’m proud of before it ships will be a pretty good day.

But I have to see something. Because from our perspective, making a claim about software that doesn’t include the software you’re talking about is very close to worthless. You’re not reporting a scientific result at that point, however miraculous your result is; you’re making an unverifiable claim that your result exists.

And we look at that and say: what if you’ve got nothing? How can we know, without something we can audit and test? Of course, all the supporting research is paywalled PDFs with no concomitant code or data either, so by any metric that matters – and the only metric that matters here is “code I can run against data I can verify” – it doesn’t exist.

Those aren’t metrics that matter to you, though. What matters to you is either “getting a tenure-track position” or “getting hired to do work in your field”. And by and large the academic tenure track doesn’t care about open access, so you’re afraid that actually showing your work will actively hurt your likelihood of getting either of those jobs.

So here we are in this bizarro academic-research standoff, where I can’t work with you without your tipping your hand, and you can’t tip your hand for fear I won’t want to work with you. And so all of this work that could accomplish amazing things for a real company shipping real software that really matters to real people – five or six years of the best work you’ve ever done, probably – just sits on the shelf rotting away.

So I go to academic conferences and I beg people to publish their results and paper and data open access, because the world needs your work to matter. Because open access plus data/code as a minimum standard isn’t just important to the fundamental principles of repeatable experimental science, the integrity of your field, and your career. It’s important because if you want your work to matter to people, then you’d better put it somewhere that people can see it and use it and thank you for it and maybe even improve on it.

You did this as an undergrad. You insist on this from your undergrads, for exactly the same reasons I’m asking you to do the same: understanding, integrity and plain old better results. And it’s a few clicks and a GitHub account for you to do the same now. But I need you to do it one last time.

Full marks here isn’t “a job” or “tenure”. Your shot at those will be no worse, though I know you can’t see it from where you’re standing. But they’re still only a strong B. An A is doing something that matters, an accomplishment that changes the world for the better.

And if you want full marks, show your work.

October 3, 2014

Rogue Cryptocurrency Bootstrapping Robots

Cuban Shoreline

I tried to explain to my daughter why I’d had a strange day.

“Why was it strange?”

“Well… There’s a thing called a cryptocurrency. ‘Currency’ is another word for money; a cryptocurrency is a special kind of money that’s made out of math instead of paper or metal.”

That got me a look. Money that’s made out of made out of math, right.

“… and one of the things we found today was somebody trying to make a new cryptocurrency. Now, do you know why money is worth anything? It’s a coin or a paper with some ink on it – what makes it ‘money’?”

“… I don’t know.”

“The only answer we have is that it’s money if enough people think it is. If enough people think it’s real, it becomes real. But making people believe in a new kind of money isn’t easy, so what this guy did was kind of clever. He decided to give people little pieces of his cryptocurrency for making contributions to different software projects. So if you added a patch to one of the projects he follows, he’d give you a few of these math coins he’d made up.”

“Um.”

“Right. Kind of weird. And then whoever he is, he wrote a program to do that automatically. It’s like a little robot – every time you change one of these programs, you get a couple of math coins. But the problem is that we update a lot of those programs with our robots, too. Our scripts run, our robots, and then his robots try to give our robots some of his pretend money.”

“…”

“So that’s why my day was weird. Because we found somebody else’s programs trying to give our programs made-up money, in the hope that this made-up money would someday become real.”

“Oh.”

“What did you to today?”

“I painted different animals and gave them names.”

“What kind of names?”

“French names like zaval.”

“Cheval. Was it a good day?”

“Yeah, I like painting.”

“Good, good.”

(Charlie Stross warned us about this. It’s William Gibson’s future, but we still need to clean up after it.)

October 22, 2013

Citation Needed

I may revisit this later. Consider this a late draft. I’m calling this done.

“Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration.” — Stan Kelly-Bootle

Sometimes somebody says something to me, like a whisper of a hint of an echo of something half-forgotten, and it lands on me like an invocation. The mania sets in, and it isn’t enough to believe; I have to know.

I’ve spent far more effort than is sensible this month crawling down a rabbit hole disguised, as they often are, as a straightforward question: why do programmers start counting at zero?

Now: stop right there. By now your peripheral vision should have convinced you that this is a long article, and I’m not here to waste your time. But if you’re gearing up to tell me about efficient pointer arithmetic or binary addition or something, you’re wrong. You don’t think you’re wrong and that’s part of a much larger problem, but you’re still wrong.

For some backstory, on the off chance anyone still reading by this paragraph isn’t an IT professional of some stripe: most computer languages including C/C++, Perl, Python, some (but not all!) versions of Lisp, many others – are “zero-origin” or “zero-indexed”. That is to say, in an array A with 8 elements in it, the first element is A[0], and the last is A[7]. This isn’t universally true, though, and other languages from the same (and earlier!) eras are sometimes one-indexed, going from A[1] to A[8].

While it’s a relatively rare practice in modern languages, one-origin arrays certainly aren’t dead; there’s a lot of blood pumping through Lua these days, not to mention MATLAB, Mathematica and a handful of others. If you’re feeling particularly adventurous Haskell apparently lets you pick your poison at startup, and in what has to be the most lunatic thing I’ve seen on a piece of silicon since I found out the MIPS architecture had runtime-mutable endianness, Visual Basic (up to v6.0) featured the OPTION BASE flag, letting you flip that coin on a per-module basis. Zero- and one-origin arrays in different corners of the same program! It’s just software, why not?

All that is to say that starting at 1 is not an unreasonable position at all; to a typical human thinking about the zeroth element of an array doesn’t make any more sense than trying to catch the zeroth bus that comes by, but we’ve clearly ended up here somehow. So what’s the story there?

The usual arguments involving pointer arithmetic and incrementing by sizeof(struct) and so forth describe features that are nice enough once you’ve got the hang of them, but they’re also post-facto justifications. This is obvious if you take the most cursory look at the history of programming languages; C inherited its array semantics from B, which inherited them in turn from BCPL, and though BCPL arrays are zero-origin, the language doesn’t support pointer arithmetic, much less data structures. On top of that other languages that antedate BCPL and C aren’t zero-indexed. Algol 60 uses one-indexed arrays, and arrays in Fortran are arbitrarily indexed – they’re just a range from X to Y, and X and Y don’t even need to be positive integers.

So by the early 1960’s, there are three different approaches to the data structure we now call an array.

  • Zero-indexed, in which the array index carries no particular semantics beyond its implementation in machine code.
  • One-indexed, identical to the matrix notation people have been using for quite some time. It comes at the cost of a CPU instruction or disused word to manage the offset; usability isn’t free.
  • Arbitrary indices, in which the range is significant with regards to the problem you’re up against.

So if your answer started with “because in C…”, you’ve been repeating a good story you heard one time, without ever asking yourself if it’s true. It’s not about *i = a + n*sizeof(x) because pointers and structs didn’t exist. And that’s the most coherent argument I can find; there are dozens of other arguments for zero-indexing involving “natural numbers” or “elegance” or some other unresearched hippie voodoo nonsense that are either wrong or too dumb to rise to the level of wrong.

The fact of it is this: before pointers, structs, C and Unix existed, at a time when other languages with a lot of resources and (by the standard of the day) user populations behind them were one- or arbitrarily-indexed, somebody decided that the right thing was for arrays to start at zero.

So I found that person and asked him.

His name is Dr. Martin Richards; he’s the creator of BCPL, now almost 7 years into retirement; you’ve probably heard of one of his doctoral students Eben Upton, creator of the Raspberry Pi. I emailed him to ask why he decided to start counting arrays from zero, way back then. He replied that…

As for BCPL and C subscripts starting at zero. BCPL was essentially designed as typeless language close to machine code. Just as in machine code registers are typically all the same size and contain values that represent almost anything, such as integers, machine addresses, truth values, characters, etc. BCPL has typeless variables just like machine registers capable of representing anything. If a BCPL variable represents a pointer, it points to one or more consecutive words of memory. These words are the same size as BCPL variables. Just as machine code allows address arithmetic so does BCPL, so if p is a pointer p+1 is a pointer to the next word after the one p points to. Naturally p+0 has the same value as p. The monodic indirection operator ! takes a pointer as it’s argument and returns the contents of the word pointed to. If v is a pointer !(v+I) will access the word pointed to by v+I. As I varies from zero upwards we access consecutive locations starting at the one pointed to by v when I is zero. The dyadic version of ! is defined so that v!i = !(v+I). v!i behaves like a subscripted expression with v being a one dimensional array and I being an integer subscript. It is entirely natural for the first element of the array to have subscript zero. C copied BCPL’s approach using * for monodic ! and [ ] for array subscription. Note that, in BCPL v!5 = !(v+5) = !(5+v) = 5!v. The same happens in C, v[5] = 5[v]. I can see no sensible reason why the first element of a BCPL array should have subscript one. Note that 5!v is rather like a field selector accessing a field in a structure pointed to by v.

This is interesting for a number of reasons, though I’ll leave their enumeration to your discretion. The one that I find most striking, though, is that this is the earliest example I can find of the understanding that a programming language is a user interface, and that there are difficult, subtle tradeoffs to make between resources and usability. Remember, all this was at a time when everything about the future of human-computer interaction was up in the air, from the shape of the keyboard and the glyphs on the switches and keycaps right down to how the ones and zeros were manifested in paper ribbon and bare metal; this note by the late Dennis Ritchie might give you a taste of the situation, where he mentions that five years later one of the primary reasons they went with C’s square-bracket array notation was that it was getting steadily easier to reliably find square brackets on the world’s keyboards.

“Now just a second, Hoye”, I can hear you muttering. “I’ve looked at the BCPL manual and read Dr. Richards’ explanation and you’re not fooling anyone. That looks a lot like the efficient-pointer-arithmetic argument you were frothing about, except with exclamation points.” And you’d be very close to right. That’s exactly what it is – the distinction is where those efficiencies take place, and why.

BCPL was first compiled on an IBM 7094here’s a picture of the console, though the entire computer took up a large room – running CTSS – the Compatible Time Sharing System – that antedates Unix much as BCPL antedates C. There’s no malloc() in that context, because there’s nobody to share the memory core with. You get the entire machine and the clock starts ticking, and when your wall-clock time block runs out that’s it. But here’s the thing: in that context none of the offset-calculations we’re supposedly economizing are calculated at execution time. All that work is done ahead of time by the compiler.

You read that right. That sheet-metal, “wibble-wibble-wibble” noise your brain is making is exactly the right reaction.

Whatever justifications or advantages came along later – and it’s true, you do save a few processor cycles here and there and that’s nice – the reason we started using zero-indexed arrays was because it shaved a couple of processor cycles off of a program’s compilation time. Not execution time; compile time.

Does it get better? Oh, it gets better:

IBM had been very generous to MIT in the fifties and sixties, donating or discounting its biggest scientific computers. When a new top of the line 36-bit scientific machine came out, MIT expected to get one. In the early sixties, the deal was that MIT got one 8-hour shift, all the other New England colleges and universities got a shift, and the third shift was available to IBM for its own use. One use IBM made of its share was yacht handicapping: the President of IBM raced big yachts on Long Island Sound, and these boats were assigned handicap points by a complicated formula. There was a special job deck kept at the MIT Computation Center, and if a request came in to run it, operators were to stop whatever was running on the machine and do the yacht handicapping job immediately.

Jobs on the IBM 7090, one generation behind the 7094, were batch-processed, not timeshared; you queued up your job along with a wall-clock estimate of how long it would take, and if it didn’t finish it was pulled off the machine, the next job in the queue went in and you got to try again whenever your next block of allocated time happened to be. As in any economy, there is a social context as well as a technical context, and it isn’t just about managing cost, it’s also about managing risk. A programmer isn’t just racing the clock, they’re also racing the possibility that somebody will come along and bump their job and everyone else’s out of the queue.

I asked Tom Van Vleck, author of the above paragraph and also now retired, how that worked. He replied in part that on the 7090…

“User jobs were submitted on cards to the system operator, stacked up in a big tray, and a rudimentary system read, loaded, and ran jobs in sequence. Typical batch systems had accounting systems that read an ID card at the beginning of a user deck and punched a usage card at end of job. User jobs usually specified a time estimate on the ID card, and would be terminated if they ran over. Users who ran too many jobs or too long would use up their allocated time. A user could arrange for a long computation to checkpoint its state and storage to tape, and to subsequently restore the checkpoint and start up again.

The yacht handicapping job pertained to batch processing on the MIT 7090 at MIT. It was rare — a few times a year.”

So: the technical reason we started counting arrays at zero is that in the mid-1960’s, you could shave a few cycles off of a program’s compilation time on an IBM 7094. The social reason is that we had to save every cycle we could, because if the job didn’t finish fast it might not finish at all and you never know when you’re getting bumped off the hardware because the President of IBM just called and fuck your thesis, it’s yacht-racing time.

There are a few points I want to make here.

The first thing is that as far as I can tell nobody has ever actually looked this up.

Whatever programmers think about themselves and these towering logic-engines we’ve erected, we’re a lot more superstitious than we realize. We tell and retell this collection of unsourced, inaccurate stories about the nature of the world without ever doing the research ourselves, and there’s no other word for that but “mythology”. Worse, by obscuring the technical and social conditions that led humans to make these technical and social decisions, by talking about the nature of computing as we find it today as though it’s an inevitable consequence of an immutable set of physical laws, we’re effectively denying any responsibility for how we got here. And worse than that, by refusing to dig into our history and understand the social and technical motivations for those choices, by steadfastly refusing to investigate the difference between a motive and a justification, we’re disavowing any agency we might have over the shape of the future. We just keep mouthing platitudes and pretending the way things are is nobody’s fault, and the more history you learn and the more you look at the sad state of modern computing the the more pathetic and irresponsible that sounds.

Part of the problem is access to the historical record, of course. I was in favor of Open Access publication before, but writing this up has cemented it: if you’re on the outside edge of academia, $20/paper for any research that doesn’t have a business case and a deep-pocketed backer is completely untenable, and speculative or historic research that might require reading dozens of papers to shed some light on longstanding questions is basically impossible. There might have been a time when this was OK and everyone who had access to or cared about computers was already an IEEE/ACM member, but right now the IEEE – both as a knowledge repository and a social network – is a single point of a lot of silent failure. “$20 for a forty-year-old research paper” is functionally indistinguishable from “gone”, and I’m reduced to emailing retirees to ask them what they remember from a lifetime ago because I can’t afford to read the source material.

The second thing is how profoundly resistant to change or growth this field is, and apparently has always been. If you haven’t seen Bret Victor’s talk about The Future Of Programming as seen from 1975 you should, because it’s exactly on point. Over and over again as I’ve dredged through this stuff, I kept finding programming constructs, ideas and approaches we call part of “modern” programming if we attempt them at all, sitting abandoned in 45-year-old demo code for dead languages. And to be clear: that was always a choice. Over and over again tools meant to make it easier for humans to approach big problems are discarded in favor of tools that are easier to teach to computers, and that decision is described as an inevitability.

This isn’t just Worse Is Better, this is “Worse Is All You Get Forever”. How many off-by-one disasters could we have avoided if the “foreach” construct that existed in BCPL had made it into C? How much more insight would all of us have into our code if we’d put the time into making Michael Chastain’s nearly-omniscient debugging framework – PTRACE_SINGLESTEP_BACKWARDS! – work in 1995? When I found this article by John Backus wondering if we can get away from Von Neumann architecture completely, I wonder where that ambition to rethink our underpinnings went. But the fact of it is that it didn’t go anywhere. Changing how you think is hard and the payoff is uncertain, so by and large we decided not to. Nobody wanted to learn how to play, much less build, Engelbart’s Violin, and instead everyone gets a box of broken kazoos.

In truth maybe somebody tried – maybe even succeeded! – but it would cost me hundreds of dollars to even start looking for an informed guess, so that’s the end of that.

It’s hard for me to believe that the IEEE’s membership isn’t going off a demographic cliff these days as their membership ages, and it must be awful knowing they’ve got decades of delicious, piping-hot research cooked up that nobody is ordering while the world’s coders are lining up to slurp watery gruel out of a Stack-Overflow-shaped trough and pretend they’re well-fed. You might not be surprised to hear that I’ve got a proposal to address both those problems; I’ll let you work out what it might be.

July 25, 2013

Algorithmically Marginalized

Filed under: digital,fail,future,interfaces,science,work — mhoye @ 1:03 pm

I wouldn’t have thought that mathematics or signal processing would have a cultural bent, but I just sat through a conference call where everyone was reasonably clear except for one guy, with a pronounced central-African accent, whose voice was getting audibly butchered by the noise cancellation algorithm on the line. The beginning of every sentence, and every pause, was punctuated by a sort of wierd, static-and-squarewave tug-of-war with the background noise.

I think it’s some combination of his accent and cadence of his speech, and it was really weird to notice the trend. On reflection, it makes perfect sense – algorithms optimized for the majority, as defined by the people who wrote them, would of course have a cultural impact on people at the margins – it just hadn’t occurred to me how that would work until just now.

June 8, 2013

Crypto Is Not A Panacea

Filed under: academic,digital,doom,future,interfaces,science,vendetta,work — mhoye @ 9:36 am

Bricks

I was going to write this to an internal mailing list, following this week’s PRISM excitement, but I’ve decided to put it here instead. It was written (and cribbed from other stuff I’ve written elsewhere) in response to an argument that encrypting everything would somehow solve a scary-sounding though imprecisely-specified problem, a claim you may not be surprised to find out I think is foolish.

I’ve written about this elsewhere, so forgive me, but: I think that it’s a profound mistake to assume that crypto is a panacea here.

Backstory time: in 1993, the NSA released SHA, the Secure Hashing Algorithm; you’ve heard of it, I’m sure. Very soon afterwards – months, I think? – they came back and said no, stop, don’t use that. Use SHA-1 instead, here you go.

No explanation, nothing. But nobody else could even begin to make a case either way, so SHA-1 it is.

It’s 2005 before somebody manages to generate one, just one, collision in what’s now called SHA-0, and they do that by taking a theoretical attack that gets you close to a collision, generalizing it and running it for around 80,000 CPU hours or so on a machine with 256 Itanium-2 processors running this one job flat out for two weeks.

That hardware straight up didn’t exist in 1993. That was the year the original Doom came out, for what it’s worth, so it’s very likely that the “significant weakness” they found was found by a person or team of people scribbling on a whiteboard. And, note, they found the weaknesses in that algorithm in the weeks after publication when those holes – or indeed “any holes at all” – would take the public-facing crypto community more than a decade to discover were a theoretical possibility.

Now, wash that tender morsel down with this quote from an article in Wired quoting James Bamford, longtime writer about all things NSA:

“According to another top official also involved with the program, the NSA made an enormous breakthrough several years ago in its ability to cryptanalyze, or break, unfathomably complex encryption systems employed by not only governments around the world but also many average computer users in the US. The upshot, according to this official: “Everybody’s a target; everybody with communication is a target.”

“Many average computer users in the US”? Welp. That’s SSL, then.

So odds are good that what we here in the public and private sectors consider to be strong crypto isn’t much more of an impediment for the NSA than ROT-13. In the public sector AES-128 is considered sufficient for information up to level “secret” only; AES-256 is for “top secret”, and both are part of the NSA’s Suite B series of cryptographic algorithms, outlined here.

Suite A is unlikely to ever see the light of day, not even so much as their names. The important thing that this suggests is that the NSA may internally have a class break for their recommended Series B crypto algorithms, or at least an attack that makes decryption computationally feasible for a small set of people that includes themselves, and indeed for anything weaker, or with known design flaws.

The problem that needs to be addressed here is a policy problem, not a technical one. And that’s actually great news, because if you’re getting into a pure-math-and-computational-power arms race with the NSA, you’re gonna have a bad time.

February 18, 2013

That’s Too Much Machine For You

Filed under: awesome,documentation,future,interfaces,irc,linux,science,toys — mhoye @ 11:10 am

Keep This Area Clear

Man, how awful is it to see people broken by the realization that they are no longer young. Why are you being cantankerous, newly-old person? It’s totally OK not to be 17 or 23, things are still amazing! Kids are having fun! You may not really understand it, but just roll with it! The stuff you liked when you were 17 isn’t diminished by your creeping up on 40!

This has been making the rounds, a lazy, disappointing article from Wired about the things we supposedly “learned about hacking” from the 1995 almost-classic, Hackers. It’s a pretty unoriginal softball of an article, going for a few easy smirks by cherrypicking some characters’ sillier idiosyncrasies while making the author sound like his birthday landed on him like a cartoon piano.

We need a word for this whole genre of writing, where the author tries far too hard to convince you of his respectable-grownup-hood by burning down his youth. It’s hard to believe that in fifteen years the cycle won’t repeat itself, with this article being the one on the pyre; you can almost smell the smoke already, the odor of burning Brut and secret regrets.

The saddest part of the article, really, is how much it ignores. Which is to say: just about everything else. There’s plenty of meat to chew on there, so I don’t really understand why; presumably it has something to do with deadlines or clickthroughs or word-counts or column inches or something, whatever magic words the writers at Wired burble as they pantomime their editor’s demands and sob into their dwindling Zima stockpile.

I’ve got quite a soft spot in my heart and possibly also my brain for this movie, in part because it is flat-out amazing how many things Hackers got exactly right:

  • Most of the work involves sitting in immobile concentration, staring at a screen for hours trying to understand what’s going on? Check.
  • It’s usually an inside job from a disgruntled employee? Check.
  • A bunch of kids who don’t really understand how severe the consequences of what they’re up to can be, in it for kicks? Check.
  • Grepping otherwise-garbage swapfiles for security-sensitive information? Almost 20 years later most people still don’t get why that one’s a check, but my goodness: check.
  • Social-engineering for that one piece of information you can’t get otherwise, it works like a charm? Check.
  • Using your computer to watch a TV show you wouldn’t otherwise be able to? Golly, that sounds familiar.
  • Dumpster-diving for source printouts? I suspect that for most of my audience “line printers” fit in the same mental bucket as “coelecanth”, and printing anything at all, much less code, seems kind of silly and weird by now, so you’ll just have to take my word for it when I say: very much so, check.
  • A computer virus that can affect industrial control systems, causing a critical malfunction? I wonder where I’ve heard that recently.
  • Abusive prosecutorial overreach, right from the opening scene? You’d better believe, check.

So if you haven’t seen it, Hackers is a remarkable artefact of its time. It’s hardly perfect; the dialog is uneven, the invented slang aged as well as invented-slang always does. Moore’s Law has made anything with a number on the side look kind of quaint, and there’s plenty of that horrible neon-cars-on-neon-highways that directors seem to fall back on when they need to show you what the inside of a computer is doing. But really: Look at that list. Look at it.

For all its flaws, sure, Hackers may not be something you’d hold aloft as a classic. But it’s good fun and it gets an awful lot more right than wrong, and that’s not nothing.

June 4, 2012

Today, In Orbital Panopticon News

Filed under: doom,future,interfaces,lunacy,science,toys — mhoye @ 3:23 pm

This is really astounding, though perhaps it shouldn’t be. The Department of Defence has given NASA a gift of two better-than-Hubble telescopes it built but never used, because despite this quote describing them…

They have 2.4-meter (7.9 feet) mirrors, just like the Hubble. They also have an additional feature that the civilian space telescopes lack: A maneuverable secondary mirror that makes it possible to obtain more focused images. These telescopes will have 100 times the field of view of the Hubble, according to David Spergel, a Princeton astrophysicist and co-chair of the National Academies advisory panel on astronomy and astrophysics.

… it considers them to be outdated. That’s right – 100 times the field of view of the Hubble, more maneuverable and able to take far more accurate pictures, hugely better than any instrument available to any civilian anywhere, and apparently an antique. As The Atlantic notes:

“That’s right. Our military had two, unflown, better-than-Hubble space telescopes just sitting around. […] This is the state of our military-industrial-scientific complex in miniature: The military has so much money that it has two extra telescopes better than anything civilians have; meanwhile, NASA will need eight years to find enough change in the couches at Cape Canaveral to turn these gifts into something they can use. Anyone else find anything wrong with this state of affairs?”

Maybe just the fact that those cameras were intended to be pointed down, not up.

The issue’s not whether you’re paranoid, Lenny, I mean look at this shit, the issue is whether you’re paranoid enough.

Strange Days, 1995.

May 14, 2012

Bring Your Daughter To Work Day

Filed under: awesome,beauty,life,parenting,science,toys — mhoye @ 9:11 pm

Switching Tires

Work

Seen here wearing her favorite Rocket Shirt, this is Maya is helping me change the tires on my car. It’s important to get kids started early on this sort of work, I think.

Early Signs Of Anthro

Filed under: documentation,interfaces,parenting,science — mhoye @ 9:05 pm

Smug

Noted psychoanalyst Jacques Lacan has an interesting theory about infant development, something called the “Mirror Stage”. The idea is that at some point very early in a child’s development they will, on seeing themselves and a parent in the mirror, look from themselves to their parent back to themselves, in shock and laughter; this is the infant’s discovery of the Self, and the moment of differentiation from the Other, the forming brain’s earliest discovery that they are in the world, and differentiated from the other within it.

Like most early theories about psychological development, it’s bunk; it bears no relation to empirically obtained results, casual scrutiny or even common sense. The congenitally blind, for example, still believe themselves to be individuals and humans antedate silvered glass by more than a few years. But it’s really compellingly-told bunkum, and makes for good stories that are easy to retell; much like Freud, it survives in popular narrative long after it’s been deprecated or debunked by the professional community.

The main reason bunkum like Freud or Lacan’s – the urban myths of the human psyche, really – is stories like the one I’m about to tell you get told all the time.

Coincidentally about a week and a half ago and for no particular reason, I started showing Carter himself in a mirror before putting him in the bath. And in the space of a week, Carter has gone from being a wad of fussy protohuman cookie dough to tracking faces, interacting with noises and just generally acting like the early stages of an actual human. It’s really remarkable how quickly that happens, like (a lot like, I bet) something in his head just finished self-assembling and turned itself on. Carter and I have conversations, now – he makes a noise, and I respond, and he seems – how can I know for sure, really, but he seems – to understand that if we make eye contact, he can make a little noise, and I’ll make one back, and then he makes another. He tracks my face, though with about a third of a second in lag.

I’ve felt that rush of understanding, staring at code; suddenly the pieces all seem to coalesce, and the solution to the problem I’ve been staring at for hours or days is just there, intact. I can’t imagine what it must be like for an infant, to go from random shapes and noises to other people. It must be a hell of a thing.

But that thing with the mirrors is kind of silly, really.

« Newer PostsOlder Posts »

Powered by WordPress