blarg?

October 23, 2019

80×25

Every now and then, my brain clamps on to obscure trivia like this. It takes so much time. “Because the paper beds of banknote presses in 1860 were 14.5 inches by 16.5 inches, a movie industry cartel set a standard for theater projectors based on silent film, and two kilobytes is two kilobytes” is as far back as I have been able to push this, but let’s get started.

In August of 1861, by order of the U.S. Congress and in order to fund the Union’s ongoing war efforts against the treasonous secessionists of the South, the American Banknote Company started printing what were then called “Demand Notes”, but soon widely known as “greenbacks”.

It’s difficult to research anything about the early days of American currency on Wikipedia these days; that space has been thoroughly colonized by the goldbug/sovcit cranks. You wouldn’t notice it from a casual examination, which is of course the plan; that festering rathole is tucked away down in the references, where articles will fold a seemingly innocuous line somewhere into the middle, tagged with an exceptionally dodgy reference. You’ll learn that “the shift from demand notes to treasury notes meant they could no longer be redeemed for gold coins[1]” – which is strictly true! – but if you chase down that footnote you wind up somewhere with a name like “Lincoln’s Treason – Fiat Currency, Maritime Law And The U.S. Treasury’s Conspiracy To Enslave America”, which I promise I am only barely exaggerating about.

It’s not entirely clear if this is a deliberate exercise in coordinated crank-wank or just years of accumulated flotsam from the usual debate-club dead-enders hanging off the starboard side of the Overton window. There’s plenty of idiots out there that aren’t quite useful enough to work the k-cups at the Heritage Institute, and I guess they’re doing something with their time, but the whole thing has a certain sinister elegance to it that the Randroid crowd can’t usually muster. I’ve got my doubts either way, and I honestly don’t care to dive deep enough into that sewer to settle them. Either way, it’s always good to be reminded that the goldbug/randroid/sovcit crank spectrum shares a common ideological klancestor.

Mercifully that is not what I’m here for. I am here because these first Demand Notes, and the Treasury Notes that came afterwards, were – on average, these were imprecise times – 7-3/8” wide by 3-1/4” tall.

I haven’t been able to precisely answer the “why” of that – I believe, but do not know, that that this is because of the size of the specific dimensions of the presses they were printed on. Despite my best efforts I haven’t been able to find the exact model and specifications of that device. I’ve asked the U.S. Congressional Research Service for some help with this, but between them and the Bureau of Engraving and Printing, we haven’t been able to pin it down. From my last correspondence with them:

Unfortunately, we don’t have any materials in the collection identifying the specific presses and their dimension for early currency production. The best we can say is that the presses used to print currency in the 1860s varied in size and model. These presses went by a number of names, including hand presses, flat-bed presses, and spider presses. They also were capable of printing sheets of paper in various sizes. However, the standard size for printing securities and banknotes appears to have been 14.5 inches by 16.5 inches. We hope this bit of information helps.

… which is unfortunate, but it does give us some clarity. A 16.5″ by 14.5″ printing sheet lets you print eight 7-3/8” by 3-1/4″ sheets to size, with a fraction of an inch on either side for trimming.

The answer to that question starts to matter about twenty years later on the heels of the 1880 American Census. Mandated to be performed once a decade, the United States population had grown some 30% since the previous census, and even with enormous effort the final tabulations weren’t finished until 1888, an unacceptable delay.

One of the 1880 Census’ early employees was a man named Herman Hollerith, a recent graduate of the Columbia School of Mines who’d been invited to join the Census efforts early on by one of his professors. The Census was one of the most important social and professional networking exercises of the day, and Hollerith correctly jumped at the opportunity:

The absence of a permanent institution meant the network of individuals with professional census expertise scattered widely after each census. The invitation offered a young graduate the possibility to get acquainted with various members of the network, which was soon to be dispersed across the country.

As an aside, that invitation letter is one of the most important early documents in the history of computing for lots of reasons, including this one:

The machine in that picture was the third generation of the “Hollerith Tabulator”, notable for the replaceable plugboard that made it reprogrammable. I need to find some time to dig further into this, but that might be the first multipurpose, if not “general purpose” as we’ve come to understand it, electronic computation device. This is another piece of formative tech that emerged from this era, one that led to directly to the removable panels (and ultimately the general componentization) of later computing hardware.

Well before the model 3, though, was the original 1890 Hollerith Census Tabulator that relied on punchcards much like this one.

Hollerith took the inspiration for those punchcards from the “punch photographs” used by some railways at the time to make sure that tickets belonged to the passengers holding them. You can see a description of one patent for them here dating to 1888, but Hollerith relates the story from a few years earlier:

One thing that helped me along in this matter was that some time before I was traveling in the west and I had a ticket with what I think was called a punch photograph. When the ticket was first presented to a conductor he punched out a description of the individual, as light hair, dark eyes, large nose etc. So you see I only made a punch photograph of each person.

Tangentially: this is the birth of computational biometrics. And as you can see from this extract from The Railway News (Vol. XLVIII, No. 1234 , published Aug. 27, 1887) people have been concerned about harassment because of unfair assessment by the authorities from day one:

punch-photograph

After experimenting with a variety of card sizes Hollerith decided that to save on production costs he’d use the same boxes the U.S. Treasury was using for the currency of the day: the Demand Note. Punch cards stayed about that shape, punched with devices that looked a lot like this for about 20 years until Thomas Watson Sr. (IBM’s first CEO, from whom the Watson computer gets its name) asked Clair D. Lake and J. Royden Peirce to develop a new, higher data-density card format.

Tragically, this is the part where I need to admit an unfounded assertion. I’ve got data, the pictures line up and numbers work, but I don’t have a citation. I wish I did.

Take a look at “Type Design For Typewriters: Olivetti, written by Maria Ramos Silvia. (You can see a historical talk from her on the history of typefaces here that’s also pretty great.)

Specifically, take a look on page 46 at Mikron Piccolo, Mikron Condensed. The fonts don’t precisely line up – see the different “4”, for example, when comparing it to the typesetting of IBM’s cards – but the size and spacing do. In short: a line of 80 characters, each separated by a space, is the largest round number of digits that the tightest typesetting of the day would allow to be fit on a single 7-3/8” wide card: a 20-point condensed font.

I can’t find a direct citation for this; that’s the only disconnect here. But the spacing all fits, the numbers all work, and I’d bet real money on this: that when Watson gave Lake the task of coming up with a higher information-density punch card, Lake looked around at what they already had on the shelf – a typewriter with the highest-available character density of the day, on cards they could manage with existing and widely-available tooling – and put it all together in 1928. The fact that a square hole – a radical departure from the standard circular punch – was a patentable innovation at the time was just icing on the cake.

The result of that work is something you’ll certainly recognize, the standard IBM punchcard, though of course there’s lot more to it than that. Witness the full glory of the Card Stock Acceptance Procedure, the protocol for measuring folding endurance, air resistance, smoothness and evaluating the ash content, moisture content and pH of the paper, among many other things.

At one point sales of punchcards and related tooling constituted a completely bonkers 30% of IBM’s annual profit margin, so you can understand that IBM had a lot invested in getting that consistently, precisely correct.

At around this time John Logie Baird invented the first “mechanical television”; like punchcards, the first television cameras were hand-cranked devices that relied on something called a Nipkow disk, a mechanical tool for separating images into sequential scan lines, a technique that survives in electronic form to this day. By linearizing the image signal Baird could transmit the image’s brightness levels via a simple radio signal and in 1926 he did just that, replaying that mechanically encoded signal through a CRT and becoming the inventor of broadcast television. He would go on to pioneer colour television – originally called Telechrome, a fantastic name I’m sad we didn’t keep – but that’s a different story.

Baird’s original “Televisor” showed its images on a 7:3 aspect ration vertically oriented cathode ray tube, intended to fit the head and shoulders of a standing person, but that wouldn’t last.

For years previously, silent films had been shot on standard 35MM stock, but the addition of a physical audio track to 35MM film stock didn’t leave enough space left over for the visual area. So – after years of every movie studio having its own preferred aspect ratio, which required its own cameras, projectors, film stock and tools (and and and) – in 1929 the movie industry agreed to settle on the Society of Motion Picture And Television Engineers’ proposed standard of 0.8 inches by 0.6 inches, what became known as the Academy Ratio, or as we better know it today, 4:3.

Between 1932 and 1952, when widescreen for cinemas came into vogue as a differentiator from standard television, just about all the movies made in the world were shot in that aspect ratio, and just about every cathode ray tube made came in that shape, or one that could display it reliably. In 1953 studios started switching to a wider “Cinemascope”, to aggressively differentiate themselves from television, but by then television already had a large, thoroughly entrenched install base, and 4:3 remained the standard for in-home displays – and CRT manufacturers – until widescreen digital television came to market in the 1990s.

As computers moved from teleprinters – like, physical, ink-on-paper line printers – to screens, one byproduct of that standardization was that if you wanted to build a terminal, you either used that aspect ratio or you started making your own custom CRTs, a huge barrier to market entry. You can do that if you’re IBM, and you’re deeply reluctant to if you’re anyone else. So when DEC introduced their VT52 terminal, a successor to the VT50 and earlier VT05 that’s what they shipped, and with only 1Kb of display ram (one kilobyte!) it displayed only twelve rows of widely-spaced text. Math is unforgiving, and 80×12=960; even one more row breaks the bank. The VT52 and its successor the VT100, though, doubled that capacity giving users the opulent luxury of two entire kilobytes of display memory, laid out with a font that fit nicely on that 4:3 screen. The VT100 hit the market in August of 1978, and DEC sold more than six million of them over the product’s lifespan.

You even got an extra whole line to spare! Thanks to the magic of basic arithmetic 80×25 just sneaks under that opulent 2k limit with 48 bytes to spare.

This is another point where direct connections get blurry, because 1976 to 1984 was an incredibly fertile time in the history of computing history. After a brief period where competing terminal standards effectively locked software to the hardware that it shipped on, the VT100 – being the first terminal to market fully supporting the recently codified ANSI standard control and escape sequences – quickly became the de-facto standard, and soon afterwards the de-jure, codified in ANSI-X3.64/ECMA-48. CP/M, soon to be replaced with PC-DOS and then MS-DOS came from this era, with ANSI.SYS being the way DOS programs talked to the display from DOS 2.0 through to beginning of Windows. Then in 1983 the Apple IIe was introduced, the first Apple computer to natively support an 80×24 text display, doubling the 40×24 default of their earlier hardware. The original XTerm, first released in 1984, was also created explicitly for VT100 compatibility.

Fascinatingly, the early versions of the ECMA-48 standard specify that this standard isn’t solely meant for displays, specifying that “examples of devices conforming to this concept are: an alpha-numeric display device, a printer or a microfilm output device.”

A microfilm output device! This exercise dates to a time when microfilm output was a design constraint! I did not anticipate that cold-war spy-novel flavor while I was dredging this out, but it’s there and it’s magnificent.

It also dates to a time that the market was shifting quickly from mainframes and minicomputers to microcomputers – or, as we call them today, “computers” – as reasonably affordable desktop machines that humans might possibly afford and that companies might own a large number of, meaning this is also where the spectre of backcompat starts haunting the industry – This moment in a talk from the Microsoft developers working on the Windows Subsystem for Linux gives you a sense of the scale of that burden even today. In fact, it wasn’t until the fifth edition of ECMA-48 was published in 1991, more than a decade after the VT100 hit the market, that the formal specification for terminal behavior even admitted the possibility (Appendix F) that a terminal could be resized at all, meaning that the existing defaults were effectively graven in stone during what was otherwise one of the most fertile and formative periods in the history of computing.

As a personal aside, my two great frustrations with doing any kind of historical CS research remain the incalculable damage that academic paywalls have done to the historical record, and the relentless insistence this industry has on justifying rather than interrogating the status quo. This is how you end up on Stack Overflow spouting unresearched nonsense about how “4 pixel wide fonts are untidy-looking”. I’ve said this before, and I’ll say it again: whatever we think about ourselves as programmers and these towering logic-engines we’ve erected, we’re a lot more superstitious than we realize, and by telling and retelling these unsourced, inaccurate just-so stories without ever doing the work of finding the real truth, we’re betraying ourselves, our history and our future. But it’s pretty goddamned difficult to convince people that they should actually look things up instead of making up nonsense when actually looking things up, even for a seemingly simple question like this one, can cost somebody on the outside edge of an academic paywall hundreds or thousands of dollars.

So, as is now the usual in these things:

  • There are technical reasons,
  • There are social reasons,
  • It’s complicated, and
  • Open access publication or GTFO.

But if you ever wondered why just about every terminal in the world is eighty characters wide and twenty-five characters tall, there you go.

September 11, 2019

Duty Of Care

A colleague asked me what I thought of this Medium article by Owen Bennett on the application of the UK’s Duty of Care laws to software. I’d had… quite a bit of coffee at that point, and this (lightly edited) was my answer:

I think the point Bennett makes early about the shortcomings of analogy is an important one, that however critical analogy is as a conceptual bridge it is not a valuable endpoint. To some extent analogies are all we have when something is new; this is true ever since the first person who saw fire had to explain to somebody else, it warms like the sun but it is not the sun, it stings like a spear but it is not a spear, it eats like an animal but it is not an animal. But after we have seen fire, once we know fire we can say, we can cage it like an animal, like so, we can warm ourselves by it like the sun like so. “Analogy” moves from conceptual, where it is only temporarily useful, to functional and structural where the utility endures.

I keep coming back to something Bryan Cantrill said in the beginning of an old DTrace talk – https://www.youtube.com/watch?v=TgmA48fILq8 – (even before he gets into the dtrace implementation details, the first 10 minutes or so of this talk are amazing) – that analogies between software and literally everything else eventually breaks down. Is software an idea, or is it a machine? It’s both. Unlike almost everything else.

(Great line from that talk – “Does it bother you that none of this actually exists?”)

But: The UK has some concepts that really do have critical roles as functional- and structural-analogy endpoints for this transition. What is your duty of care here as a developer, and an organization? Is this software fit for purpose?

Given the enormous potential reach of software, those concepts absolutely do need to survive as analogies that are meaningful and enforceable in software-afflicted outcomes, even if the actual text of (the inevitable) regulation of software needs to recognize software as being its own, separate thing, that in the wrong context can be more dangerous than unconstrained fire.

With that in mind, and particularly bearing in mind that the other places the broad “duty of care” analogy extends go well beyond beyond immediate action, and covers stuff like industrial standards, food safety, water quality and the million other things that make modern society work at all, I think Bennett’s argument that “Unlike the situation for ‘offline’ spaces subject to a duty of care, it is rarely the case that the operator’s act or omission is the direct cause of harm accruing to a user — harm is almost always grounded in another user’s actions” is incorrectly omitting an enormous swath of industrial standards and societal norms that have already made the functional analogy leap so effectively as to be presently invisible.

Put differently, when Toyota recalls hundreds of thousands of cars for potential defects in which exactly zero people were harmed, we consider that responsible stewardship of their product. And when the people working at Uber straight up murder a person with an autonomous vehicle, they’re allowed to say “but software”. Because much of software as an industry, I think, has been pushing relentlessly against the notion that the industry and people in it can or should be held accountable for the consequences of their actions, which is another way of saying that we don’t have and desperately need a clear sense of what a “duty of care” means in the software context.

I think that the concluding paragraph – “To do so would twist the law of negligence in a wholly new direction; an extremely risky endeavour given the context and precedent-dependent nature of negligence and the fact that the ‘harms’ under consideration are so qualitatively different than those subject to ‘traditional’ duties.” – reflects a deep genuflection to present day conceptual structures, and their specific manifestations as text-on-the-page-today, that is (I suppose inevitably, in the presence of this Very New Thing) profoundly at odds with the larger – and far more noble than this article admits – social and societal goals of those structures.

But maybe that’s just a superficial reading; I’ll read it over a few times and give it some more thought.

March 27, 2019

Defined By Prosodic And Morphological Properties

Filed under: academia,academic,awesome,interfaces,lunacy,science — mhoye @ 5:09 pm

Untitled

I am fully invested in these critical advances in memetic linguistics research:

[…] In this paper, we go beyond the aforementioned prosodic restrictions on novel morphology, and discuss gradient segmental preferences. We use morphological compounding to probe English speakers’ intuitions about the phonological goodness of long-distance vowel and consonant identity, or complete harmony. While compounding is a central mechanism for word-building in English, its phonology does not impose categorical vowel or consonant agreement patterns, even though such patterns are attested cross-linguistically. The compound type under investigation is a class of insult we refer to as shitgibbons, taking their name from the most prominent such insult which recently appeared in popular online media (§2). We report the results of three online surveys in which speakers rated novel shitgibbons, which did or did not instantiate long-distance harmonies (§3). With the ratings data established, we follow two lines of inquiry to consider their source: first, we compare shitgibbon harmony preferences with the frequency of segmental harmony in English compounds more generally, and conclude that the lexicon displays both vowel and consonant harmony (§4); second, we attribute the lack of productive consonant harmony in shitgibbons to the attested cross-linguistic harmonies, which we implement as a locality bias in MaxEnt grammar (§5).

You had me at “diddly-infixation”.

August 15, 2018

Time Dilation

Filed under: academic,digital,documentation,interfaces,lunacy,mozilla,science,work — mhoye @ 11:17 am


[ https://www.youtube.com/embed/JEpsKnWZrJ8 ]

I riffed on this a bit over at twitter some time ago; this has been sitting in the drafts folder for too long, and it’s incomplete, but I might as well get it out the door. Feel free to suggest additions or corrections if you’re so inclined.

You may have seen this list of latency numbers every programmer should know, and I trust we’ve all seen Grace Hopper’s classic description of a nanosecond at the top of this page, but I thought it might be a bit more accessible to talk about CPU-scale events in human-scale transactional terms. So: if a single CPU cycle on a modern computer was stretched out as long as one of our absurdly tedious human seconds, how long do other computing transactions take?

If a CPU cycle is 1 second long, then:

  • Getting data out of L1 cache is about the same as getting your data out of your wallet; about 3 seconds.
  • At 9 to 10 seconds, getting data from L2 cache is roughly like asking your friend across the table for it.
  • Fetching data from the L3 cache takes a bit longer – it’s roughly as fast as having an Olympic sprinter bring you your data from 400 meters away.
  • If your data is in RAM you can get it in about the time it takes to brew a pot of coffee; this is how long it would take a world-class athlete to run a mile to bring you your data, if they were running backwards.
  • If your data is on an SSD, though, you can have it six to eight days, equivalent to having it delivered from the far side of the continental U.S. by bicycle, about as fast as that has ever been done.
  • In comparison, platter disks are delivering your data by horse-drawn wagon, over the full length of the Oregon Trail. Something like six to twelve months, give or take.
  • Network transactions are interesting – platter disk performance is so poor that fetching data from your ISP’s local cache is often faster than getting it from your platter disks; at two to three months, your data is being delivered to New York from Beijing, via container ship and then truck.
  • In contrast, a packet requested from a server on the far side of an ocean might as well have been requested from the surface of the moon, at the dawn of the space program – about eight years, from the beginning of the Apollo program to Armstrong, Aldrin and Collin’s successful return to earth.
  • If your data is in a VM, things start to get difficult – a virtualized OS reboot takes about the same amount of time as has passed between the Renaissance and now, so you would need to ask Leonardo Da Vinci to secretly encode your information in one of his notebooks, and have Dan Brown somehow decode it for you in the present? I don’t know how reliable that guy is, so I hope you’re using ECC.
  • That’s all if things go well, of course: a network timeout is roughly comparable to the elapsed time between the dawn of the Sumerian Empire and the present day.
  • In the worst case, if a CPU cycle is 1 second, cold booting a racked server takes approximately all of recorded human history, from the earliest Indonesian cave paintings to now.

June 8, 2017

A Security Question

To my shame, I don’t have a certificate for my blog yet, but as I was flipping through some referer logs I realized that I don’t understand something about HTTPS.

I was looking into the fact that I sometimes – about 1% of the time – I see non-S HTTP referers from Twitter’s t.co URL shortener, which I assume means that somebody’s getting man-in-the-middled somehow, and there’s not much I can do about it. But then I realized the implications of my not having a cert.

My understanding of how this works, per RFC7231 is that:

A user agent MUST NOT send a Referer header field in an unsecured HTTP request if the referring page was received with a secure protocol.

Per the W3C as well:

Requests from TLS-protected clients to non- potentially trustworthy URLs, on the other hand, will contain no referrer information. A Referer HTTP header will not be sent.

So, if that’s true and I have no certificate on my site, then in theory I should never see any HTTPS entries in my referer logs? Right?

Except: I do. All the time, from every browser vendor, feed reader or type of device, and if my logs are full of this then I bet yours are too.

What am I not understanding here? It’s not possible, there is just no way for me to believe that it’s two thousand and seventeen and I’m the only person who’s ever noticed this. I have to be missing something.

What is it?

FAST UPDATE: My colleagues refer me to this piece of the puzzle I hadn’t been aware of, and Francois Marier’s longer post on the subject. Thanks, everyone! That explains it.

SECOND UPDATE: Well, it turns out it doesn’t completely explain it. Digging into the data and filtering out anything referred via Twitter, Google or Facebook, I’m left with two broad buckets. The first is is almost entirely made of feed readers; it turns out that most and maybe almost all feed aggregators do the wrong thing here. I’m going to have to look into that, because it’s possible I can solve this problem at the root.

The second is one really persistent person using Firefox 15. Who are you, guy? Why don’t you upgrade? Can I help? Email me if I can help.

December 15, 2016

Even the dedication to reason and truth might, for all we know, change drastically.

Filed under: academic,documentation,doom,future,interfaces,vendetta — mhoye @ 12:19 pm

The following letter, written by Carl Sagan, is one of the appendices of the “Expert Judgement on Markers To Deter Inadvertent Human Intrusion into the Waste Isolation Pilot Plant” document, completed in 1993.

It’s on page 331, and it hurts to read.

Dr. D. Richard Anderson
Performance Assessment Division
6342 Sandia National Laboratories
Albuquerque,
New Mexico
87185

Dear Dr. Anderson:

Many thanks for your kind invitation to participate in the panel charged with making recommendations on signing to the far future about the presence of dangerous long-lived radioactive waste repositories (assuming the waste hasn’t all leached out by then). It is an interesting and important problem, and I’m sorry that my schedule will not permit me to participate. But I can, in a few sentences, tell you my views on the matter; perhaps you would be kind enough to pass them on to the members of the panel:

Several half-lives of the longest-lived radioisotopes in question constitute a time period longer than recorded human history. No one knows what changes that span of time will bring. Social institutions, artistic conventions, written and spoken language, scientific knowledge and even the dedication to reason and truth might, for all we know, change drastically. What we need is a symbol invariant to all those possible changes. Moreover, we want a symbol that will be understandable not just to the most educated and scientifically literate members of the population, but to anyone who might come upon this repository. There is one such symbol . It is tried and true. It has been used transculturally for thousands of years, with unmistakable meaning. It is the symbol used on the lintels of cannibal dwellings, the flags of pirates, the insignia of SS divisions and motorcycle gangs, the labels of bottles of poisons — the skull and crossbones. Human skeletal anatomy, we can be reasonably sure, will not unrecognizably change in the next few tens of thousands of years. You might very well wish also to include warnings in major human languages (being careful not to exclude Chinese and Arabic), and to attach a specification of the radioisotopes in question — perhaps by circling entries in a periodic table with the appropriate isotopic atomic numbers emphasized. It might be useful to include on the signs their own radioactive markers so that the epoch of radioactive waste burial can be calculated (or maybe a sequence of drawings of the Big Dipper moving around the Pole Star each year so that, through the precession of the equinoxes, the epoch of burial, modulo 26,000 years, could be specified) . But all this presumes much about future generations. The key is the skull and crossbones.

Unless a more powerful and more direct symbol can be devised, I think the only reason for not using the skull and crossbones is that we believe the current political cost of speaking plainly about deadly radioactive waste is worth more than the well-being of future generations.

With best wishes,

      Cordially,

      Carl Sagan

September 22, 2016

Falsehoods Programmers Believe About Economics

Filed under: academic,documentation,doom,interfaces,want — mhoye @ 8:38 am

IMG_20160521_152840

Late update: This post has been added to this excellent list of falsehoods programmers believe, and I’m pretty proud of that.

Two similar jokes rolled past me late last week, the first when I mentioned that running a Java program in JVM in a Linux VM in a container on AWS is a very inefficient way of generating waste heat, and that I could save a lot of time and effort by cutting out the middleman and just setting money on fire.

For the second, a friend observed that startups are an extremely inefficient way of transferring wealth from venture capitalists to bay-area landlords; there’s a disruptive opportunity here to shortcut that process and just give venture capital directly to the SoCal rentier class for a nominal service fee. I suggested he call his startup “olygarchr”, or maybe “plutocrysii”; you heard it here first, in two years YCombinator will be obsolete.

With that in mind and in the spirit of the now-classic Falsehoods Programmers Believe About Time and Falsehoods Programmers Believe About Names, I asked for this on Twitter the other day and got some pretty good feedback. But I guess if I want something written up, I’ll be the one writing it up.

I may add some more links to this over the next little while, but for now here you go.

Falsehoods Programmers Believe About Economics

  1. Economics is simple.
  2. Econ-101 is a comprehensive overview of the field.
  3. Economics is morally neutral.
  4. Economics is racially- and gender-neutral.
  5. The efficient markets hypothesis is true.
  6. Classical economics is empirically grounded.
  7. Politics is an entirely unrelated field.
  8. Externalities are the same as inefficiencies.
  9. Pareto efficiency exists.
  10. Information symmetry exists.
  11. People are rational actors.
  12. OK, sure, people, I get it. But I’m a rational actor.
  13. “Rational” to me is the same as “rational” to everyone else.
  14. Rational actors exist at all.
  15. Advertising doesn’t influence or distort markets.
  16. Ok, fine, but advertising doesn’t influence me.
  17. Just-so stories make predictive economic models.
  18. Just-so stories make effective public policy.
  19. Price is an indication of cost.
  20. Price is an indication of value.
  21. The system works for me, therefore the system works for everyone.
  22. Wealth is an indication of worth.

November 15, 2015

The Thousand Year Roadmap

Filed under: academic,documentation,future,interfaces,lunacy,mozilla,work — mhoye @ 10:33 am

I made this presentation at Seneca’s FSOSS a few weeks ago; some of these ideas have been rattling around in my brain for a while, but it was the first time I’d even run through it. I was thoroughly caffeinated at the time so all of my worst verbal tics are on display, right as usual um right pause right um. But if you want to have my perspective on why free and open source software matters, why some institutions and ideas live and others die out, and how I think you should design and build organizations around your idea so that they last a few hundred years, here you go.

There are some mistakes I made, and now that I’m watching it – I meant to say “merchants” rather than “farmers”, there’s a handful of others I may come back here to note later. But I’m still reasonably happy with it.

August 30, 2015

Opaque Symbology

Filed under: academic,documentation,interfaces,lunacy — mhoye @ 8:22 pm

A collection of highway traffic signage unused pending an economical symbolic representation:

  • Warning: Ornery Local Stereotype
  • Completely Unremarkable Natural Phenomenon: 2 KM
  • Something Will Happen And The Right Decision Will Seem Obvious In Hindsight But Nobody In The Car Will Ever Let You Live Down What You Did, Next 500 M
  • If There Were No Eternal Consciousness In The Next 10 KM, If At The Bottom Of Everything There Were Only A Wild Ferment, A Power That Twisting In Dark Passions Produced Everything Great Or Inconsequential, If An Unfathomable, Insatiable Emptiness Lay Hid Beneath Everything, What Would The Next 10 KM Be But Despair
  • Meese
  • Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season Duck Season Rabbit Season
  • Iield, Yield, Theild. Hield, Shield, Wield
  • Jarring And Inappropriate Pop-Culture References Next 12 Parsecs
  • I Don’t Care For Your Tone Young Man
  • Now Entering Eldritch Nether Regions
  • I Wouldn’t Call Them Slow Children Playing But Honestly They’re Not The Brightest Of Sparks Rachel It’s A Highway Who Lets Their Kids Do That Somebody Should Call Somebody My God
  • Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Red Rum Turn Right Next Exit
  • Property Is Theft Therefore Theft Is Property Therefore This Road Sign Is Mine Now No You Shut Up That Is How It Works Doug
  • OMG LIEK WOAH NEXT 3 KM OMG OMG
  • Yo Dog I Heard You Like Hairpin Turns So We Put A Hairpin Turn In Your Hairpin Turn So You Can Die While You Die
  • Locus Of Shame
  • Caution But Telling You Why Would Ruin The Surprise
  • Slow: Ennui

July 24, 2015

Hostage Situation

(This is an edited version of a rant that started life on Twitter. I may add some links later.)

Can we talk for a few minutes about the weird academic-integrity hostage situation going on in CS research right now?

We share a lot of data here at Mozilla. As much as we can – never PII, not active security bugs, but anyone can clone our repos or get a bugzilla account, follow our design and policy discussions, even watch people design and code live. We default to open, and close up only selectively and deliberately. And as part of my job, I have the enormous good fortune to periodically go to conferences where people have done research, sometimes their entire thesis, based on our data.

Yay, right?

Some of the papers I’ve seen promise results that would be huge for us. Predicting flaws in a patch prereview. Reducing testing overhead 80+% with a four-nines promise of no regressions and no loss of quality.

I’m excitable, I get that, but OMFG some of those numbers. 80 percent reductions of testing overhead! Let’s put aside the fact that we spend a gajillion dollars on the physical infrastructure itself, let’s only count our engineers’ and contributors’ time and happiness here. Even if you’re overoptimistic by a factor of five and it’s only a 20% savings we’d hire you tomorrow to build that for us. You can have a plane ticket to wherever you want to work and the best hardware money can buy and real engineering support to deploy something you’ve already mostly built and proven. You want a Mozilla shirt? We can get you that shirt! You like stickers? We have stickers! I’ll get you ALL THE FUCKING STICKERS JUST SHOW ME THE CODE.

I did mention that I’m excitable, I think.

But that’s all I ask. I go to these conferences and basically beg, please, actually show me the tools you’re using to get that result. Your result is amazing. Show me the code and the data.

But that never happens. The people I talk to say I don’t, I can’t, I’m not sure, but, if…

Because there’s all these strange incentives to hold that data and code hostage. You’re thinking, maybe I don’t need to hire you if you publish that code. If you don’t publish your code and data and I don’t have time to reverse-engineer six years of a smart kid’s life, I need to hire you for sure, right? And maybe you’re not proud of the code, maybe you know for sure that it’s ugly and awful and hacks piled up over hacks, maybe it’s just a big mess of shell scripts on your lab account. I get that, believe me; the day I write a piece of code I’m proud of before it ships will be a pretty good day.

But I have to see something. Because from our perspective, making a claim about software that doesn’t include the software you’re talking about is very close to worthless. You’re not reporting a scientific result at that point, however miraculous your result is; you’re making an unverifiable claim that your result exists.

And we look at that and say: what if you’ve got nothing? How can we know, without something we can audit and test? Of course, all the supporting research is paywalled PDFs with no concomitant code or data either, so by any metric that matters – and the only metric that matters here is “code I can run against data I can verify” – it doesn’t exist.

Those aren’t metrics that matter to you, though. What matters to you is either “getting a tenure-track position” or “getting hired to do work in your field”. And by and large the academic tenure track doesn’t care about open access, so you’re afraid that actually showing your work will actively hurt your likelihood of getting either of those jobs.

So here we are in this bizarro academic-research standoff, where I can’t work with you without your tipping your hand, and you can’t tip your hand for fear I won’t want to work with you. And so all of this work that could accomplish amazing things for a real company shipping real software that really matters to real people – five or six years of the best work you’ve ever done, probably – just sits on the shelf rotting away.

So I go to academic conferences and I beg people to publish their results and paper and data open access, because the world needs your work to matter. Because open access plus data/code as a minimum standard isn’t just important to the fundamental principles of repeatable experimental science, the integrity of your field, and your career. It’s important because if you want your work to matter to people, then you’d better put it somewhere that people can see it and use it and thank you for it and maybe even improve on it.

You did this as an undergrad. You insist on this from your undergrads, for exactly the same reasons I’m asking you to do the same: understanding, integrity and plain old better results. And it’s a few clicks and a GitHub account for you to do the same now. But I need you to do it one last time.

Full marks here isn’t “a job” or “tenure”. Your shot at those will be no worse, though I know you can’t see it from where you’re standing. But they’re still only a strong B. An A is doing something that matters, an accomplishment that changes the world for the better.

And if you want full marks, show your work.

Older Posts »

Powered by WordPress