blarg?

November 26, 2020

Punching Holes

An early encoding proposal

As always, I am inexplicably carrying a deep seated personal grudge against anyone incurious enough to start with “because in C” when you ask them why computers do anything, but bear with me here. I know that a surprising amount of modern computing is definitely Dennis Ritchie’s fault, I get it, but even he existed in a context and these are machines made out of people’s decisions, not capricious logic-engine fae that spring to life when you rub a box of magnets with a copy of SICP. Quit repeating the myths you’ve been spoon-fed and do the research. A few hours in a library can save you a few decades in the dark.

Anyway, on a completely unrelated note: Today in wildly-unforeseeable-consequences news, I have learned where null-terminated strings come from. Not only are they older than Unix, they’re older than transistors and might be older than IBM. It turns out at least one of the roads to hell is paved with yesterday’s infrastructure.

“Coded Character Sets, History & Development” by Charles E. Mackenzie is an amazing technical artifact, approximately 75% mechanical tedium and 25% the most amazing deep-cut nerd history lesson I’ve seen in a while. I’ve gone on about how important Herman Hollerith and his card-readers were in the history of computing, but this document takes that to a much lower level, walking meticulously through the decision making processes by which each character’s bit sequence was determined, what constraints and decisions arose and how they were resolved, and where it isn’t boring as hell it is absolutely fascinating.

As an aside, it continues to be really unfortunate how much historical information about the decisions that have led up to modern computing is either hidden, lost or (I suspect most commonly) just ignored. There is a lot to learn from the Whys of so many of these decisions, lessons about process that transcend the implementation details lost in favour of easily-retellable falsehoods, and there’s an entire lost generation of programmers out there who found ESR’s version of the Jargon File before they found their own critical faculties and never quite recovered from that who’ll never learn those lessons.

(I mention it because I’m going to be talking about EBCDIC, and if your gut reaction to seeing that acronym is a snide dismissal for reasons you can’t really elaborate, I’m talking about you. Take it personally. Like a lot of Raymond’s work and indeed the man himself, his self-interested bastardization of the jargon file is superficially clever, largely wrong and aging very badly. Fortunately the Steele-1983 version is still out there and true to its moment in history, you’re still here to read it and a better future is still possible. I believe in all but one of you.)

On a less sarcastic note, there is a lot in here.

I didn’t realize quite how old ASCII is, for one. I mean, I knew the dates involved but I didn’t grasp the context, how completely disconnected the concerns of modern computing were from the creation of the ASCII standard. The idea of software, or decision-making implemented in software as relevant consumer is almost completely a non-issue, mentioned in passing and quickly brushed off. Far and away the most important concerns – as with EBCDIC, BCDIC and PTTC before it and dating back to the turn of the century – were about the efficiency of collating punch cards and fast line printing on existing tooling.

Printing, collating and backwards compatibility. In terms of importance, nothing else even came close. The idea of “code” as an information control-flow mechanism barely enters into it; that compilers exist at all is given a brief nod at the start of chapter 25 of 27, but otherwise it’s holes-in-cardboard all the way down. You can draw a straight line back in time from Unicode through a century of evolving punchcard standards all the way to the Hollerith Census Tabulator of 1890; Hollerith has cast an impossibly long shadow over this industry, and backwards compatibility with the form, machinery and practices of punch cards are entirely the name of this game and have been forever.

Early glyphs

It’s also amazing how many glyphs in various degrees of common use across various languages and systems were used, reconsidered and discarded for some wild variety of reasons as encodings evolved; the “cent” symbol giving way to a square bracket, various useful symbols like logical-not getting cut without any obvious replacements. Weird glyphs I’ve never seen on any keyboard in my life getting adopted then abandoned because they would have caused a specific model of long-established tape storage system to crash. The strangely durable importance of the lozenge character, and the time a late revision of BCDIC just … forgot “+”. Oops?

I had no idea that for a while there we were flirting with lowercase numbers. We were seriously debating whether or not computers needed a lowercase zero. That was a real thing.

Another thing I didn’t realize is how much of a dead end ASCII is, not just as a character set but as a set of practices that character set enables: *char++, I’m looking at you and all your footgun friends. I was never much of a student, but am I misremembering all that time I spent sorting and manipulating strings with tools that have wound up somewhere on the “merely obsolete” to “actively dangerous” spectrum, unsafe and unportable byproducts of a now-senescent encoding that nobody uses by choice anymore?

It’s really as though at some point, before about 1975, people just… hadn’t fully come to terms with the fact that the world is big. There’s a long chapter here about the granular implementation details of an industry struggling to come to terms with the fact that Europe and Asia actually exist and use computers and even if they didn’t sorting human text is a subtle problem and encodings aren’t the place it’s going to get solved, only for that discussion to get set shunted aside as the ghosts of compatibilities past shamble around the text rattling their chains.

There’s also a few pages in there about “Decimal ASCII” – basically “what if ASCII, but cursed” – a proposal with a such powerful Let Us Never Speak Of This Again energy that almost no modern references to it exist, high on the list of mercifully-dodged bullets scattered throughout this document.

But maybe the most interesting thing in here was about how much effort went into sorting out the difference between blank, space, null, zero and minus zero, which turns out to be a really difficult problem for all sorts of reasons. And the most incredible part of that is this:

An excerpt reading: How would one provide the traditional capability of leaving certain card columns unpunched (blank card columns) during keypunching to be filled with punched data on subsequent card punching operations? Such card columns would in fact have to be created by punching the Zero character that is equated to blank card column. In normal keypunching operations, such card columns are created by spacing, skipping or ejecting. Under this proposal, then, the relatively fast card motion of skipping or ejecting would be repalced by the relatively slow motion of manual keying by an operator. As in the previous argument, key punching productivity would be substantially reduced.

Null-terminated strings were “produced by the .ASCIZ directive of the PDP-11 assembly languages and the ASCIZ directive of the MACRO-10 macro assembly language for the PDP-10”, per Wikipedia, before manifesting themselves in C. But that’s not where they come from.

In fact null terminated strings existed long before C, because using a column of unpunched entries in a Hollerith card – a null column, in a convention that apparently dates to the earliest uses of punchcard collating and sorting machines – to indicate that you could terminate the card’s scan, was a fast, lightweight way to facilitate punch-card re-use and efficient data entry and re-entry, when that data was entered by punching it into cards.

That is to say, we somehow built the foundations of what would become a longstanding security exploit vector decades before anyone could build the operating systems it could exploit. Likely before the invention of the transistor, even; I’m trying to track down a citation for this now.

It’s sort of amazing that anything ever works at all.

October 8, 2020

Control Keys Redux

Filed under: arcade,digital,documentation,interfaces,life,linux,science,toys — mhoye @ 5:29 pm

A long overdue followup.

One of my favourite anecdotes in Kernighan’s “Unix, A History And A Memoir” is the observation that the reason early Unix commands are so often truncated – rm, mv, ls, and so on – was that the keyboards of the day were so terrible that they hurt to type on for any length of time.

I wish more people thought about keyboards. This is the primary interface to these devices we spend so much of our time on, and it baffles me that people just stick with whatever ten dollar keyboard came in the box. It makes as much sense to me a runner buying one-size-fits-all shoes.

There are a lot of people who do think about keyboards, of course, but even so what I’m aiming for isn’t part of that conversation, and often feels like lonely work. Most of the mechanical keyboard fetishists that I can find are in it for the aesthetics, assembling these switches and those keycaps, and while the results can be beautiful they aren’t structurally all that different, still not quite something truly built to be truly personal. Kailh bronze switches are made of joy, sure, but if my wrists are still contorting to use the keyboard, that fantastic popcorn keyspring texture isn’t going to be durably great for me.

I’ve had this plan in mind for a while now, and have finally gotten around to setting up a keyboard the way I’ve long intended to thanks to a friend who introduced me to Cardellini ball clamps and mounting plates. Those were the missing pieces I needed to set up the keyboard I’m typing this on now, a Kinesis Ergo Edge split mechanical keyboard.

Some minor gripes about this specific device include:

    • Manufacturers of split keyboards absolutely refuse, for reasons I cannot figure out, to allow the halves of the keyboard to overlap. I want the 6TGB line and 7YHN columns on both halves! I’d much rather have that than macros or illumination gimmicks.
    • The stands you can order for it, like the wrist rests in the box, are a waste of time. I’m using neither so it’s not a big deal, but seeing a nice product ship with cheap plastic greebling is always a shame.
    • The customization software that comes with it is… somewhat opaque. I’ll find a use for those macro keys, but for now meh. Remapping that ridiculous panic button thing in the upper left to “lock my screen” was straightforward enough, which was nice.
    • Keys on these split keyboards are never ortholinear – meaning, never in a regular old, non-offset grid, like you’d expect on a tool being used by people without diagonal fingers. Standard keyboard layouts make zero sense and haven’t in fifty years; we don’t need to genuflect to a layout forced on us by mechanical typewriter levers and haven’t since before Unix was invented! Get it together, manufacturers! But here we are.

But the nice things about it – the action on these delightfully clicky Cherry MX Blue switches,  the fact that most of the keys are in the right places, the split cable being elegantly tucked away – they outweigh all of the gripes, and so far I’m reasonably happy with the setup, but that’s not really because of what came in the box.

It’s because the setup is this:


Ergonomics

Like I say, I’ve had this in mind for a while – an A shape hanging off the front of the desk, each half of the keyboard with a ball head sticking out the plates on the bottom and a third ball head sticking out of the standing desk at about belt level, bolted into the underside of my standing desk. All of it is held together with a surprisingly rigid three-way ball clamp – the film industry doesn’t like having lights or cameras just topple to the floor for no reason, funny story – and the result is a standing desk where I can type with my hands in a very relaxed, natural position all day, without craning my wrists or resting them awkwardly on anything. The key surface is all facing away from me, which takes some getting used to, but hooking my thumbs on the side of the spacebars gives me a good enough home key experience that my typing error rate is getting back down to the usual “merely poor” levels I’m long accustomed to.

It’s a good feeling so far, even if I’m making microadjustments all the time and sort of reteaching myself how to type. I’m starting to suspect that any computer-related ergonomics setup that preassumes a desk and chair is starting from a unrecoverable condition of sin; humans are shaped like neither of those things, and tools should be made to fit humans.

Update: A few people have asked me for a parts list. It is:

 

February 18, 2020

Dexterity In Depth

Filed under: a/b,academic,documentation,interfaces,mozilla,science,vendetta,work — mhoye @ 10:50 am

Untitled

I’m exactly one microphone and one ridiculous haircut away from turning into Management Shingy when I get rolling on stuff like this, because it’s just so clear to me how much this stuff matters and how little sense I might be making at the same time. Is your issue tracker automatically flagging your structural blind spots? Do your QA and UX team run your next reorg? Why not?

This all started life as a rant on Mastodon, so bear with me here. There are two empirically-established facts that organizations making software need to internalize.

The first is that by wide margin the most significant predictive indicator that there will be a future bug in a piece of software is the relative orgchart distance of the people working on it. People who are working on a shared codebase in the same room but report to different VPs are wildly more likely to introduce errors into a codebase than two people who are on opposite sides of the planet and speak different first languages but report to the same manager.

The second is that the number one predictor that a bug will be resolved is if it is triaged correctly – filed in the right issue tracker, against the right component, assigned to the right people – on the first try.

It’s fascinating that neither of the strongest predictive indicators of the most important parts of a bug’s lifecycle – birth and death – actually take place on the developers’ desk, but it’s true. In terms of predictive power, nothing else in the software lifecycle comes close.

Taken together, these facts give you a tools to roughly predict the effectiveness of collaborating teams, and by analyzing trends among bugs that are frequently re-assigned or re-triaged, can give you a lot of foresight into how, where and why a company need to retrain or reorganize those teams. You might have read Agile As Trauma recently, in which Dorian Taylor describes agile development as an allergic reaction to previously bad management:

The Agile Manifesto is an immune response on the part of programmers to bad management. The document is an expression of trauma, and its intellectual descendants continue to carry this baggage. While the Agile era has brought about remarkable advancements in project management techniques and development tools, it remains a tactical, technical, and ultimately reactionary movement.

This description is strikingly similar to – and in obvious tension with – Clay Shirky’s description of bureaucracy as the extractive mechanism of complexity and an allergic reaction to previous institutional screwups.

Bureaucracies temporarily suspend the Second Law of Thermodynamics. In a bureaucracy, it’s easier to make a process more complex than to make it simpler, and easier to create a new burden than kill an old one.

… which sounds an awful lot like the orgchart version of “It’s harder to read code than to write it”, doesn’t it?

I believe both positions are correct. But that tension scribes the way forward, I think, for an institutional philosophy that is responsive, flexible and empirically grounded, in which being deliberate about the scale, time, and importance of different feedback cycles gives an organization the freedom to treat scaling like a tool, that the signals of different contexts can inform change as a continuum between the macro and micro levels of organizational structure and practice. Wow, that’s a lot of words in a strange order, but hear me out.

It’s not about agile, or even agility. Agility is just the innermost loops, the smallest manifestation of a wide possible set of tightly-coupled feedback mechanisms. And outside the agile team, adjacent to the team, those feedback loops may or may not exist however much they need to, up and down the orgchart (though there’s not often much “down” left in the orgchart, I’ve noticed, where most agile teams live…) but more importantly with the adjacent and complementary functions that agile teams rely on.

It is self-evident that how teams are managed profoundly affects how they deliver software. But agile development (and every other modern developer-cult I’m aware of) doesn’t close that loop, and in failing to do so agile teams are reduced to justifying their continued existence through work output rather than informing positive institutional change. And I don’t use “cult” lightly, there; the current state of empirical evaluation of agile as a practice amounts to “We agiled and it felt good and seemed to work!” And feeling good and kinda working is not nothing! But it’s a long way from being anything more than that.

If organizations make software, then starting from a holistic view of what “development” and “agility” means and could be, looking carefully at where feedback loops in an organization exist, where they don’t and what information they circulate, all that suggests that there are reliable set of empirical, analytic tools for looking at not just developer practice, but the organizational processes around them. And assessing, in some measurable, empirical way, the real and sustainable value of different software development schools and methodologies.

But honestly, if your UX and QA teams aren’t informing your next reorg, why not?

December 26, 2019

Intrasective Subversions

I often wonder where we’d be if Google had spent their don’t-be-evil honeymoon actually interviewing people for some sort moral or ethical framework instead of teaching a generation of new hires that the important questions are all about how many piano tuners play ping pong on the moon.

You might have seen the NYTimes article on hypertargeted product placement, one of those new magical ideas that look totally reasonable in an industry where CPU cycles are cheap and principles are expensive.

I just wanted to make sure we all understood that one extremely intentional byproduct of that will breathe new life into the old documnent-canary trick of tailoring sensitive text with unique punctuation or phrasing in particularly quotable passages to identify leakers, and has been purpose-built as a way to precision-target torrent seeders or anyone else who shares media. “We only showed this combination of in-product signal to this specific person, therefore they’re the guilty party” is where this is going, and that’s not an accident.

The remedy, of course, is going to be cooperation. Robust visual diffs, scene hashes and smart muting (be sure to refer to They Live for placeholder inspiration) will be more than enough to fuzz out discoverability for even a moderately-sized community. As it frequently is, the secret ingredient is smart people working together.

In any case, I’m sure that all right thinking people can agree that ads are the right place to put graffiti. So I’m looking forward to all the shows that are turned into hijacked art-project torrents the moment they’re released, and seeing

THEY LIVE
THEY LAUGH
THEY LOVE

in the background of the pirated romcoms of 2021.

October 23, 2019

80×25

Every now and then, my brain clamps on to obscure trivia like this. It takes so much time. “Because the paper beds of banknote presses in 1860 were 14.5 inches by 16.5 inches, a movie industry cartel set a standard for theater projectors based on silent film, and two kilobytes is two kilobytes” is as far back as I have been able to push this, but let’s get started.

In August of 1861, by order of the U.S. Congress and in order to fund the Union’s ongoing war efforts against the treasonous secessionists of the South, the American Banknote Company started printing what were then called “Demand Notes”, but soon widely known as “greenbacks”.

It’s difficult to research anything about the early days of American currency on Wikipedia these days; that space has been thoroughly colonized by the goldbug/sovcit cranks. You wouldn’t notice it from a casual examination, which is of course the plan; that festering rathole is tucked away down in the references, where articles will fold a seemingly innocuous line somewhere into the middle, tagged with an exceptionally dodgy reference. You’ll learn that “the shift from demand notes to treasury notes meant they could no longer be redeemed for gold coins[1]” – which is strictly true! – but if you chase down that footnote you wind up somewhere with a name like “Lincoln’s Treason – Fiat Currency, Maritime Law And The U.S. Treasury’s Conspiracy To Enslave America”, which I promise I am only barely exaggerating about.

It’s not entirely clear if this is a deliberate exercise in coordinated crank-wank or just years of accumulated flotsam from the usual debate-club dead-enders hanging off the starboard side of the Overton window. There’s plenty of idiots out there that aren’t quite useful enough to work the k-cups at the Heritage Institute, and I guess they’re doing something with their time, but the whole thing has a certain sinister elegance to it that the Randroid crowd can’t usually muster. I’ve got my doubts either way, and I honestly don’t care to dive deep enough into that sewer to settle them. Either way, it’s always good to be reminded that the goldbug/randroid/sovcit crank spectrum shares a common ideological klancestor.

Mercifully that is not what I’m here for. I am here because these first Demand Notes, and the Treasury Notes that came afterwards, were – on average, these were imprecise times – 7-3/8” wide by 3-1/4” tall.

I haven’t been able to precisely answer the “why” of that – I believe, but do not know, that that this is because of the size of the specific dimensions of the presses they were printed on. Despite my best efforts I haven’t been able to find the exact model and specifications of that device. I’ve asked the U.S. Congressional Research Service for some help with this, but between them and the Bureau of Engraving and Printing, we haven’t been able to pin it down. From my last correspondence with them:

Unfortunately, we don’t have any materials in the collection identifying the specific presses and their dimension for early currency production. The best we can say is that the presses used to print currency in the 1860s varied in size and model. These presses went by a number of names, including hand presses, flat-bed presses, and spider presses. They also were capable of printing sheets of paper in various sizes. However, the standard size for printing securities and banknotes appears to have been 14.5 inches by 16.5 inches. We hope this bit of information helps.

… which is unfortunate, but it does give us some clarity. A 16.5″ by 14.5″ printing sheet lets you print eight 7-3/8” by 3-1/4″ sheets to size, with a fraction of an inch on either side for trimming.

The answer to that question starts to matter about twenty years later on the heels of the 1880 American Census. Mandated to be performed once a decade, the United States population had grown some 30% since the previous census, and even with enormous effort the final tabulations weren’t finished until 1888, an unacceptable delay.

One of the 1880 Census’ early employees was a man named Herman Hollerith, a recent graduate of the Columbia School of Mines who’d been invited to join the Census efforts early on by one of his professors. The Census was one of the most important social and professional networking exercises of the day, and Hollerith correctly jumped at the opportunity:

The absence of a permanent institution meant the network of individuals with professional census expertise scattered widely after each census. The invitation offered a young graduate the possibility to get acquainted with various members of the network, which was soon to be dispersed across the country.

As an aside, that invitation letter is one of the most important early documents in the history of computing for lots of reasons, including this one:

The machine in that picture was the third generation of the “Hollerith Tabulator”, notable for the replaceable plugboard that made it reprogrammable. I need to find some time to dig further into this, but that might be the first multipurpose, if not “general purpose” as we’ve come to understand it, electronic computation device. This is another piece of formative tech that emerged from this era, one that led to directly to the removable panels (and ultimately the general componentization) of later computing hardware.

Well before the model 3, though, was the original 1890 Hollerith Census Tabulator that relied on punchcards much like this one.

Hollerith took the inspiration for those punchcards from the “punch photographs” used by some railways at the time to make sure that tickets belonged to the passengers holding them. You can see a description of one patent for them here dating to 1888, but Hollerith relates the story from a few years earlier:

One thing that helped me along in this matter was that some time before I was traveling in the west and I had a ticket with what I think was called a punch photograph. When the ticket was first presented to a conductor he punched out a description of the individual, as light hair, dark eyes, large nose etc. So you see I only made a punch photograph of each person.

Tangentially: this is the birth of computational biometrics. And as you can see from this extract from The Railway News (Vol. XLVIII, No. 1234 , published Aug. 27, 1887) people have been concerned about harassment because of unfair assessment by the authorities from day one:

punch-photograph

After experimenting with a variety of card sizes Hollerith decided that to save on production costs he’d use the same boxes the U.S. Treasury was using for the currency of the day: the Demand Note. Punch cards stayed about that shape, punched with devices that looked a lot like this for about 20 years until Thomas Watson Sr. (IBM’s first CEO, from whom the Watson computer gets its name) asked Clair D. Lake and J. Royden Peirce to develop a new, higher data-density card format.

Tragically, this is the part where I need to admit an unfounded assertion. I’ve got data, the pictures line up and numbers work, but I don’t have a citation. I wish I did.

Take a look at “Type Design For Typewriters: Olivetti, written by Maria Ramos Silvia. (You can see a historical talk from her on the history of typefaces here that’s also pretty great.)

Specifically, take a look on page 46 at Mikron Piccolo, Mikron Condensed. The fonts don’t precisely line up – see the different “4”, for example, when comparing it to the typesetting of IBM’s cards – but the size and spacing do. In short: a line of 80 characters, each separated by a space, is the largest round number of digits that the tightest typesetting of the day would allow to be fit on a single 7-3/8” wide card: a 20-point condensed font.

I can’t find a direct citation for this; that’s the only disconnect here. But the spacing all fits, the numbers all work, and I’d bet real money on this: that when Watson gave Lake the task of coming up with a higher information-density punch card, Lake looked around at what they already had on the shelf – a typewriter with the highest-available character density of the day, on cards they could manage with existing and widely-available tooling – and put it all together in 1928. The fact that a square hole – a radical departure from the standard circular punch – was a patentable innovation at the time was just icing on the cake.

The result of that work is something you’ll certainly recognize, the standard IBM punchcard, though of course there’s lot more to it than that. Witness the full glory of the Card Stock Acceptance Procedure, the protocol for measuring folding endurance, air resistance, smoothness and evaluating the ash content, moisture content and pH of the paper, among many other things.

At one point sales of punchcards and related tooling constituted a completely bonkers 30% of IBM’s annual profit margin, so you can understand that IBM had a lot invested in getting that consistently, precisely correct.

At around this time John Logie Baird invented the first “mechanical television”; like punchcards, the first television cameras were hand-cranked devices that relied on something called a Nipkow disk, a mechanical tool for separating images into sequential scan lines, a technique that survives in electronic form to this day. By linearizing the image signal Baird could transmit the image’s brightness levels via a simple radio signal and in 1926 he did just that, replaying that mechanically encoded signal through a CRT and becoming the inventor of broadcast television. He would go on to pioneer colour television – originally called Telechrome, a fantastic name I’m sad we didn’t keep – but that’s a different story.

Baird’s original “Televisor” showed its images on a 7:3 aspect ration vertically oriented cathode ray tube, intended to fit the head and shoulders of a standing person, but that wouldn’t last.

For years previously, silent films had been shot on standard 35MM stock, but the addition of a physical audio track to 35MM film stock didn’t leave enough space left over for the visual area. So – after years of every movie studio having its own preferred aspect ratio, which required its own cameras, projectors, film stock and tools (and and and) – in 1929 the movie industry agreed to settle on the Society of Motion Picture And Television Engineers’ proposed standard of 0.8 inches by 0.6 inches, what became known as the Academy Ratio, or as we better know it today, 4:3.

Between 1932 and 1952, when widescreen for cinemas came into vogue as a differentiator from standard television, just about all the movies made in the world were shot in that aspect ratio, and just about every cathode ray tube made came in that shape, or one that could display it reliably. In 1953 studios started switching to a wider “Cinemascope”, to aggressively differentiate themselves from television, but by then television already had a large, thoroughly entrenched install base, and 4:3 remained the standard for in-home displays – and CRT manufacturers – until widescreen digital television came to market in the 1990s.

As computers moved from teleprinters – like, physical, ink-on-paper line printers – to screens, one byproduct of that standardization was that if you wanted to build a terminal, you either used that aspect ratio or you started making your own custom CRTs, a huge barrier to market entry. You can do that if you’re IBM, and you’re deeply reluctant to if you’re anyone else. So when DEC introduced their VT52 terminal, a successor to the VT50 and earlier VT05 that’s what they shipped, and with only 1Kb of display ram (one kilobyte!) it displayed only twelve rows of widely-spaced text. Math is unforgiving, and 80×12=960; even one more row breaks the bank. The VT52 and its successor the VT100, though, doubled that capacity giving users the opulent luxury of two entire kilobytes of display memory, laid out with a font that fit nicely on that 4:3 screen. The VT100 hit the market in August of 1978, and DEC sold more than six million of them over the product’s lifespan.

You even got an extra whole line to spare! Thanks to the magic of basic arithmetic 80×25 just sneaks under that opulent 2k limit with 48 bytes to spare.

This is another point where direct connections get blurry, because 1976 to 1984 was an incredibly fertile time in the history of computing history. After a brief period where competing terminal standards effectively locked software to the hardware that it shipped on, the VT100 – being the first terminal to market fully supporting the recently codified ANSI standard control and escape sequences – quickly became the de-facto standard, and soon afterwards the de-jure, codified in ANSI-X3.64/ECMA-48. CP/M, soon to be replaced with PC-DOS and then MS-DOS came from this era, with ANSI.SYS being the way DOS programs talked to the display from DOS 2.0 through to beginning of Windows. Then in 1983 the Apple IIe was introduced, the first Apple computer to natively support an 80×24 text display, doubling the 40×24 default of their earlier hardware. The original XTerm, first released in 1984, was also created explicitly for VT100 compatibility.

Fascinatingly, the early versions of the ECMA-48 standard specify that this standard isn’t solely meant for displays, specifying that “examples of devices conforming to this concept are: an alpha-numeric display device, a printer or a microfilm output device.”

A microfilm output device! This exercise dates to a time when microfilm output was a design constraint! I did not anticipate that cold-war spy-novel flavor while I was dredging this out, but it’s there and it’s magnificent.

It also dates to a time that the market was shifting quickly from mainframes and minicomputers to microcomputers – or, as we call them today, “computers” – as reasonably affordable desktop machines that humans might possibly afford and that companies might own a large number of, meaning this is also where the spectre of backcompat starts haunting the industry – This moment in a talk from the Microsoft developers working on the Windows Subsystem for Linux gives you a sense of the scale of that burden even today. In fact, it wasn’t until the fifth edition of ECMA-48 was published in 1991, more than a decade after the VT100 hit the market, that the formal specification for terminal behavior even admitted the possibility (Appendix F) that a terminal could be resized at all, meaning that the existing defaults were effectively graven in stone during what was otherwise one of the most fertile and formative periods in the history of computing.

As a personal aside, my two great frustrations with doing any kind of historical CS research remain the incalculable damage that academic paywalls have done to the historical record, and the relentless insistence this industry has on justifying rather than interrogating the status quo. This is how you end up on Stack Overflow spouting unresearched nonsense about how “4 pixel wide fonts are untidy-looking”. I’ve said this before, and I’ll say it again: whatever we think about ourselves as programmers and these towering logic-engines we’ve erected, we’re a lot more superstitious than we realize, and by telling and retelling these unsourced, inaccurate just-so stories without ever doing the work of finding the real truth, we’re betraying ourselves, our history and our future. But it’s pretty goddamned difficult to convince people that they should actually look things up instead of making up nonsense when actually looking things up, even for a seemingly simple question like this one, can cost somebody on the outside edge of an academic paywall hundreds or thousands of dollars.

So, as is now the usual in these things:

  • There are technical reasons,
  • There are social reasons,
  • It’s complicated, and
  • Open access publication or GTFO.

But if you ever wondered why just about every terminal in the world is eighty characters wide and twenty-five characters tall, there you go.

August 7, 2019

Ten More Simple Rules

Untitled

The Public Library of Science‘s Ten Simple Rules series can be fun reading; they’re introductory papers intended to provide novices or non-domain-experts with a set of quick, evidence-based guidelines for dealing with common problems in and around various fields, and it’s become a pretty popular, accessible format as far as scientific publication goes.

Topic-wise, they’re all over the place: protecting research integrity, creating a data-management plan and taking advantage of Github are right there next to developing good reading habits, organizing an unconference or drawing a scientific comic, and lots of them are kind of great.

I recently had the good fortune to be co-author on one of them that’s right in my wheelhouse and has recently been accepted for publication: Ten Simple Rules for Helping Newcomers Become Contributors to Open Projects. They are, as promised, simple:

  1. Be welcoming.
  2. Help potential contributors evaluate if the project is a good fit.
  3. Make governance explicit.
  4. Keep knowledge up to date and findable.
  5. Have and enforce a code of conduct.
  6. Develop forms of legitimate peripheral participation.
  7. Make it easy for newcomers to get started.
  8. Use opportunities for in-person interaction – with care.
  9. Acknowledge all contributions, and
  10. Follow up on both success and failure.

You should read the whole thing, of course; what we’re proposing are evidence-based practices, and the details matter, but the citations are all there. It’s been a privilege to have been a small part of it, and to have done the work that’s put me in the position to contribute.

June 29, 2019

Blitcha

Blit

March 27, 2019

Defined By Prosodic And Morphological Properties

Filed under: academia,academic,awesome,interfaces,lunacy,science — mhoye @ 5:09 pm

Untitled

I am fully invested in these critical advances in memetic linguistics research:

[…] In this paper, we go beyond the aforementioned prosodic restrictions on novel morphology, and discuss gradient segmental preferences. We use morphological compounding to probe English speakers’ intuitions about the phonological goodness of long-distance vowel and consonant identity, or complete harmony. While compounding is a central mechanism for word-building in English, its phonology does not impose categorical vowel or consonant agreement patterns, even though such patterns are attested cross-linguistically. The compound type under investigation is a class of insult we refer to as shitgibbons, taking their name from the most prominent such insult which recently appeared in popular online media (§2). We report the results of three online surveys in which speakers rated novel shitgibbons, which did or did not instantiate long-distance harmonies (§3). With the ratings data established, we follow two lines of inquiry to consider their source: first, we compare shitgibbon harmony preferences with the frequency of segmental harmony in English compounds more generally, and conclude that the lexicon displays both vowel and consonant harmony (§4); second, we attribute the lack of productive consonant harmony in shitgibbons to the attested cross-linguistic harmonies, which we implement as a locality bias in MaxEnt grammar (§5).

You had me at “diddly-infixation”.

December 13, 2018

Looking Skyward

Filed under: awesome,beauty,documentation,flickr,future,life,science — mhoye @ 12:43 pm

PC050781

PC050776

Space

November 9, 2018

The Evolution Of Open

Filed under: digital,future,interfaces,linux,losers,mozilla,science,toys,vendetta,work — mhoye @ 5:00 pm

This started its life as a pair of posts to the Mozilla governance forum, about the mismatch between private communication channels and our principles of open development. It’s a little long-winded, but I think it broadly applies not just to Mozilla but to open source in general. This version of it interleaves those two posts into something I hope is coherent, if kind of rambly. Ultimately the only point I want to make here is that the nature of openness has changed, and while it doesn’t mean we need to abandon the idea as a principle or as a practice, we can’t ignore how much has changed or stay mired in practices born of a world that no longer exists.

If you’re up for the longer argument, well, you can already see the wall of text under this line. Press on, I believe in you.

Even though open source software has essentially declared victory, I think that openness as a practice – not just code you can fork but the transparency and accessibility of the development process – matters more than ever, and is in a pretty precarious position. I worry that if we – the Royal We, I guess – aren’t willing to grow and change our understanding of openness and the practical realities of working in the open, and build tools to help people navigate those realities, that it won’t be long until we’re worse off than we were when this whole free-and-open-source-software idea got started.

To take that a step further: if some of the aspirational goals of openness and open development are the ideas of accessibility and empowerment – that reducing or removing barriers to participation in software development, and granting people more agency over their lives thereby, is self-evidently noble – then I think we need to pull apart the different meanings of the word “open” that we use as if the same word meant all the same things to all the same people. My sense is that a lot of our discussions about openness are anchored in the notion of code as speech, of people’s freedom to move bits around and about the limitations placed on those freedoms, and I don’t think that’s enough.

A lot of us got our start when an internet connection was a novelty, computation was scarce and state was fragile. If you – like me – are a product of this time, “open” as in “open source” is likely to be a core part of your sense of personal safety and agency; you got comfortable digging into code, standing up your own services and managing your own backups pretty early, because that was how you maintained some degree of control over your destiny, how you avoided the indignities of data loss, corporate exploitation and community collapse.

“Open” in this context inextricably ties source control to individual agency. The checks and balances of openness in this context are about standards, data formats, and the ability to export or migrate your data away from sites or services that threaten to go bad or go dark. This view has very little to say about – and is often hostile to the idea of – granular access restrictions and the ability to impose them, those being the tools of this worldview’s bad actors.

The blind spots of this worldview are the products of a time where someone on the inside could comfortably pretend that all the other systems that had granted them the freedom to modify this software simply didn’t exist. Those access controls were handled, invisibly, elsewhere; university admission, corporate hiring practices or geography being just a few examples of the many, many barriers between the network and the average person.

And when we’re talking about blind spots and invisible social access controls, of course, what we’re really talking about is privilege. “Working in the open”, in a world where computation was scarce and expensive, meant working in front of an audience that was lucky enough to go to university or college, whose parents could afford a computer at home, who lived somewhere with broadband or had one of the few jobs whose company opened low-numbered ports to the outside world; what it didn’t mean was doxxing, cyberstalking, botnets, gamergaters, weaponized social media tooling, carrier-grade targeted-harassment-as-a-service and state-actor psy-op/disinformation campaigns rolling by like bad weather. The relentless, grinding day-to-day malfeasance that’s the background noise of this grudgefuck of a zeitgeist we’re all stewing in just didn’t inform that worldview, because it didn’t exist.

In contrast, a more recent turn on the notion of openness is one of organizational or community openness; that is, openness viewed through the lens of the accessibility and the experience of participation in the organization itself, rather than unrestricted access to the underlying mechanisms. Put another way, it puts the safety and transparency of the organization and the people in it first, and considers the openness of work products and data retention as secondary; sometimes (though not always) the open-source nature of the products emerges as a consequence of the nature of the organization, but the details of how that happens are community-first, code-second (and sometimes code-sort-of, code-last or code-never). “Openness” in this context is about accessibility and physical and emotional safety, about the ability to participate without fear. The checks and balances are principally about inclusivity, accessibility and community norms; codes of conduct and their enforcement.

It won’t surprise you, I suspect, to learn that environments that champion this brand of openness are much more accessible to women, minorities and otherwise marginalized members of society that make up a vanishingly small fraction of old-school open source culture. The Rust and Python communities are doing good work here, and the team at Glitch have done amazing things by putting community and collaboration ahead of everything else. But a surprising number of tool-and-platform companies, often in “pink-collar” fields, have taken the practices of open community building and turned themselves into something that, code or no, looks an awful lot like the best of what modern open source has to offer. If you can bring yourself to look past the fact that you can’t fork their code, Salesforce – Salesforce, of all the damn things – has one of the friendliest, most vibrant and supportive communities in all of software right now.

These two views aren’t going to be easy to reconcile, because the ideas of what “accountability” looks like in both contexts – and more importantly, the mechanisms of accountability built in to the systems born from both contexts – are worse than just incompatible. They’re not even addressing something the other worldview is equipped to recognize as a problem. Both are in some sense of the word open, both are to a different view effectively closed and, critically, a lot of things that look like quotidian routine to one perspective look insanely, unacceptably dangerous to the other.

I think that’s the critical schism the dialogue, the wildly mismatched understandings of the nature of risk and freedom. Seen in that light the recent surge of attention being paid to federated systems feels like a weirdly reactionary appeal to how things were better in the old days.

I’ve mentioned before that I think it’s a mistake to think of federation as a feature of distributed systems, rather than as consequence of computational scarcity. But more importantly, I believe that federated infrastructure – that is, a focus on distributed and resilient services – is a poor substitute for an accountable infrastructure that prioritizes a distributed and healthy community.  The reason Twitter is a sewer isn’t that Twitter is centralized, it’s that Jack Dorsey doesn’t give a damn about policing his platform and Twitter’s board of directors doesn’t give a damn about changing his mind. Likewise, a big reason Mastodon is popular with the worst dregs of the otaku crowd is that if they’re on the right instance they’re free to recirculate shit that’s so reprehensible even Twitter’s boneless, soporific safety team can’t bring themselves to let it slide.

That’s the other part of federated systems we don’t talk about much – how much the burden of safety shifts to the individual. The cost of evolving federated systems that require consensus to interoperate is so high that structural flaws are likely to be there for a long time, maybe forever, and the burden of working around them falls on every endpoint to manage for themselves. IRC’s (Remember IRC?) ongoing borderline-unusability is a direct product of a notion of openness that leaves admins few better tools than endless spammer whack-a-mole. Email is (sort of…) decentralized, but can you imagine using it with your junkmail filters off?

I suppose I should tip my hand at this point, and say that as much as I value the source part of open source, I also believe that people participating in open source communities deserve to be free not only to change the code and build the future, but to be free from the brand of arbitrary, mechanized harassment that thrives on unaccountable infrastructure, federated or not. We’d be deluding ourselves if we called systems that are just too dangerous for some people to participate in at all “open” just because you can clone the source and stand up your own copy. And I am absolutely certain that if this free software revolution of ours ends up in a place where asking somebody to participate in open development is indistinguishable from asking them to walk home at night alone, then we’re done. People cannot be equal participants in environments where they are subject to wildly unequal risk. People cannot be equal participants in environments where they are unequally threatened. And I’d have a hard time asking a friend to participate in an exercise that had no way to ablate or even mitigate the worst actions of the internet’s worst people, and still think of myself as a friend.

I’ve written about this before:

I’d like you to consider the possibility that that’s not enough.

What if we agreed to expand what freedom could mean, and what it could be. Not just “freedom to” but a positive defense of opportunities to; not just “freedom from”, but freedom from the possibility of.

In the long term, I see that as the future of Mozilla’s responsibility to the Web; not here merely to protect the Web, not merely to defend your freedom to participate in the Web, but to mount a positive defense of people’s opportunities to participate. And on the other side of that coin, to build accountable tools, systems and communities that promise not only freedom from arbitrary harassment, but even freedom from the possibility of that harassment.

More generally, I still believe we should work in the open as much as we can – that “default to open”, as we say, is still the right thing – but I also think we and everyone else making software need to be really, really honest with ourselves about what open means, and what we’re asking of people when we use that word. We’re probably going to find that there’s not one right answer. We’re definitely going to have to build a bunch of new tools.  But we’re definitely not going to find any answers that matter to the present day, much less to the future, if the only place we’re looking is backwards.

[Feel free to email me, but I’m not doing comments anymore. Spammers, you know?]

Older Posts »

Powered by WordPress