Congratulations, Sir Richard Branson, on your successful space flight! (Yes, yes, I *know* it's technically just upper atmosphere, I *know* there's no path to orbit (yet) but can we give the man some credit for an awesome achievement?) And I look forward to Jeff Bezos making a similar flight later this month. Now, I stand by my earlier statement: the way you guys are doing this, a race, is going to get someone killed, perhaps one of you guys. A rocketship is not a racecar, and moves into realms of physics where we do not have good human intuition. Please, all y'all, take it easy, and get it right. That being said, congratulations on being the first human being to put themselves into space as part of a rocket program that they themselves set in motion. That's an amazing achievement, no-one can ever take that away from you, and maybe that's why you look so damn happy. Enjoy it! -the Centaur P.S. And day 198, though I'll do an analysis of the drawing at a later time.
Posts published in “Startuppery”
You know, Jeff Bezos isn’t likely to die when he flies July 20th. And Richard Branson isn’t likely to die when he takes off at 9am July 11th (tomorrow morning, as I write this). But the irresponsible race these fools have placed them in will eventually get somebody killed, as surely as Elon Musk’s attempt to build self-driving cars with cameras rather than lidar was doomed to (a) kill someone and (b) fail. It’s just, this time, I want to be caught on record saying I think this is hugely dangerous, rather than grumbling about it to my machine learning brethren. Whether or not a spacecraft is ready to launch is not a matter of will; it’s a matter of natural fact. This is actually the same as many other business ventures: whether we’re deciding to create a multibillion-dollar battery factory or simply open a Starbucks, our determination to make it succeed has far less to do with its success than the realities of the market—and its physical situation. Either the market is there to support it, and the machinery will work, or it won’t. But with normal business ventures, we’ve got a lot of intuition, and a lot of cushion. Even if you aren’t Elon Musk, you kind of instinctively know that you can’t build a battery factory before your engineering team has decided what kind of battery you need to build, and even if your factory goes bust, you can re-sell the land or the building. Even if you aren't Howard Schultz, you instinctively know it's smarter to build a Starbucks on a busy corner rather than the middle of nowhere, and even if your Starbucks goes under, it won't explode and take you out with it. But if your rocket explodes, you can't re-sell the broken parts, and it might very well take you out with it. Our intuitions do not serve us well when building rockets or airships, because they're not simple things operating in human-scaled regions of physics, and we don't have a lot of cushion with rockets or self-driving cars, because they're machinery that can kill you, even if you've convinced yourself otherwise. The reasons behind the likelihood of failure are manyfold here, and worth digging into in greater depth; but briefly, they include:
- The Paradox of the Director's Foot, where a leader's authority over safety personnel - and their personal willingness to take on risk - ends up short-circuiting safety protocols and causing accidents. This actually happened to me personally when two directors in a row had a robot run over their foot at a demonstration, and my eagle-eyed manager recognized that both of them had stepped into the safety enclosure to question the demonstrating engineer, forcing the safety engineer to take over audience questions - and all three took their eyes off the robot. Shoe leather degradation then ensued, for both directors. (And for me too, as I recall).
- The Inexpensive Magnesium Coffin, where a leader's aesthetic desire to have a feature - like Steve Job's desire for a magnesium case on the NeXT machines - led them to ignore feedback from engineers that the case would be much more expensive. Steve overrode his engineers ... and made the NeXT more expensive, just like they said it would, because wanting the case didn't make it cheaper. That extra cost led to the product's demise - that's why I call it a coffin. Elon Musk's insistence on using cameras rather than lidar on his self-driving cars is another Magnesium Coffin - an instance of ego and aesthetics overcoming engineering and common sense, which has already led to real deaths. I work in this precise area - teaching robots to navigate with lidar and vision - and vision-only navigation is just not going to work in the near term. (Deploy lidar and vision, and you can drop lidar within the decade with the ground-truth data you gather; try going vision alone, and you're adding another decade).
- Egotistical Idiot's Relay Race (AKA Lord Thomson's Suicide by Airship). Finally, the biggest reason for failure is the egotistical idiot's relay race. I wanted to come up with some nice, catchy parable name to describe why the Challenger astronauts died, or why the USS Macon crashed, but the best example is a slightly older one, the R101 disaster, which is notable because the man who started the R101 airship program - Lord Thomson - also rushed the program so he could make a PR trip to India, with the consequence that the airship was certified for flight without completing its endurance and speed trials. As a result, on that trip to India - its first long distance flight - the R101 crashed, killing 48 of the 54 passengers - Lord Thomson included. Just to be crystal clear here, it's Richard Branson who moved up his schedule to beat Jeff Bezos' announced flight, so it's Sir Richard Branson who is most likely up for a Lord Thomson's Suicide Award.
So, 2019. What a mess. More on that later; as for me, I've had neither the time nor even the capability to blog for a while. But one thing I've noticed is, at least for me, the point at which I want to give up is usually just prior to the point where I could have my big breakthrough. For example: Scrivener. I had just about given up on Scrivener, an otherwise great program for writers that helps with organizing notes, writing screenplays, and even for comic book scripts. But I'd become used to Google Docs and its keyboard shortcuts for hierarchical bulleted lists, not entirely different from my prior life using hierarchical notebook programs like GoldenSection Notes. But Scrivener's keyboard shortcuts were all different, and the menus didn't seem to support what I needed, so I had started trying alternatives. Then I gave on more shot at going through the manual, which had earlier got me nothing.At first this looked like a lost cause: Scrivener depended on Mac OS X's text widgets, which themselves implement a nonstandard text interface (fanboys, shut up, sit down: you're overruled. case in point: Home and End. I rest my case), and worse, depend on the OS even for the keyboard shortcuts, which require the exact menu item. But the menu item for list bullets actually was literally a bullet, which normally isn't a text character in most programs; you can't access it. But as it turns out, in Scrivener, you can. I was able to insert a bullet, find the bullet character, and even create a keyboard shortcut for it. And it did what it was supposed to! Soon I found the other items I needed to fill out the interface that I'd come to know and love in Google Docs for increasing/decreasing the list bullet indention on the fly while organizing a list: Eventually I was able to recreate the whole interface and was so happy I wrote a list describing it in the middle of the deep learning Scrivener notebook that I had been working on when I hit the snag that made me go down this rabbit hole (namely, wanting to create a bullet list): Writing this paragraph itself required figuring out how to insert symbols for control characters in Mac OS X, but whatever: a solution was possible, even ready to be found, just when I was ready to give up. I found the same thing with so many things recently: stuck photo uploads on Google Photos, configuration problems on various publishing programs, even solving an issue with the math for a paper submission at work. I suspect this is everywhere. It's a known thing in mathematics that when you feel close to a solution you may be far from it; I often find myself that the solution is to be found just after the point you want to give up. I've written about a related phenomenon called this "working a little bit harder than you want to" but this is slightly different: it's the idea that your judgment that you've exhausted your options is just that, a judgment. It may be true. Try looking just a bit harder for that answer. -the Centaur Pictured: a photo of the Greenville airport over Christmas, which finally uploaded today when I went back through the archives of Google Photos on my phone and manually stopped a stuck upload from December 19th.
I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I can tell you all about it! When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance. For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what. But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings. People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments. But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge. Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world. In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round). This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper. Until the ICRA version comes out, you can find the preliminary version on arXiv:
So, when I say "I teach robots to learn" ... that's what I do. -the Centaur
https://arxiv.org/abs/1710.03937 PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based PlanningWe present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.
So one of the things I like to do each year, as part of my traditional visit to family over the holidays, is to drop in on a Panera Bread, pull out my notebook, review my plans for the previous year, and make plans for the new one. As of the 7th of January, I still haven't done this yet. Shit happened last year. Good shit, such as really getting serious about teaching robots to learn; bad shit, such as serious illnesses in the pets in our family; and ugly shit which I'm not going to talk about until the final contracts are signed and everyone agrees everything is hunky and dory. And much of this went down just before the holidays, and once the holidays started, I cared a lot more about spending time with family and friends than sitting by myself in a Panera. (In all fairness, the holidays were easier when I lived in Atlanta and came up to see family many times a year, as opposed to only occasionally). But I can recommend trying to do a yearly review. One year I decided to list what I wanted to do, both in the immediate future, in the coming year, in the coming 5 years, and in my life; and the next year, almost by chance, I sat down in the same Panera to review it. That served me well for more than a decade, and I find that even trying to do it helps me feel more focused and refreshed. And so that's precisely what I tried to do yesterday. I didn't accomplish it - I still haven't managed to "clear the thickets" of my TODO lists to get to the actual yearly plan, and I miss being able to take a whole afternoon at Panera doing this - but I did the next best thing, sitting myself down to a nice "reboot" dinner and treating myself to a showing of Star Wars: The Last Jedi. As someone said (a reference I read recently, but have been unable to find) the very act of doing something daily centers the mind. Here's to that. -Anthony
Not literally; we were far south of the literal fires, which just barely missed the homes of our friends. But so many other things have been going wrong that it felt like things were on fire ... so no posts for a while, sorry. But tonight, I got to the last chapter of Dakota Frost #6, SPIRITUAL GOLD. I will likely finish this chapter Saturday. That makes today a good day. Time for some cake. -the Centaur Pictured: a cat break with Loki. Not how things look right now, but how I feel.