Press "Enter" to skip to content

Posts published in “Computing”

The art and science of mechanized thought.

Learning to Drive … by Learning Where You Can Drive

centaur 0

I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I  can tell you all about it!

When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance.

For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what.

But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings.

People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments.

But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge.

Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world.

In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round).

This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper.  Until the ICRA version comes out, you can find the preliminary version on arXiv:
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.


So, when I say "I teach robots to learn" ... that's what I do.

-the Centaur

My Daily Dragon Interview in Two Words: “Just Write!”

centaur 0

So at Dragon Con I had a reading this year. Yeah, looks like this is the last year I get to bring all my books - too many, to heavy! I read the two flash fiction pieces in Jagged Fragments, "If Looks Could Kill" and "The Secret of the T-Rex's Arms", as well as reading the first chapter of Jeremiah Willstone and the Clockwork Time Machine, a bit of my and Jim Davies' essay on the psychology of Star Trek's artificial intelligences, and even a bit of my very first published story, "Sibling Rivalry". I also gave the presentation I was supposed to give at the SAM Talks before I realized I was double booked; that was "Risk Getting Worse".

But that wasn't recorded, so, oh dang, you'll have to either go to my Amazon page to get my books, or wait until we get "Risk Getting Worse" recorded. But my interview with Nancy Northcott for the Daily Dragon, "Robots, Computers, and Magic", however, IS online, so I can share it with you all. Even more so, I want to share what I think is the most important part of my interview:

DD: Do you have any one bit of advice for aspiring writers?

AF: Write. Just write. Don’t worry about perfection, or getting published, or even about pleasing anyone else: just write. Write to the end of what you start, and only then worry about what to do with it. In fact, don’t even worry about finishing everything—don’t be afraid to try anything. Artists know they need to fill a sketchbook before sitting down to create a masterwork, but writers sometimes get trapped trying to polish their first inspiration into a final product.

Don’t get trapped on the first hill! Whip out your notebook and write. Write morning pages. Write diary at the end of the day. Write a thousand starts to stories, and if one takes flight, run with it with all the abandon you have in you. Accept all writing, especially your own. Just write. Write.

That's it. To read more, check out the interview here, or see all my Daily Dragon mentions at Dragon Con here, or check out my interviewer Nancy Northcott's site here. Onward!

-the Centaur



What is Artificial Intelligence?

centaur 0


Simply put, "artificial intelligence” is people trying to make things do things that we’d call smart if done by people.

So what’s the big deal about that?

Well, as it turns out, a lot of people get quite wound up with the definition of "artificial intelligence.” Sometimes this is because they’re invested in a prescientific notion that machines can’t be intelligent and want to define it in a way that writes the field off before it gets started, or it’s because they’re invested in an unscientific degree into their particular theory of intelligence and want to define it in a way that constrains the field to look at only the things they care about, or because they’re actually not scientific at all and want to proscribe the field to work on the practical problems of particular interest to them.

No, I’m not bitter about having to wade through a dozen bad definitions of artificial intelligence as part of a survey. Why do you ask?

The Eagle Has Landed

centaur 0


Welp, that was anticlimactic! Thanks, God, for a smooth update to WordPress 4.7.3! (And thanks to the WordPress team for maintaining backwards compatibility). And hey, look - the Library has close to 1,000 posts!

Screenshot 2017-03-21 12.35.50.png

Expect major site updates in the months to come, as WordPress’s Themes and Pages now enable me to do things I could only formerly do with static pages and hand-coded pages, and it will all be backed up easier thanks to WordPress’s Jetpack plugin.

The things you learn helping other people with their web sites ….

-the Centaur

We are go for launch …

centaur 0


Welp, it’s time: I’ve backed up the Library of Dresan three ways to Sunday, said a prayer … and now am planning to upgrade WordPress from 3.0.1-alpha-15359 to 4.7.3. I know that’s 1.7.2 full version numbers, but it’s been too long, and there are too many new features I need, so … time to press the button.

God, please help me! Everyone else, your prayers, please.

-the Centaur

GDC 2017 AI Summit in Progress 

centaur 0


Lots of great content …


... and this year I have pages and pages of notes!


Stay tuned …


... or check the talks out in a few weeks on the GDC Vault!


-the Centaur

Welcome to the Future

centaur 0


Welcome to the future, ladies and gentlemen. Here in the future, the obscure television shows of my childhood rate an entire section in the local bookstore, which combines books, games, music, movies, and even vinyl records with a coffeehouse and restaurant.


Here in the future, the heretofore unknown secrets of my discipline, artificial intelligence, are now conveniently compiled in compelling textbooks that you can peruse at your leisure over a cup of coffee.


Here in the future, genre television shows play on the monitors of my favorite bar / restaurant, and the servers and I have meaningful conversations about the impact of robotics on the future of labor.


And here in the future, Monty Python has taken over the world.

Perhaps that explains 2016.

-the Centaur

I’m so sorry, web …

centaur 0

… I had to install an ad-blocker. Why? Firefox before any ad block:

Screenshot 2016-12-21 21.08.19.png

Firefox after Adblock Plus:

Screenshot 2016-12-21 21.08.55.png

Yep, Firefox was TEN TIMES SLOWER when loading a page with ads, and it stayed that way because the ads kept updating. Just one page with ads brought FF to its knees, and I did the experiment several times to confirm, yes, it indeed was the ads. I don’t know what’s specifically going on here, but I strongly suspect VPAID ads and similar protocols are the culprit, as documented here:

… publisher and website owner Artem Russakovskii took to Google+ and The Hacker News to share some of his findings concerning VPAID ads. He shows how VPAID ads can degrade a user’s browser performance:

“… after several minutes of just leaving this one single ad open, I’m at 53MB downloaded and 5559 requests. By the time I finished typing this, I was at 6140 requests. A single ad did this. Without reloading the page, just leaving it open.

A single VPAID ad absolutely demolishes site performance on mobile and desktop, and we, the publishers, get the full blame from our readers. And when multiple VPAID ads end up getting served on the same page… you get the idea."

Similarly, John Gruber reports that a 500-word text article weighed in at 15MB - enough data to hold more than 10 copies of the Bible, according to the Guardian. Gruber links another post which shows that web pages can get more than 5 times faster without all the excess scripts that they load.

The sad thing is, I don’t mind ads. The very first version of my site had fake “ads” for other blogs I liked. Even the site I tested above, the estimable Questionable Content, had ads for other webcomics I liked, but experimentation showed that ads could bring Firefox to its knees. QC I always thought of as ad-lite, but guess it’s time to start contributing via Patreon.

The real problem is news sites. Sites were opening a simple story kept locking up Firefox and twice brought down my whole computer by draining the battery incredibly fast. I don’t care what you think your metrics are telling you, folks: if you pop up an overview so I can’t see your page, and start running a dozen ads that kill my computer, I will adblock you, or just stop going to your site, and many, many other people across the world are doing the same.

We need standards of excellence in content that say 2/3 of a page will be devoted to content and that ads can add no more than 50% to the bandwidth downloaded by a page. Hell, make it only 1/3 content and 100% extra bandwidth - that will be almost 100% more content than a page totally destroyed by popup ads and almost 3000% less data than one bloated by 10 copies of the Old Testament in the form of redundant ads for products I will either never buy or, worse, have already bought.

-the Centaur

Why yes, I’m running a deep learning system on a MacBook Air. Why?

centaur 1


Yep, that’s Python consuming almost 300% of my CPU - guess what, I guess that means this machine has four processing cores, since I saw it hit over 300% - running the TensorFlow tutorial. For those that don’t know, "deep learning” is a relatively recent type of learning which uses improvements in both processing power and learning algorithms to train learning networks that can have dozens or hundreds of layers - sometimes as many layers as neural networks in the 1980’s and 1990’s had nodes.

For those that don’t know even that, neural networks are graphs of simple nodes that mimic brain structures, and you can train them with data that contains both the question and the answer. With enough internal layers, neural networks can learn almost anything, but they require a lot of training data and a lot of computing power. Well, now we’ve got lots and lots of data, and with more computing power, you’d expect we’d be able to train larger networks - but the first real trick was discovering mathematical tricks that keep the learning signal strong deep, deep within the networks.

The second real trick was wrapping all this amazing code in a clean software architecture that enables anyone to run the software anywhere. TensorFlow is one of the most recent of these frameworks - it’s Google’s attempt to package up the deep learning technology it uses internally so that everyone in the world can use it - and it’s open source, so you can download and install it on most computers and try out the tutorial at home. The CPU-baking example you see running here, however, is not the simpler tutorial, but a test program that runs a full deep neural network. Let’s see how it did:

Screenshot 2016-02-08 21.08.40.png

Well. 99.2% correct, it seems. Not bad for a couple hundred lines of code, half of which is loading the test data - and yeah, that program depends on 200+ files worth of Python that the TensorFlow installation loaded onto my MacBook Air, not to mention all the libraries that the TensorFlow Python installation depends on in turn …

But I still loaded it onto a MacBook Air, and it ran perfectly.

Amazing what you can do with computers these days.

-the Centaur