Posts published in “Computing”

The art and science of mechanized thought.

Jesus and Gödel

Published February 19, 2021 by centaur

Yesterday I claimed that Christianity was following Jesus - looking at him as a role model for thinking, judging, and doing, stepping away from rules and towards principles, choosing good outcomes over bad ones and treating others like we wanted to be treated, and ultimately emulating what Jesus would do.

But it's an entirely fair question to ask, why do we need a role model to follow? Why not have a set of rules that guide our behavior, or develop good principles to live by? Well, it turns out it's impossible - not hard, but literally mathematically impossible - to have perfect rules, and principles do not guide actions. So a role model is the best tool we have to help us build the cognitive skill of doing the right thing.

Let's back up a bit. I want to talk about what rules are, and how they differ from principles and models.

In the jargon of my field, artificial intelligence, rules are if-then statements: if this, then do that. They map a range of propositions to a domain of outcomes, which might be actions, new propositions, or edits to our thoughts. There's a lot of evidence that the lower levels of operation of our minds is rule-like.

Principles, in contrast, are descriptions of situations. They don't prescribe what to do; they evaluate what has been done. The venerable artificial intelligence technique of generate-and-test - throw stuff on the wall to see what sticks - depends on "principles" to evaluate whether the outcomes are good.

Models are neither if-then rules nor principles. Models predict the evolution of a situation. Every time you play a computer game, a model predicts how the world will react to your actions. Every time you think to yourself, "I know what my friend would say in response to this", you're using a model.

Rules, of a sort, may underly our thinking, and some of our most important moral precepts are encoded in rules, like the Ten Commandments. But rules are fundamentally limited. No matter how attached you are to any given set of rules, eventually, those rules can fail you, and you can't know when.

The iron laws behind these fatal flaws are Gödel's incompleteness theorems. Back in the 1930's, Kurt Gödel showed any set of rules sophisticated enough to handle basic math would either fail to find things that were true, or would make mistakes - and, worse, could never prove that they were consistent.

Like so many seemingly abstract mathematical concepts, this has practical real-world implications. If you're dealing with anything at all complicated, and try to solve your problems with a set of rules, either those rules will fail to find the right answers, or will give the wrong answers, and you can't tell which.

That's why principles are better than rules: they make no pretensions of being a complete set of if-then rules that can handle all of arithmetic and their own job besides. They evaluate propositions, rather than generating them, they're not vulnerable to the incompleteness result in the same way.

How does this affect the moral teachings of religion? Well, think of it this way: God gave us the Ten Commandments (and much more) in the Old Testament, but these if-then rules needed to be elaborated and refined into a complete system. This was a cottage industry by the time Jesus came on the scene.

Breaking with the rule-based tradition, Jesus gave us principles, such as "love thy neighbor as thyself" and "forgive as you wish to be forgiven" which can be used to evaluate our actions. Sometimes, some thought is required to apply them, as in the case of "Is it lawful to do good or evil on the Sabbath?"

This is where principles fail: they don't generate actions, they merely evaluate them. Some other process needs to generate those actions. It could be a formal set of rules, but then we're back at square Gödel. It could be a random number generator, but an infinite set of monkeys will take forever to cross the street.

This is why Jesus's function as a role model - and the stories about Him in the Bible - are so important to Christianity. Humans generate mental models of other humans all the time. Once you've seen enough examples of someone's behavior, you can predict what they will do, and act and react accordingly.

The stories the Bible tells about Jesus facing moral questions, ethical challenges, physical suffering, and even temptation help us build a model of what Jesus would do. A good model of Jesus is more powerful than any rule and more useful than any principle: it is generative, easy to follow, and always applicable.

Even if you're not a Christian, this model of ethics can help you. No set of rules can be complete and consistent, or even fully checkable: rules lawyering is a dead end. Ethical growth requires moving beyond easy rules to broader principles which can be used to evaluate the outcomes of your choices.

But principles are not a guide to action. That's where role models come in: in a kind of imitation-based learning, they can help guide us by example until we've developed the cognitive skills to make good decisions automatically. Finding role models that you trust can help you grow, and not just morally.

Good role models can help you decide what to do in any situation. Not every question is relevant to the situations Jesus faced in ancient Galilee! For example, when faced with a conundrum, I sometimes ask three questions: "What would Jesus do? What would Richard Feynman do? What would Ayn Rand do?"

These role models seem far apart - Ayn Rand, in particular, tried to put herself on the opposite pole from Jesus. But each brings unique mental thought processes to the table - "Is this doing good or evil?" "You are the easiest person for yourself to fool" and "You cannot fake reality in any way whatsoever."

Jesus helps me focus on what choices are right. Feynman helps me challenge my assumptions and provides methods to test them. Rand is benevolent, but demands that we be honest about reality. If two or three of these role models agree on a course of action, it's probably a good choice.

Jesus was a real person in a distant part of history. We can only reach an understanding of who Jesus is and what He would do by reading the primary source materials about him - the Bible - and by analyses that help put these stories in context, like religious teachings, church tradition, and the use of reason.

But that can help us ask what Jesus would do. Learning the rules are important, and graduating beyond them to understand principles is even more important. But at the end of the day, we want to do the right thing, by following the lead of the man who asks, "Love thy neighbor as thyself."

-the Centaur

Pictured: Kurt Gödel, of course.

Renovation in Process

Published October 2, 2019 by centaur

Renovation in Process

So you may have noticed the blog theme and settings changing recently; that's because I'm trying to get some kind of slider or visual image above the fold. I love the look of the blog with the big banner image, but I'm concerned that people just won't scroll down to see what's in the blog if there's nothing on the first page which says what I do. So I'll be experimenting. Stay tuned! -the Centaur Pictured: Yeah, this isn't the only renovation going on.

Robots in Montreal

Published May 24, 2019 by centaur

Robots in Montreal

"Robots in Montreal," eh? Sounds like the title of a Steven Moffat Doctor Who episode. But it's really ICRA 2019 - the IEEE Conference on Robotics and Automation, and, yes, there are quite a few robots!

Boston Dynamics quadruped robot with arm and another quadruped.

My team presented our work on evolutionary learning of rewards for deep reinforcement learning, AutoRL, on Monday. In an hour or so, I'll be giving a keynote on "Systematizing Robot Navigation with AutoRL":

Keynote: Dr. Anthony Francis
Systematizing Robot Navigation with AutoRL: Evolving Better Policies with Better Evaluation

Abstract: Rigorous scientific evaluation of robot control methods helps the field progress towards better solutions, but deploying methods on robots requires its own kind of rigor. A systematic approach to deployment can do more than just make robots safer, more reliable, and more debuggable; with appropriate machine learning support, it can also improve robot control algorithms themselves. In this talk, we describe our evolutionary reward learning framework AutoRL and our evaluation framework for navigation tasks, and show how improving evaluation of navigation systems can measurably improve the performance of both our evolutionary learner and the navigation policies that it produces. We hope that this starts a conversation about how robotic deployment and scientific advancement can become better mutually reinforcing partners.

Bio: Dr. Anthony G. Francis, Jr. is a Senior Software Engineer at Google Brain Robotics specializing in reinforcement learning for robot navigation. Previously, he worked on emotional long-term memory for robot pets at Georgia Tech's PEPE robot pet project, on models of human memory for information retrieval at Enkia Corporation, and on large-scale metadata search and 3D object visualization at Google. He earned his B.S. (1991), M.S. (1996) and Ph.D. (2000) in Computer Science from Georgia Tech, along with a Certificate in Cognitive Science (1999). He and his colleagues won the ICRA 2018 Best Paper Award for Service Robotics for their paper "PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning". He's the author of over a dozen peer-reviewed publications and is an inventor on over a half-dozen patents. He's published over a dozen short stories and four novels, including the EPIC eBook Award-winning Frost Moon; his popular writing on robotics includes articles in the books Star Trek Psychology and Westworld Psychology. as well as a Google AI blog article titled Maybe your computer just needs a hug. He lives in San Jose with his wife and cats, but his heart will always belong in Atlanta. You can find out more about his writing at his website.

Looks like I'm on in 15 minutes! Wish me luck.

-the Centaur

I<tab-complete> welcome our new robot overlords.

Published July 3, 2018 by centaur

I<tab-complete> welcome our new robot overlords.

Hoisted from a recent email exchange with my friend Gordon Shippey:

Re: Whassap? Gordon: Sounds like a plan. (That was an actual GMail suggested response. Grumble-grumble AI takeover.) Anthony: I<tab-complete> welcome our new robot overlords.

I am constantly amazed by the new autocomplete. While, anecdotally, autocorrect of spell checking is getting worse and worse (I blame the nearly-universal phenomenon of U-shaped development, where a system trying to learn new generalizations gets worse before it gets better), I have written near-complete emails to friends and colleagues with Gmail's suggested responses, and when writing texts to my wife, it knows our shorthand! One way of doing this back in the day were Markov chain text models, where we learn predictions of what patterns are likely to follow each other; so if I write "love you too boo boo" to my wife enough times, it can predict "boo boo" will follow "love you too" and provide it as a completion. More modern systems use recurrent neural networks to learn richer sets of features with stateful information carried down the chain, enabling modern systems to capture subtler relationships and get better results, as described in the great article "The Unreasonable Effectiveness of Recurrent Neural Networks". -the<tab-complete> Centaur

PRM-RL Won a Best Paper Award at ICRA!

Published May 31, 2018 by centaur

PRM-RL Won a Best Paper Award at ICRA!

So, this happened! Our team's paper on "PRM-RL" - a way to teach robots to navigate their worlds which combines human-designed algorithms that use roadmaps with deep-learned algorithms to control the robot itself - won a best paper award at the ICRA robotics conference!

I talked a little bit about how PRM-RL works in the post "Learning to Drive ... by Learning Where You Can Drive", so I won't go over the whole spiel here - but the basic idea is that we've gotten good at teaching robots to control themselves using a technique called deep reinforcement learning (the RL in PRM-RL) that trains them in simulation, but it's hard to extend this approach to long-range navigation problems in the real world; we overcome this barrier by using a more traditional robotic approach, probabilistic roadmaps (the PRM in PRM-RL), which build maps of where the robot can drive using point to point connections; we combine these maps with the robot simulator and, boom, we have a map of where the robot thinks it can successfully drive.

We were cited not just for this technique, but for testing it extensively in simulation and on two different kinds of robots. I want to thank everyone on the team - especially Sandra Faust for her background in PRMs and for taking point on the idea (and doing all the quadrotor work with Lydia Tapia), for Oscar Ramirez and Marek Fiser for their work on our reinforcement learning framework and simulator, for Kenneth Oslund for his heroic last-minute push to collect the indoor robot navigation data, and to our manager James for his guidance, contributions to the paper and support of our navigation work.

Woohoo! Thanks again everyone! -the Centaur

Dave, We’re On Your Side

Published March 28, 2018 by centaur

The biggest "current" in my mind is the person I am currently worried about, my good friend and great Game AI developer Dave Mark. Dave is the founder of the GDC AI Summit ... but was struck by a car leaving the last sessions at GDC, and still is in the hospital, seriously injured. Dave is a really special person. I've been going to GDC longer than Dave, but it was he (along with my friend Neil Kirby) who drew me out of my shell and got me to participate in the Game AI community, which is a super important part of my life even though I don't do Game AI for my day job. Dave's friends and family have set up a Go Fund Me to help cover his medical expenses and the travel and other expenses of his family while he remains in the hospital in the Bay Area. I encourage you all to help out - especially if you've ever played a game and found the AI especially clever. Dave, you're in our prayers ... -the Centaur Pictured: Dave (on the right) and friends.

Just Checking in on the Currents

Published March 28, 2018 by centaur

SO! Hey! GDC and Clockwork Alchemy are over and I'm not dead! (A joke which actually I don't find that funny given the circumstances, which I'll dig into in just a moment). Strangely enough, hitting two back-to-back conferences, both of which you participate super heavily in, can take something out of your blog. Who knew? But I need to get better at blogging, so I thought I'd try something new: a "check-in" in which I try to hit all the same points each time - what am I currently writing, editing, programming, etc? For example, I am currently:

Listening To: Tomb Raider soundtrack (the original).
Reading: Theoretical Neuroscience (book).
Writing: "Death is a Game for the Young", a novella in the Jeremiah Willstone multiverse.
Editing: SPECTRAL IRON, Dakota Frost #4.
Reviewing: SHATTERED SKY, Lunar Cycle #2 by David Colby.
Researching: Neural Approaches to Universal Subgoaling.
Programming: A toy DQN (Deep Q Network) to stretch my knowledge.
Drawing: Steampunk girls with goggles.
Planning: Camp Nanowrimo for April, ROOT USER, Cinnamon Frost #3.
Taking on: Giving up alcohol for Lent.
Dragging on: Doing my taxes.
Spring Cleaning: The side office.
Trying to Ignore: The huge pile of blogposts left over from GDC and CA.
Caring For: My cat Lenora, suffering from cancer.
Waiting For: My wife Sandi, returning from a business trip.

Whew, that's a lot, and I don't even think I got them all. Maybe I won't try to write all of the same "currents" every time, but it was a useful exercise in "find something to blog about without immediately turning it into a huge project." But the biggest "current" in my mind is the person I am currently worried about, my good friend and great Game AI developer Dave Mark. Dave is the founder of the GDC AI Summit ... but was struck by a car leaving the last sessions at GDC, and still is in the hospital, seriously injured. More in a moment. -the Centaur Pictured: Butterysmooooth sashimi at Izakaya Ginji in San Mateo from a few days ago, along with my "Currently Reading" book Theoretical Neuroscience open to the Linear Algebra appendix, when I was "Currently Researching" some technical details of the vector notation of quadratic forms by going through stacks and stacks of books, a question which would have been answered more easily if I had started by looking at the entry for quadratic forms in Wolfram's MathWorld, had I only known at the start of my search that that was the name for math terms like xᵀWx.

Enter Colaboratory (AKA “A Spoonful of the Tracking Soup”)

Published March 4, 2018 by centaur

As an author, I'm interested in how well my books are doing: not only do I want people reading them, I also want to compare what my publisher and booksellers claim about my books with my actual sales. (Also, I want to know how close to retirement I am.) In the past, I used to read a bunch of web pages on Amazon (and Barnes and Noble too, before they changed their format) and entered them into an Excel spreadsheet called "Writing Popularity" (but just as easily could have been called "Writing Obscurity", yuk yuk yuk). That was fine when I had one book, but now I have four novels and an anthology out. This could take out half an a hour or more, which I needed for valuable writing time. I needed a better system. I knew about tools for parsing web pages, like the parsing library Beautiful Soup, but it had been half a decade since I touched that library and I just never had the time to sit down and do it. But, recently, I've realized the value of a great force multiplier for exploratory software development (and I don't mean Stack Exchange): interactive programming notebooks. Pioneered by Mathematica in 1988 and picked up by tools like iPython and its descendent Jupyter, an interactive programming notebook is like a mix of a command line - where you can dynamically enter commands and get answers - and literate programming, where code is written into the documents that document (and produce it). But Mathematica isn't the best tool for either web parsing or for producing code that will one day become a library - it's written in the Wolfram Language, which is optimized for mathematical computations - and Jupyter notebooks require setting up a Jupyter server or otherwise jumping through hoops. Enter Google's Colaboratory. Colab is a free service provided by Google that hosts Jupyter notebooks. It's got most of the standard libraries that you might need, it provides its own backends to run the code, and it saves copies of the notebooks to Google Drive, so you don't have to worry about acquiring software or running a server or even saving your data (but do please hit save). Because you can try code out and see the results right away, it's perfect on iterating ideas: no need to re-start a changed program, losing valuable seconds; if something doesn't work, you can tweak the code and try it right away. In this sense Colab has some of the force multiplier effects of a debugger, but it's far more powerful. Heck, in this version of the system you can ask a question on Stack Overflow right from the Help menu. How cool is that? My prototyping session got a bit long, so rather than try to insert it inline here, I wrote this blog post in Colab! To read more, go take a look at the Colaboratory notebook itself, "A Sip of the Tracking Soup", available at: https://goo.gl/Mihf1n -the Centaur

Why I’m Solving Puzzles Right Now

Published February 17, 2018 by centaur

When I was a kid (well, a teenager) I'd read puzzle books for pure enjoyment. I'd gotten started with Martin Gardner's mathematical recreation books, but the ones I really liked were Raymond Smullyan's books of logic puzzles. I'd go to Wendy's on my lunch break at Francis Produce, with a little notepad and a book, and chew my way through a few puzzles. I'll admit I often skipped ahead if they got too hard, but I did my best most of the time. I read more of these as an adult, moving back to the Martin Gardner books. But sometime, about twenty-five years ago (when I was in the thick of grad school) my reading needs completely overwhelmed my reading ability. I'd always carried huge stacks of books home from the library, never finishing all of them, frequently paying late fees, but there was one book in particular - The Emotions by Nico Frijda - which I finished but never followed up on. Over the intervening years, I did finish books, but read most of them scattershot, picking up what I needed for my creative writing or scientific research. Eventually I started using the tiny little notetabs you see in some books to mark the stuff that I'd written, a "levels of processing" trick to ensure that I was mindfully reading what I wrote. A few years ago, I admitted that wasn't enough, and consciously began trying to read ahead of what I needed to for work. I chewed through C++ manuals and planning books and was always rewarded a few months later when I'd already read what I needed to to solve my problems. I began focusing on fewer books in depth, finishing more books than I had in years. Even that wasn't enough, and I began - at last - the re-reading project I'd hoped to do with The Emotions. Recently I did that with Dedekind's Essays on the Theory of Numbers, but now I'm doing it with the Deep Learning. But some of that math is frickin' beyond where I am now, man. Maybe one day I'll get it, but sometimes I've spent weeks tackling a problem I just couldn't get. Enter puzzles. As it turns out, it's really useful for a scientist to also be a science fiction writer who writes stories about a teenaged mathematical genius! I've had to simulate Cinnamon Frost's staggering intellect for the purpose of writing the Dakota Frost stories, but the further I go, the more I want her to be doing real math. How did I get into math? Puzzles! So I gave her puzzles. And I decided to return to my old puzzle books, some of the ones I got later but never fully finished, and to give them the deep reading treatment. It's going much slower than I like - I find myself falling victim to the "rule of threes" (you can do a third of what you want to do, often in three times as much time as you expect) - but then I noticed something interesting. Some of Smullyan's books in particular are thinly disguised math books. In some parts, they're even the same math I have to tackle in my own work. But unlike the other books, these problems are designed to be solved, rather than a reflection of some chunk of reality which may be stubborn; and unlike the other books, these have solutions along with each problem. So, I've been solving puzzles ... with careful note of how I have been failing to solve puzzles. I've hinted at this before, but understanding how you, personally, usually fail is a powerful technique for debugging your own stuck points. I get sloppy, I drop terms from equations, I misunderstand conditions, I overcomplicate solutions, I grind against problems where I should ask for help, I rabbithole on analytical exploration, and I always underestimate the time it will take for me to make the most basic progress. Know your weaknesses. Then you can work those weak mental muscles, or work around them to build complementary strengths - the way Richard Feynman would always check over an equation when he was done, looking for those places where he had flipped a sign. Back to work! -the Centaur Pictured: my "stack" at a typical lunch. I'll usually get to one out of three of the things I bring for myself to do. Never can predict which one though.

Nailed It (Sorta)

Published February 15, 2018 by centaur

Here's what was in the rabbit hole from last time (I had been almost there):

I had way too much data to exploit, so I started to think about culling it out, using the length of the "mumbers" to cut off all the items too big to care about. That led to the key missing insight: my method of mapping mumbers mapped the first digit of each item to the same position - that is, 9, 90, 900, 9000 all had the same angle, just further out. This distance was already a logarithm of the number, but once I dropped my resistance to taking the logarithm twice...

... then I could create a transition plot function which worked for almost any mumber in the sets of mumbers I was playing with ...

Then I could easily visualize the small set of transitions - "mumbers" with 3 digits - that yielded the graph above; for reference these are:

The actual samples I wanted to play with were larger, like this up to 4 digits:

This yields a still visible graph:

And this, while it doesn't let me visualize the whole space that I wanted, does provide the insight I wanted. The "mumbers" up to 10000 do indeed "produce" most of the space of the smaller "mumbers" (not surprising, as the "mumber" rule 2XYZ produces XYZ, and 52XY produces XYXY ... meaning most numbers in the first 10,000 will be produced by one in that first set). But this shows that sequences of 52 rule transitions on the left produce a few very, very large mumbers - probably because 552552 produces 552552552552 which produces 552552552552552552552552552552552552 which quickly zooms away to the "mumberOverflow" value at the top of my chart. And now the next lesson: finishing up this insight, which more or less closes out what I wanted to explore here, took 45 minutes. I had 15 allotted to do various computer tasks before leaving Aqui, and I'm already 30 minutes over that ... which suggests again that you be careful going down rabbit holes; unlike leprechaun trails, there isn't likely to be a pot of gold down there, and who knows how far down it can go? -the Centaur P.S. I am not suggesting this time spent was not worthwhile; I'm just trying to understand the option cost of various different problem solving strategies so I can become more efficient.

Don’t Fall Into Rabbit Holes

Published February 14, 2018 by centaur

Don’t Fall Into Rabbit Holes

SO! There I was, trying to solve the mysteries of the universe, learn about deep learning, and teach myself enough puzzle logic to create credible puzzles for the Cinnamon Frost books, and I find myself debugging the fine details of a visualization system I've developed in Mathematica to analyze the distribution of problems in an odd middle chapter of Raymond Smullyan's The Lady or the Tiger.

I meant well! Really I did. I was going to write a post about how finding a solution is just a little bit harder than you normally think, and how insight sometimes comes after letting things sit.

But the tools I was creating didn't do what I wanted, so I went deeper and deeper down the rabbit hole trying to visualize them.

The short answer seems to be that there's no "there" there and that further pursuit of this sub-problem will take me further and further away from the real problem: writing great puzzles!

I learned a lot - about numbers, about how things could combinatorially explode, about Ulam Spirals and how to code them algorithmically. I even learned something about how I, particularly, fail in these cases. But it didn't provide the insights I wanted. Feynman warned about this: he called it "the computer disease", worrying about the formatting of the printout so much you forget about the answer you're trying to produce, and it can strike anyone in my line of work. Back to that work. -the Centaur

Learning to Drive … by Learning Where You Can Drive

Published January 12, 2018 by centaur

I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I can tell you all about it! When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance. For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what. But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings. People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments. But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge. Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world.

In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round).

This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper. Until the ICRA version comes out, you can find the preliminary version on arXiv:

https://arxiv.org/abs/1710.03937 PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

Aleksandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, Anthony Francis, James Davidson, Lydia Tapia

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.

So, when I say "I teach robots to learn" ... that's what I do. -the Centaur

My Daily Dragon Interview in Two Words: “Just Write!”

Published September 6, 2017 by centaur

My Daily Dragon Interview in Two Words: “Just Write!”

So at Dragon Con I had a reading this year. Yeah, looks like this is the last year I get to bring all my books - too many, to heavy! I read the two flash fiction pieces in Jagged Fragments, "If Looks Could Kill" and "The Secret of the T-Rex's Arms", as well as reading the first chapter of Jeremiah Willstone and the Clockwork Time Machine, a bit of my and Jim Davies' essay on the psychology of Star Trek's artificial intelligences, and even a bit of my very first published story, "Sibling Rivalry". I also gave the presentation I was supposed to give at the SAM Talks before I realized I was double booked; that was "Risk Getting Worse".

But that wasn't recorded, so, oh dang, you'll have to either go to my Amazon page to get my books, or wait until we get "Risk Getting Worse" recorded. But my interview with Nancy Northcott for the Daily Dragon, "Robots, Computers, and Magic", however, IS online, so I can share it with you all. Even more so, I want to share what I think is the most important part of my interview:

DD: Do you have any one bit of advice for aspiring writers? AF: Write. Just write. Don’t worry about perfection, or getting published, or even about pleasing anyone else: just write. Write to the end of what you start, and only then worry about what to do with it. In fact, don’t even worry about finishing everything—don’t be afraid to try anything. Artists know they need to fill a sketchbook before sitting down to create a masterwork, but writers sometimes get trapped trying to polish their first inspiration into a final product. Don’t get trapped on the first hill! Whip out your notebook and write. Write morning pages. Write diary at the end of the day. Write a thousand starts to stories, and if one takes flight, run with it with all the abandon you have in you. Accept all writing, especially your own. Just write. Write.

That's it. To read more, check out the interview here, or see all my Daily Dragon mentions at Dragon Con here, or check out my interviewer Nancy Northcott's site here. Onward! -the Centaur

What is Artificial Intelligence?

Published August 27, 2017 by centaur

Simply put, "artificial intelligence” is people trying to make things do things that we’d call smart if done by people.

So what’s the big deal about that?

Well, as it turns out, a lot of people get quite wound up with the definition of "artificial intelligence.” Sometimes this is because they’re invested in a prescientific notion that machines can’t be intelligent and want to define it in a way that writes the field off before it gets started, or it’s because they’re invested in an unscientific degree into their particular theory of intelligence and want to define it in a way that constrains the field to look at only the things they care about, or because they’re actually not scientific at all and want to proscribe the field to work on the practical problems of particular interest to them.

No, I’m not bitter about having to wade through a dozen bad definitions of artificial intelligence as part of a survey. Why do you ask?

The Eagle Has Landed

Published March 21, 2017 by centaur

Welp, that was anticlimactic! Thanks, God, for a smooth update to WordPress 4.7.3! (And thanks to the WordPress team for maintaining backwards compatibility). And hey, look - the Library has close to 1,000 posts!

Screenshot 2017-03-21 12.35.50.png

Expect major site updates in the months to come, as WordPress’s Themes and Pages now enable me to do things I could only formerly do with static pages and hand-coded pages, and it will all be backed up easier thanks to WordPress’s Jetpack plugin.

The things you learn helping other people with their web sites ….

-the Centaur

We are go for launch …

Published March 21, 2017 by centaur

Welp, it’s time: I’ve backed up the Library of Dresan three ways to Sunday, said a prayer … and now am planning to upgrade WordPress from 3.0.1-alpha-15359 to 4.7.3. I know that’s 1.7.2 full version numbers, but it’s been too long, and there are too many new features I need, so … time to press the button.

God, please help me! Everyone else, your prayers, please.

-the Centaur

GDC 2017 AI Summit in Progress

Published February 28, 2017 by centaur

Lots of great content …

... and this year I have pages and pages of notes!

Stay tuned …

... or check the talks out in a few weeks on the GDC Vault!

-the Centaur

Welcome to the Future

Published January 1, 2017 by centaur

Welcome to the future, ladies and gentlemen. Here in the future, the obscure television shows of my childhood rate an entire section in the local bookstore, which combines books, games, music, movies, and even vinyl records with a coffeehouse and restaurant.

Here in the future, the heretofore unknown secrets of my discipline, artificial intelligence, are now conveniently compiled in compelling textbooks that you can peruse at your leisure over a cup of coffee.

Here in the future, genre television shows play on the monitors of my favorite bar / restaurant, and the servers and I have meaningful conversations about the impact of robotics on the future of labor.

And here in the future, Monty Python has taken over the world.

Perhaps that explains 2016.

-the Centaur

I’m so sorry, web …

Published December 21, 2016 by centaur

… I had to install an ad-blocker. Why? Firefox before any ad block:

Screenshot 2016-12-21 21.08.19.png

Firefox after Adblock Plus:

Screenshot 2016-12-21 21.08.55.png

Yep, Firefox was TEN TIMES SLOWER when loading a page with ads, and it stayed that way because the ads kept updating. Just one page with ads brought FF to its knees, and I did the experiment several times to confirm, yes, it indeed was the ads. I don’t know what’s specifically going on here, but I strongly suspect VPAID ads and similar protocols are the culprit, as documented here:

http://techaeris.com/2016/06/14/vpaid-ads-hurting-internet-experience/

… publisher and website owner Artem Russakovskii took to Google+ and The Hacker News to share some of his findings concerning VPAID ads. He shows how VPAID ads can degrade a user’s browser performance:

“… after several minutes of just leaving this one single ad open, I’m at 53MB downloaded and 5559 requests. By the time I finished typing this, I was at 6140 requests. A single ad did this. Without reloading the page, just leaving it open.

A single VPAID ad absolutely demolishes site performance on mobile and desktop, and we, the publishers, get the full blame from our readers. And when multiple VPAID ads end up getting served on the same page… you get the idea."

Similarly, John Gruber reports that a 500-word text article weighed in at 15MB - enough data to hold more than 10 copies of the Bible, according to the Guardian. Gruber links another post which shows that web pages can get more than 5 times faster without all the excess scripts that they load.

The sad thing is, I don’t mind ads. The very first version of my site had fake “ads” for other blogs I liked. Even the site I tested above, the estimable Questionable Content, had ads for other webcomics I liked, but experimentation showed that ads could bring Firefox to its knees. QC I always thought of as ad-lite, but guess it’s time to start contributing via Patreon.

The real problem is news sites. Sites were opening a simple story kept locking up Firefox and twice brought down my whole computer by draining the battery incredibly fast. I don’t care what you think your metrics are telling you, folks: if you pop up an overview so I can’t see your page, and start running a dozen ads that kill my computer, I will adblock you, or just stop going to your site, and many, many other people across the world are doing the same.

We need standards of excellence in content that say 2/3 of a page will be devoted to content and that ads can add no more than 50% to the bandwidth downloaded by a page. Hell, make it only 1/3 content and 100% extra bandwidth - that will be almost 100% more content than a page totally destroyed by popup ads and almost 3000% less data than one bloated by 10 copies of the Old Testament in the form of redundant ads for products I will either never buy or, worse, have already bought.

-the Centaur

Why yes, I’m running a deep learning system on a MacBook Air. Why?

Published February 8, 2016 by centaur

Why yes, I’m running a deep learning system on a MacBook Air. Why?

Yep, that’s Python consuming almost 300% of my CPU - guess what, I guess that means this machine has four processing cores, since I saw it hit over 300% - running the TensorFlow tutorial. For those that don’t know, "deep learning” is a relatively recent type of learning which uses improvements in both processing power and learning algorithms to train learning networks that can have dozens or hundreds of layers - sometimes as many layers as neural networks in the 1980’s and 1990’s had nodes. For those that don’t know even that, neural networks are graphs of simple nodes that mimic brain structures, and you can train them with data that contains both the question and the answer. With enough internal layers, neural networks can learn almost anything, but they require a lot of training data and a lot of computing power. Well, now we’ve got lots and lots of data, and with more computing power, you’d expect we’d be able to train larger networks - but the first real trick was discovering mathematical tricks that keep the learning signal strong deep, deep within the networks. The second real trick was wrapping all this amazing code in a clean software architecture that enables anyone to run the software anywhere. TensorFlow is one of the most recent of these frameworks - it’s Google’s attempt to package up the deep learning technology it uses internally so that everyone in the world can use it - and it’s open source, so you can download and install it on most computers and try out the tutorial at home. The CPU-baking example you see running here, however, is not the simpler tutorial, but a test program that runs a full deep neural network. Let’s see how it did:

Well. 99.2% correct, it seems. Not bad for a couple hundred lines of code, half of which is loading the test data - and yeah, that program depends on 200+ files worth of Python that the TensorFlow installation loaded onto my MacBook Air, not to mention all the libraries that the TensorFlow Python installation depends on in turn … But I still loaded it onto a MacBook Air, and it ran perfectly. Amazing what you can do with computers these days. -the Centaur