Dave, We’re On Your Side

The biggest “current” in my mind is the person I am currently worried about, my good friend and great Game AI developer Dave Mark. Dave is the founder of the GDC AI Summit … but was struck by a car leaving the last sessions at GDC, and still is in the hospital, seriously injured.

Dave is a really special person. I’ve been going to GDC longer than Dave, but it was he (along with my friend Neil Kirby) who drew me out of my shell and got me to participate in the Game AI community, which is a super important part of my life even though I don’t do Game AI for my day job.

Dave’s friends and family have set up a Go Fund Me to help cover his medical expenses and the travel and other expenses of his family while he remains in the hospital in the Bay Area. I encourage you all to help out – especially if you’ve ever played a game and found the AI especially clever.

Dave, you’re in our prayers …

-the Centaur

Pictured: Dave (on the right) and friends.

Just Checking in on the Currents

SO! Hey! GDC and Clockwork Alchemy are over and I’m not dead! (A joke which actually I don’t find that funny given the circumstances, which I’ll dig into in just a moment). Strangely enough, hitting two back-to-back conferences, both of which you participate super heavily in, can take something out of your blog. Who knew?

But I need to get better at blogging, so I thought I’d try something new: a “check-in” in which I try to hit all the same points each time – what am I currently writing, editing, programming, etc? For example, I am currently:

  • Listening To: Tomb Raider soundtrack (the original).
  • Reading: Theoretical Neuroscience (book).
  • Writing: “Death is a Game for the Young”, a novella in the Jeremiah Willstone multiverse.
  • Editing: SPECTRAL IRON, Dakota Frost #4.
  • Reviewing: SHATTERED SKY, Lunar Cycle #2 by David Colby.
  • Researching: Neural Approaches to Universal Subgoaling.
  • Programming: A toy DQN (Deep Q Network) to stretch my knowledge.
  • Drawing: Steampunk girls with goggles.
  • Planning: Camp Nanowrimo for April, ROOT USER, Cinnamon Frost #3.
  • Taking on: Giving up alcohol for Lent.
  • Dragging on: Doing my taxes.
  • Spring Cleaning: The side office.
  • Trying to Ignore: The huge pile of blogposts left over from GDC and CA.
  • Caring For: My cat Lenora, suffering from cancer.
  • Waiting For: My wife Sandi, returning from a business trip.

Whew, that’s a lot, and I don’t even think I got them all. Maybe I won’t try to write all of the same “currents” every time, but it was a useful exercise in “find something to blog about without immediately turning it into a huge project.”

But the biggest “current” in my mind is the person I am currently worried about, my good friend and great Game AI developer Dave Mark. Dave is the founder of the GDC AI Summit … but was struck by a car leaving the last sessions at GDC, and still is in the hospital, seriously injured.

More in a moment.

-the Centaur

Pictured: Butterysmooooth sashimi at Izakaya Ginji in San Mateo from a few days ago, along with my “Currently Reading” book Theoretical Neuroscience open to the Linear Algebra appendix, when I was “Currently Researching” some technical details of the vector notation of quadratic forms by going through stacks and stacks of books, a question which would have been answered more easily if I had started by looking at the entry for quadratic forms in Wolfram’s MathWorld, had I only known at the start of my search that that was the name for math terms like xWx.

Enter Colaboratory (AKA “A Spoonful of the Tracking Soup”)

As an author, I’m interested in how well my books are doing: not only do I want people reading them, I also want to compare what my publisher and booksellers claim about my books with my actual sales. (Also, I want to know how close to retirement I am.)

In the past, I used to read a bunch of web pages on Amazon (and Barnes and Noble too, before they changed their format) and entered them into an Excel spreadsheet called “Writing Popularity” (but just as easily could have been called “Writing Obscurity”, yuk yuk yuk). That was fine when I had one book, but now I have four novels and an anthology out. This could take out half an a hour or more, which I needed for valuable writing time. I needed a better system.

I knew about tools for parsing web pages, like the parsing library Beautiful Soup, but it had been half a decade since I touched that library and I just never had the time to sit down and do it. But, recently, I’ve realized the value of a great force multiplier for exploratory software development (and I don’t mean Stack Exchange): interactive programming notebooks. Pioneered by Mathematica in 1988 and picked up by tools like iPython and its descendent Jupyter, an interactive programming notebook is like a mix of a command line – where you can dynamically enter commands and get answers – and literate programming, where code is written into the documents that document (and produce it). But Mathematica isn’t the best tool for either web parsing or for producing code that will one day become a library – it’s written in the Wolfram Language, which is optimized for mathematical computations – and Jupyter notebooks require setting up a Jupyter server or otherwise jumping through hoops.

Enter Google’s Colaboratory.

Colab is a free service provided by Google that hosts Jupyter notebooks. It’s got most of the standard libraries that you might need, it provides its own backends to run the code, and it saves copies of the notebooks to Google Drive, so you don’t have to worry about acquiring software or running a server or even saving your data (but do please hit save). Because you can try code out and see the results right away, it’s perfect on iterating ideas: no need to re-start a changed program, losing valuable seconds; if something doesn’t work, you can tweak the code and try it right away. In this sense Colab has some of the force multiplier effects of a debugger, but it’s far more powerful. Heck, in this version of the system you can ask a question on Stack Overflow right from the Help menu. How cool is that?

My prototyping session got a bit long, so rather than try to insert it inline here, I wrote this blog post in Colab! To read more, go take a look at the Colaboratory notebook itself, “A Sip of the Tracking Soup”, available at: https://goo.gl/Mihf1n

-the Centaur

 

Why I’m Solving Puzzles Right Now

When I was a kid (well, a teenager) I’d read puzzle books for pure enjoyment. I’d gotten started with Martin Gardner’s mathematical recreation books, but the ones I really liked were Raymond Smullyan’s books of logic puzzles. I’d go to Wendy’s on my lunch break at Francis Produce, with a little notepad and a book, and chew my way through a few puzzles. I’ll admit I often skipped ahead if they got too hard, but I did my best most of the time.

I read more of these as an adult, moving back to the Martin Gardner books. But sometime, about twenty-five years ago (when I was in the thick of grad school) my reading needs completely overwhelmed my reading ability. I’d always carried huge stacks of books home from the library, never finishing all of them, frequently paying late fees, but there was one book in particular – The Emotions by Nico Frijda – which I finished but never followed up on.

Over the intervening years, I did finish books, but read most of them scattershot, picking up what I needed for my creative writing or scientific research. Eventually I started using the tiny little notetabs you see in some books to mark the stuff that I’d written, a “levels of processing” trick to ensure that I was mindfully reading what I wrote.

A few years ago, I admitted that wasn’t enough, and consciously  began trying to read ahead of what I needed to for work. I chewed through C++ manuals and planning books and was always rewarded a few months later when I’d already read what I needed to to solve my problems. I began focusing on fewer books in depth, finishing more books than I had in years.

Even that wasn’t enough, and I began – at last – the re-reading project I’d hoped to do with The Emotions. Recently I did that with Dedekind’s Essays on the Theory of Numbers, but now I’m doing it with the Deep Learning. But some of that math is frickin’ beyond where I am now, man. Maybe one day I’ll get it, but sometimes I’ve spent weeks tackling a problem I just couldn’t get.

Enter puzzles. As it turns out, it’s really useful for a scientist to also be a science fiction writer who writes stories about a teenaged mathematical genius! I’ve had to simulate Cinnamon Frost’s staggering intellect for the purpose of writing the Dakota Frost stories, but the further I go, the more I want her to be doing real math. How did I get into math? Puzzles!

So I gave her puzzles. And I decided to return to my old puzzle books, some of the ones I got later but never fully finished, and to give them the deep reading treatment. It’s going much slower than I like – I find myself falling victim to the “rule of threes” (you can do a third of what you want to do, often in three times as much time as you expect) – but then I noticed something interesting.

Some of Smullyan’s books in particular are thinly disguised math books. In some parts, they’re even the same math I have to tackle in my own work. But unlike the other books, these problems are designed to be solved, rather than a reflection of some chunk of reality which may be stubborn; and unlike the other books, these have solutions along with each problem.

So, I’ve been solving puzzles … with careful note of how I have been failing to solve puzzles. I’ve hinted at this before, but understanding how you, personally, usually fail is a powerful technique for debugging your own stuck points. I get sloppy, I drop terms from equations, I misunderstand conditions, I overcomplicate solutions, I grind against problems where I should ask for help, I rabbithole on analytical exploration, and I always underestimate the time it will take for me to make the most basic progress.

Know your weaknesses. Then you can work those weak mental muscles, or work around them to build complementary strengths – the way Richard Feynman would always check over an equation when he was done, looking for those places where he had flipped a sign.

Back to work!

-the Centaur

Pictured: my “stack” at a typical lunch. I’ll usually get to one out of three of the things I bring for myself to do. Never can predict which one though.

Nailed It (Sorta)

Here’s what was in the rabbit hole from last time (I had been almost there):

I had way too much data to exploit, so I started to think about culling it out, using the length of the “mumbers” to cut off all the items too big to care about. That led to the key missing insight: my method of mapping mumbers mapped the first digit of each item to the same position – that is, 9, 90, 900, 9000 all had the same angle, just further out. This distance was already a logarithm of the number, but once I dropped my resistance to taking the logarithm twice…

… then I could create a transition plot function which worked for almost any mumber in the sets of mumbers I was playing with …

Then I could easily visualize the small set of transitions – “mumbers” with 3 digits – that yielded the graph above; for reference these are:

The actual samples I wanted to play with were larger, like this up to 4 digits:

This yields a still visible graph:

And this, while it doesn’t let me visualize the whole space that I wanted, does provide the insight I wanted. The “mumbers” up to 10000 do indeed “produce” most of the space of the smaller “mumbers” (not surprising, as the “mumber” rule 2XYZ produces XYZ, and 52XY produces XYXY … meaning most numbers in the first 10,000 will be produced by one in that first set). But this shows that sequences of 52 rule transitions on the left produce a few very, very large mumbers – probably because 552552 produces 552552552552 which produces 552552552552552552552552552552552552 which quickly zooms away to the “mumberOverflow” value at the top of my chart.

And now the next lesson: finishing up this insight, which more or less closes out what I wanted to explore here, took 45 minutes. I had 15 allotted to do various computer tasks before leaving Aqui, and I’m already 30 minutes over that … which suggests again that you be careful going down rabbit holes; unlike leprechaun trails, there isn’t likely to be a pot of gold down there, and who knows how far down it can go?

-the Centaur

P.S. I am not suggesting this time spent was not worthwhile; I’m just trying to understand the option cost of various different problem solving strategies so I can become more efficient.

Don’t Fall Into Rabbit Holes

SO! There I was, trying to solve the mysteries of the universe, learn about deep learning, and teach myself enough puzzle logic to create credible puzzles for the Cinnamon Frost books, and I find myself debugging the fine details of a visualization system I’ve developed in Mathematica to analyze the distribution of problems in an odd middle chapter of Raymond Smullyan’s The Lady or the Tiger.

I meant well! Really I did. I was going to write a post about how finding a solution is just a little bit harder than you normally think, and how insight sometimes comes after letting things sit.

But the tools I was creating didn’t do what I wanted, so I went deeper and deeper down the rabbit hole trying to visualize them.

The short answer seems to be that there’s no “there” there and that further pursuit of this sub-problem will take me further and further away from the real problem: writing great puzzles!

I learned a lot – about numbers, about how things could combinatorially explode, about Ulam Spirals and how to code them algorithmically. I even learned something about how I, particularly, fail in these cases.

But it didn’t provide the insights I wanted. Feynman warned about this: he called it “the computer disease”, worrying about the formatting of the printout so much you forget about the answer you’re trying to produce, and it can strike anyone in my line of work.

Back to that work.

-the Centaur

Learning to Drive … by Learning Where You Can Drive

I often say “I teach robots to learn,” but what does that mean, exactly? Well, now that one of the projects that I’ve worked on has been announced – and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference – I  can tell you all about it!

When I’m not roaming the corridors hammering infrastructure bugs, I’m trying to teach robots to roam those corridors – a problem we call robot navigation. Our team’s latest idea combines “traditional planning,” where the robot tries to navigate based on an explicit model of its surroundings, with “reinforcement learning,” where the robot learns from feedback on its performance.

For those not in the know, “traditional” robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler “local planner” that knows how to navigate shorter distances. It’s a little like how you learn to drive in your neighborhood – starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what.

But for that to work, you have to know how to drive, and that’s where the local planner comes in. Building a local planner is simple in theory – you can write one for a toy world in a few dozen lines of code – but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called “navigation stacks” and can contain dozens of components – and in my experience they’re hard to get working and even when you do, they’re often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings.

People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments.

But there’s a further wrinkle: the so-called “reality gap“. By necessity, the local planner used by a probablistic roadmap is simulated – attempting to connect points on a map. That simulated local planner isn’t identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can’t navigate safely in the real world. This can have disastrous consequences – causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they’re going, causing cars to tumble off the end of a bridge.

Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap’s local planner and the robot’s navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world.

In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a “straight-line” local planner which was our default alternative. While I didn’t happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot’s behavior wasn’t as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we’ll fix this in a subsequent round).

This work includes both our group working on office robot navigation – including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson – and Alexandra’s collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper.  Until the ICRA version comes out, you can find the preliminary version on arXiv:

https://arxiv.org/abs/1710.03937
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.

 

So, when I say “I teach robots to learn” … that’s what I do.

-the Centaur

My Daily Dragon Interview in Two Words: “Just Write!”

So at Dragon Con I had a reading this year. Yeah, looks like this is the last year I get to bring all my books – too many, to heavy! I read the two flash fiction pieces in Jagged Fragments, “If Looks Could Kill” and “The Secret of the T-Rex’s Arms”, as well as reading the first chapter of Jeremiah Willstone and the Clockwork Time Machine, a bit of my and Jim Davies’ essay on the psychology of Star Trek’s artificial intelligences, and even a bit of my very first published story, “Sibling Rivalry“. I also gave the presentation I was supposed to give at the SAM Talks before I realized I was double booked; that was “Risk Getting Worse”.

But that wasn’t recorded, so, oh dang, you’ll have to either go to my Amazon page to get my books, or wait until we get “Risk Getting Worse” recorded. But my interview with Nancy Northcott for the Daily Dragon, “Robots, Computers, and Magic“, however, IS online, so I can share it with you all. Even more so, I want to share what I think is the most important part of my interview:

DD: Do you have any one bit of advice for aspiring writers?

AF: Write. Just write. Don’t worry about perfection, or getting published, or even about pleasing anyone else: just write. Write to the end of what you start, and only then worry about what to do with it. In fact, don’t even worry about finishing everything—don’t be afraid to try anything. Artists know they need to fill a sketchbook before sitting down to create a masterwork, but writers sometimes get trapped trying to polish their first inspiration into a final product.

Don’t get trapped on the first hill! Whip out your notebook and write. Write morning pages. Write diary at the end of the day. Write a thousand starts to stories, and if one takes flight, run with it with all the abandon you have in you. Accept all writing, especially your own. Just write. Write.

That’s it. To read more, check out the interview here, or see all my Daily Dragon mentions at Dragon Con here, or check out my interviewer Nancy Northcott’s site here. Onward!

-the Centaur

 

 

What is Artificial Intelligence?

20140523_114702_HDR.jpg

Simply put, “artificial intelligence” is people trying to make things do things that we’d call smart if done by people.

So what’s the big deal about that?

Well, as it turns out, a lot of people get quite wound up with the definition of “artificial intelligence.” Sometimes this is because they’re invested in a prescientific notion that machines can’t be intelligent and want to define it in a way that writes the field off before it gets started, or it’s because they’re invested in an unscientific degree into their particular theory of intelligence and want to define it in a way that constrains the field to look at only the things they care about, or because they’re actually not scientific at all and want to proscribe the field to work on the practical problems of particular interest to them.

No, I’m not bitter about having to wade through a dozen bad definitions of artificial intelligence as part of a survey. Why do you ask?

The Eagle Has Landed

lunar-module.jpg

Welp, that was anticlimactic! Thanks, God, for a smooth update to WordPress 4.7.3! (And thanks to the WordPress team for maintaining backwards compatibility). And hey, look – the Library has close to 1,000 posts!

Screenshot 2017-03-21 12.35.50.png

Expect major site updates in the months to come, as WordPress’s Themes and Pages now enable me to do things I could only formerly do with static pages and hand-coded pages, and it will all be backed up easier thanks to WordPress’s Jetpack plugin.

The things you learn helping other people with their web sites ….

-the Centaur