Press "Enter" to skip to content

Posts tagged as “Engineering the Robot Apocalypse”

Announcing the Embodied AI Workshop #4 at CVPR 2023

centaur 0

Hey folks, I am proud to announce the 4th annual Embodied AI Workshop, held once again at CVPR 2023! EAI is a multidisciplinary workshop bringing computer vision researchers, machine learning researchers and roboticists to study the problem of creating intelligent systems that interact with their worlds.

For a highlight of previous workshops, see our Retrospectives paper. This year, EAI #4 will feature dozens of researchers, over twenty participating institutions, and ten distinct embodied AI challenges. Our three main themes for this year's workshop are:

  • Foundation Models: large, pretrained models that can solve many tasks few-shot or zero-shot
  • Generalist Agents: agents capable of solving a wide variety of problems
  • Sim to Real Transfer: learning in simulation but deploying in reality.

We will have presentations from all the challenges discussing their tasks, progress in the community, and winning approaches. We will also have six speakers on a variety of topics, and at the end of the workshop I'll be moderating a panel discussion among them.

I hope you can join us, in the real or virtually, at EAI #4 at CVPR 2023 in Vancouver!

-the Centaur

[twenty-eight] minus twenty: re-ju-ven-ate!

centaur 0

Oh, look, it's a Dalek acting as a security guard! Nothing can go wrong with this trend. :-/

Though, as a roboticist seeing this gap between terminals, I can't help but wonder whether it just undocked from its charger, whether it is about to dock with its charger, whether it needs help from a human to dock with its charger, or whether it has failed to dock with its charger and is about to run out of power in the dark and the cold where all the wolves are.

-the Centaur

Announcing Logical Robotics

centaur 0

So, I'm proud to announce my next venture: Logical Robotics, a robot intelligence firm focused on making learning robots work better for people. My research agenda is to combine the latest advances of deep learning with the rich history of classical artificial intelligence, using human-robot interaction research and my years of experience working on products and benchmarking to help robots make a positive impact.

Recent advances in large language model planning, combined with deep learning of robotic skills, have enabled almost magical developments in explainable artificial intelligence, where it is now possible to ask robots to do things in plain language and for the robots to write their own programs to accomplish those goals, building on deep learned skills but reporting results back in plain language. But applying these technologies to real problems will require a deep understanding of both robot performance benchmarks to refine those skills and human psychological studies to evaluate how these systems benefit human users, particularly in the areas of social robotics where robots work in crowds of people.

Logical Robotics will begin accepting new clients in May, after my obligations to my previous employer have come to a close (and I have taken a break after 17 years of work at the Search Engine That Starts With a G). In the meantime, I am available to answer general questions about what we'll be doing; if you're interested, please feel free to drop me a line at via centaur at logicalrobotics.com or take a look at our website.

-the Centaur

Robots in Montreal

centaur 1
A cool hotel in old Montreal.

"Robots in Montreal," eh? Sounds like the title of a Steven Moffat Doctor Who episode. But it's really ICRA 2019 - the IEEE Conference on Robotics and Automation, and, yes, there are quite a few robots!

Boston Dynamics quadruped robot with arm and another quadruped.

My team presented our work on evolutionary learning of rewards for deep reinforcement learning, AutoRL, on Monday. In an hour or so, I'll be giving a keynote on "Systematizing Robot Navigation with AutoRL":

Keynote: Dr. Anthony Francis
Systematizing Robot Navigation with AutoRL: Evolving Better Policies with Better Evaluation

Abstract: Rigorous scientific evaluation of robot control methods helps the field progress towards better solutions, but deploying methods on robots requires its own kind of rigor. A systematic approach to deployment can do more than just make robots safer, more reliable, and more debuggable; with appropriate machine learning support, it can also improve robot control algorithms themselves. In this talk, we describe our evolutionary reward learning framework AutoRL and our evaluation framework for navigation tasks, and show how improving evaluation of navigation systems can measurably improve the performance of both our evolutionary learner and the navigation policies that it produces. We hope that this starts a conversation about how robotic deployment and scientific advancement can become better mutually reinforcing partners.

Bio: Dr. Anthony G. Francis, Jr. is a Senior Software Engineer at Google Brain Robotics specializing in reinforcement learning for robot navigation. Previously, he worked on emotional long-term memory for robot pets at Georgia Tech's PEPE robot pet project, on models of human memory for information retrieval at Enkia Corporation, and on large-scale metadata search and 3D object visualization at Google. He earned his B.S. (1991), M.S. (1996) and Ph.D. (2000) in Computer Science from Georgia Tech, along with a Certificate in Cognitive Science (1999). He and his colleagues won the ICRA 2018 Best Paper Award for Service Robotics for their paper "PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning". He's the author of over a dozen peer-reviewed publications and is an inventor on over a half-dozen patents. He's published over a dozen short stories and four novels, including the EPIC eBook Award-winning Frost Moon; his popular writing on robotics includes articles in the books Star Trek Psychology and Westworld Psychology. as well as a Google AI blog article titled Maybe your computer just needs a hug. He lives in San Jose with his wife and cats, but his heart will always belong in Atlanta. You can find out more about his writing at his website.

Looks like I'm on in 15 minutes! Wish me luck.

-the Centaur

 

PRM-RL Won a Best Paper Award at ICRA!

centaur 2
So, this happened! Our team's paper on "PRM-RL" - a way to teach robots to navigate their worlds which combines human-designed algorithms that use roadmaps with deep-learned algorithms to control the robot itself - won a best paper award at the ICRA robotics conference! I talked a little bit about how PRM-RL works in the post "Learning to Drive ... by Learning Where You Can Drive", so I won't go over the whole spiel here - but the basic idea is that we've gotten good at teaching robots to control themselves using a technique called deep reinforcement learning (the RL in PRM-RL) that trains them in simulation, but it's hard to extend this approach to long-range navigation problems in the real world; we overcome this barrier by using a more traditional robotic approach, probabilistic roadmaps (the PRM in PRM-RL), which build maps of where the robot can drive using point to point connections; we combine these maps with the robot simulator and, boom, we have a map of where the robot thinks it can successfully drive. We were cited not just for this technique, but for testing it extensively in simulation and on two different kinds of robots. I want to thank everyone on the team - especially Sandra Faust for her background in PRMs and for taking point on the idea (and doing all the quadrotor work with Lydia Tapia), for Oscar Ramirez and Marek Fiser for their work on our reinforcement learning framework and simulator, for Kenneth Oslund for his heroic last-minute push to collect the indoor robot navigation data, and to our manager James for his guidance, contributions to the paper and support of our navigation work. Woohoo! Thanks again everyone! -the Centaur

Learning to Drive … by Learning Where You Can Drive

centaur 1
I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I  can tell you all about it! When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance. For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what. But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings. People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments. But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge. Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world. In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round). This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper.  Until the ICRA version comes out, you can find the preliminary version on arXiv:

https://arxiv.org/abs/1710.03937 PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.
  So, when I say "I teach robots to learn" ... that's what I do. -the Centaur

Viiictory the Fifteenth

centaur 0

Print

Once again, I’ve completed the challenge of writing 50,000 words in a month as part of the National Novel Writing Month challenges - this time, the July 2016 Camp Nanowrimo, and the next 50,000 words of Dakota Frost #5, PHANTOM SILVER!


Phantom Silver v2 Small.png

This is the reason that I’ve been so far behind on posting on my blog - I simultaneously was working on four projects: edits on THE CLOCKWORK TIME MACHINE, writing PHANTOM SILVER, doing publishing work for Thinking Ink Press, and doing my part at work-work to help bring about the robot apocalypse (it’s busy work, let me tell you). So busy that I didn’t even blog successfully getting TCTM back to the editor. Add to that a much needed old-friends recharge trip to Tahoe kicking off the month, and I ended up more behind than I’ve ever been … at least, as far as I’ve been behind, and still won:

Camp Nano 2016 July 31b.png

What did I learn this time? Well, I can write over 9,000 words a day, though the text often contains more outline than story; I will frequently stop and do GMC (Goal Motivation Conflict) breakdowns of all the characters in the scene and just leave it in the document as paragraphs of italicized notes, because Nano - I can take it out later, its word count now now now! That’s how you get five times a normal word count in a day, or 500+ times the least productive day in which I actually wrote something.

Camp Nano 2016 July 31c.png

Also, I get really really really sloppy - normally I wordsmith what I write as I write, even in Nano - but that’s when I have the luxury of writing 1000-2000 words a day. When I have to write 9000, I write things like "I want someoent bo elive this whnen ai Mideone” and just keep going, knowing that I can correct the text later to “I want someone to believe this when I am done,” and, more importantly, can use the idea behind that text to craft a better scene on the next draft (in this case, Dakota’s cameraman Ron is filming a bizarre event in which someone’s life is at stake, and when challenged by a bystander he challenges back, saying that he doesn’t have any useful role to fill, but he can at least document what’s happening so they’ll all be believed later).

Camp Nano 2016 July 31d.png

The other thing is, what I am starting to call The Process actually seems to work. I put characters in situations. I think through how they would react, using Goal Motivation Conflict to pull out what they want, why they want it, and why they can’t get it (a method recommended by my editor Debra Dixon in her GMC book). But the critical part of my Process is, when I have to go write something that I don’t know, I look it up - in a lot of detail. Yes, Virginia, even when I was writing 9,000+ words a day, I still went on Wikipedia - and I don’t regret it. Why? Because when I’m spewing around trying to make characters react like they’re in a play, the characters are just emoting, and the beats, no matter how well motivated, could get replaced by something else.

2209942304_e9f94d213a_b.jpg

But when it strikes me that the place my characters area about visit looks like a basilica, I can do more than just write “basilica.” I can ask myself why I chose that word. I can look up the word “basilica” on Apple’s Dictionary app. I can drill through to famous basilicas like the Basilica of Saint Peter. I can think about how this place will be different from that, and start pulling out telling details. I can start to craft a space, to create staging, to create an environment that my characters can react to. Because emotions aren’t just inside us, or between us; they’re for something, for navigating this complex world with other humans at our side. If a group of people argues, no matter how charged, it’s just a soap opera. Put them in their own Germanic/Appalachian heritage family kitchen in the Dark Corner of South Carolina, on on the meditation path near an onsen run continuously by the same family for 42 generations, and the same argument can have a completely different ambiance - and completely different reactions.

The text I wrote using my characters reacting to the past plot, or even with GMC, may likely need a lot of tweaking: the point was to get them to a particular emotional, conceptual or plot space. The text I wrote with the characters reacting to things that were real, even if it needs tweaking, often crackles off the page, even in very rough form. It’s material I won’t want to lose - more importantly, material I wouldn’t have produced, if I hadn’t pushed myself to do National Novel Writing Month.

Up next, finishing a few notes and ideas - the book is very close to done - and then diving into contracts for Thinking Ink Press, and reinforcement learning policy gradients for the robot apocalypse, all while waiting for the shoe to drop on TCTM. Keep your fingers crossed that the book is indeed on its way out!

-the Centaur

Conversations Ongoing

centaur 2

IMG_20121208_141309.jpg

So recently I posted an article about the ongoing debate on AI - something of very great interest to me - and my very good friend Jim Davies posted the following comment (getting it down to the gist):

So we have an interesting problem of customers wanting the ethical decisions [made by AI] to be a more public, open discussion, perhaps done by ethics experts, and the reality is that the programmers are doing the deciding behind closed doors. Is it satisfying for the rest of us to say merely that we’re confident that the engineers are thinking and talking about it all the time, deep in Google’s labs where nobody can hear them?

There are some interesting things to unpack - for example, whether there really are such things as ethics experts, and whether ethical decisions should be made by the public or by individuals.

Personally, as an ex-Catholic who once thought of going into the priesthood, and as an AI researcher who thinks about ethics quite carefully, I believe most so-called ethical experts are actually not (and for sake of argument, I’ll put myself in that same bin). For example, philosopher Peter Singer is often cited as an ethical expert, but several of his more prominent positions - e.g., opposing the killing of animals while condoning the killing of infants - undermine the sanctity of human life, a position he admits; so the suggestion that ethics experts should be making these decisions seems extraordinarily hazardous to me. Which experts?

Similarly, I don’t think ethical decisions in engineered systems should not be made by the public, but I do think safety standards should be set consistent with our democratic, constitutional process - by which I mean, ethical standards should reflect the will of the people being governed, consistent with constitutional safeguards for the rights of the minority. Car safety and airplane safety are good examples of this policy; as I understand the law, the government is not (in general) making actual decisions about how car makers and airplane makers need to meet safety standards - that is, not making decisions about which metals or strut designs keep a vehicle safe - but are instead creating a safety framework within which a variety of approaches could be implemented.

There’s a lot to discuss there.

But one thing that still bugs me about this is the idea that engineers are talking about this deep in corporate labs where no-one can hear them. I mean, they are having those conversations. But some of those same engineers are saying things publicly - Peter Norvig, a Director of Research at Google, has an article in the recent What to Think About Machines that Think, and some other Googler is writing this very blog post.

But my experience is that software engineers and artificial intelligence researchers are talking about this all the time - to each other, in hallways at GDC, over dinner, with friends - as far back as I can remember.

So I guess what’s really bothering me is, if we’re talking about it all the time, why does nobody seem to be listening? And why do people keep on saying that we’re not talking about it, or that we’re not thinking about it, or that we’re clearly not talking about it or thinking about it to the degree that the talking and thinking we’re not doing should be taken away from us?

-the Centaur