A predator has landed on me. Send … heeellllp …
A predator has landed on me. Send … heeellllp …
So, this happened! Our team’s paper on “PRM-RL” – a way to teach robots to navigate their worlds which combines human-designed algorithms that use roadmaps with deep-learned algorithms to control the robot itself – won a best paper award at the ICRA robotics conference!
I talked a little bit about how PRM-RL works in the post “Learning to Drive … by Learning Where You Can Drive“, so I won’t go over the whole spiel here – but the basic idea is that we’ve gotten good at teaching robots to control themselves using a technique called deep reinforcement learning (the RL in PRM-RL) that trains them in simulation, but it’s hard to extend this approach to long-range navigation problems in the real world; we overcome this barrier by using a more traditional robotic approach, probabilistic roadmaps (the PRM in PRM-RL), which build maps of where the robot can drive using point to point connections; we combine these maps with the robot simulator and, boom, we have a map of where the robot thinks it can successfully drive.
We were cited not just for this technique, but for testing it extensively in simulation and on two different kinds of robots. I want to thank everyone on the team – especially Sandra Faust for her background in PRMs and for taking point on the idea (and doing all the quadrotor work with Lydia Tapia), for Oscar Ramirez and Marek Fiser for their work on our reinforcement learning framework and simulator, for Kenneth Oslund for his heroic last-minute push to collect the indoor robot navigation data, and to our manager James for his guidance, contributions to the paper and support of our navigation work.
Woohoo! Thanks again everyone!
Hail, fellow adventurers! And now you know why you haven’t heard from me for a while: I was heads down finishing my wordcount for Camp Nanowrimo! And this is a very special one, because it marks the twentieth time I have won a National Novel Writing Month style challenge to write 50,000 words of a novel in a month! Woohoo! When I started, I never thought I’d finish this many!
This was a difficult month for it. Sure, I just finished early, but that final push involved locking me in a downstairs room with my laptop until I finished so I could enjoy the rest of my vacation with my wife. And the push up to this point has been hard: my wife returning from vacation, with me scrambling to finish a spring cleaning gone awry before she got home. A cat being treated for cancer. An organization I’m volunteering with had an emergency that involved multiple meetings over the month. Major shifts and dustups at work. Robots, on the loose, being chased down the corridors. Ok, that last one isn’t real. Well, actually, it was, but it was much, much, much more prosaic than it sounds.
The upshot, seen above, is blood on the water (behind on my wordcount) for most of the month. And with the very last weekend of the month being my long-planned vacation in Monterey with my wife before she flies out on her next business trip, there was a very real danger that I wouldn’t make it. But my wife is awesome, and tolerated me taking out this first evening to do a massive push to get all my words done!
And now, sleep. But first, an excerpt:
“The Ere Mother is … not the most dangerous enemy I’ve ever faced,” I says. “Actually, she doesn’t rate really highly compared to the thing we found in the Vault of Nightmares, which was the real source of the magic that tried to burn down this city, Lady Scara—not me. But the Ere Mother is terribly dangerous, that I admit, Magus Meredith, Elder Jackson-Monarch. She’s terribly dangerous. But I did not ‘unleash’ her on the city. I went where my leadership told me to go and did what they told me to do, and the bottom dropped out under me. Yes, she came to life when I fell into the chambers of her court, but I strongly doubt that she was brought to life by a magic tiger butt. As unstable as that structure was—and it was still subsiding from time to time—the Ere Mother could have been unleashed at anytime, and we’d know even less about her than we do because I was down there investigatin’—as you all asked me to.”
I stands there, quietly.
“OH!” I says. “Um, yeah. That’s … that’s my report.”
“Well,” Mom says. “Thank you, First Mage, for your testimony—”
“Chair Frost?” Meredith says, raising his hand politely. “Are questions allowed?”
Mom blinks. “Always, as long as we maintain order. You have the floor.”
“Shoot,” I says. “Not literally—”
“How do you know the structure was still subsiding?” asked Meredith.
I stares at him. The hair rises on the back of my head. I thinks very, very fast.
“I heard it from the remaining member of the Dire Court,” I says. “A fox changeling, er, proto-fox changeling, at least I assume it was a changeling—er, anyway, we spoke, briefly, before the Ere Mother attacked. He mentioned a subsidence that, um.”
“Yes?” Meredith says, eyes gleaming.
“That, ah, uncovered his eye, so he wasn’t stuck in the dark anymore,” I says quietly. Meredith’s face falls, with true horror. “There was light down there, from runes. But after the Ere Mother’s attack … I don’t think there’s anything left of the fox fae anymore.”
“That’s … horrible,” Meredith says. “Do you remember what else you spoke about?”
“I will try to reconstruct a transcript. Mostly, he said shit like, ‘Oh, God’, and ‘Don’t hurt me.’” Somebody laughs, and I idly turns towards them and says, “Hey, I was pretty scared. You wanna be pretty scared to, I can always Change into what I looked like down there.”
“Cinnamon Stray Foundling Frost,” Mom says sternly, “if you eat anyone at this Council, you’re grounded!”
“Yes, Mom,” I says.
Ah, Cinnamon. You and your wacky hijinks with ancient faerie changelings!
Now … zzzzz…
In theory, mast cell tumors of the skin don’t kill cats, at least not directly. They can lead to lesions that can’t heal and further infections, but its MCT of the spleen or gastrointestinal tract that are really dangerous.
For Lenora, our precious little wimp cat, this cancer is aggressive enough that we may need to take proactive steps. She’s gone from one lump to 10 to 30 to 40 to 50 to 70, with a brief dip back to 40 after her surgery to remove her spleen … but now the MCT has exploded, going from 80 to 100 to probably hundreds at this point, many of them showing lesions and scabs.
The first two combinations of cancer treatments failed; this one does not seem to be having an effect. Lenora is still active, but she no longer wants to spend time indoors, instead choosing to find high spots on the exterior podium or the fence. I think she thinks fleas are eating her alive.
I fear she’s on her way out. I’d love to say “I know” but everything I’ve learned over the years tells me (a) you don’t really know and (b) foreclosing an opportunity in your mind is a precursor to getting it foreclosed in real life. We sometimes like to think that we’re tough minded people making hard decisions in the face of difficult circumstances, but if you’re that guy or gal, I have bad news for you: you’re selling yourself a line of bullshit.
Far too often we get tired of dealing with something and choose to perceive it as hopeless, then take all the bad decisions we need to in order to make the bad outcome we’ve decided upon happen, then telling ourselves “there’s nothing else we could have done.” This is particularly common with cars: cars rarely die until we decide to kill them by not maintaining them. It’s even more common with politics: the other guy’s plan rarely fails on its own until we take steps to sabotage it, just so we can then say “we told you so.”
With your health, or the health of a loved one, what does this translate into? Never give up. Stephen Hawking lasted something like five decades after his doctors told him he’d likely be dead, and he didn’t last that long by crawling into a bed and not fighting every step of the way. Sometimes heroic measures are not called for, but just giving up hope will make things far worse far faster.
So we’re here for you, Lenora, even if you’re on your way out.
Have a scritchy behind the ear. Yes. There you go.
I prefer pictures of food to pictures of myself, but, since my phone stopped charging and started shocking people (along with emitting a lovely BURNING smell) you get old stock footage or Photo Booth for the time being.
And now, the currents:
Why do these things matter? Why should you care? I know some people could care less about the incessant Facebook updates by people saying where they are and what they are doing. Some people I know even call sharing updates humblebragging as a way of shitshaming people into shutting up. (Hey guys! You know who you are. Message from me to you: Fuck off, kthanksbai.)
Not me. I like seeing people say what they’re up to; I like the birthday wishes on Facebook or the posts by famous writers saying, “ugh, I can has no brain today, here is a picture of a cat”. I still remember after my Aunt Kitty died sharing on Facebook my last picture of her, and all the people I knew who showed up at the funeral only because I had posted it.
It’s human and natural to share with each other what we are doing. It lets each of us know that we aren’t alone dealing with the good or bad. If status updates aren’t the thing you’re into, get off Facebook or Twitter. There’s nothing wrong with that: I know many people have done it and have felt better for doing so.
For me, there are so many people I only stay connected to because we have that instant means of connection. And (ssh: between you and me) there’s always my ulterior motive: the more I write, the better I get at writing, and the more I discover and perfect my own voice. And just about everyone I know who does that just gets more interesting the longer that they do it.
That’s why I’m currently … blogging.
Hit save, then publish.
SO! Hey! GDC and Clockwork Alchemy are over and I’m not dead! (A joke which actually I don’t find that funny given the circumstances, which I’ll dig into in just a moment). Strangely enough, hitting two back-to-back conferences, both of which you participate super heavily in, can take something out of your blog. Who knew?
But I need to get better at blogging, so I thought I’d try something new: a “check-in” in which I try to hit all the same points each time – what am I currently writing, editing, programming, etc? For example, I am currently:
Whew, that’s a lot, and I don’t even think I got them all. Maybe I won’t try to write all of the same “currents” every time, but it was a useful exercise in “find something to blog about without immediately turning it into a huge project.”
But the biggest “current” in my mind is the person I am currently worried about, my good friend and great Game AI developer Dave Mark. Dave is the founder of the GDC AI Summit … but was struck by a car leaving the last sessions at GDC, and still is in the hospital, seriously injured.
More in a moment.
Pictured: Butterysmooooth sashimi at Izakaya Ginji in San Mateo from a few days ago, along with my “Currently Reading” book Theoretical Neuroscience open to the Linear Algebra appendix, when I was “Currently Researching” some technical details of the vector notation of quadratic forms by going through stacks and stacks of books, a question which would have been answered more easily if I had started by looking at the entry for quadratic forms in Wolfram’s MathWorld, had I only known at the start of my search that that was the name for math terms like xᵀWx.
SO RECENTLY I had a very vivid dream in which my veterinarian said to me “There are some cats you save … and some you make comfortable.” I think the context behind that dream is worth a little unpacking, don’t you?
Loki the Loquacious is a cat that we saved. I came home one day to find him yowling and lethargic, sensitive to the touch yet unwilling to move, with a bloated feeling to the touch, and after a brief search online we rushed him to the nearby animal hospital who quickly diagnosed him with a urinary tract blockage, put him on a catheter, and nursed him back to health.
Now, he hates the urinary tract pet food we feed him and the occasional water droppers when he’s not drinking, but unless this outdoor cat gets too adventurous, he’s probably got a long life ahead of him.
Caesar the Conqueror is a cat that we made comfortable. He’d been made frail by a long battle with a thyroid condition when he decided to start peeing inappropriately indoors, so we had to make him an outdoors cat; but we were able to set up a relatively nice outdoor area for him. But then some nasal obstruction began interfering with his breathing, and he ultimately wheezed himself to death.
We kept him comfortable, of course, until he took a rapid turn downhill, and then we had him peacefully put to sleep in my arms.
As for Lenora the Cat … the jury is still out.
She’s a healthy-looking, happy-looking, active cat, and even though she from time to time got pencil-eraser sized moles, and once even a larger lump on a back leg, they were always benign … until a month ago. Then a new mole appeared, and another, and another, until she had dozens of the tiny, not-itchy, not-bleeding, not-discolored bumps all over her body. We took her to the doctor, who found two more walnut-sized lumps in her abdomen; biopsies revealed these to be mast cell tumors (MCT or mastocytoma).
Our doctor’s recommended regimen – a cortisone shot, followed by predisone and possibly other medications – tracks with what I’ve been able to research. Cortisone and similar drugs are recommended, and sometimes even can cure it, especially if it’s on the skin; but prognosis for lumps in the internals are more guarded – and she’s gotten another lump since the biopsy.
So now we’re researching, weighing the options of continuing treatment vs seeing an oncologist now (our vet is of the opinion that we’d have to wait a few weeks for an oncologist to get good readings on bloodwork because of the cortisone shot, but if I was an oncologist I’d want to see that third lump right now). Cats with this condition can last three years with surgery, a year with palliative care … or can die within weeks if it’s serious.
We don’t yet know if Lenora’s a cat we must make comfortable … or that we can save.
The following was written just before I left on Christmas vacation. The fact that I’m posting it three weeks later I think says something about the very point I was making in the article … so I’m going to let it stand as I wrote it the day that it happened. Here goes …
So, my cat died in my lap today, and while I didn’t kill it, I made it happen.
I’d love to say I have a lot of feelings about that.
The truth is, for me, departures leave a void. I don’t know what to feel, or don’t feel anything. Our precious little fraidy cat Caesar is gone, just gone, and the event passed without the reactions that movies and literature tell me happen when people go through life-changing events.
And this is a change, make no mistake. Almost twelve years ago, I and my wife agreed to adopt two rescue cats, Nero (the big black butch one) and Caesar (the skinny Holstein-cow one afraid of crinkling paper). They’d been turned out onto the street by a couple who got on drugs, and were being fostered by one of our bridesmaids, who already had three tiny, frail, elderly cats, and was forced to keep both cats in a bathroom. We had Nero and Caesar shipped from the East Coast to the West, and made them a part of our lives.
Nero’s long gone, victim of coyotes, but Caesar, with a different behavioral inheritance, survived and thrived, until a few years ago thyroid problems caused him to start to lose weight. He wasted away from twelve pounds to seven over the years, but we were mostly able to control it with medication, even when we ultimately had to put Caesar outside when, in his old age, he decided it was just fine to pee, like, wherever, because he’d reached the age where he didn’t have to give a damn anymore.
Bay Area winters are, of course, as brutal as cream puffs, but we nonetheless set up a huge gazebo enclosure in the back yard, where a tarp, pillows, heating pads and collection of chairs, tables and cat condos gave him a comfy throne for over a year.
But then he started wheezing. At first it was a cute little cooing-dove purr, and we thought he was just becoming more vocal. But it developed into a whistling, ticking sound as he labored for breath. Never comfortable on trips to the vet—always scared and panting, frequently pooping in the carrier even when in the best of health—on his last trip he was so freaked out they had to put him on oxygen. Tweaks to his medication and a cortisone shot helped for a while, but soon he was back where he started, with the recommendation of the vet that we make him comfortable.
And we did—or, mostly, my wife did.
She constantly reworked the outer area to make it a luxurious throne. A night owl herself, she fed him at all hours as, despite his decreasing weight of six and a half pounds, he became our most ravenous cat. And she stayed with him to brush him or sit with him or make him happy.
And me? I’m the one who dragged us out to the Bay Area to work for a search engine company, and I’m the one who has to work long hours keeping the lights on now that I’ve transitioned from search to robotics. I’m the one who chose to take on a huge writing project at which I’m barely started, and I’m the one who chose to take on helping found a small press. I seemingly can’t say no to projects, not because I want to do so many projects, but because that’s the only way I have found to make the projects that I do work on into successes—constantly seeking other avenues, other points of connections that make the work that I do more valuable. So now I find myself with an enormous stack of responsibilities that I can’t easily unwind.
For a variety of reasons, this has become even worse in the last six months, right when Caesar began his decline. Weekend after weekend I planned to spend time sitting in the back yard with the cats, and weekend after weekend I found myself working late at work or putting out fires at the small press. And week after week, I saw Caesar continue to decline.
I even knew this was likely to happen, and took a picture intending to blog about caring for elderly cats. But life intervened, and Caesar has now passed without me ever posting that post about his decline. I can’t look at those pictures without thinking about dereliction of duty.
Finally, I had enough, and started to arrange time to spend more time with Caesar. But it was too late. He’d grown too frail to clean himself, but no longer enjoyed brushing, pulling away from me when I tried to clean out his fur. He’d grown too scattershot to properly drink from poured water, but no longer enjoyed suckling my knuckle, making a few halfhearted attempts at the gesture that had calmed him so much as a young cat before wobbling away. I’d sit in the Adirondack chair in the back yard, hoping he’d come up and sit in my lap, and for a while he did, scrabbling his way up on me, getting a scratch, then shakily hop down and walk away. I eventually tried picking him up to put him in my lap, but he just wanted down. By the end, he barely tolerated a scratch behind the ears, and would quickly give up or walk away.
As Christmas approached, I worried that he wouldn’t be here when I got back from visiting my folks—but last night, we noticed vomit on his pillow. Today he wasn’t sitting in his throne, and I found him lying against the fence in the back yard, muzzle covered in vomit, drooling on his paws, unable to muster the energy to eat and unwilling to tolerate my touch.
I called in at work, woke up my wife, and we started calling for home pet euthanasia services. After half a dozen calls, we had an appointment arranged, and in the mid afternoon, a kindly veterinarian came by. Caesar had slid even further, with a soft, plaintive mew, and the vet gave him a sedative to help him sleep, and soon he was breathing easy for the first time in weeks.
Five minutes later, I was sitting on the porch, with Caesar in my lap. The vet shaved a small patch of fur on his leg to get to his vein, and injected the final shot. I put my hand on his chest as he breathed his last, and the vet listened until his tiny heart stopped. The vet left us an impressed paw print in clay and a tiny bundle of fur, and took our cat, wrapped up in a basket, looking more comfortable than he had in six months. Then Caesar was gone.
I wasn’t there when my dad died. I knew he was going, I even quit work so that I could be there for him while he was dying in Greenville, South Carolina, but for some reason at the time I felt like I had to periodically go back to my home town, Atlanta, Georgia, for what, I don’t remember now, to keep up the condo, or for my karate classes, or whatever, and on one of my returns to Greenville Dad passed while I was finding a parking space in the Greenville Memorial parking lot. Mom stood straight, but was in tears, and I knew what had happened; Dad’s body lay there, his eyes open, half lidded, his head turned partially aside, not rightable, the human body’s unconscious processes of self-stabilization and homeostasis finally ceased. So Dad was gone.
I wasn’t there when my grandmother died. She’d been in the nursing home for a while, and the doctors warned us that she’d had a sharp slide. We came out to see her. Mom, strangely, didn’t want to go into the room, seeming somehow semi-estranged from her, despite being about as good to her as she could have been. I went see Grandma; she was holding her hands tight, her eyes half-lidded, barely registering my presence. We waited a long time, then returned the next day, and waited again. Finally we went for a late lunch, and when we returned, it was over. And Grandma was gone.
I wasn’t there when my Aunt Kitty died. She’d been in decent health, despite a painful hip problem, and was jogging at the gym one day when she had a heart attack and fell off the treadmill. I was already on my way to Greenville for other reasons, but when I arrived, she again was barely holding on, each of her organs struggling to keep up, offloading their problems onto another. I parsed the jargon the doctors were saying and re-uttered the words to the family in words they understood, and they seemed comforted. She lay there, writhing a little; once her eyes, half-lidded, seemed to recognize me. But the family told me to leave, and after a few days, I flew back to the Bay Area. She passed the next day, and I flew back for the funeral. But Aunt Kitty was gone.
I wasn’t there when Gray Cat died. He was a feral who stayed in the yard, and we slowly started the process of trying to tame him. I was the only one who could feed him. I was the only one who could pet him, and I did it with gloves. But we had started to play together, and he started to warm—then got in through the cat door and attacked my wife. She had to fight him off with a broom, and we ultimately decided that he was dangerous enough that we had to put him to sleep. But it was my wife who took him to the pound. And Gray Cat was gone.
I wasn’t there when Caesar’s brother Nero died; as I said, he was taken by coyotes. He was an active outdoor cat, and we could even take him on walks without a leash. But that expanded his range, and he loved hunting on the watershed hill near our home. One night went out late at night, shortly before we heard the coyotes howl. He never came back. We posted flyers and walked the neighborhood, and checked shelters, but none of that mattered; we knew what happened the very next morning. And Nero was gone.
Nero’s death came without warning. I knew Caesar’s end was coming. I was determined to not let him die alone and afraid the way Nero did. So I kept close watch on him. I thought through the scenarios he might encounter and decided what I was and was not willing to put him through. The ultimate criteria, I decided, was if he could not breathe, if he could not eat, or if he could not get up; today, two of those three happened. So we acted.
I was there when Caesar died. We let him lie where he had chosen until the drugs put him into a peaceful sleep, and then I held him in my lap until he passed. And after he was gone, I asked my wife to go for a walk, and I unloaded to her about how I wanted to have been there more.
“No,” she said. “We are a team, and I was there for him, several times, every day, while you worked. While you spent your love on the cats that still wanted affection, I focused instead on Ceasar and gave him all the attention he needed. We gave him everything we could.”
I still don’t know what I feel about this. I must feel something: I’ve been prompted to write two thousand words on it. But the feeling is that of a void. An uncertainty of how I should react or how I should feel. The only thing I know is that I made sure I was there when Caesar died.
Epilogue: Caesar is gone. Now one of our other cats, Lenora, has erupted in tiny bumps and larger lesions, along with two big lumps in her abdomen. Is it cancer, and she’s soon to be gone? Is it simply cowpox, and she’ll be fine in a month or two? I don’t know. But I do know I am making a special effort to be with her, and with my wife, and my friends and family, while they are alive.
I often say “I teach robots to learn,” but what does that mean, exactly? Well, now that one of the projects that I’ve worked on has been announced – and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference – I can tell you all about it!
When I’m not roaming the corridors hammering infrastructure bugs, I’m trying to teach robots to roam those corridors – a problem we call robot navigation. Our team’s latest idea combines “traditional planning,” where the robot tries to navigate based on an explicit model of its surroundings, with “reinforcement learning,” where the robot learns from feedback on its performance.
For those not in the know, “traditional” robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler “local planner” that knows how to navigate shorter distances. It’s a little like how you learn to drive in your neighborhood – starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what.
But for that to work, you have to know how to drive, and that’s where the local planner comes in. Building a local planner is simple in theory – you can write one for a toy world in a few dozen lines of code – but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called “navigation stacks” and can contain dozens of components – and in my experience they’re hard to get working and even when you do, they’re often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings.
People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments.
But there’s a further wrinkle: the so-called “reality gap“. By necessity, the local planner used by a probablistic roadmap is simulated – attempting to connect points on a map. That simulated local planner isn’t identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can’t navigate safely in the real world. This can have disastrous consequences – causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they’re going, causing cars to tumble off the end of a bridge.
Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap’s local planner and the robot’s navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world.
In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a “straight-line” local planner which was our default alternative. While I didn’t happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot’s behavior wasn’t as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we’ll fix this in a subsequent round).
This work includes both our group working on office robot navigation – including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson – and Alexandra’s collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper. Until the ICRA version comes out, you can find the preliminary version on arXiv:
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning
We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.
So, when I say “I teach robots to learn” … that’s what I do.