Press "Enter" to skip to content

Posts tagged as “Hard Science”

The Embodied AI Workshop is Tomorrow, Sunday, June 20th!

centaur 0

embodied AI workshop

What happens when deep learning hits the real world? Find out at the Embodied AI Workshop this Sunday, June 20th! We’ll have 8 speakers, 3 live Q&A sessions with questions on Slack, and 10 embodied AI challenges. Our speakers will include:

  • Motivation for Embodied AI Research
    • Hyowon Gweon, Stanford
  • Embodied Navigation
    • Peter Anderson, Google
    • Aleksandra Faust, Google
  • Robotics
    • Anca Dragan, UC Berkeley
    • Chelsea Finn, Stanford / Google
    • Akshara Rai, Facebook AI Research
  • Sim-2-Real Transfer
    • Sanja Fidler, University of Toronto, NVIDIA
      Konstantinos Bousmalis, Google

You can find us if you’re signed up to #cvpr2021, through our webpage embodied-ai.org or at the livestream on YouTube.

Come check it out!

-the Centaur

He thinks he invented Java because he was in the room when someone made coffee

taidoka 0

... came up as my wife and I were discussing the "creative hangers-on form" of Stigler's Law. The original Stigler's Law, discovered by Roger Merton and popularized by Stephen Stigler, is the idea that in science, no discovery is named after its original discoverer.

In creative circles, it comes up when someone who had little or nothing to do with a creative process takes credit for it. A few of my wife's friends were like this, dropping by to visit her while she was in the middle of a creative project, describing out loud what she was doing, then claiming, "I told her to do that."

In the words of Finn from The Rise of Skywalker: "You did not!"

In computing circles, the old joke referred to the Java programming language. I've heard several variants, but the distilled version is "He thinks he invented Java because he was in the room when someone made coffee."  Apparently this is a good description of how Java itself was named, down to at least one person  claiming they came up with the name Java and others disputing that, even suggesting that they opposed it, claiming instead that someone else in the room was responsible - while that person in turn rejected the idea, noting only that there was some coffee in the room from Peet's.

Regardless, I dispute Howard Aiken's saying "Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats." Nah. Once you've forced an idea down someone's throat, they won't just swallow it, they'll claim it was in their stomach all along.

-the Centaur

The Embodied AI Workshop at CVPR 2021

centaur 0

embodied AI workshop

Hail, fellow adventurers: to prove I do something more than just draw and write, I'd like to send out a reminder of the Second Embodied AI Workshop at the CVPR 2021 computer vision conference. In the last ten years, artificial intelligence has made great advances in recognizing objects, understanding the basics of speech and language, and recommending things to people. But interacting with the real world presents harder problems: noisy sensors, unreliable actuators, incomplete models of our robots, building good simulators, learning over sequences of decisions, transferring what we've learned in simulation to real robots, or learning on the robots themselves.

interactive vs social navigation

The Embodied AI Workshop brings together many researchers and organizations interested in these problems, and also hosts nine challenges which test point, object, interactive and social navigation, as well as object manipulation, vision, language, auditory perception, mapping, and more. These challenges enable researchers to test their approaches on standardized benchmarks, so the community can more easily compare what we're doing. I'm most involved as an advisor to the Stanford / Google iGibson Interactive / Social Navigation Challenge, which forces robots to maneuver around people and clutter to solve navigation problems. You can read more about the iGibson Challenge at their website or on the Google AI Blog.

the iGibson social navigation environment

Most importantly, the Embodied AI Workshop has a call for papers, with a deadline of TODAY.

Call for Papers

We invite high-quality 2-page extended abstracts in relevant areas, such as:

  •  Simulation Environments
  •  Visual Navigation
  •  Rearrangement
  •  Embodied Question Answering
  •  Simulation-to-Real Transfer
  •  Embodied Vision & Language

Accepted papers will be presented as posters. These papers will be made publicly available in a non-archival format, allowing future submission to archival journals or conferences.

The submission deadline is May 14th (Anywhere on Earth). Papers should be no longer than 2 pages (excluding references) and styled in the CVPR format. Paper submissions are now open.

I assume anyone submitting to this already has their paper well underway, but this is your reminder to git'r done.

-the Centaur

It’s been a long time since I’ve thrown a book …

taidoka 0

chuck that junk

Yeah, so that happened on my attempt to get some rest on my Sabbath day.

I'm not going to cite the book - I'm going to do the author the courtesy of re-reading the relevant passages to make sure I'm not misconstruing them, but I'm not going to wait to blog my reaction - but what caused me to throw this book, an analysis of the flaws of the scientific method, was this bit:

Imagine an experiment with two possible outcomes: the new theory (cough EINSTEIN) and the old one (cough NEWTON). Three instruments are set up. Two report numbers consistent with the new theory; the third one, missing parts, possibly configured improperly and producing noisy data, matches the old.

Wow! News flash: any responsible working scientist would say these results favored the new theory. In fact, if they were really experienced, they might have even thrown out the third instrument entirely - I've learned, based on red herrings from bad readings, that it's better not to look too closely at bad data.

What did the author say, however? Words to the effect: "The scientists ignored the results from the third instrument which disproved their theory and supported the original, and instead, pushing their agenda, wrote a paper claiming that the results of the experiment supported their idea."

Pushing an agenda? Wait, let me get this straight, Chester Chucklewhaite: we should throw out two results from well-functioning instruments that support theory A in favor of one result from an obviously messed-up instrument that support theory B - oh, hell, you're a relativity doubter, aren't you?

Chuck-toss.

I'll go back to this later, after I've read a few more sections of E. T. Jaynes's Probability Theory: The Logic of Science as an antidote.

-the Centaur

P. S. I am not saying relativity is right or wrong, friend. I'm saying the responsible interpretation of those experimental results as described would be precisely the interpretation those scientists put forward - though, in all fairness to the author of this book, the scientist involved appears to have been a super jerk.
 

Black Holes and Divine Revelation

centaur 0

einstein headshot

Growing up with Superman comics, Hollywood movies and Greek mythology can give you a distorted idea of the spiritual world. Colorful heroes with flashy powers hurl villains into the Phantom Zone, and a plucky bard with a fancy lyres can sing his way into hell to rescue his bride, if only he doesn't look back.

This models the afterlife as a distant but reachable part of the natural world. The word "supernatural" gets tossed around without force, because there are rules for breaking the rules: like warp drive breaking the laws of motion or the cheat codes to the Matrix, you can hack your way into and out of the afterlife.

But spirituality is not magic, and prayers aren't spells. While I've argued "spirit" isn't strictly necessary for the practice of Christianity, most theologians would agree that the supernatural realm is a reflection of the grander reality of God and operates on His will - not a set of rules that could be manipulated by Man.

Even the idea of the "afterlife" isn't necessary. We're waiting in hope for bodily resurrection. We die, and stay dead, yet our essences live on in the mind of God, to be resurrected in a future world which outstrips even our boldest imaginations (though C. S. Lewis sure tried in The Great Divorce and The Last Battle).

Death, in this view, is a one-way trajectory. It isn't likely that people are going to and returning from the afterlife, no matter how many tunnels of light are reported by hypoxia patients, because the afterlife is not a quasi-physical realm to be hacked into, but a future physical state accompanied by spiritual perfection.

So if no-one's come back from Heaven to tell us about the afterlife, how do we know to seek it?

This is not trivial for someone who teaches robots to learn. In reinforcement learning, we model decision making as Markov decision processes, a mathematical formalism in which we choose actions in states to receive rewards, and use the rewards to estimate the values of those actions to make better choices.

But if no-one has returned from a visit to the state of the afterlife, how can we estimate the reward? One typical way around this dilemma is imitation learning: the trajectories of one agent can be used to inform another agent, granting it knowledge of the rewards in states that it cannot visit.

That agent might be human, or another, more skilled robot. You can imagine it as an army of robots with walkie-talkies trying to cross a minefield: as long as they keep radioing back what they've observed, the other robots can use that information to guide their own paths, continuing to improve.

But we're back to the same problem again: there's no radio in the afterlife, no cell service in Heaven.

One-way trajectories like this exist in physics: black holes. Forget the traversable black holes you see in movies from The Black Hole to Star Trek to Interstellar: a real black hole in general relativity is defined as a region of space where trajectories go in, but do not come back out; its boundary is the event horizon.

It's called the event horizon because no events beyond the horizon affect events outside the horizon. Other than the inexorable pull to suck more world-lines in, no information comes back from the black hole: no reward is recorded for the unvisited states of the Markov decision process.

Death appears to be a black hole, literally and figuratively. We die, remain dead, and are often put in a cold dark place in the ground, communicating nothing back to the world of the living, now on a trajectory beyond the event horizon, heading to that undiscovered country of Shakespeare and Star Trek.

In our robot minefield example, that might be a mine with a radio scrambler, cutting off signals before any other robots could be told not to follow that path. But what if there was someone with a radio who was watching that minefield from above, say a rescue helicopter, signaling down the path from above?

In a world where spirituality is a reflection of the grander reality of God, there's no magical hack which can give us the ability to communicate with the afterlife. But in a world where every observed particle event has irreducible randomness, God has plenty of room to turn around and contact us.

Like a faster-than-light radio which only works for the Old Ones, we can receive information from God if and only if He chooses to. The Old Testament records many stories of people hearing the voice of God - in dreams, in waking, in writing on the wall, in voices thundering from the heaven, in whispers.

You don't need to treat the Bible like a fax from God to imagine that the information it contains could be inconceivably precious, a deposit of revelation which could never be received from any amount of human experience. No wonder the Church preserved these books and guarded them so jealously.

But even this sells short the value that we get from God incarnating as Jesus.

Jesus Christ, a human being, provides a direct model of the behavior we should follow, informed by the knowledge of Jesus God, the portion of the Trinity most directly comprehensible by us. This is the best example we could have for imitation learning: a trace of the behavior of a divinely inspired teacher.

No amount of flying around the Earth will bring someone back from the dead; there may very well be "a secret chord that pleases the Lord," but you can't sing yourself into the afterlife. Fortunately, the afterlife has already sent an emissary, showing us the behavior we need to model to follow Him there.

-the Centaur

Pictured: Guess who.

Surfacing

centaur 0

An interpretation of the rocket equation.

Wow. It's been a long time. Or perhaps not as long as I thought, but I've definitely not been able to post as much as I wanted over the last six months or so. But it's been for good reasons: I've been working on a lot of writing projects. The Dakota Frost / Cinnamon Frost "Hexology", which was a six book series; the moment I finished those rough drafts, it seemed, I rolled into National Novel Writing Month and worked on JEREMIAH WILLSTONE AND THE MACHINERY OF THE APOCALYPSE. Meanwhile, at work, I've been snowed under following up on our PRM-RL paper.

Thor's Hammer space station.

But I've been having fun! The MACHINERY OF THE APOCALYPSE is (at least possibly) spaaaace steampunk, which has led me to learn all sorts of things about space travel and rockets and angular momentum which I somehow didn't learn when I was writing pure hard science fiction. I've learned so much about creating artificial languages as part of the HEXOLOGY.

The Modanaqa Abugida.

So, hopefully I will have some time to start sharing this information again, assuming that no disasters befall me in the middle of the night.

Gabby in the emergency room.

Oh dag nabbit! (He's going to be fine).

-the Centaur

PRM-RL Won a Best Paper Award at ICRA!

centaur 0

So, this happened! Our team's paper on "PRM-RL" - a way to teach robots to navigate their worlds which combines human-designed algorithms that use roadmaps with deep-learned algorithms to control the robot itself - won a best paper award at the ICRA robotics conference!

I talked a little bit about how PRM-RL works in the post "Learning to Drive ... by Learning Where You Can Drive", so I won't go over the whole spiel here - but the basic idea is that we've gotten good at teaching robots to control themselves using a technique called deep reinforcement learning (the RL in PRM-RL) that trains them in simulation, but it's hard to extend this approach to long-range navigation problems in the real world; we overcome this barrier by using a more traditional robotic approach, probabilistic roadmaps (the PRM in PRM-RL), which build maps of where the robot can drive using point to point connections; we combine these maps with the robot simulator and, boom, we have a map of where the robot thinks it can successfully drive.

We were cited not just for this technique, but for testing it extensively in simulation and on two different kinds of robots. I want to thank everyone on the team - especially Sandra Faust for her background in PRMs and for taking point on the idea (and doing all the quadrotor work with Lydia Tapia), for Oscar Ramirez and Marek Fiser for their work on our reinforcement learning framework and simulator, for Kenneth Oslund for his heroic last-minute push to collect the indoor robot navigation data, and to our manager James for his guidance, contributions to the paper and support of our navigation work.

Woohoo! Thanks again everyone!

-the Centaur

Why I’m Solving Puzzles Right Now

centaur 0

When I was a kid (well, a teenager) I'd read puzzle books for pure enjoyment. I'd gotten started with Martin Gardner's mathematical recreation books, but the ones I really liked were Raymond Smullyan's books of logic puzzles. I'd go to Wendy's on my lunch break at Francis Produce, with a little notepad and a book, and chew my way through a few puzzles. I'll admit I often skipped ahead if they got too hard, but I did my best most of the time.

I read more of these as an adult, moving back to the Martin Gardner books. But sometime, about twenty-five years ago (when I was in the thick of grad school) my reading needs completely overwhelmed my reading ability. I'd always carried huge stacks of books home from the library, never finishing all of them, frequently paying late fees, but there was one book in particular - The Emotions by Nico Frijda - which I finished but never followed up on.

Over the intervening years, I did finish books, but read most of them scattershot, picking up what I needed for my creative writing or scientific research. Eventually I started using the tiny little notetabs you see in some books to mark the stuff that I'd written, a "levels of processing" trick to ensure that I was mindfully reading what I wrote.

A few years ago, I admitted that wasn't enough, and consciously  began trying to read ahead of what I needed to for work. I chewed through C++ manuals and planning books and was always rewarded a few months later when I'd already read what I needed to to solve my problems. I began focusing on fewer books in depth, finishing more books than I had in years.

Even that wasn't enough, and I began - at last - the re-reading project I'd hoped to do with The Emotions. Recently I did that with Dedekind's Essays on the Theory of Numbers, but now I'm doing it with the Deep Learning. But some of that math is frickin' beyond where I am now, man. Maybe one day I'll get it, but sometimes I've spent weeks tackling a problem I just couldn't get.

Enter puzzles. As it turns out, it's really useful for a scientist to also be a science fiction writer who writes stories about a teenaged mathematical genius! I've had to simulate Cinnamon Frost's staggering intellect for the purpose of writing the Dakota Frost stories, but the further I go, the more I want her to be doing real math. How did I get into math? Puzzles!

So I gave her puzzles. And I decided to return to my old puzzle books, some of the ones I got later but never fully finished, and to give them the deep reading treatment. It's going much slower than I like - I find myself falling victim to the "rule of threes" (you can do a third of what you want to do, often in three times as much time as you expect) - but then I noticed something interesting.

Some of Smullyan's books in particular are thinly disguised math books. In some parts, they're even the same math I have to tackle in my own work. But unlike the other books, these problems are designed to be solved, rather than a reflection of some chunk of reality which may be stubborn; and unlike the other books, these have solutions along with each problem.

So, I've been solving puzzles ... with careful note of how I have been failing to solve puzzles. I've hinted at this before, but understanding how you, personally, usually fail is a powerful technique for debugging your own stuck points. I get sloppy, I drop terms from equations, I misunderstand conditions, I overcomplicate solutions, I grind against problems where I should ask for help, I rabbithole on analytical exploration, and I always underestimate the time it will take for me to make the most basic progress.

Know your weaknesses. Then you can work those weak mental muscles, or work around them to build complementary strengths - the way Richard Feynman would always check over an equation when he was done, looking for those places where he had flipped a sign.

Back to work!

-the Centaur

Pictured: my "stack" at a typical lunch. I'll usually get to one out of three of the things I bring for myself to do. Never can predict which one though.

Don’t Fall Into Rabbit Holes

centaur 1

SO! There I was, trying to solve the mysteries of the universe, learn about deep learning, and teach myself enough puzzle logic to create credible puzzles for the Cinnamon Frost books, and I find myself debugging the fine details of a visualization system I've developed in Mathematica to analyze the distribution of problems in an odd middle chapter of Raymond Smullyan's The Lady or the Tiger.

I meant well! Really I did. I was going to write a post about how finding a solution is just a little bit harder than you normally think, and how insight sometimes comes after letting things sit.

But the tools I was creating didn't do what I wanted, so I went deeper and deeper down the rabbit hole trying to visualize them.

The short answer seems to be that there's no "there" there and that further pursuit of this sub-problem will take me further and further away from the real problem: writing great puzzles!

I learned a lot - about numbers, about how things could combinatorially explode, about Ulam Spirals and how to code them algorithmically. I even learned something about how I, particularly, fail in these cases.

But it didn't provide the insights I wanted. Feynman warned about this: he called it "the computer disease", worrying about the formatting of the printout so much you forget about the answer you're trying to produce, and it can strike anyone in my line of work.

Back to that work.

-the Centaur

Learning to Drive … by Learning Where You Can Drive

centaur 0

I often say "I teach robots to learn," but what does that mean, exactly? Well, now that one of the projects that I've worked on has been announced - and I mean, not just on arXiv, the public access scientific repository where all the hottest reinforcement learning papers are shared, but actually, accepted into the ICRA 2018 conference - I  can tell you all about it!

When I'm not roaming the corridors hammering infrastructure bugs, I'm trying to teach robots to roam those corridors - a problem we call robot navigation. Our team's latest idea combines "traditional planning," where the robot tries to navigate based on an explicit model of its surroundings, with "reinforcement learning," where the robot learns from feedback on its performance.

For those not in the know, "traditional" robotic planners use structures like graphs to plan routes, much in the same way that a GPS uses a roadmap. One of the more popular methods for long-range planning are probabilistic roadmaps, which build a long-range graph by picking random points and attempting to connect them by a simpler "local planner" that knows how to navigate shorter distances. It's a little like how you learn to drive in your neighborhood - starting from landmarks you know, you navigate to nearby points, gradually building up a map in your head of what connects to what.

But for that to work, you have to know how to drive, and that's where the local planner comes in. Building a local planner is simple in theory - you can write one for a toy world in a few dozen lines of code - but difficult in practice, and making one that works on a real robot is quite the challenge. These software systems are called "navigation stacks" and can contain dozens of components - and in my experience they're hard to get working and even when you do, they're often brittle, requiring many engineer-months to transfer to new domains or even just to new buildings.

People are much more flexible, learning from their mistakes, and the science of making robots learn from their mistakes is reinforcement learning, in which an agent learns a policy for choosing actions by simply trying them, favoring actions that lead to success and suppressing ones that lead to failure. Our team built a deep reinforcement learning approach to local planning, using a state-of-the art algorithm called DDPG (Deep Deterministic Policy Gradients) pioneered by DeepMind to learn a navigation system that could successfully travel several meters in office-like environments.

But there's a further wrinkle: the so-called "reality gap". By necessity, the local planner used by a probablistic roadmap is simulated - attempting to connect points on a map. That simulated local planner isn't identical to the real-world navigation stack running on the robot, so sometimes the robot thinks it can go somewhere on a map which it can't navigate safely in the real world. This can have disastrous consequences - causing robots to tumble down stairs, or, worse, when people follow their GPSes too closely without looking where they're going, causing cars to tumble off the end of a bridge.

Our approach, PRM-RL, directly combats the reality gap by combining probabilistic roadmaps with deep reinforcement learning. By necessity, reinforcement learning navigation systems are trained in simulation and tested in the real world. PRM-RL uses a deep reinforcement learning system as both the probabilistic roadmap's local planner and the robot's navigation system. Because links are added to the roadmap only if the reinforcement learning local controller can traverse them, the agent has a better chance of attempting to execute its plans in the real world.

In simulation, our agent could traverse hundreds of meters using the PRM-RL approach, doing much better than a "straight-line" local planner which was our default alternative. While I didn't happen to have in my back pocket a hundred-meter-wide building instrumented with a mocap rig for our experiments, we were able to test a real robot on a smaller rig and showed that it worked well (no pictures, but you can see the map and the actual trajectories below; while the robot's behavior wasn't as good as we hoped, we debugged that to a networking issue that was adding a delay to commands sent to the robot, and not in our code itself; we'll fix this in a subsequent round).

This work includes both our group working on office robot navigation - including Alexandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, me, and James Davidson - and Alexandra's collaborator Lydia Tapia, with whom she worked on the aerial navigation also reported in the paper.  Until the ICRA version comes out, you can find the preliminary version on arXiv:

https://arxiv.org/abs/1710.03937
PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning

We present PRM-RL, a hierarchical method for long-range navigation task completion that combines sampling-based path planning with reinforcement learning (RL) agents. The RL agents learn short-range, point-to-point navigation policies that capture robot dynamics and task constraints without knowledge of the large-scale topology, while the sampling-based planners provide an approximate map of the space of possible configurations of the robot from which collision-free trajectories feasible for the RL agents can be identified. The same RL agents are used to control the robot under the direction of the planning, enabling long-range navigation. We use the Probabilistic Roadmaps (PRMs) for the sampling-based planner. The RL agents are constructed using feature-based and deep neural net policies in continuous state and action spaces. We evaluate PRM-RL on two navigation tasks with non-trivial robot dynamics: end-to-end differential drive indoor navigation in office environments, and aerial cargo delivery in urban environments with load displacement constraints. These evaluations included both simulated environments and on-robot tests. Our results show improvement in navigation task completion over both RL agents on their own and traditional sampling-based planners. In the indoor navigation task, PRM-RL successfully completes up to 215 meters long trajectories under noisy sensor conditions, and the aerial cargo delivery completes flights over 1000 meters without violating the task constraints in an environment 63 million times larger than used in training.

 

So, when I say "I teach robots to learn" ... that's what I do.

-the Centaur