Press "Enter" to skip to content

Posts tagged as “Development”

Enter Colaboratory (AKA “A Spoonful of the Tracking Soup”)

centaur 0
As an author, I'm interested in how well my books are doing: not only do I want people reading them, I also want to compare what my publisher and booksellers claim about my books with my actual sales. (Also, I want to know how close to retirement I am.) In the past, I used to read a bunch of web pages on Amazon (and Barnes and Noble too, before they changed their format) and entered them into an Excel spreadsheet called "Writing Popularity" (but just as easily could have been called "Writing Obscurity", yuk yuk yuk). That was fine when I had one book, but now I have four novels and an anthology out. This could take out half an a hour or more, which I needed for valuable writing time. I needed a better system. I knew about tools for parsing web pages, like the parsing library Beautiful Soup, but it had been half a decade since I touched that library and I just never had the time to sit down and do it. But, recently, I've realized the value of a great force multiplier for exploratory software development (and I don't mean Stack Exchange): interactive programming notebooks. Pioneered by Mathematica in 1988 and picked up by tools like iPython and its descendent Jupyter, an interactive programming notebook is like a mix of a command line - where you can dynamically enter commands and get answers - and literate programming, where code is written into the documents that document (and produce it). But Mathematica isn't the best tool for either web parsing or for producing code that will one day become a library - it's written in the Wolfram Language, which is optimized for mathematical computations - and Jupyter notebooks require setting up a Jupyter server or otherwise jumping through hoops. Enter Google's Colaboratory. Colab is a free service provided by Google that hosts Jupyter notebooks. It's got most of the standard libraries that you might need, it provides its own backends to run the code, and it saves copies of the notebooks to Google Drive, so you don't have to worry about acquiring software or running a server or even saving your data (but do please hit save). Because you can try code out and see the results right away, it's perfect on iterating ideas: no need to re-start a changed program, losing valuable seconds; if something doesn't work, you can tweak the code and try it right away. In this sense Colab has some of the force multiplier effects of a debugger, but it's far more powerful. Heck, in this version of the system you can ask a question on Stack Overflow right from the Help menu. How cool is that? My prototyping session got a bit long, so rather than try to insert it inline here, I wrote this blog post in Colab! To read more, go take a look at the Colaboratory notebook itself, "A Sip of the Tracking Soup", available at: https://goo.gl/Mihf1n -the Centaur  

Meanwhile, Back at GDC

centaur 0

on-the-road-2015a.png

View from my hotel in San Francisco. It may seem strange to get a hotel for a conference in San Francisco when I live in the San Francisco Bay Area, but the truth is that I "live in the Bay Area" only by a generous border-case interpretation of "Bay Area" (we're literally on the last page of the 500-page Bay Area map book that I bought when i came out here). The trip from my house to the Moscone Center in the morning is two to two and a half hours - you could drive from Greenville, SC to Atlanta, Georgia in that time, so by that logic I should have commuted from home to Georgia Tech. So. Not. Going. To. Happen.  

So why am I heading to the Moscone Center this week? The Game Developer's Conference, of course. At the request of my wife, I may not directly blog from wherever it is that I am, so I'll be posting with a delay about this conference. So far, I've attended the AI Game Programmer's Guild dinner Sunday night, which was a blast seeing old friends, meeting new ones, renewing friendships, and talking about the robot apocalypse and the future of artificial intelligence research. GDC is a blast even if you don't directly program games, because game developers are constantly pushing the boundaries of the possible - so I try not to miss it. I've been coming for roughly 15 years now - and already have close to 15 pages of notes. Good stuff.

One thing does occur to me, though, about games and "Gamer Gate." If you're into games, you may or may not have heard of the Gamer Gate controversy; some people claim it's about corruption in games journalism, while others openly state it's motivated by the invasion of gaming by so-called "social justice warriors" who are trying to destroy traditional male-oriented games in favor of thinly disguised social commentary. Still others suspect that the entire controversy is a manufactured excuse for misogynists to abuse women in games - and there's evidence that shows that at least some miscreants are doing just that.

But let's go back to the first reason, ethics in games journalism. I can't really speak to this from the inside, but in the circles in which I've been playing games for the past thirty-five years, no one cares about game reviews. Occasionally we use game magazines to find neat screenshots of new games, but, seriously - everything is word of mouth.

What about the second, the "invasion of social justice warriors?" I can speak about this: in the circles that I've traveled in the game industry in the past fifteen years, no one cares about this controversy. At GDC, women who speak about games are much more likely to be speaking about technical issues like constraint systems and procedural content generation than they are about social issues - and men are as likely as women to speak about women's issues or the treatment of other minorities.

These issues are important issues - but they're not big issues. Out of a hundred books in the conference bookstore, perhaps a dozen were on social issues, and only two of those dealt with women's culture or alternative culture. But traditional games are going strong - and are getting bigger and better and brighter and more vibrant as time goes along.

People like the games they like, and developers build them. No-one is threatened by the appearance of a game that breaks traditional stereotypes. No-one imagines that popular games that appeal to men are going to go away. All we really care about is make it fun, make it believable, finish it in a reasonable time and something approximating a reasonable budget.

Look, I get it: change is scary. And not just emotionally; these issues run deep. At a crowd simulation talk today, a researcher showed that you can mathematically measure a person's discomfort navigating in crowds - and showed a very realistic-looking behavior where a single character facing a phalanx of oncoming agents turned tail and ran away.

But this wasn't an act of fear; it was an act of avoidance. The appearance of an onrushing wall of people made that straightforward algorithm, designed to prove to the agent that it wouldn't run into trouble, choose a path that went the other way. An agent with more options to act might have chosen to lash out - to try to force a path.

But none of that was necessary. A slightly more sophisticated algorithm, based on study of actual human crowd behavior, showed that if an agent choose to boldly go forward into a space which slightly risked collisions, avoiding a bit harder if people got too close, worked just as well. It was easily able to wade through the phalanx - and the phalanx smoothly moved around it.

The point is that many humans don't want to run into things that are different. If the oncoming change is big enough, the simplest path may involve turning tail and running away - and if you don't want to run away, you might want to lash out. But it isn't necessary. Step forward with confidence moving towards the things that you want, and people will make space for you.

Yes, change is coming.

But change won't stop game developers from making games aimed at every demographic of fun. Chill out.

-the Centaur

P.S. Yes, it is a bit ridiculous to refer to a crowd avoidance algorithm that can mathematically prove that it avoids collision as "simple", and it's debatable whether that system, ORCA, which is based on linear programming over a simplification of velocity obstacles, is really "simpler" than the TTC force method based on combining goal acceleration with avoidance forces derived from a discomfort energy gradient defined within a velocity obstacle. For the sake of this anecdote, ORCA shows slightly "simpler" behavior than TTC, because ORCA's play-it-safe strategy causes it to avoid areas of velocity space that TTC will try, leading to slightly more "sophisticated" crowd behaviors emerging naturally in TTC based systems. Look up http://motion.cs.umn.edu/PowerLaw if you want more information - this is an anecdote tortured into an extended metaphor, not a tutorial.

Working Hard, Working Smart

centaur 0

stress-relieving-doodle.png

I had planned to post a bit about my work on the editing of LIQUID FIRE, but this image in my Google+ photo stream caught my eye first, so you get a bit of opinionating about work instead.

Previously, I've blogged about working just a little bit harder than you want to (here, and here), the gist being that you don't need to work yourself to death, but success often comes just after the point where you want to give up.

But how do you keep yourself working when you want to quit?

One trick I've used since my days interning with Yamaha at Japan is an afternoon walk. Working on a difficult problem often makes you want to quit, but a short stint out in the fresh air can clear the decks.

Other people use exercise for the same purpose, but that takes such a large chunk out of my day that I can't afford to do it - I work four jobs (at my employer, on writing, at a small press and on comics) and need to be working at work, damnit.

But work sometimes needs to bleed out of its confines. I've found that giving work a little bit extra - checking your calendar before you go to bed, making yourself available for videoconferences at odd hours for those overseas - really helps.

One way that helps is to read about work outside of work. What I do frequently pushes the boundaries of my knowledge, and naturally, you need to read up on things at work in order to make progress.

But you also know the general areas of your work, and can proactively read ahead in areas that you think you'll be working on. So I've been reading on programming languages and source control systems and artificial intelligence outside work.

Now, not everyone reads at lunch, dinner, coffee and just before bedtime - maybe that's just me - but after I committed to starting my lunch reading with a section of a book that helped at work, all of my work started going faster and faster.

Other tricks you can use are playing music, especially with noise canceling headphones so you can concentrate - I find lyric-free music helps, but your mileage may vary. (I often listen to horror movie music at work, so I know mileage varies).

Another thing you can do, schedule permitting, is taking a week out to sharpen the saw and eliminate blockers in your common tools so everything goes faster. I recently started documenting this when I did it and that helped too.

One more thing you can try is inverse procrastination - cheat on one project you really need to do by working on another project you really need to do. You use different resources on different projects, and switching gears can feel like taking a break.

Quitting time is another technique; I often make a reservation at a nice restaurant at the end of my workday, and use the promise of going out to dinner to both motivate me to work efficiently and as a reward for a job well done (I tell myself).

Some people use caffeine to power through this - and sometimes I even describe myself as a caffeine powered developer - but I've seen a developer stop in shock at their trembling hands, so beware stimulants. But at quitting time? That hits the spot.

20150201_212020.jpg

Oh, and the last thing? Use a different channel. My wife is a painter … and listens to audiobooks ten to twelve hours a day. I'm a writer and programmer … so I doodle. Find a way to keep yourself engaged and going … just a little bit longer.

-the Centaur

Word! What are you DOING?

centaur 0

Screenshot 2013-11-03 15.36.00.png

I love Microsoft Word, but when I cut and pasted that excerpt from MAROONED into Ecto and published, I noticed a huge blank gap at the beginning of the quoted passage. When I looked in Ecto's raw text editor to see what was the matter, I found 336 lines of gunk injected by Microsoft Word … a massive amount of non printable goop like this:

<!--[if gte mso 9]><xml>

<o:DocumentProperties>

<o:Revision>0</o:Revision>

<o:TotalTime>0</o:TotalTime>

<o:Pages>1</o:Pages>

<o:Words>246</o:Words>

<o:Characters>1183</o:Characters>

<o:Company>Xivagent Scientific Consulting</o:Company>

<o:Lines>18</o:Lines>

<o:Paragraphs>11</o:Paragraphs>

<o:CharactersWithSpaces>1418</o:CharactersWithSpaces>

<o:Version>14.0</o:Version>

</o:DocumentProperties>

</xml><![endif]-->

<!--[if gte mso 9]><xml>

<w:WordDocument>

...

This is apparently XML text which captures the formatting of the Word document that it came from, somehow pasted into the HTML document. As you may or may not be able to see from the screenshot above, but should definitely be able to see in the bolded parts of what I quoted above, for 1183 bytes of text Word injected 17,961 bytes of formatting. 300+ lines for 200+ words. Oy, vey. All I wanted was an excerpt without having to go manually recreate all my line breaks …

I understand this lets you paste complex formatting between programs, I get that, and actually the problem might be Ecto taking too much rather than Word giving too much. Or perhaps it's just a mismatch of specifications. But I know HTML, Word, Ecto, and many other blogging platforms like Ecto. What is someone who doesn't know all that supposed to do? Just suffer when their application programs get all weird on them and they don't know why?

Sigh. I'm not really complaining here, but it's just amusing, after a fashion.

-Anthony

Treat Problems as Opportunities

centaur 0

Treat Problems v1.png

Recently I had a setback. Doesn't matter what on; setbacks happen. Sometimes they're on things outside your control: if a meteor smacks the Earth and the tidal wave is on its way to you, well, you're out of luck buddy.

But sometimes it only seems like a tidal wave about to wipe out all life. Suppose your party has lost the election. Your vote didn't stop it. You feel powerless - but you're not. You can vote. You can argue. You can volunteer. Even run for office yourself.

Even then, it might be a thirty year project to get yourself or people you like elected President - but most problems aren't trying to change the leader of the free world. The reality is, most of the things that do happen to us are things we can partially control.

So the setback happens. I got upset, thinking about this misfortune. I try to look closely at situations and to honestly blame myself for everything that went wrong. By honestly blame, I mean to look for my mistakes, but not exaggerate their impact.

In this case, at first, I thought I saw many things I did wrong, but the more I looked, the more I realized that most of what I did was right, and only a few of them were wrong, and they didn't account for all the bad things that had happened beyond my control.

Then I realized: what if I treated those bad things as actual problems?

A disaster is something bad that happens. A problem is a situation that can be fixed. A situation that has a solution. At work, and in writing, I'm constantly trying to come up with solutions to problems, solutions which sometimes must be very creative.

"Treat setbacks as problems," I thought. "Don't complain about them (ok, maybe do) but think about how you can fix them." Of course, sometimes the specific problems are unfixable: the code failed in production, the story was badly reviewed. Too late.

That's when the second idea comes in: what if you treated problems as opportunities to better your skills?

An opportunity is a situation you can build on. At work, and in writing, I try to develop better and better skills to solve problems, be it in prose, code, organization, or self-management. And once you know a problem can happen, you can build skills to fix it.

So I came up with a few mantras: "Take Problems as Opportunities" and "Accept Setbacks as Problems" were a couple of them that I wrote down (and don't have the others on me). But I was so inspired I put together a little inspirational poster.

I don't yet know how to turn this setback into a triumph. But I do know what kinds of problems caused it, and those are all opportunities for me to learn new skills to try to keep this setback from happening again. Time to get to it.

-Anthony

Pictured: me on a ridge of rock, under my very own motivational poster.

P.S. Now that I've posted this, I see I'm not the first to come up with this phrase. Great minds think alike!

The Centaur’s Guide to the Game Developers Conference

centaur 1

gdc2013logo.png

Once again it’s time for GDC, the Game Developers Conference. This annual kickstart to my computational creativity is held in the Moscone Center in San Francisco, CA and attracts roughly twenty thousand developers from all over the world.

I’m interested primarily in artificial intelligence for computer games– “Game AI” – and in the past few years they’ve had an AI Summit where game AI programmers can get together to hear neat talks about progress in the field.

Coming from an Academic AI background, what I like about Game AI is that it can’t not work. The AI for a game must work, come hell or high water. It doesn’t need to be principled. It doesn’t need to be real. It can be a random number generator. But it needs to appear to work—it has to affect gameplay, and users have to notice it.

gdc2013aisummit.png

That having been said, there are an enormous number of things getting standard in game artificial intelligence – agents and their properties, actions and decision algorithms, pathfinding and visibility, multiple agent interactions, animation and intent communication, and so forth – and they’re getting better all the time.

I know this is what I’m interested in, so I go to the AI Summit on Monday and Tuesday, some subset of the AI Roundtables, other programming, animation, and tooling talks, and if I can make it, the AI Programmer’s Dinner on Friday night. But if game AI isn’t your bag, what should you do? What should you see?

gdc2013people.png

If you haven’t been before, GDC can be overwhelming. Obviously, try to go to talks that you like, but how do you navigate this enormous complex in downtown San Francisco? I’ve blogged about this before, but it’s worth a refresher. Here are a few tips that I’ve found improve my experience.

Get your stuff done before you arrive. There is a LOT to see at GDC, and every year it seems that a last minute videoconference bleeds over into some talk that I want to see, or some programming task bumps the timeslot I set aside for a blogpost, or a writing task that does the same. Try to get this stuff done before you arrive.

Build a schedule before the conference. You’ll change your mind the day of, but GDC has a great schedule builder that lets you quickly and easily find candidate talks. Use it, email yourself a copy, print one out, save a PDF, whatever. It will help you know where you need to go.

Get a nearby hotel. The 5th and Minna Garage near GDC is very convenient, but driving there, even just in the City, is a pain. GDC hotels are done several months in advance, but if you hunt on Expedia or your favorite aggregator you might find something. Read the reviews carefully and doublecheck with Yelp so you don’t get bedbugs or mugged.

Check in the day before. Stuff starts really early, so if you want to get to early talks, don’t even bother to fly in the same day. I know this seems obvious, but this isn’t a conference that starts at 5pm on the first day with a reception. The first content-filled talks start at 10am on Monday. Challenge mode: you can check in Sunday if you arrive early enough.

mozcafe.png

Leave early, find breakfast. Some people don’t care about food, and there’s snacks onsite. Grab a crossaint and cola, or banana and coffee, or whatever. But if you power-up via a good hot breakfast, there are a number of great places to eat nearby – the splendiferous Mo’z Café and the greasy spoon Mel’s leap to mind, but hey, Yelp. A sea of GDC people will be there, and you’ll have the opportunity to network, peoplewatch, and go through your schedule again, even if you don’t find someone to strike up a conversation with.

Ask people who’ve been before what they recommend. This post got started when I left early, got breakfast at Mo’z, and then let some random dude sit down on the table opposite me because the place was too crowded. He didn’t want to disturb my reading, but we talked anyway, and he admitted: “I’ve never been before? What do I do?” Well, I gave him some advice … and then packaged it up into this blogpost. (And this one.)

Network, network, network. Bring business cards. (I am so bad at this!) Take business cards. Introduce yourself to people (but don’t be pushy). Ask what they’re up to. Even if you are looking for a job, you’re not looking for a job: you want people to get to know you first before you stick your hand out. Even if you’re not really looking for a job, you are really looking for a job, three, five or ten years later. I got hired into the Search Engine that Starts with a G from GDC … and I wasn’t even looking.

Learn, learn, learn. Find talks that look like they may answer questions related to problems that you have in your job. Find talks that look directly related to your job. Find talks that look vaguely related to your job. Comb the Expo floor looking for booths that have information even remotely related to your job. Scour the GDC Bookstore for books on anything interesting – but while you’re here: learn, learn, learn.

gdc2013expofloor.png

Leave early if you want lunch or dinner. If you don’t care about a quiet lunch, or you’ve got a group of friends you want to hang with, or colleagues you need to meet with, or have found some people you want to talk to, go with the flow, and feel comfortable using your 30 minute wait to network. But if you’re a harried, slightly antisocial writer with not enough hours in the day needing to work on his or her writing projects aaa aaa they’re chasing me, then leave about 10 minutes before the lunch or dinner rush to find dinner. Nearby places just off the beaten path like the enormous Chevy’s or the slightly farther ’wichcraft are your friends.

Find groups or parties or events to go to. I usually have an already booked schedule, but there are many evening parties. Roundtables break up with people heading to lunch or dinner. There may be guilds or groups or clubs or societies relating to your particular area; find them, and find out where they meet or dine or party or booze. And then network.

gdc2013roundtables.png

Hit Roundtables in person; hit the GDC Vault for conflicts. There are too many talks to go. Really. You’ll have to make sacrifices. Postmortems on classic games are great talks to go to, but pro tip: the GDC Roundtables, where seasoned pros jam with novices trying to answer their questions, are not generally recorded. All other talks usually end up on the GDC Vault, a collection of online recordings of all past sessions, which is expensive unless you…

Get an All Access Pass. Yes, it is expensive. Maybe your company will pay for it; maybe it won’t. But if you really are interested in game development, it’s totally worth it. Bonus: if you come back from year to year, you can get an Alumni discount if you order early. Double bonus: it comes with a GDC Vault subscription.

gdc2013chevys.png

Don’t Commit to Every Talk. There are too many talks to go to. Really. You’ll have to make sacrifices. Make sure you hit the Expo floor. Make sure you meet with friends. Make sure you make an effort to find some friends. Make time to see some of San Francisco. Don’t wear yourself out: go to as much as you can, then soak the rest of it in. Give yourself a breather. Give yourself an extra ten minutes between talks. Heck, leave a talk if you have to if it isn’t panning out, and find a more interesting one.

Get out of your comfort zone. If you’re a programmer, go to a design talk. If you’re a designer, go to a programming talk. Both of you could probably benefit from sitting in on an audio or animation talk, or to get more details about production. What did I say about learn, learn, learn?

Most importantly, have fun. Games are about fun. Producing them can be hard work, but GDC should not feel like work. It should feel like a grand adventure, where you explore parts of the game development experience you haven’t before, an experience of discovery where you recharge your batteries, reconnect with your field, and return home eager to start coding games once again.

-the Centaur

Pictured: The GDC North Hall staircase, with the mammoth holographic projected GDC logo hovering over it. Note: there is no mammoth holographic projected logo. After that, breakfast at Mo'z, the Expo floor, the Roundtables, and lunch at Chevy's.

An open letter to people who do presentations

centaur 0

presentations.png

I’ve seen many presentations that work: presentations with a few slides, with many slides, with no slides. Presentations with text-heavy slides, with image-heavy slides, with a few bullet points, even hand scrawled. Presentations done almost entirely by a sequence of demos; presentations given off the cuff sans microphone.

But there are a lot of things that don’t work in presentations, and I think it comes down to one root problem: presenters don’t realize they are not their audience. You should know, as a presenter, that you aren’t your audience: you’re presenting, they’re listening, you know what you’re going to say, they don’t.

But recently, I’ve had evidence otherwise. Presenters that seem to think you know what they’re thinking. Presenters that seem to think you have access to their slides. Presenters that seem that you are in on every private joke that they tell. Presenters that not only seem to think that they are standing on the podium with them, but are like them in every way – and like them as well.

Look, let’s be honest. Everyone is unique, and as a presenter, you’re more unique than everyone else. [u*nique |yo͞oˈnēk| adj, def (2): distinctive, remarkable, special, or unusual: a person unique enough to give him a microphone for forty-five minutes]. So your audience is not like you — or they wouldn’t have given you a podium. The room before that podium is filled with people all different from you.

How are they different?

  • First off, they don’t have your slides. Fine, you can show them to them. But they haven’t read your slides. They don’t know what’s on your slides. They can’t read them as fast as you can flip through them. Heck, you can’t read them as fast as you can flip through them. You have to give them the audience time to read your slides.

  • Second, they don’t know what you know. They can’t read slides which are elliptical and don’t get to the point. They can’t read details printed only in your slide notes. They can’t read details only on your web site. The only thing they get is what you say and show. If you don’t say it or show it, the audience won’t know it.
  • Third, they probably don’t know you. But that’s not an excuse to pour your heart and soul into your presentation. It’s especially not a reason to pour your heart and soul into your bio slide. Your audience does not want to get to know you. They want to know what you know. That’s an excuse to pour into it what they came to hear.
  • Fourth, your audience may not even like you. That’s not your fault: they don’t probably know you. But that’s not an excuse to sacrifice content for long, drawn out, extended jokes. Your audience isn’t there to be entertained by you. We call that standup. Humor is an important part of presentations, but only as a balanced part. We don’t call a pile of sugar a meal; we call it an invitation to hyperglycemic shock.
  • Fifth, your audience came to see other people than you. You showed up to give your presentation; they came to see a sequence of them. So, after following a too-fast presentation where the previous too-fast presenter popped up a link to his slide notes, please, for the love of G*d, don’t hop up on stage and immediately slap up your detailed bio slide before we’ve had time to write down the tiny URL.

Look, I don’t want to throw a lot of rules at you. I know some people say “no more than 3 bullets per slide, no more than 1 slide per 2 minutes” but I’ve seen Scott McCloud give a talk with maybe triple that density, and his daughter Sky McCloud is even faster and better. There are no rules. Just use common sense.

  • Don’t jam a 45 minute talk into 25 minutes. Cut something out.
  • Don’t have a 10 minute funny video at a technical conference. Cut it in half.
  • Don’t leap up on stage to show your bio slide before the previous presenter is done talking. Wait for people to write down the slides.
  • Don’t “let the audience drive the talk with questions.” They came to hear your efforts to distill your wisdom, not to hear your off-the-cuff answers to irrelevant questions from the audience.
  • Don’t end without leaving time for questions. Who knows, you may have made a mistake.

Ok. That’s off my chest.

Now to dive back into the fray…

-the Centaur

Pictured: A slide from ... axually a pretty good talk at GDC, not one of the ones that prompted the letter above.

Back to the Future with the Old Reader

centaur 0

theoldreader.png

As I mentioned in a previous post, Google Reader is going away. If you don't use RSS feeds, this service may be mystifying to you, but think of it this way: imagine, instead of getting a bunch of Facebook, Google+ or Twitter randomized micro-posts, you could get a steady stream of high-quality articles just from the people you like and admire? Yeah. RSS. It's like that.

So anyway, the Reader shutdown. I have a lot of thoughts about that, as do many other people, but the first one is: what the heck do I do? I use Reader on average about seven times a day. I'm certainly not going to hope Google change their minds, and even if they do, my trust is gone. Fortunately, there are a number of alternatives, which people have blogged about here and here.

The one I want to report on today is The Old Reader, the first one I tried. AWESOME. In more detail, this is what I found:

  • It has most, though not all, features of Google Reader. It's got creaky corners that sometimes make it look like features are broken, but as I've dug into it, almost everything is there and works pretty great.
  • It was able to import all my feeds I exported via Google Takeout. Their servers are pretty slow, so it actually took a few days, and they did it two passes. But they sent me an email when it was done, and they got everything.
  • The team is insanely responsive. They're just three guys - but when I found a problem with the Add Subscription button, they fixed it in just a couple of days. Amazing. More responsive than other companies I know.

There are drawbacks, most notably: they don't yet have an equivalent for Google Takeout's OPML export. But, they are only three guys. They just started taking money, which is a good sign that they might stay around. Here's hoping they are able to build a business on this, and that they have the same commitment to openness that Google had.

I plan to try other feed readers, as I can't be trapped into one product as I was before, but kudos to The Old Reader team for quickly and painlessly rescuing me from the First Great Internet Apocalypse of 2013. I feel like I'm just using Reader, except now I have a warm fuzzy that my beloved service isn't going to get neglected until it withers away.

-the Centaur

Context-Directed Spreading Activation

centaur 0
netsphere.png Let me completely up front about my motivation for writing this post: recently, I came across a paper which was similar to the work in my PhD thesis, but applied to a different area. The paper didn’t cite my work – in fact, its survey of related work in the area seemed to indicate that no prior work along the lines of mine existed – and when I alerted the authors to the omission, they informed me they’d cited all relevant work, and claimed “my obscure dissertation probably wasn’t relevant.” Clearly, I haven’t done a good enough job articulating or promoting my work, so I thought I should take a moment to explain what I did for my doctoral dissertation. My research improved computer memory by modeling it after human memory. People remember different things in different contexts based on how different pieces of information are connected to one another. Even a word as simple as ‘ford’ can call different things to mind depending on whether you’ve bought a popular brand of car, watched the credits of an Indiana Jones movie, or tried to cross the shallow part of a river. Based on that human phenomenon, I built a memory retrieval engine that used context to remember relevant things more quickly. My approach was based on a technique I called context directed spreading activation, which I argued was an advance over so-called “traditional” spreading activation. Spreading activation is a technique for finding information in a kind of computer memory called semantic networks, which model relationships in the human mind. A semantic network represents knowledge as a graph, with concepts as nodes and relationships between concepts as links, and traditional spreading activation finds information in that network by starting with a set of “query” nodes and propagating “activation” out on the links, like current in an electric circuit. The current that hits each node in the network determines how highly ranked the node is for a query. (If you understand circuits and spreading activation, and this description caused you to catch on fire, my apologies. I’ll be more precise in future blogposts. Roll with it). The problem is, as semantic networks grow large, there’s a heck of a lot of activation to propagate. My approach, context directed spreading activation (CDSA), cuts this cost dramatically by making activation propagate over fewer types of links. In CDSA, each link has a type, each type has a node, and activation propagates only over links whose nodes are active (to a very rough first approximation, although in my evaluations I tested about every variant of this under the sun). Propagating over active links isn’t just cheaper than spreading activation over every link; it’s smarter: the same “query” nodes can activate different parts of the network, depending on which “context” nodes are active. So, if you design your network right, Harrison Ford is never going to occur to you if you’ve been thinking about cars. I was a typical graduate student, and I thought my approach was so good, it was good for everything—so I built an entire cognitive architecture around the idea. (Cognitive architectures are general reasoning systems, normally built by teams of researchers, and building even a small one is part of the reason my PhD thesis took ten years, but I digress.) My cognitive architecture was called context sensitive asynchronous memory (CSAM), and it automatically collected context while the system was thinking, fed it into the context-directed spreading activation system, and incorporated dynamically remembered information into its ongoing thought processes using patch programs called integration mechanisms. CSAM wasn’t just an idea: I built it out into a computer program called Nicole, and even published a workshop paper on it in 1997 called “Can Your Architecture Do This? A Proposal for Impasse-Driven Asynchronous Memory Retrieval and Integration.” But to get a PhD in artificial intelligence, you need more than a clever idea you’ve written up in a paper or implemented in a computer program. You need to use the program you’ve written to answer a scientific question. You need to show that your system works in the domains you claim it works in, that it can solve the problems that you claim it can solve, and that it’s better than other approaches, if other approaches exist. So I tested Nicole on computer planning systems and showed that integration mechanisms worked. Then I and a colleague tested Nicole on a natural language understanding program and showed that memory retrieval worked. But the most important part was showing that CDSA, the heart of the theory, didn’t just work, but was better than the alternatives. I did a detailed analysis of the theory of CDSA and showed it was better than traditional spreading activation in several ways—but that rightly wasn’t enough for my committee. They wanted an example. There were alternatives to my approach, and they wanted to see that my approach was better than the alternatives for real problems. So I turned Nicole into an information retrieval system called IRIA—the Information Retrieval Intelligent Assistant. By this time, the dot-com boom was in full swing, and my thesis advisor invited me and another graduate student to join him starting a company called Enkia. We tried many different concepts to start with, but the further we went, the more IRIA seemed to have legs. We showed she could recommend useful information to people while browsing the Internet. We showed several people could use her at the same time and get useful feedback. And critically, we showed that by using context-directed spreading activation, IRIA could retrieve better information faster than traditional spreading activation approaches. The first publication on IRIA came out in 2000, shortly before I got my PhD thesis, and at the company things were going gangbusters. We found customers for the idea, my more experienced colleagues and I turned the IRIA program from a typical graduate student mess into a more disciplined and efficient system called the Enkion, a process we documented in a paper in early 2001. We even launched a search site called Search Orbit—and then the whole dot-com disaster happened, and the company essentially imploded. Actually, that’s not fair: the company continued for many years after I left—but I essentially imploded, and if you want to know more about that, read “Approaching 33, as Seen from 44.” Regardless, the upshot is that I didn’t follow up on my thesis work after I finished my PhD. That happens to a lot of PhD students, but for me in particular I felt that it would have been betraying the trust of my colleagues to go publish a sequence of papers on the innards of a program they were trying to use to run their business. Eventually, they moved on to new software, but by that time, so had I. Fast forward to 2012, and while researching an unrelated problem for The Search Engine That Starts With A G, I came across the 2006 paper “Recommending in context: A spreading activation model that is independent of the type of recommender system and its contents” by Alexander Kovács and Haruki Ueno. At Enkia, we’d thought of doing recommender systems on top of the Enkion, and had even started to build a prototype for Emory University, but the idea never took off and we never generated any publications, so at first, I was pleased to see someone doing spreading activation work in recommender systems. Then I was unnerved to see that this approach also involved spreading activation, over a typed network, with nodes representing the types of links, and activation in the type nodes changing the way activation propagated over the links. Then I was unsettled to see that my work, which is based on a similar idea and predates their publication by almost a decade, was not cited in the paper. Then I was actually disturbed when I read: “The details of spreading activation networks in the literature differ considerably. However, they’re all equal with respect to how they handle context … context nodes do not modulate links at all…” If you were to take that at face value, the work that I did over ten years of my life—work which produced four papers, a PhD thesis, and at one point helped employ thirty people—did not exist. Now, I was also surprised by some spooky similarities between their systems and mine—their system is built on a context-directed spreading activation model, mine is a context-directed spreading activation model, theirs is called CASAN, mine is embedded in a system called CSAM—but as far as I can see there’s NO evidence that their work was derivative of mine. As Chris Atkinson said to a friend of mine (paraphrased): “The great beam of intelligence is more like a shotgun: good ideas land on lots of people all over the world—not just on you.” In fact, I’d argue that their work is a real advance to the field. Their model is similar, not identical, and their mathematical formalism uses more contemporary matrix algebra, making the relationship to related approaches like Page Rank more clear (see Google Page Rank and Beyond). Plus, they apparently got their approach to work on recommender systems, which we did not; IRIA did more straight up recommendation of information in traditional information retrieval, which is a similar but not identical problem. So Kovács and Ueno’s “Recommending in Context” paper is a great paper and you should read it if you’re into this kind of stuff. But, to set the record straight, and maybe to be a little bit petty, there are a number of spreading activation systems that do use context to modulate links in the network … most notably mine. -the Centaur Pictured: a tiny chunk of the WordNet online dictionary, which I’m using as a proxy of a semantic network. Data processing by me in Python, graph representation by the GraphViz suite’s dot program, and postprocessing by me in Adobe Photoshop.

A Ray of Hoops

centaur 0

rayofhope.png

So, after my scare over almost losing 150+ files on Google Drive, I've made some progress on integrating Google Drive and Dropbox using cloudHQ. The reason it wasn't completely seamless is that I use both Google Drive and Dropbox on my primary personal laptop, and cannot afford to have two copies of all files on this one machine. The other half of this problem is that if you only set up partial sync of certain folders, then any new files added to the top folder of Google Drive or Dropbox won't get replicated - and believe it or not, that's already happened to me. So I need a "reliable scheme" I can count on.

The solution? Set up a master folder on Google Drive called "Replicated", in which everything that I want to keep - all my Google Docs, in particular - will get copied to a folder of the same name called "Replicated" in Dropbox. For good measure, set up another replication pair for the Shared folder of Google Drive. The remaining files, all the Pictures I've stored because of Google Drive's great bang for the buck storage deal, don't need to be replicated here.

The reason this works is that if you obey the simple anal-retentive policy of creating all your Google Docs within a named folder, and you put all your named folders under Replicated, then they all automatically get copied to Dropbox as documents. I've even seen it in action, as I edit Google Docs and Dropbox informs me that new copies of documents in Microsoft Word .docx format are appearing in my drive. Success!

At last, I've found a way to reliably use Google Drive cloud. Google doesn't always support the features you want, or the patterns of usage that you want, but they're deeply committed to open APIs, to data liberation, and to the creation of third party applications that enable you to fill the gaps in Google's services so that you aren't locked in to one solution.

Breaking News: Google Reader canceled. G*d dammit, Google…

Next up: after my scare of losing Google Reader, a report on my progress using The Old Reader to rescue my feeds...

-the Centaur

Pictured: A table candle at Cascal's in Mountain View, Ca...