Tag Archive for microsoft

The Simple Exchange of Please and Thank You

I’d like to make a request of all you personal AIssistant programmers, you engineers at Apple, Google, Microsoft, all of you who are responsible for iterating on human/AI exchanges.

I’d like to be able to say please and thank you to my voice controlled computing.

It seems like a minor thing, doesn’t it?  A quaint nicety falling by the wayside in the pursuit of one more step towards the Singularity.  But what you are forgetting, my engineers, is that while you are training your AI’s to talk to us, those AI’s are training us to talk to them.

Much like cats, but with less shedding.

A request from a person often forms a sort of closed-loop.  It’s a format we learn, something that most cultures have.  An In, a Confirmation, A Request, a Confirmation and an Out.  To your average human, this feels complete.  In fact, interrupting this sequence feels rude.  Failing to complete this sequence just leaves one feeling uncomfortable, the same kind of uncomfortable you get when someone fails to say “good bye” before they hang up the phone.  Depending on the person/culture this feeling can range from a mild annoyance to an offence that requires a response.

It’s not always pretty.

 

As an example, let’s say we have a diner in a restaurant, ordering a meal from an AIssistant (like Siri or Hey Google).  The interaction might go something like this:

DINER: “Hey Waiter.” (In)

WAITER: “What do you want to order?” (Confirmation)

DINER: “I would like the Salmon Mousse, please.” (Request)

WAITER: “One Salmon Mousse, coming right up.” (Confirmation)

DINER: “Thank you.” (Out)

You’ve probably had thousands of exchanges like this over the course of your lifetime.  At the end the waiter is released from the encounter by the Out and both parties are free to move on to other things.  There is a clear In and Out, nobody is left hanging, waiting for a followup or a new request.  In fact, you may have had an experience or two when the Waiter has left the exchange early, before the second Confirmation or before the Out.

It left you feeling a bit slighted, didn’t it.  Maybe a little confused.  Definitely not quite right, though you might not have understood why.

This type of exchange flows smoothly, we have an idea in our heads of how it will play out.  It’s comfortable, familiar.  It’s successful execution triggers a feeling of satisfaction in both parties similar to the way you feel when picking up resources in Clash of Clans or creating a cascade in Candy Crush.

With the current state of Voice Recognition Technology, this same exchange is truncated, cut short:

DINER: “Hey, Waiter?”

WAITER: “Yes?”

DINER: “I would like the Salmon Mousse, please.”

WAITER: “Salmon mousse with peas.”

And boom, you’re done.  Misunderstanding of the word please aside, there’s no Out here.  The Diner has to trust that they will get what they want.  They are left hanging and, when the Waiter delivers peas alongside the Salmon Mousse they are frustrated, annoyed.  The exchange fails in the users mind, the AIssistant is cast as unreliable.

Once you’ve had a few of these sub-optimal exchanges with your AIssistant, you stop using natural language.  Every please and thank you, because they are so often misunderstood or they are ignored, or they cause a misunderstanding, gets dropped.  These conditioned responses, designed to get the best possible reaction from a human, become a burden when talking to an AI.  Your exchange becomes:

DINER: “Hey, Waiter. Salmon Mousse, plate, dining room, extra fork.”

WAITER: Delivers plate of Salmon Mousse on a plate to the dining room with an extra fork.

Yikes! This is no longer a “natural language” request.  The diner had started to simply deliver a string of keywords in order to get the end result they are looking for.    The user, the human part of this equation that natural language voice recognition is specifically being designed for, has abandoned natural language entirely when talking to their AIssistant.  They have run up against the Uncanny Valley of voice and have begun treating the AIssistant like a garden variety search engine.

Which wouldn’t be a problem if it only affected the AIsisstant.  In fact, it makes things run much more smoothly.  But these voice patterns tend to stick.  They backflush into the common lexicon of words (look at words like LOL and l33t that have entered spoken language and are here to stay, they exist only because of the constraints of technology).  Listen to a voice message left by someone who habitually uses Voice to Text.  You’ll find they have a tendency to automatically speak their punctuation out loud, just like you need to when dictating an email or a text message.

Please and thank you cease to be Ins and Outs of a conversation, they instead become stumbling blocks, places where your command sequence fails.  These niceties that we use to frame requests in the spoken language start to get dropped not because nobody’s teaching them, not because humans are getting ruder, but because they are being trained back out again by interaction with AIssistants that fall a bit too shy of being human.

The next step becomes complex.  Do we split language into a “conversation” and a “command” form?  Or do we end of abandoning the conversational form altogether in favor of the much more efficient (but far less communicative) string of key words?  It will be interesting to see if we pass each other in the night, humans and AIssistants, with the human language patterns becoming even more AI friendly as the AI language recognition software gets better at handling our natural way of speaking.

Either way, please and thank you, those natural addresses that help to keep requests couched in a tidy little package, may be one of the first victims.

Old Dinosaur, New Tricks

http://gizmodo.com/its-microsoft-build-day-2-live-streaming-hot-1701224950

What the h*ll, Microsoft!

After dancing the dance of the dinosaur’s graveyard for decades now, you give us this.  The HoloLens.

There are a metric *ss-load of VR devices and Apps in the works right now.  Everyone is hunting the killer app (I think VR App companies outnumber hardware companies by, like, 20 to 1).  Everyone is hunting the one cool thing that will finally make VR and AR mainstream products.

Microsoft may have done just that.

The key difference in what Microsoft is pitching is not the one coolest game you’ll ever play (like Magic Leap’s video) or the ultra-minimal camera on ur face (the public perception of Google’s Glass).  Instead they are showing us an integrated world.  They are pitching a lifestyle, one limited to inside your home to be sure, but a functioning, useful product that integrates your screens with your life.  You have the option of attaching stuff to your walls, to having apps and objects appear and disappear in-situ, rather than carrying them with you all the time.

And I think this is the big perceptual difference.  Having VR elements situationally popping into and out of existence requires a kind of constant mental engagement.  It makes you want to put the headset down and go to the kitchen for a soda, just to get a break from all the micro-attentions.  But by having those apps and objects stay static, have them fully integrate with the environment around you, like actual physical objects, you give the user the ability to walk away at any time, then come back to find everything where they left it.  It allows the VR to be a part of your life, rather than a novelty item.

Three Gun Monte

This is a re-post from my Gamasutra Archive.  As Halo 5 approaches, I’m working my way back through Halo 4 and I’m very interested in seeing how the gameplay has evolved over the past few years.  The change from Bungie to 343 brought about some significant changes to the way the game played.  Not all “bad” but some things that took quite a bit of adapting to.

 

*****

 

I’ve been playing Halo since the original demo at E3 many many years ago now.  Like so many of you, I’ve had the privilege of watching this IP evolve, go from being the Flagship title of the original XBox console to a product vast enough to change the way we think about entertainment (but that’s for another post).

My first thought, out of the box was (and you can find this on Twitter) “Holy sh*t, 343 brought their A Game…”  And I stand by that statement.  If this title were to stand alone, even without the decades of experimentation and innovation behind it, it would be worthy of the AAA rating.  I have my complaints, everyone does with a new game in a well loved IP, but one thing sticks out to me.

There used to be this Big Three in the level design.  Within the space of a single level there would be 1.  a place where you wanted to use grenades, 2.  a place where melee combat might be best and 3. a place where you wanted a hand-held weapon.  Might be in different places in different levels, but it was consistent enough that you had to *think* as you played through the game, because these changeups would happen inside the level.  You had to be able to assess where you were, what you had to hand and how best to use that.  You had to be quick on your toes, but it made you FEEL like the best of the best if you didn’t get your *ss handed to you.  If you started slogging too much, then you’d screwed something up, missed a cue, gone in for melee when what you’d really needed was the Battle Rifle or the Needler.

Halo 4, in contrast, almost feels like a “One Level, One Perfect Weapon” game.  When you come around the corner you can look at the layout, the architecture and you know, “okay, it’s all sniper shots from here on out”.  The combat change up *within* the encounter spaces seems to be gone.

And I guess this is what happens when you have a new set of minds working with an old and familiar franchise.  But I can’t help but wonder if this was a conscious design decision, if 343 decided to do away with that Big Three aspect of the original in favor of this One Level One Perfect Weapon approach, or if this simply reflects a difference in how they think a FPS ought to play.  OR, conversely (since I don’t know anyone over at Bungie or 343 to ask this of) was that Big Three a mistake?  Was it a random convergence of level design and gameplay and never intended to be the way things were supposed to be played.

I like to think, especially after hearing reports of the oodles of gameplay and focus testing that went on for the Halo franchise, to keep the “fun” factor vibrant, that there has been a conscious change here (hopefully something with an awesome payout as I near the end of the single-player game) and that there is a higher-concept at work that I’m just missing.  But I miss being able to make those assessments on the fly, being able to play smarter, not just with a bigger gun.