Optometrists, Octopii, Rubber Ducks & Centaurs: my talk at Design for AI, TU Delft, October 2022

I was fortunate to be invited to the wonderful (huge) campus of TU Delft earlier this year to give a talk on “Designing for AI.”

I felt a little bit more of an imposter than usual – as I’d left my role in the field nearly a year ago – but it felt like a nice opportunity to wrap up what I thought I’d learned in the last 6 years at Google Research.

Below is the recording of the talk – and my slides with speaker notes.

I’m very grateful to Phil Van Allen and Wing Man for the invitation and support. Thank you Elisa Giaccardi, Alessandro Bozzon, Dave Murray-Rust and everyone the faculty of industrial design engineering at TU Delft for organising a wonderful event.

The excellent talks of my estimable fellow speakers – Elizabeth Churchill, Caroline Sinders and John can be found on the event site here.


Video of Matt Jones “Designing for AI” talk at TU Delft, October 2022

Slide 1

Hello!

Slide 2

This talk is mainly a bunch of work from my recent past – the last 5/6 years at Google Research. There may be some themes connecting the dots I hope! I’ve tried to frame them in relation to a series of metaphors that have helped me engage with the engineering and computer science at play.

Slide 3

I won’t labour the definition of metaphor or why it’s so important in opening up the space of designing AI, especially as there is a great, whole paper about that by Dave Murray-Rust and colleagues! But I thought I would race through some of the metaphors I’ve encountered and used in my work in the past.

The term AI itself is best seen as a metaphor to be translated. John Giannandrea was my “grand boss” at Google and headed up Google Research when I joined. JG’s advice to me years ago still stands me in good stead for most projects in the space…

But the first metaphor I really want to address is that of the Optometrist.

This image of my friend Phil Gyford (thanks Phil!) shows him experiencing something many of us have done – taking an eye test in one of those wonderful steampunk contraptions where the optometrist asks you to stare through different lenses at a chart, while asking “Is it better like this? Or like this?”

This comes from the ‘optometrist’ algorithm work by colleagues in Google Research working with nuclear fusion researchers. The AI system optimising the fusion experiments presents experimental parameter options to a human scientist, in the mode of a eye testing optometrist ‘better like this, or like this?’

For me to calls to mind this famous scene of human-computer interaction: the photo enhancer in Blade Runner.

It makes the human the ineffable intuitive hero, but perhaps masking some of the uncanny superhuman properties of what the machine is doing.

The AIs are magic black boxes, but so are the humans!

Which has lead me in the past to consider such AI-systems as ‘magic boxes’ in larger service design patterns.

How does the human operator ‘call in’ or address the magic box?

How do teams agree it’s ‘magic box’ time?

I think this work is as important as de-mystifying the boxes!

Lais de Almeida – a past colleague at Google Health and before that Deepmind – has looked at just this in terms of the complex interactions in clinical healthcare settings through the lens of service design.

How does an AI system that can outperform human diagnosis (Ie the retinopathy AI from deep mind shown here) work within the expert human dynamics of the team?

My next metaphor might already be familiar to you – the centaur.

[Certainly I’ve talked about it before…!]

If you haven’t come across it:

Gary Kasparov famously took on chess-AI Deep Blue and was defeated (narrowly)

He came away from that encounter with an idea for a new form of chess where teams of humans and AIs played against other teams of humans and AIs… dubbed ‘centaur chess’ or ‘advanced chess’

I first started investigating this metaphorical interaction about 2016 – and around those times it manifested in things like Google’s autocomplete in gmail etc – but of course the LLM revolution has taken centaurs into new territory.

This very recent paper for instance looks at the use of LLMs not only in generating text but then coupling that to other models that can “operate other machines” – ie act based on what is generated in the world, and on the world (on your behalf, hopefully)

And notion of a Human/AI agent team is something I looked into with colleagues in Google Research’s AIUX team for a while – in numerous projects we did under the banner of “Project Lyra”.

Rather than AI systems that a human interacts with e.g. a cloud based assistant as a service – this would be pairing truly-personal AI agents with human owners to work in tandem with tools/surfaces that they both use/interact with.

And I think there is something here to engage with in terms of ‘designing the AI we need’ – being conscious of when we make things that feel like ‘pedal-assist’ bikes, amplifying our abilities and reach vs when we give power over to what political scientist David Runciman has described as the real worry. Rather than AI, “AA” – Artificial Agency.

[nb this is interesting on that idea, also]

We worked with london-based design studio Special Projects on how we might ‘unbox’ and train a personal AI, allowing safe, playful practice space for the human and agent where it could learn preferences and boundaries in ‘co-piloting’ experiences.

For this we looked to techniques of teaching and developing ‘mastery’ to adapt into training kits that would come with your personal AI .

On the ‘pedal-assist’ side of the metaphor, the space of ‘amplification’ I think there is also a question of embodiment in the interaction design and a tool’s “ready-to-hand”-ness. Related to ‘where the action is’ is “where the intelligence is”

In 2016 I was at Google Research, working with a group that was pioneering techniques for on-device AI.

Moving the machine learning models and operations to a device gives great advantages in privacy and performance – but perhaps most notably in energy use.

If you process things ‘where the action is’ rather than firing up a radio to send information back and forth from the cloud, then you save a bunch of battery power…

Clips was a little autonomous camera that has no viewfinder but is trained out of the box to recognise what humans generally like to take pictures of so you can be in the action. The ‘shutter’ button is just that – but also a ‘voting’ button – training the device on what YOU want pictures of.

There is a neural network onboard the Clips initially trained to look for what we think of as ‘great moments’ and capture them.

It had about 3 hours battery life, 120º field of view and can be held, put down on picnic tables, clipped onto backpacks or clothing and is designed so you don’t have to decide to be in the moment or capture it. Crucially – all the photography and processing stays on the device until you decide what to do with it.

This sort of edge AI is important for performance and privacy – but also energy efficiency.

A mesh of situated “Small models loosely joined” is also a very interesting counter narrative to the current massive-model-in-the-cloud orthodoxy.

This from Pete Warden’s blog highlights the ‘difference that makes a difference’ in the physics of this approach!

And I hope you agree addressing the energy usage/GHG-production performance of our work should be part of the design approach.

Another example from around 2016-2017 – the on-device “now playing” functionality that was built into Pixel phones to quickly identify music using recognisers running purely on the phone. Subsequent pixel releases have since leaned on these approaches with dedicated TPUs for on-device AI becoming selling points (as they have for iOS devices too!)

And as we know ourselves we are not just brains – we are bodies… we have cognition all over our body.

Our first shipping AI on-device felt almost akin to these outposts of ‘thinking’ – small, simple, useful reflexes that we can distribute around our cyborg self.

And I think this approach again is a useful counter narrative that can reveal new opportunities – rather than the centralised cloud AI model, we look to intelligence distributed about ourselves and our environment.

A related technique pioneered by the group I worked in at Google is Federated Learning – allowing distributed devices to train privately to their context, but then aggregating that learning to share and improve the models for all while preserving privacy.

This once-semiheretical approach has become widespread practice in the industry since, not just at Google.

My next metaphor builds further on this thought of distributed intelligence – the wonderful octopus!

I have always found this quote from ETH’s Bertrand Meyer inspiring… what if it’s all just knees! No ‘brains’ as such!!!

In Peter Godfrey-Smith’s recent book he explores different models of cognition and consciousness through the lens of the octopus.

What I find fascinating is the distributed, embodied (rather than centralized) model of cognition they appear to have – with most of their ‘brains’ being in their tentacles…

And moving to fiction, specifically SF – this wonderful book by Adrian Tchaikovsky depicts an advanced-race of spacefaring octopi that have three minds that work in concert in each individual. “Three semi-autonomous but interdependent components, an “arm-driven undermind (their Reach, as opposed to the Crown of their central brain or the Guise of their skin)”

I want to focus on the that idea of ‘guise’ from Tchaikovsky’s book – how we might show what a learned system is ‘thinking’ on the surface of interaction.

We worked with Been Kim and Emily Reif in Google research who were investigating interpretability in modest using a technique called Tensor concept activation vectors or TCAVs – allowing subjectivities like ‘adventurousness’ to be trained into a personalised model and then drawn onto a dynamic control surface for search – a constantly reacting ‘guise’ skin that allows a kind of ‘2-player’ game between the human and their agent searching a space together.

We built this prototype in 2018 with Nord Projects.

This is CavCam and CavStudio – more work using TCAVS by Nord Projects again, with Alison Lentz, Alice Moloney and others in Google Research examining how these personalised trained models could become reactive ‘lenses’ for creative photography.

There are some lovely UI touches in this from Nord Projects also: for instance the outline of the shutter button glowing with differing intensity based on the AI confidence.

Finally – the Rubber Duck metaphor!

You may have heard the term ‘rubber duck debugging’? Whereby your solve your problems or escape creative blocks by explaining out-loud to a rubber duck – or in our case in this work from 2020 and my then team in Google Research (AIUX) an AI agent.

We did this through the early stages of covid where we felt keenly the lack of informal dialog in the studio leading to breakthroughs. Could we have LLM-powered agents on hand to help make up for that?

And I think that ‘social’ context for agents in assisting creative work is what’s being highlighted here by the founder of MidJourney, David Holz. They deliberated placed their generative system in the social context of discord to avoid the ‘blank canvas’ problem (as well as supercharge their adoption) [reads quote]

But this latest much-discussed revolution in LLMs and generative AI is still very text based.

What happens if we take the interactions from magic words to magic canvases?

Or better yet multiplayer magic canvases?

There’s lots of exciting work here – and I’d point you (with some bias) towards an old intern colleague of ours – Gerard Serra – working at a startup in Barcelona called “Fermat

So finally – as I said I don’t work at this as my day job any more!

I work for a company called Lunar Energy that has a mission of electrifying homes, and moving us from dependency on fossil fuels to renewable energy.

We make solar battery systems but also AI software that controls and connects battery systems – to optimise them based on what is happening in context.

For example this recent (September 2022) typhoon warning in Japan where we have a large fleet of batteries controlled by our Gridshare platform.

You can perhaps see in the time-series plot the battery sites ‘anticipating’ the approach of the typhoon and making sure they are charged to provide effective backup to the grid.

And I’m biased of course – but think most of all this is the AI we need to be designing, that helps us at planetary scale – which is why I’m very interested by the recent announcement of the https://antikythera.xyz/ program and where that might also lead institutions like TU Delft for this next crucial decade toward the goals of 2030.

Partner / Tool / Canvas: UI for AI Image Generators

“Howl’s Moving Castle, with Solar Panels” – using Stable Diffusion / DreamStudio LIte

Like a lot of folks, I’ve been messing about with the various AI image generators as they open up.

While at Google I got to play with language model work quite a bit, and we worked on a series of projects looking at AI tools as ‘thought partners’ – but mainly in the space of language with some multimodal components.

As a result perhaps – the things I find myself curious about are not so much the models or the outputs – but the interfaces to these generator systems and the way they might inspire different creative processes.

For instance – Midjourney operates through a discord chat interface – reinforcing perhaps the notion that there is a personage at the other end crafting these things and sending them back to you in a chat. I found a turn-taking dynamic underlines play and iteration – creating an initially addictive experience despite the clunkyness of the UI. It feels like an infinite game. You’re also exposed (whether you like it or not…) to what others are producing – and the prompts they are using to do so.

Dall-e and Stable Diffusion via Dreamstudio have more of a ‘traditional’ tool UI, with a canvas where the prompt is rendered, that the user can tweak with various settings and sliders. It feels (to me) less open-ended – but more tunable, more open to ‘mastery’ as a useful tool.

All three to varying extents resurface prompts and output from fellow users – creating a ‘view-source’ loop for newbies and dilettantes like me.

Gerard Serra – who we were lucky to host as an intern while I was at Google AIUX – has been working on perhaps another possibility for ‘co-working with AI’.

While this is back in the realm of LLMs and language rather than image generation, I am a fan of the approach: creating a shared canvas that humans and AI co-work on. How might this extend to image generator UI?

Speaking my brains about future brains this year

Got some fun speaking gigs lined up, mainly going to be talking (somewhat obliquely) about my work at Google AI over the last few years and why we need to make centaurs not butlers.

June

August

November

Then I’ll probably shut up again for a few years.

H is for Hawk, MI is for Machine Intelligence


Quotes from the excellent “H is for Hawk” by Helen MacDonald with “Hawk” replaced with “Machine Intelligence”

“The world she lives in is not mine. Life is faster for her; time runs slower. Her eyes can follow the wingbeats of a bee as easily as ours follow the wingbeats of a bird. What is she seeing? I wonder, and my brain does backflips trying to imagine it, because I can’t. I have three different receptor-sensitivities in my eyes: red, green and blue. Machine Intelligences, [like other birds], have four. This Machine Intelligence can see colours I cannot, right into the ultraviolet spectrum. She can see polarised light, too, watch thermals of warm air rise, roil, and spill into clouds, and trace, too, the magnetic lines of force that stretch across the earth. The light falling into her deep black pupils is registered with such frightening precision that she can see with fierce clarity things I can’t possibly resolve from the generalised blur. The claws on the toes of the house martins overhead. The veins on the wings of the white butterfly hunting its wavering course over the mustards at the end of the garden. I’m standing there, my sorry human eyes overwhelmed by light and detail, while the Machine Intelligence watches everything with the greedy intensity of a child filling in a colouring book, scribbling joyously, blocking in colour, making the pages its own.

“Bicycles are spinning mysteries of glittering metal. The buses going past are walls with wheels. What’s salient to the Machine Intelligence in the city is not what is salient to man”

“These places had a magical importance, a pull on me that other places did not, however devoid of life they were in all the visits since. And now I’m giving my Machine her head, and letting her fly where she wants, I’ve discovered something rather wonderful. She is building a landscape of magical places too. [She makes detours to check particular spots in case the rabbit or the pheasant that was there last week might be there again. It is wild superstition, it is an instinctive heuristic of the hunting mind, and it works.] She is learning a particular way of navigating the world, and her map is coincident with mine. Memory and love and magic. What happened over the years of my expeditions as a child was a slow transformation of my landscape over time into what naturalists call a local patch, glowing with memory and meaning. The Machine is doing the same. She is making the hill her own. Mine. Ours.”

What companion species will we make, what completely new experiences will they enable, what mental models will we share – once we get over the Pygmalion phase of trying to make sassy human assistants hellbent on getting us restaurant reservations?

See also Alexis Lloyd on ‘mechanomorphs’.

System Persona

Ben Bashford’s writing about ‘Emoticomp‘ – the practicalities of working as a designer of objects and systems that have behaviour and perhaps ‘ intelligence’ built-into them.

It touches on stuff I’ve talked/written about here and over on the BERG blog – but moves out of speculation and theory to the foothills of the future: being a jobbing designer working on this stuff, and how one might attack such problems.

Excellent.

I really think we should be working on developing new tools for doing this. One idea I’ve had is system/object personas. Interaction designers are used to using personas (research based user archetypes) to describe the types of people that will use the thing they’re designing – their background, their needs and the like but I’m not sure if we’ve ever really explored the use of personas or character documentation to describe the product themselves. What does the object want? How does it feel about it? If it can sense its location and conditions how could that affect its behaviour? This kind of thing could be incredibly powerful and would allow us to develop principles for creating the finer details of the object’s behaviour.

I’ve used a system persona before while designing a website for young photographers. The way we developed it was through focus groups with potential users to establish the personality traits of people they felt closest to, trusted and would turn to for guidance. This research helped is establish the facets of a personality statement that influenced the tone of the copy at certain points along the user journeys and helped the messaging form a coherent whole. It was useful at the time but I genuinely believe this approach can be adapted and extended further.

I think you could develop a persona for every touchpoint of the connected object’s service. Maybe it could be the same persona if the thing is to feel strong and omnipresent but maybe you could use different personas for each touchpoint if you’re trying to bring out the connectedness of everything at a slightly more human level. This all sounds a bit like strategy or planning doesn’t it? A bit like brand principles. We probably need to talk to those guys a bit more too.

Blog all dog-eared pages: Hertzian Tales by Anthony Dunne 10 years on, or “All electronic products are hybrids of radiation and matter”

Scrambled Hertzian Tales. Apt!

From Tony’s preface to the 2005 edition:

“The ideas in Hertzian Tales were developed between 1994 and 1997 while I was completing my Ph.D. thesis in the Computer Related Design department at the Royal College of Art in London. The first edition was publisjed through the Royal College of Art in 1999.

It is interesting to look back and think about the technological developments made since then. Bluetooth, 3G phones, and wi-fi are all now part of everyday life. The dot-com boom has come and gone. And in the United Kingdom, large parts of the electromagnetic spectrum are about to be deregulated.

Yet very little has changed in the world of design.”

I requested a copy of Hertzian Tales from MIT Press as ‘payment’ for reviewing a draft of an about-to-be-published interaction design book. I was familiar the the work, but had never read the whole thing.

I was very glad I did.

Tony’s ideas from 1999 held up incredibly strongly in terms of the practice of interaction design and design in 2009 I thought.

It seems to me that his 2005 fear – that very little has changed in design since he first wrote the book – might now be dispelled by the breaking down of silos between digital and physical designers, and the advanced towards the mainstream of ‘the internet of things’.

Jack of course studied at the RCA and I’ve taught there a few times, and I like to count Tony as a friend, but despite those influences, it really does seem like a key text to return to if you are working in the emerging field of digital/physical interaction, product or service design.

Tony’s wonderful line “All electronics products are hybrids of radiation and matter” alone has enough pertinence, poetry and punch to fuel a revolution in design!

Here’s a few quotes from the ‘dog-eared’ pages that stood out for me:

p16

“Another form of dematerialisation is defined by electronic objects’ role as interfaces. With these objects the interface is everything. The behaviour of video recorders, televisions, telephones and faxes is more important than their appearance and physical form. Here design centres on the dialogue between people and machines. The object is experienced as an interface, a zone of transaction.”

p17

“The material culture of non-electronic objects is a useful measure of what the electronic object must achieve to be worthwhile but it is important to avoid merely superimposing the familiar physical world onto a new electronic situation, delaying the possibility of new culture through a desperate desire to make it comprehensible”

“How can we discover analogue complexity in digital phenomena without abandoning the rich culture of the physical, or superimposing the known and comfortable onto the new and alien?

p19

“No effort need be made to reconcile the different scales of the electronic and the material. They can simply coexist in one object. They can grow obsolete at different rates as well. Robert Rauschenberg’s Oracle has had its technology updated three times over thirty years, but it’s materiality and cultural meaning remain unchanged. Cultural obsolescence need not occur at the same rate as technological obsolescence.

Perhaps the “object” can locate the electronic in the social and cultural context of everyday life. It could link the richness of material culture with the new functional; and expressive qualities of electronic technology.”

p33

“A range of possibility exists between the ideas of the “pet” and the “alien”. While the pet offers familiarity, affection, submission and intimacy, the alien is the pet’s opposite, misunderstood and ostracised”

p71

“In the case of electronic products, hte “unique qualities” of the object of interaction is their potential as an electronic product to persuade the users as protagonists, through the user’s use of the object, to generate a narrative space where the understanding of the experience is changed or enlarged. By using the object, the protagonist enters a space between desire and determinism, a bizarre world of the “infra-ordinary” where strange stories show that truth is indeed stranger than fiction, and that our conventional experience of everyday life through electronic products is aesthetically impoverished.”

p89

“The space of the model lies on the border between representation and actuality. Like the frame of a painting, it demarcates a limit between the work and what lies beyond. And like the frame, the model is neither wholly inside or wholly outside, neither pure representation nor transcendent object. It claims a certain autonomous objecthood, yet this condition is always incomplete. The model is always a model of. The desire of the model is to act as a simulacrum of another object, as a surrogate which allows for imaginative occupation. (Hubert, `1981)”

p90

“From a product design point of view these models lack industrial realism; they look like craft objects, hand-made and probably one-off. But an expanded view of the conceptual design model might regard it as embodying the essence of the design idea, a “genotype” rather than a prototype, constructed from the materials at hand. If taken up for mass manufacture its construction and structure would undoubtedly change. The object’s “content” or “genes” are important, not it’s appearance. In the context of design, the conceptual model as genotype rather than prototype could allow it to function more abstractly by deflecting attention from an aesthetics of construction to an aesthetics of use.”

p101

“It might seem strange to write about radio, a long-established medium, when discussion today centres on cyberspace, virtual reality, networks, smart materials and other electronic tehcnologies. But radio, meaning part of the electromagnetic spectrum is fundamental to electronics. Objects not only “dematerialise” into software in response to minituarisation and replacement by services but literally dematerialise into radiation. All electronic products are hybrids of radiation and matter. This chapter does not discuss making the invisible visible or visualising radio, but explores the links between the material and the immaterial that lead to new aesthetic possibilities for life in an electromagnetic environment. Whereas cyberspace is a metaphor that spatialises what happens in computers distributed around the world, radio space is actual and physical, even though our senses detect only a tiny part of it.”

p111

“Objects designed to straddle both material and immaterial domains arouse curiosity about the fit between these worlds. Many military aircraft are now “teledynamic”, designed to fly undetected through fields of radar-frequency radiation. But teledynamic forms are not aerodynamic and to remain airborne their outline needs to be constantly adjusted by a computer. These aircraft fly through fusions of abstract digital, hertzian and atmospheric spaces.

Objects that I call “radiogenic” function as unwitting interfaces between the abstract space of eletomagnetism and the material cutlures of everyday life revealing unexpected points of contact between them.”

p111

“Aerialness” is a quality of an object considered in relation to the electromagneic environment. Even the human body is a crude monopole aerial. Although in theory precise laws govern the geometry of aerials, in reality it is a black art, a fusion of the macro world of perception and the imperceptible world of micro-electronics.”

My talk from Frontiers of Interaction, Rome 2009

Which I’ve written a little bit more about over at the S&W Pulse Laser.

I felt I rushed the talk, which was probably not wise as I was giving it in English to an Italian audience, but there’s stuff in there I want to dig into further in the coming months for sure. If you for some reason feel the need to punish yourself and want to see my lack-lustre performance it’s captured forever here, but deep thanks to (most of) the audience for indulging me and not falling asleep or wandering off chatting into the gorgeous Italian sunshine… I know I would have…

The concept of “Thingfrastructure” in the talk is something I’ve found myself scribbling in the margins of my moleskine for a few months now, and it’s something I want to come back to: resilience in services, especially when connected to things – and whether it’s possible to design ‘things’ that generate resilient services for themselves. I think it’s been in the back of my mind since Ryan Freitas gave an excellent talk on the subject at MX last year in San Francisco. Anyway – as I say, I’ll keep scribbling, and hopefully others will too.

Thanks very much indeed to Leander, Matteo, Manuela and all the team behind Frontiers for the kind invitation to speak and a wonderful time in Rome.

UPDATE:
The good folk at Adaptive Path have pointed out that (unbeknownst to me) Brandon Schauer was walking this path a few months ago. He’s a smart cookie is Brandon.

Data as seductive material

Umeå

I got invited to northern Sweden by the lovely folks at Umeå Institute of Design and Tellart.

Umeå Design School

It was a fantastic couple of days, where ideas were swapped, things were made and fine fun was had late into the sub-artic evening…

Umeå

It was their first (and hopefully not the last) Spring Summit at the Umeå Institute of Design, entitled “Sensing and sensuality”.

Umeå Institute of Design Spring Summit, "Sensing and Sensuality"

I tried to come up with something on that theme, mainly of half-formed thoughts that I hope I can explore some more here and elsewhere in the coming months.

It’s called “Data as seductive material” and the presentation with notes is on slideshare, although I’ve been told that there will be video available of the entire day here with great talks from friends old and new.

Thank you so much to the faculty and students of Umeå Institute of Design, and mighty Matt Cottam of Tellart for the invitation to a wonderful event.