Magazine: Interfaces on the go
Enabling mobile micro-interactions with physiological computing.
Interfaces on the go
Full text also available in the ACM Digital Library as PDF | HTML | Digital Edition
We have continually evolved computing to not only be more efficient, but also more accessible, more of the time (and place), and to more people. We have progressed from batch computing with punch cards, to interactive command line systems, to mouse-based graphical user interfaces, and more recently to mobile computing. Each of these paradigm shifts has drastically changed the way we use technology for work and life, often in unpredictable and profound ways.
With the latest move to mobile computing, we now carry devices with significant computational power and capabilities on our bodies. However, their small size typically leads to limited interaction space (diminutive screens, buttons, and jog wheels) and consequently diminishes their usability and functionality. This presents a challenge and an opportunity for developing interaction modalities that will open the door for novel uses of computing.
Researchers have been exploring small device interaction techniques that leverage every available part of the device. For example, NanoTouch, developed by Patrick Baudisch and Gerry Chu at Microsoft Research, utilizes the backside of devices so that the fingers don't interfere with the display on the front  (see also in this issue "My New PC is a Mobile Phone," page 36). In more conceptual work, Ni and Baudisch explore the advent of "disappearing mobile devices" (see ).
Other researchers have proposed that devices should opportunistically and temporally "steal" capabilities from the environment, making creative use of existing surfaces already around us . One example of this type of interaction is Scratch Input, developed by Chris Harrison and Scott Hudson of Carnegie Mellon's HCI Institute. This technique allows users to place devices on ordinary surfaces, like tables, and then use them as ad hoc gestural finger input canvases. This is achieved with a microphone on the underside that allows the device to sense audio signals transmitted through the material, like taps and scratches . These types of solutions work really well in situations where the user is situated (in an office, airport, hotel room), but is impractical when the user is on the go.
This mobile scenario is particularly challenging because of the stringent physical and cognitive constraints of interacting on-the-go. In fact, Antti Oulasvirta and colleagues showed that users could attend to mobile interaction bursts in chunks of about 4 to 6 seconds before having to refocus attentional resources on their real-world activity (see  for the full write up). At this point, the dual task becomes cognitively taxing as users are constantly interrupted by having to move focus back and forth. In a separate line of work, Daniel Ashbrook of Georgia Institute of Technology measured the overhead associated with mobile interactions and found that just getting a phone out of the pocket or hip holster takes about 4 seconds and initiating interaction with the device takes another second or so . They propose the concept of micro-interactionsinteractions that take less than 4 seconds to initiate and complete, so that the user can quickly return to the task at hand. An example of this type of interaction is Whack Gestures , created by Carnegie Mellon and Intel Labs researchers, where quite simply, you do things like whack the phone in your pocket to silence an incoming phone call.
"Micro-interactions could significantly expand the set of tasks we could perform on-the-go and fundamentally alter the way we view mobile computing."
We believe that such micro-interactions could significantly expand the set of tasks we could perform on-the-go and fundamentally alter the way we view mobile computing. We assert that while seemingly subtle, augmenting users with always-available micro-interactions could have impact on the same magnitude that mobile computing had on enabling a set of tasks that were never before possible. After all, who would have imagined mobile phonse would make the previously onerous task of arranging to meet a group of friends for a movie a breeze? Who would have imagined when mobile data access became prevalent that we'd be able to price shop on-the-fly? Or resolve a bar debate on sports statistics with a quick Wikipedia search? Imagine what we could enable with seamless and even greater access to information and computing power.
To realize this vision, we've been looking at ways to enable micro-interactions. Often, this involves developing novel input modalities that take advantage of the unique properties of the human body. In this article, we describe two such technologies: one that senses electrical muscle activity to infer finger gestures, and the other that monitors bio-acoustic transmissions through the body, allowing the skin to be turned into a finger-tap-sensitive interaction surface. We conclude with some of the challenges and lessons learned in our work using physiological sensing for interaction.
Removing manipulation of physical transducers does not necessarily preclude leveraging the full bandwidth available with finger and hand gestures. To date, most efforts at enabling implement-free interaction have focused on speech and computer vision, both of which have made significant strides in recent years, but remain prone to interference from environmental noise and require that the user make motions or sounds that can be sensed externally and cannot be easily concealed from people around them.
Advances in muscular sensing and processing technologies provide us with the unprecedented opportunity to interface directly with human muscle activity in order to infer body gestures. To contract a muscle, the brain sends an electrical signal through the nervous system to motor neurons, which then transmit electrical impulses to adjoining muscle fibers, causing them to contract and the body to move. Electromyography (EMG) senses this muscle activity by measuring the electrical potential between ground and a sensor electrode.
In our work, we focus on a band of sensors placed on the upper forearm that senses finger gestures on surfaces and in free space (see Figures 1 and 2). We have recently built a small, low-powered wireless prototype EMG unit that uses dry electrodes and that can be placed in an armband form factor, making it continuously wearable as an always-available input device. The signals from this device are streamed to a nearby computer, where features are extracted and machine learning used to model and classify gestures. However, this could also be done entirely on a mobile device.
Reasonably high accuracies can be achieved for gestures performed on flat surfaces. In one experiment with 13 novice users, we attained an average of 78 percent accuracy for sensing whether each of two fingers is curled, 84 percent for which of several pressure levels are being exerted on the surface, 78 percent for which of the five fingers have tapped the surface, and 95 percent for which of the five have lifted off the surface.
"With the latest move to mobile computing, we now carry devices with significant computational power and capabilities on our bodies."
Similarly, in a separate test with 12 different novice users, we attain 79 percent classification accuracy for pinching the thumb to fingers in free space, 85 percent when squeezing different fingers on a coffee mug, and 88 percent when carrying a bag. These results demonstrate the feasibility of detecting finger gestures in multiple scenarios, and even when the hands are otherwise occupied with other objects.
To further expand the range of sensing modalities for always-available input systems, we developed Skinput (see Figure 3), a novel input technique that allows the skin to be used as a finger input surface. When a finger taps the skin, several distinct forms of acoustic energy are produced and transmitted through the body. We chose to focus on the arm, although the technique could be applied elsewhere. This is an attractive area to "steal" for input as it provides considerable surface area for interaction, including a contiguous and flat area for projection.
Using our prototype, we've conducted several experiments that demonstrate high classification accuracies even with a large number of tap locations. This remains true even when the sensing armband was placed above the elbow (where taps are both separated in distance and by numerous joints). For example, for a setup in which we cared to distinguish between taps on each of the five fingers, we attain an average accuracy of 88 percent across our 13 novice participants. If we spread the five locations out across the whole arm, the average accuracy goes up to 95 percent. The technique remains fairly accurate even when users are walking or jogging. Although classification is not perfectnor will it likely ever bewe believe the accuracy of our proof-of-concept system clearly demonstrates that real-life interfaces could be developed on top of the technique.
While our bio-acoustic input approach is not strictly tethered to a particular output modality, we believe the sensor form factors we explored could be readily coupled with a small digital projector. There are two nice properties of wearing such a projection device on the arm: 1) the arm is a relatively rigid structurethe projector, when attached appropriately, will naturally track with the arm; 2) since we have fine-grained control of the arm, making minute adjustments to align the projected image with the arm is trivial (e.g., projected horizontal stripes for alignment with the wrist and elbow).
Using the human body as the interaction platform has several obvious advantages. Foremost, it is great that we can assume a consistent, reliable, and always-available surface. We take our bodies everywhere we go (or rather it takes us). Furthermore, we are intimately familiar with our bodies, and proprioceptive senses allow us to interact even in harsh circumstances (like a moving bus). We can quickly and easily make finger gestures or tap on a part of our body, even when we cannot see it and are on the move.
"Who would have imagined when mobile data access became prevalent that we'd be able to price shop on-the-fly? Or resolve a bar debate on sports statistics with a quick Wikipedia search? Imagine what we could enable with seamless and even greater access to information and computing power."
That said, using the signals generated by or transmitted through the body as a means of intentional control comes with various new challenges and opportunities for innovation. From a technical perspective, building models of these signals that work across multiple users and multiple sessions with minimal calibration is often challenging. Most of our current work is calibrated and trained each time the user dons the device, and while these individual models work surprisingly well across different body types, we recognize that this overhead of training is not acceptable for real world use. Furthermore, regardless of universality of the models, processing the often-noisy signals coming from these sensors is not trivial and will likely never yield perfect results. This is true because of the complexity of the noise patterns as users move through different environments, perform different tasks, and as the physiological signals changes throughout the course of their normal activities. Hence, interaction techniques must be carefully designed to tolerate or even take advantage of imperfect interaction input.
On the interaction design front, there are many problems that must be addressed. For example, the system must provide enough affordances that the user can learn the new system. This is not specific to physiological sensing, though the level of indirect interpretation of signals can sometimes make end-user debugging difficult, especially when the system does not act as it is expected to. The interface must also be designed to handle the "midas touch" problem, in which interaction is unintentionally triggered when the user performs everyday tasks like turning a doorknob. We have purposely designed our gesture sets in order to minimize this, but we imagine there are more graceful solutions.
In fact, with many interaction modalities, our first instinct is often to emulate existing modalities (e.g., mouse and keyboard) and use it to control existing interfaces. However, the special affordances found in the mobile scenario bring with it enough deviations from our traditional assumptions that we must be diligent in designing for it. We should also emphasize the importance of designing these systems so that they operate seamlessly with other modalities and devices that the user carries with them.
6. Hudson, S. E., Harrison, C., Harrson, B. L., LaMarca, A. 2010. Whack gestures: Inexact and inattentive interaction with mobile devices. In Proceedings of the 4th International Conference on Tangible, Embedded and Embodied Interaction [Cambridge, MA, January 25 27, 2010]. TEI '10. ACM, New York, NY.
Desney Tan is a senior researcher at Microsoft Research, where he manages the Computational User Experiences group in Redmond, Washington and the Human-Computer Interaction group in Beijing, China. He has won awards for his work on physiological computing and healthcare, including a 2007 MIT TR35 Young Innovators award, SciFi Channel's Young Visionaries at TED 2009, and named to Forbes' Revolutionaries list in 2009. He will chair the CHI 2011 Conference, which will be held in Vancouver, BC.
Dan Morris is a researcher in the Computational User Experiences group in Microsoft Research. His research interests include computer support for musical composition, using physiological signals for input, and improving within-visit information accessibility for hospital patients. Dan received his PhD in Computer Science from Stanford University in 2006.
T. Scott Saponas is a PhD candidate in the Computer Science and Engineering department at the University of Washington. His research interests include Human-Computer Interaction (HCI), Ubiquitous Computing (UbiComp), and Physiological Computing. Scott received his B.S. in Computer Science from the Georgia Institute of Technology in 2004.
Figure 1. To contract a muscle, the brain sends an
electrical signal through the nervous system to motor neurons,
which then transmit electrical impulses to adjoining muscle
fibers, causing them to contract. Electromyography [EMG] senses
this muscle activity by measuring the electrical potential
between a ground electrode and a sensor electrode.
Figure 2. Our prototype features two arrays of sensing
elements incorporated into an armband form factor. Each element
is a cantilevered piezo film tuned to respond to a different,
narrow, low-frequency band of the acoustic spectrum.
©2010 ACM 1528-4972/10/0600 $10.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2010 ACM, Inc.
To comment you must create or log in with your ACM account.
HCI Resources & Bibliography
The ACM Special Interest Group on Computer Human Interaction (SIGCHI) keeps a handy calendar of events related to HCI
Graphical User Interface
Human-Computer Interaction: a subfield of computer science
A usually flat surface that can detect multiple finger gestures, popularized by the iPhone, and a common component of tangible user interfaces
Tangible User Interface
Windows, Icons, Menus, Pointers, the typical way we interact with a GUI
What You See is More or Less What You Get