Ross Peterson's Computer Human Interaction Blog

Sunday, May 9, 2010

HCI Remixed

My Vision isn't my vision: Making a Career out of Getting Back Where I Started

The story was written William Buxton and follows his story as a music undergraduate testing the NRC's digital music machine which was used to study computer human interaction.

The music machine sported an animation monitor and bi-manual input.

On the left hand Buxton could enter note duration on a keyboard with 5 keys.
On the rigt hand he could enter the pitch of the note using either a primitive version of the mouse or the 2 wheel knobs. Buxton opted for the wheel knobs.

Buxton felt that the machine was very advanced for its time and highlights that we should consider the user first when developing systems.

_______________

Drawing on Sketchpad: Reflections on Computer Science and HCI

The author of this short story, Joseph A. Konstant, talked about Sutherland's work on the sketchpad that featured many features that were ahead of their time such as:
Pointing with a pen light
Rendering lines, circles, and text
Constraints of the system and their displays
Data structures, algorithms, and object oriented programming structures

Sutherland essentially laid all the foundational work of graphical displays and drawing that rivals CAD systems.

Konstan said that we should innovate, communicate and not compute, and that we need to give more focus on systems for experts and not focus too much on knowledge workers.
____________________
The mouse, the demo, and the big idea

Stanford's Wendy Ju wrote an article that talked about Engelbart's online system (NLS) and the big demo that introduced and shock the world with the mouse.

Engelbart's system wasn't automatically accepted because research on the time was focused on office automation and artificial intelligence.

The goal of the demo was to change the way people thought, which it did, but not in the way Engelbart intended. Too many people were focused on the mouse.

Stanford had a "demo or die" culture.
Demonstrations create converts and makes a sale rather than just informs.

Wendy Ju argues that the computer is a tool to enhance and empower humanity rather than replace human input.
________________
My spill:

The articles presented here provided an interesting view into the past of computing and CHI that demonstrated how much the field has changed (alot) and how much has stayed the same (nearly everything).

It's interesting to note that nothing we research is entirely new, but usually just research that has been brought to the spotlight again, but this time in a different light, approach, and researcher.

Wednesday, April 14, 2010

CHI '08: From meiwaku to tokushita!: lessons for digital money design from japan

Authors:
Scott Mainwaring Intel Research, Portland, OR, USA
Wendy March Intel Research, Portland, OR, USA
Bill Maurer UC Irvine, Irvine, CA, USA

Paper Link:
http://portal.acm.org/citation.cfm?id=1357054.1357058&coll=ACM&dl=ACM&type=series&idx=SERIES260∂=series&WantType=Proceedings&title=CHI&CFID=://tamuchi2010a.blogspot.com/p/assignments.html&CFTOKEN=tamuchi2010a.blogspot.com/p/assignments.html

Mainwaring et. al. discuss the finding of an ethnographic study on the effects of e-money in Japan, particularly Tokyo and Okinawa (the Hawaii of Japan).
The main focus of the ethnographic study of e-money was on Near-Field Communication (NFC) enabled into cards, passes, and mobile devices. The reason the team chose Japan as the place for the ethnography is that it already has a high adoption rate of various forms of e-money. Mainwaring et. al. studied 3 different brands of emoney:

The main results of the study found that the reason for the high rate of adoption of e-money was the deeply engrained Japanese wish to reduce "meiwaku" 迷惑　which means "nuisance" or "bother." This plays heavily into Japanese society where the needs of the community often trump individual concerns and not wishing to stand out or bother others. This sense of meiwaku also highlights why only 1/10th of transactions in Japan are done via credit card whereas in the U.S. it's 1/4th of all transactions.

With suica, people could simply move past a turnstile and have their card automatically charged without holding up the flow of traffic.

While Edy also offers auto-charging with NFC technology, it could also increase meiwaku by holding up lines when the e-money ran out. Furthermore, putting more money into the account required finding charging stations of the same brand and on top of that, the card can only be used in stores supporting the brand. Finally, by law, money converted into e-money cannot be converted back into regular cash.

The other main theme found in the use of E-money is "Tokushita" 得した or well done/advantage gained. This refers to the rewards gained by using the e-money through rewards programs and gaining "something for nothing" out of using the card. Suica for example, allowed customers to earn travel miles for a certain amount spent. The study found that people would go out of their way to use their cards out of a sense of tokushita and to gain rewards.

The design considerations that the ethnography suggests that e-cash systems should:
1) result in a net decrease in commotion, before, during, and after point of sale.
2) Be designed for public use and take into account the environment of the transaction.
3) support management of their money without either introducing new burdens nor decreasing friction to a point of invisible spending
4) Subtly engage multiple senses, both for practical and aesthetic issues.
5) Leave room for dreams, irrationality, and for tokushita! Money is not just about exactness and frugality; it's also about fun. If e-money brightens your day then it might also fit into your life.

__________
My spill:

I was interested in this study primarily because I'm studying Japanese currently and thought I'd like to hear some of the cultural implications in spending. I can't say that I learned just a whole lot but it was interesting. think we can all appreciate not wanting to burden or be a nuisance on people and keeping that in mind for designing any technology is important.

I would like to see future work address their design considerations listed in the end. Particularly in how NFC can be employed such that people aren't charged accidentally and being able to reverse a transaction should that happen. Also, I'd like to know whether it's possible to convert e-cash into say a credit card in the U.S. or if that's just a Japanese law.

Incorporating a rewards system for these kinds of transaction is a smart business move, I think. It keeps people motivated for using your card.

The authors also mentioned that the Japanese really focus on delivering aesthetic satisfaction in using their products. I think we should do that more in the states.

Tuesday, April 13, 2010

IUI '08: Designing and assessing an intelligent e-tool for deaf children

Authors:
Rosella Gennari Free University of Bozen-Bolzano, Bolzano, Italy
Ornella Mich Free University of Bozen-Bolzano, Bolzano, Italy

Paper Link:
http://portal.acm.org/citation.cfm?id=1378773.1378821&coll=ACM&dl=ACM&type=series&idx=SERIES823&part=series&WantType=Proceedings&title=IUI&CFID=81639924&CFTOKEN=12013848

In this paper, Gennari and Mich designed an intelligent e-web based program called LODE (LOgic based e-tool for DEaf children) that aimed at cultivating the reading and reasoning skills of deaf children.

The aim for their system is best understood in the light of the difficulties that the deaf experience in language. Deaf children have difficulty developing their reading and reasoning skills as they are largely deprived of the constant exposure to language. Deaf people encode information differently from those that can hear and organize and access knowledge in different ways. The deaf focus on details and images as opposed to relations amongst concepts.

Specifically, their system focused on "stimulating global deductive reasoning" on entire narratives. Their LODE system does this by extracting temporally sensitive words using a logic system and automated temporal reasoning. The system can logically arrange the given input language (Italian in this case) and generate global deductive reasoning questions based on the story.

The architecture of the system is based on a web client-server model composed of several modules:
1) e-stories database
2) Automated reasoner made up of:
a) ECLiPSe - constrainst based programming system
b) a knowledge base for ECLiPSe
c) domain knowldge of constraint problems formalizing the temporal information of the e-stories

The GUI consists of a simple page framed in yellow (for concentration) with a picture and a sentence from the story on a blue background (for calmness) along with buttons to go to the next and previous pages and a dictionary to look-up difficult words. Temporal words are highlighted in orange to draw attention to temporal concepts that the user should remember.

They tested their system with bringing together LIS interpreters, a logopaedist, a linguist expert of deaf studies, a cognitive psychologist expert of deaf studies, and two deaf children.

One kid who was 13 years of age completed the stories easily while another 8 year old kid had trouble navigating the interface. Feedback from the experts was positive.

______________
My Spill:

It's great that they're constructing advanced educational tools for deaf children. It seems like a great system for exceptionally young children, but I would think the questions for older children using the system would need to have stories and questions crafted by a human.

Their testing of the system was terribly insufficient. They needed to test more children using their system. Of course the experts on deaf studies will approve the system. After all, they are interesting in promoting work in their own fields.

IUI '08 (assignment): Temporal semantic compression for video browsing

Authors:
Brett Adams Curtin University of Technology, Perth, W. Australia
Stewart Greenhill Curtin University of Technology, Perth, W. Australia
Svetha Venkatesh Curtin University of Technology, Perth, W. Australia

Paper Link:
http://portal.acm.org/citation.cfm?id=1378773.1378813&coll=ACM&dl=ACM&type=series&idx=SERIES823∂=series&WantType=Proceedings&title=IUI&CFID=81639924&CFTOKEN=12013848

Adams et. al. set out a video browsing approach known as Temporal Semantic Compression (TSC) that allows for unique ways of browsing and playing video data based on tempo and interest algorithms.

With interest algorithms, which can be installed to the browser using customizable plug-ins, a video can be filtered in terms of what the user is looking for in the video. An interesting application highlighted in the paper is that of applying different interest algorithms based on the genre.

For example, we could use:
excitement algorithms for sports
anxiety for home surveillance and news story change
attention for home home videos
etc...

The controls for the temporal compression based video browser employ a 2d spatial control on the display screen where the horizontal axis controls the point in the video whereas the vertical axis controls the compression. (compression is the amount of the video remaining from the original. i.e. 20% compression leaves 20% of the shots from the original video. 100% compression only leaves the "most intersting" frame of the video.)

The main measure of interest to derive which frames are selected in compression is calculated by determining the tempo. Tempo is determined by the director of the video by using action, music, dialog to affect the audiences sense of the time in the film. This video compression browser measures tempo by the pan, tilt, volume.
The calculation is as follows:

3 timescales:
1) Frame level features are in the timescale in the original movie. Adjusts playback point.
2) Shot level features are in the timescale that weights the timescale durations as being equal.
3) Compression level is where the compression functions can be changed.

Example compression functions:

Default (linear) - playback is in a linear pace much like the regular playback and fast forward functions.

Midshot - takes a constant amount from each shot (section) chosen by the pacing algorithm

Pace Proportional - uses the pacing tempo to continuously vary the playback speed. When the tempo is low the playback increases leading to more playback from higher tempo sections. (i.e. the more important sections are favored for playback)

Interesting shots - Applies speed up and compression and entire shots that consist of lower tempos are left out.

Adams et. al. tested their system on several movies, news shows, commercial, cartoons and talk shows and found that their compression algorithm could successfully pull out meaningful and interesting chunks of shots from the clips.

Video: (should make it easier to understand)

__________
My Spill:

The Temporal Semantic Compression scheme is a great idea from my perspective. Most media players only support regular playback and fastforward and scene selection but I've never seen a browsing tool for choosing interesting parts of the video.
That's really cool.

The plugable functions could make the user able to search for different points of interest. (maybe I just want to find the action scenes in a movie.)

The real improvement in their interface would be to reduce the amount of metrics are shown so that screen space can be maximized.

IUI '08: Multimodal Chinese text entry with speech and keypad on mobile devices

Authors:
Yingying Jiang Chinese Academy of Sciences, Beijing, China
Xugang Wang Chinese Academy of Sciences, Beijing, China and Ministry of information Industry Software and Integrated Circuit Promotion Center
Feng Tian Chinese Academy of Sciences, Beijing, China
Xiang Ao Ministry of information Industry Software and Integrated Circuit Promotion Center
Guozhong Dai Chinese Academy of Sciences, Beijing, China
Hongan Wang Chinese Academy of Sciences, Beijing, China

Paper Link:
http://portal.acm.org/citation.cfm?id=1378773.1378825&coll=ACM&dl=ACM&type=series&idx=SERIES823&part=series&WantType=Proceedings&title=IUI&CFID=81639924&CFTOKEN=12013848

In this paper Jiang et. al. created a multimodal text entry system that uses both keypad and speech entry to reduce the amount of key-presses, time to enter the characters, and number of resulting possible characters to choose from when using a mobile device.

Jiang et. al. identify the problem of chinese text entry on mobile keypads as slow and arduous and set out to improve the input method for these characters. The current method is called T9 in which roman phonetic characters (pinyin) corresponding to the sound of the chinese characters are input and then the desired characters are selected from a list of homophones. However this is slow and arduous so the Jiang et. al. proposed a method called "Jianpin" where the initial sound of the each chinese character the user wants is input via keyboard while the user simultaneously says the word they wish to enter.

For example, if the user wants to enter "wang luo" 网络 (network) into a mobile phone using Jianpin, the user presses "95" which corresponds to "w.l" while saying "wang luo" then the user selects 网络　from several other homophones.

Here is an overview of the input method:

A user study was run with 4 college students where 50 words were inputted in both the T9 method and the "Jianpin" method. They measured the number of key presses it took to complete the 50 words with each method. The results are as follows:

_________________
My spill:
The Jianpin input system sounds like a great way to reduce ambiguity in the selection set as well as speed up input.
My only bone to pick is that the input scheme requires voice input. I can imagine being on a crowded street in china with hundreds of Chinese entering voice input into their cell phones just so they can text.
It's just more noise pollution that way.
If they can make a faster system without voice input, I'll be impressed.

Knowing Japanese, I was really interested in how the Chinese entered text since they don't have a phonetic system like the Japanese. In the end, it really isn't all that different.

Monday, April 12, 2010

CHI '08 (assignment): Reality-based interaction: a framework for post-WIMP interfaces

(Comment left on Brandon Jarratt's blog

Authors:
Robert J.K. Jacob Tufts University, Medford, MA, USA
Audrey Girouard Tufts University, Medford, MA, USA
Leanne M. Hirshfield Tufts University, Medford, MA, USA
Michael S. Horn Tufts University, Medford, MA, USA
Orit Shaer Tufts University, Medford, MA, USA
Erin Treacy Solovey Tufts University, Medford, MA, USA
Jamie Zigelbaum MIT Media Lab, Cambridge, MA, USA

Paper Link:
http://portal.acm.org/citation.cfm?id=1357054.1357089&coll=ACM&dl=ACM&type=series&idx=SERIES260&part=series&WantType=Proceedings&title=CHI&CFID=://tamuchi2010a.blogspot.com/p/assignments.html&CFTOKEN=tamuchi2010a.blogspot.com/p/assignments.html

In this paper, Jacob et. al. discuss the emerging methods of human computer broadly referred to as reality based interfaces (RBI) and identify the unifying themes and concepts of these methods.

The research team first notes that human computer interaction was initially done via command line instructions that were typed in through a keyboard. This method of interaction was cumbersome and relied on knowledge of the command the computer would accept. It was difficult to use in part because users could not use preconceived notions of interaction.

Next they identified that the current generation of HCI is direct manipulation of 2 widgets commonly known as window, icon, menu, pointing device (WIMP) interfaces.

Finally the emerging methods of interaction are reality based interaction (RBI) that they define as drawing from four overarching themes:
1) Naive Physics
2) Body Awareness & Skills
3) Environment Awareness & Skills
4) Social Awareness & Skills

The team notes that using RBI themes may enhance or inhibit:
Expressive Power
Efficiency
Versatility
Ergonomics
Accessibility
Practicality

The team uses Superman as an analogy saying that a strictly reality based representation of Superman would only allow Superman to walk and see like a regular man, but instead reality is traded off for the extra functionality of flight and X-ray vision.

The team demonstrates the four themes of RBI and the resulting tradeoffs in several case studies:
1) URP (a tangible user interface for urban planning)
2) Apple iPhone
3) Electronic Tourist Guide
4) Visual-Cliff Virtual Environment

The research team hopes this paper provides a scheme that unites the divergent user interfaces into a common framework that will be adopted by interface designers to create better systems in the future and that their research also provides a method to analyze future interfaces.

_____________
My Spill:

While their work is an interesting summary of reality based interfaces, I feel like this research didn't generate anything we didn't already know.
Reality is an ever emerging theme in CHI
and using reality based interfaces introduces several considerations and tradeoffs.

That's essentially all this paper was.
I'd like to see them present a set of ideal interfaces for a system or something.

The Superman analogy was nice.

Sunday, April 11, 2010

Rich interfaces for reading news on the web

Authors:
Earl J. Wagner Northwestern University, Evanston, IL, USA
Jiahui Liu Northwestern University, Evanston, IL, USA
Larry Birnbaum Northwestern University, Evanston, IL, USA
Kenneth D. Forbus Northwestern University, Evanston, IL, USA

paper link:
http://portal.acm.org/citation.cfm?id=1502650.1502658&coll=ACM&dl=ACM&type=series&idx=SERIES823&part=series&WantType=Proceedings&title=IUI&CFID=81639924&CFTOKEN=12013848

In this paper, Wagner et al. created the "Brussell" system which is an interface that compiles summary information on a news article.
Brussell also gathers background information from on the article from related articles and links and can construct a kind of summary of information leading to the main article. It does this by searching for related links and cross referencing the information against other articles to remove extraneous and possible erroneous information.

By giving a summary of an article, at a quick glance users can quickly assimilate news, background information, and current information on certain events.
Even more important is the fact that the system can work off a knowledge base construct a net of references to older material when looking at a current article.

To test the system the team created templates for several kinds of articles and defined a set of information that the system looks to fill in for the template.
The team also used a database of older articles that gave the system a knowledge base for the Brussell system. Then the system was run over 100 different news stories to measure the number of references found by the system. The system found an average of 4.1 references per article.

_________
My Spill:

The Brussell system creates an interesting addition to the data mining community by allowing casual users to weave a web of references and background information for news articles. I think the idea for the system is great. Allowing people to have a summarized view of current events could make the general populace more informed on current issues if the system is strong enough.

But that makes me think that the average user might not be motivated enough to use the system to become more educated on current issues, even though the given implementation may be easy enough to use. If the system could provide a means of rewarding the user for taking advantage of the system and reviewing material, then I think this kind of thing could be revolutionary.

I really wonder how "smart" the system really is...