Rangkuman Dari https://www.researchgate.net/publication/281006272_Human-Computer_Interaction_Overview_on_State_of_the_Art

Human-Computer Interaction: Overview on State of the Art

5 Applications
A classic example of a multimodal system is the “Put That There” demonstration system [57].
This system allowed one to move an object into a new location on a map on the screen by
saying “put that there” while pointing to the object itself then pointing to the desired
destination. Multimodal interfaces have been used in a number of applications including map-
based simulations, such as the aforementioned system; information kiosks, such as AT&T’s
MATCHKiosk [58] and biometric authentication systems [56].
Multimodal interfaces can offer a number of advantages over traditional interfaces. For one
thing, they can offer a more natural and user-friendly experience. For instance, in a real-estate
system called Real Hunter [24], one can point with a finger to a house of interest and speak to
make queries about that particular house. Using a pointing gesture to select an object and
using speech to make queries about it illustrates the type of natural experience multimodal
interfaces offer to their users. Another key strength of multimodal interfaces is their ability to
provide redundancy to accommodate different people and different circumstances. For
instance, MATCHKiosk [58] allows one to use speech or handwriting to specify the type of
147

business to search for on a map. Thus, in a noisy setting, one may provide input through
handwriting rather than speech. Few other examples of applications of multimodal systems
are listed below:
• Smart Video Conferencing [59]
• Intelligent Homes/Offices [60]
• Driver Monitoring [61]
• Intelligent Games [62]
• E-Commerce [63]
• Helping People with Disabilities [64]
In the following sections, some of important applications of multimodal systems have been
presented with greater details.
5.1 Multimodal Systems for Disabled people
One good application of multimodal systems is to address and assist disabled people (as
persons with hands disabilities), which need other kinds of interfaces than ordinary people. In
such systems, disabled users can perform work on the PC by interacting with the machine
using voice and head movements [65]. Figure 4 is an actual example of such a system.

Figure 4: Gaze detection pointing system for people with disabilities (taken from
www.adamfulton.co.uk)
Two modalities are then used: speech and head movements. Both modalities are active
continuously. The head position indicates the coordinates of the cursor in current time
moment on the screen. Speech, on the other hand, provides the needed information about the
meaning of the action that must be performed with an object selected by the cursor.
148
FAKHREDDINE KARRAY ET., AL., HUMAN-COMPUTER INTERACTION: OVERVIEW ON STATE OF THE ART
Synchronization between the two modalities is performed by calculating the cursor position at
the beginning of speech detection. This is mainly due to the fact that during the process of
pronouncing the complete sentence, the cursor location can be moved by moving the head,
and then the cursor can be pointing to other graphical object; moreover the command which
must be fulfilled is appeared in the brain of a human in a short time before beginning of
phrase input. Figure 5 shows the diagram of this system.

Figure 5: Diagram of a bimodal system [65]
In spite of some decreasing of operation speed, the multimodal assertive system allows
working with computer without using standard mouse and keyboard. Hence, such system can
be successfully used for hands-free PC control for users with disabilities of their hands.
5.2 Emotion Recognition Multimodal Systems
As we move towards a world in which computers are more and more ubiquitous, it will
become more essential that machines perceive and interpret all clues, implicit and explicit,
that we may provide them regarding our intentions. A natural human-computer interaction
cannot be based solely on explicitly stated commands. Computers will have to detect the
various behavioural signals based on which to infer one’s emotional state. This is a significant
piece of the puzzle that one has to put together to predict accurately one’s intentions and
future behaviour.
People are able to make prediction about one’s emotional state based on their observations
about one’s face, body, and voice. Studies show that if one had access to only one of these
modalities, the face modality would produce the best predictions. However, this accuracy can
149
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, VOL. 1, NO. 1, MARCH 2008
be improved by 35% when human judges are given access to both face and body modalities
together [66]. This suggests that affect recognition, which has for the most part focused on
facial expressions, can greatly benefit from multimodal fusion techniques.
One of the few works that has attempted to integrate more than one modality for affect
recognition is [67] in which facial features and body posture features are combined to produce
an indicator of one’s frustration. Another work that integrated face and body modalities is
[68] in which the authors showed that, similar to humans, machine classification of emotion is
better when based upon face and body data, rather than either modality alone. In [69], the
authors attempted to fuse facial and voice data for affect recognition. Once again, remaining
consistent with human judges, machine classification of emotion as neutral, sad, angry, or
happy was most accurate when the facial and vocal data is combined.
They recorded the four emotions: “sadness, anger, happiness, and neutral state”. The detailed
facial motions were captured in conjunctions with simultaneous speech recordings. Deducted
experiments showed that the performance of the facial recognition based system overcame the
one based on acoustic information only. Results also show that an appropriate fusion of both
modalities gave measurable improvements.
Results show that the emotion recognition system based on acoustic information only give an
overall performance of 70.9 percent, compared to an overall performance of 85 percent for a
recognition system based on facial expressions. This is, in fact, due to the fact that the cheek
areas give important information for emotion classification.
On the other hand, for the bimodal system based on fusing the facial recognition and acoustic
information, the overall performance of this classifier was 89.1 percent.
5.3 Map-Based Multimodal Applications
Different input modalities are suitable for expressing different messages. For instance, speech
provides an easy and natural mechanism for expressing a query about a selected object or
requesting that the object initiate a given operation. However, speech may not be ideal for
tasks, such as selection of a particular region on the screen or defining out a particular path.
These types of tasks are better accommodated by hand or pen gestures. However, making
queries about a given region and selecting that region are all typical tasks that should be
accommodate by a map-based interface. Thus, the natural conclusion is that map-based
interfaces can greatly improve the user experience by supporting multiple modes of input,
especially speech and gestures.
150
FAKHREDDINE KARRAY ET., AL., HUMAN-COMPUTER INTERACTION: OVERVIEW ON STATE OF THE ART
Quickset [70] is one of the more widely known and older map-based applications that make
use of speech and pen gesture input. Quickset is a military-training application that allows
users to use one of the two modalities or both simultaneously to express a full command. For
instance, users may simply draw out with a pen a predefined symbol for platoons at a given
location on the map to create a new platoon in that location. Alternatively, users could use
speech to specify their intent on creating a new platoon and could specify vocally the co-
ordinates in which to place the platoon. Lastly, users could express vocally their intent on
making a new platoon while making a pointing gesture with a pen to specify the location of
the new platoon.
A more recent multimodal map-based application is Real Hunter [24]. It is a real-estate
interface that expects users to select objects or regions with touch input while making queries
using speech. For instance, the user can ask “How much is this?” while pointing to a house on
the map.
Tour guides are another type of map-based applications that have shown great potential to
benefit from multimodal interfaces. One such example is MATCHKiosk [58], the interactive
city guide. In a similar fashion to Quickset, MATCHKiosk allows one to express certain
queries using speech only, such as “Find me Indian restaurants in Washington.”; using pen
input only by circling a region and writing out “restaurants”; using bimodal input by saying
“Indian restaurants in this area” and drawing out a circle around Alexandria. These examples
illustrate MATCHKiosk’s incorporation of handwriting recognition that can frequently
substitute for speech input. Although speech may be the more natural option for a user, given
the imperfectness of speech, especially in noisy environments, having handwriting as a
backup can reduce user frustration.
5.4 Multimodal Human-Robot Interface Applications
Similar to some map-based interfaces, human-robot interfaces usually have to provide
mechanisms for pointing to particular locations and for expressing operation-initiating
requests. As discussed earlier, the former type of interaction is well accommodated by
gestures, whereas the latter is better accommodate by speech. Thus, the human-robot
interface built by the Naval Research Laboratory (NRL) should come as no surprise [71].
NRL’s interface allows users to point to a location while saying “Go over there”.
Additionally, it allows users to use a PDA screen as a third possible avenue of interaction,
which could be resorted to when speech or hand gesture recognition is failing. Another
151
INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, VOL. 1, NO. 1, MARCH 2008
multimodal human-robot interface is the one built by Interactive System Laboratories (ISL)
[72], which allows use of speech to request the robot to do something while gestures could be
used to point to objects that are referred to by the speech. One such example is to ask the
robot, “switch on the light” while pointing to the light. Additionally, in ISL’s interface, the
system may ask for clarification from the user when unsure about the input. For instance, in
case that no hand gesture is recognized that is pointing to a light, the system may ask the user:
“Which light?”
5.5 Multi-Modal HCI in Medicine
By the early 1980s, surgeons were beginning to reach their limits based on traditional
methods alone. Human hand was unfeasible for many tasks and greater magnification and
smaller tools were needed. Higher precision was required to localize and manipulate within
small and sensitive parts of the human body. Digital robotic neuro-surgery has come as a
leading solution to these limitations and emerged fast due to the vast improvements in
engineering, computer technology and neuro-imaging techniques. Robotics surgery was
introduced into the surgical area [73].
State University of Aerospace Instrumentation, University of Karlsruhe (Germany) and
Harvard Medical School (USA) has been working on developing man-machine interfaces,
adaptive robots and multi-agent technologies intended for neuro-surgery [54].
The neuro-surgical robot consists of the following main components: An arm, feedback vision
sensors, controllers, a localization system and a data processing centre. Sensors provide the
surgeon with feedbacks from the surgical site with real-time imaging, where the latter one
updates the controller with new instructions for the robot by using the computer interface and
some joysticks.
Neuro-surgical robotics provide the ability to perform surgeries on a much smaller scale with
much higher accuracy and precision, giving access to small corridors which is completely
important when a brain surgery is involved [73].

Link dari HUMAN–COMPUTER INTERACTION AND MANAGEMENT INFORMATION SYSTEMS:FOUNDATIONS

http://melody.syr.edu/pzhang/publications/AMIS_HCI_06_Zhang_Galletta_Foundations.pdf

HUMAN–COMPUTER INTERACTION AND MANAGEMENT INFORMATION SYSTEMS: FOUNDATIONS

18 ZHANG AND GALLETA

Te’eni, D. A cognitive-affective model of organizational communication for designing IT. MIS Quarterly, 25, 2 (2001), 251–312. Torkzadeh, G., and Doll, W.J. The development of a tool for measuring the perceived impact of information technology on work. Omega-International Journal of Management Science, 27, 3 (1999), 327–339. Tractinsky, N.; Katz, A.S.; and Ikar, D. What is beautiful is usable. Interacting with Computers, 13 (2000), 127–145. van der Heijden, H. Factors influencing the usage of websites—the case of a generic portal in the Netherlands. Information & Management, 40, 6 (2003), 541–549. Venkatesh, V. Determinants of perceived ease of use: integrating control, intrinsic motivation, and emotion into the technology acceptance model. Information Systems Research, 11, 4 (2000), 342–365. Venkatesh, V., and Davis, F. A theoretical extension of the technology acceptance model: four longitudinal field studies. Management Science, 46, 2 (2000), 186–204. Venkatesh, V., and Davis, F.D. A model of the antecedents of perceived ease of use: development and test. Decision Science, 27, 3 (1996), 451–481. Venkatesh, V.; Morris, M.G.; Davis, G.B.; and Davis, F.D. User acceptance of information technology: toward a unified view. MIS Quarterly, 27, 3 (2003), 425–478. Vessey, I. Cognitive fit: a theory-based analysis of the graphs versus tables literature. Decision Sciences, 22 (1991), 219–240. Vessey, I. The effect of information presentation on decision making: A cost-benefit analysis. Information & Management, 27, 2 (1994), 103–119. Vessey, I., and Galletta, D.F. Cognitive fit: an empirical study of information acquisition. Information Systems Research, 2, 1 (1991), 63–84. Vessey, I.; Ramesh, V.; and Glass, R.L. Research in information systems: An empirical study of diversity in the discipline and its journals. Journal of Management Information Systems, 19, 2 (2002), 129–174. Webster, J., and Martocchio, J.J. The differential effects of software training previews on training outcomes. Journal of Management, 21, 4 (1995), 757–787. Webster, J., and Martocchio, J.J. Microcomputer playfulness: development of a measure with workplace implications. MIS Quarterly, 16, 1 (1992), Yoo, Y., and Alavi, M. Media and group cohesion: relative influences on social presence, task participation, and group consensus. MIS Quarterly, 25, 3 (2001), 371–390. Zhang, P. AIS SIGHCI three-year report. AIS SIGHCI Newsletter, 3, 1 (2004), 2–6. Zhang, P. An image construction method for visualizing managerial data. Decision Support Systems, 23, 4 (1998), 371–387. Zhang, P.; Benbasat, I.; Carey, J.; Davis, F.; Galletta, D.; and Strong, D. Human-computer interaction research in the MIS discipline. Communications of the AIS, 9, 20 (2002), 334–355. Zhang, P.; Carey, J.; Te’eni, D.; and Tremaine, M. Integrating human-computer interaction development into the systems development life cycle: a methodology. Communications of the AIS, 15 (2005), 512–543. Zhang, P., and Li, N. The importance of affective quality. Communications of the ACM, 48, 9 (2005), 105–108. Zhang, P., and Li, N. The intellectual development of human-computer interaction research in MIS: a critical assessment of the MIS literature (1990–2002). Journal of the Association for Information Systems, 6, 11 (2005), 227–292. Zhang, P.; Nah, F.H.F.; and Preece, J. HCI Studies in MIS. Behaviour & Information Technology, 23, 3 (2004), 147–151. Zigurs, I., and Buckland, B.K. A theory of task/technology fit and group support systems effectiveness. MIS Quarterly, 22, 3 (1998), 313–334. Zigurs, I.; Buckland, B.K.; Connolly, J.R.; and Wilson, E.V. A test of task-technology fit theory for group support systems. Database for Advances in Information Systems, 30, 3/4 (1999), 34. Zigurs, I.; Poole, S.; and DeSanctis, G. A study of influence behavior in computer-mediated group decision making. MIS Quarterly, 12, 4 (1988), 625–644. Zmud, R.W.; Anthony, W.P.; and Stair, R.M. The use of mental imagery to facilitate information identification in requirements analysis. Journal of Management Information Systems, 9, 4 (1993), 175–191.

Rangkuman dariDesign oriented human computer interaction

https://www.researchgate.net/publication/2908300_Design-oriented_Human---Computer_Interaction

DESIGN AS UNFOLDING If one accepts the importance of sketching in design work, it is also easier to understand and appreciate the argument that design is a kind of dialogue; a reflective conversation. But if design then is reconsidered in terms of Schön’s problem setting and problem solving, it is important that they are not interpreted as two different or succeeding activities. They are rather intertwined in the activity of design, an inseparable pair only unfolded through the design dialogue. Design in this sense becomes more of a search for a symmetrical, coherent, and well-balanced whole [3]—a complete gestalt [36]—than a process of first setting up and then solving problems. Using sketching to work out a coherent whole means putting ideas to use (externalization) but it also means that these ideas are put to a test (interpretation) [14, 31, 36]: How about this? Would this damage the whole? The interpretation that unavoidably occurs when something is put to use is rarely explicit, as it is so embedded in use that we do not think of it as also a test [14]. If the use/test pair fails, the designer tries another approach, a new angle on the problem or on the problem setting. Failure only explicitly occurs when the designer is not able to approach the problem or the problem setting from a new angle; the designer has got stuck. Likewise, success is not measured in explicit terms, it stems from a lack of failure rather than an explicit achievement; from actions of one or many use/test pairs that do not suggest problems, endorsing the designer to move on [14]. The design dialogue thus unfolds; exploring the tension between details and the search for a coherent, well-balanced whole.

Search This Blog

TugasRangkumanKomas

TugasKomunikasi&Masyarakat

Human-Computer Interaction: Overview on State of the Art

Comments

Post a Comment