Full Citation in the ACM Digital Library
This paper explores the unique challenges faced by tribal communities in the context of emergency management, encompassing natural disasters and the preservation of their rich cultural heritage. The study aims to investigate both the potential advantages and hurdles associated with the adoption of large language models (LLMs) in tribal emergency management. Our primary goal is to qualitatively assess Indigenous perspectives on the suitability and acceptability of deploying an LLM-powered chatbot in this specific domain. To achieve this objective, we employ a think-aloud interview methodology involving 18 tribal members. This qualitative research approach captures participants’ cognitive processes and decision-making as they engage with the language model’s responses in real-time. Through thematic analysis of these verbalized thoughts and the prompts submitted, the study sheds light on various aspects, including usability, information-seeking behavior, and the incorporation of tribal culture considerations when integrating large language models into tribal emergency management practices. The paper concludes with a discussion of potential design implications and contributions to the fields of AI and HCI.
Messaging apps, such as Telegram and WhatsApp, are routinely used to communicate, chat and make decisions. Group Recommender Systems (GRSs) have been introduced as self standing tools to support group interactions and decision-making. We present here a TelegramBot, named CHARM, that supports groups to make a decision on an arbitrary topic by leveraging GRSs techniques. CHARM helps elicit the group members’ preferences, ranks the items that the members have suggested to be considered, provides a summary of the current status of the discussion, and finally recommends a fair choice. A focus group study has revealed that the designed functionality includes features that users expect to find in a bot aimed at supporting group decision-making.
Choreography creation is a multimodal endeavor, demanding cognitive abilities to develop creative ideas and technical expertise to convert choreographic ideas into physical dance movements. Previous endeavors have sought to reduce the complexities in the choreography creation process in both dimensions. Among them, non-AI-based systems have focused on reinforcing cognitive activities by helping analyze and understand dance movements and augmenting physical capabilities by enhancing body expressivity. On the other hand, AI-based methods have helped the creation of novel choreographic materials with generative AI algorithms. The choreography creation process is constrained by time and requires a rich set of resources to stimulate novel ideas, but the need for iterative prototyping and reduced physical dependence have not been adequately addressed by prior research. Recognizing these challenges and the research gap, we present an innovative AI-based choreography-support system. Our goal is to facilitate rapid ideation by utilizing a generative AI model that can produce diverse and novel dance sequences. The system is designed to support iterative digital dance prototyping through an interactive web-based user interface that enables the editing and modification of generated motion. We evaluated our system by inviting six choreographers to analyze its limitations and benefits and present the evaluation results along with potential directions for future work.
Intent prediction finds widespread applications in user interface (UI/UX) design to predict target icons, in automotive industry to anticipate driver’s intent, and in understanding human motion during human-robot interactions (HRI). Predicting human intent involves analyzing factors such as hand motion, eye gaze movement, and gestures. This paper introduces a multimodal intent prediction algorithm involving hand and eye gaze using Bayesian fusion. Inverse reinforcement learning was leveraged to learn human preferences for the human-robot handover task. Results demonstrate that the proposed approach achieves the highest prediction accuracy of 99.9% at 60% task completion as compared to state-of-the-art (SOTA) methods.
Robotics is a trailblazing technology that has found extensive applications in the field of assistive aids for individuals with severe speech and motor impairment (SSMI). This article describes the design and development of an eye gaze-controlled user interface to manipulate the robotic arm. User studies were reported to engage users through eye gaze input to select stamps from the two designs and select the stamping location on cards using three designated boxes present in the User Interface. The entire process, from stamp selection to stamping location selection, is controlled by eye movements. The user interface contains the print button to initiate the robotic arm that enables the user to independently create personalized stamped cards. Extensive user interface trials revealed that individuals with severe speech and motor impairment showed improvements with a 33.2% reduction in the average time taken and a 42.8% reduction in the standard deviation for the completion of the task. This suggests the effectiveness and potential to enhance the autonomy and creativity of individuals with SSMI, contributing to the development of inclusive assistive technologies.
With the rapid improvement in large language model (LLM) capabilities, its becoming more difficult to measure the quality of outputs generated by natural language generation (NLG) systems. Conventional metrics such as BLEU and ROUGE are bound to reference data, and are generally unsuitable for tasks that require creative or diverse outputs. Human evaluation is an option, but manually evaluating generated text is difficult to do well, and expensive to scale and repeat as requirements and quality criteria change. Recent work has focused on the use of LLMs as customize-able NLG evaluators, and initial results are promising. In this demonstration we present EvaluLLM, an application designed to help practitioners setup, run and review evaluation over sets of NLG outputs, using an LLM as a custom evaluator. Evaluation is formulated as a series of choices between pairs of generated outputs conditioned on a user provided evaluation criteria. This approach simplifies the evaluation task and obviates the need for complex scoring algorithms. The system can be applied to general evaluation, human assisted evaluation, and model selection problems.
Ensuring that a machine learning model performs as intended is a critical step before it can be used in practice. This is commonly done by measuring a model’s predictive performance (e.g., accuracy). However, in high-stakes settings it is often necessary to verify on which data aspects the model actually relies on. This demo presents XAIVIER, the eXplainable AI VIsual Explorer and Recommender, a web application for interactive XAI on time series data. XAIVIER supports dataset exploration and model inspection, allowing users to explain model predictions using various explainer methods. An explainer recommender is provided to advise users which explainer delivers most faithful explanations for their dataset and model. Finally, explanation-based grouping is provided to reveal the model’s underlying decision-making strategies. The proposed set of features aims to cover the full model verification use case for time series classifiers. A demo of XAIVIER is available at https://xai-explorer-demo.know-center.at
In recent years, the number of cyclists has increased and the use of bicycles has been promoted worldwide as a means of transportation in Mobility as a Service (MaaS). However, route recommendation, incorporating efficiency and comfort, is yet to be explored further. In this study, we construct a navigation system that recommends the shortest route for a cyclist. During the traversing through the recommended route, the cyclist’s facial images are captured and unevenness of the route is recorded periodically; through the camera and an accelerometer of the smartphone. The system analyzes the level of comfort the cyclist experienced by estimating the facial expressions and the degree of unevenness throughout the recommended route. Finally, the information on the experienced level of comfort for the recommended route is annotated in Google Maps. We conducted real-life experiments for two consecutive days in which four subjects took part; where each of the subjects rode bicycles for about 1.5 hours per day following the route recommended by the navigation system.
Carousel-based interfaces are integral in enhancing user experience in online recommender systems like streaming services or e-commerce platforms, yet their usability evaluation often lacks standardization. Existing work on evaluating recommender systems, from toolkits to infrastructure, mainly assesses recommendation algorithms rather than user experience. This focus leads to a limited understanding of recommender systems’ effectiveness, as it overlooks the role of user interface design, especially carousel-based interfaces, in user experience. In response, this paper introduces a web-based infrastructure for the usability assessment of carousel-based interfaces. Our infrastructure is adaptable for various domains and setups, and its modular design allows for potential expansion.
We demonstrate a modular display device that interweaves artificial intelligence, tangible interaction, and an ACM IUI corpus. Our design incorporates a low-cost, shape-changing tangible interface with an array of illuminated, actuated physical interactors and hexagonal tokens (hextoks), within a structure of varying scale, sidedness, materiality, and computational hardware composition. The demonstration illustrates generative AI syntheses of varying-scale physical and virtual platforms; of rich content, drawing upon all published ACM IUI articles and Wikipedia; visuals, automatically synthesized for varying-scale & medium displays; and kinetics, as well as illustrating paths for generatively upcycling second-life computational technologies.
Informational videos serve as a crucial source of conceptual and procedural knowledge. When producing informational videos, editors edit videos by overlaying text/images or trimming footage to enhance the video quality and make it more engaging. However, video editing can be difficult and time-consuming, especially for novice video editors who often struggle with expressing and implementing their editing ideas. We present ExpressEdit, a system that enables editing videos via NL text and sketching on the video frame. Powered by LLM and vision models, the system interprets (1) temporal, (2) spatial, and (3) operational references in an NL command and spatial references from sketching. This work offers insights into building multimodal interfaces for video editing.
As interest in programming as a major grows, instructors must accommodate more students in their programming courses. One particularly challenging aspect of this growth is providing quality assistance to students during in-class and out-of-class programming exercises. Prior work proposes using instructor dashboards to help instructors combat these challenges. Further, the introduction of ChatGPT represents an exciting avenue to assist instructors with programming exercises but needs a delivery method for this assistance. We propose a revision of a current instructor dashboard Assistant Dashboard Plus that extends an existing dashboard with two new features: (a) identifying students in difficulty so that instructors can effectively assist them, and (b) providing instructors with pedagogically relevant groupings of students’ exercise solutions with similar implementations so that instructors can provide overlapping code style feedback to students within the same group. For difficulty detection, it uses a state-of-the-art algorithm for which a visualization has not been created. For code clustering, it uses GPT. We present a first-pass implementation of this dashboard.
This demo presents a human-AI collaborative system to assist pathologists in examining the pathological pattern of mitosis, a critical factor in tumor diagnosis. Traditionally, pathologists face challenges in assessing mitoses due to the task’s inherent complexity. The demonstrated system aims to address the problem by designing an enhanced human-artificial intelligence workflow. Firstly, it can guide a pathologist user to regions of interest that have flexible morphologies. Then, inside each region of interest, the system highlights each AI-detected mitosis event along with enriched forms of explainable AI evidence. The system can potentially improve the efficiency and correctness of pathologists’ mitosis assessment by enabling them to leverage the power of AI while retaining their clinical expertise and judgment.
The success of GPT with coding tasks has made it important to consider the impact of GPT and similar models on teaching programming. Students’ use of GPT to solve programming problems can hinder their learning. However, they might also get significant benefits such as quality feedback on programming style, explanations of how a given piece of code works, help with debugging code, and the ability to see valuable alternatives to their code solutions. We propose a new design for interacting with GPT called Mediated GPT with the goals of (a) providing students with access to GPT but allowing instructors to programmatically modify responses to prevent hindrances to student learning and combat common GPT response concerns, (b) helping students generate and learn to create effective prompts to GPT, and (c) tracking how students use GPT to get help on programming exercises. We demonstrate a first-pass implementation of this design called NotebookGPT.
We demonstrate (X)AI-SPOT - (X)AI-Supported Process Optimization Tool - that aims to encourage, facilitate, and enhance AI usage for process engineering and optimization in the production industry. Furthermore, (X)AI-SPOT seeks not to become a one-size-fits-all approach but a framework where each user archetype (Shop Floor Worker, Field Expert, and AI Expert) can receive tailored XAI functionality suited to their unique requirements. Currently, (X)AI-SPOT handles the Shop Floor Worker user archetype, with initial support for the Field Expert. We also describe our tool’s architecture w.r.t. extendibility and support of different user archetypes; we share our findings from an expert user interview and conclude with a discussion of design decisions and future work. Our application is available at http://exait.know-center.at/mv-ui.
Undesirable consequences of digital technologies are often difficult to foresee at the time of their design and development, but having access to examples can help. This demonstration paper presents Blip, a research prototype that provides a catalog of undesirable consequences by extracting, summarizing, and categorizing undesirable consequences from online articles and academic papers using large language models (LLMs). Blip provides a web-based interactive user interface that allows users to explore undesirable consequences pertaining to different life aspects (e.g., “economy” or “politics”) and bookmark those found to be relevant.
The recent public releases of AI tools such as ChatGPT have forced computer science educators to reconsider how they teach. These tools have demonstrated considerable ability to generate code and answer conceptual questions, rendering them incredibly useful for completing CS coursework. While overreliance on AI tools could hinder students’ learning, we believe they have the potential to be a helpful resource for both students and instructors alike. We propose a novel system for instructor-mediated GPT interaction in a class discussion board. By automatically generating draft responses to student forum posts, GPT can help Teaching Assistants (TAs) respond to student questions in a more timely manner, giving students an avenue to receive fast, quality feedback on their solutions without turning to ChatGPT directly. Additionally, since they are involved in the process, instructors can ensure that the information students receive is accurate, and can provide students with incremental hints that encourage them to engage critically with the material, rather than just copying an AI-generated snippet of code. We utilize Piazza—a popular educational forum where TAs help students via text exchanges—as a venue for GPT-assisted TA responses to student questions. These student questions are sent to GPT-4 alongside assignment instructions and a customizable prompt, both of which are stored in editable instructor-only Piazza posts. We demonstrate an initial implementation of this system, and provide examples of student questions that highlight its benefits.
ABSTRACT In text-based communication, the enhancement of user experiences during the earliest stages of contact, particularly while establishing rapport, is of utmost importance. This research investigates differences between males and females in the utilisation of an AI conversation assistant at the beginning of a conversation. The system has text “Recommendation” and “Polishing” capabilities and can customise the linguistic style, selecting either a humorous or a respectful tone. Users could also choose between three different levels of AI extraversion. In user evaluation studies, the system received a favourable usability rating, confirming its efficacy. Generally, male users reported a greater level of user experience compared to female users. However, both male and female users indicated an elevated sense of comfort and a greater inclination to sustain connections while utilising the AI system.
Although there is a significant increase in research on the user experience of recommender systems, they do not delineate varied experiences based on cognitive abilities. In this paper, we evaluate the impact of recommender systems on users with the neurodevelopmental classification of Attention Deficit Hyperactivity Disorder (ADHD). Through constructivist grounded theory analysis of six contextual interviews, we formulate an initial theory explaining how personalized recommendations exacerbate ADHD users’ self-regulatory challenges leading to overarching detrimental consequences in ADHD users’ interpersonal lives. Furthermore, though participants found community and social support through personalized recommendations, the challenges of personalized recommendations outweighed the benefits.
Many document search systems display search results with snippets, i.e., brief excerpts from the matched documents. It helps users quickly locate relevant documents in the result list, but when users search for documents in non-native languages, they cannot read snippets quickly. In this paper, we compare two methods of presenting a search result for such non-native language documents. One is to show a snippet translated into the user’s native language, whereas the other is to show the original snippet together with several keyphrases extracted from the document and translated into the user’s native language. The result of our experiment shows that the translated snippets is not effective while the translated keyphrases are in non-native language document search.
As organizations recognize the potential of Large Language Models (LLMs), bespoke domain-specific solutions are emerging, which inherently face challenges of knowledge gaps and contextual accuracy. Prompt engineering techniques such as chain-of-thoughts and few-shot prompting have been proposed to enhance LLMs’ capabilities by dynamically presenting relevant exemplars. Are LLMs able to infer domain knowledge from code exemplars involving similar domain concepts and analyze the data correctly? To investigate this, we curated a synthetic dataset containing 45 tabular databases, each has domain concepts and definitions, natural language data analysis queries, and responses in the form of Python code, visualizations, and insights. Using this dataset, we conducted a within-subjects experiment to evaluate the effectiveness of domain-specific exemplars versus randomly selected, generic exemplars. Our study underscores the significance of tailored exemplars in enhancing LLMs’ accuracy and contextual understanding in domain-specific tasks, paving the way for more intuitive and effective data analysis solutions.
In the realm of generative AI applications, explicitly focusing on LLM-driven conversational agents, there exists a notable gap in addressing the needs of individuals with disabilities, including those with dementia. This oversight raises concerns regarding equity and accessibility in deploying potentially beneficial assistive technologies. This paper seeks to form best practices for instructing individuals with mild to moderate dementia on integrating assistive technology, in this case ChatGPT 3.5 (a conversational AI), into their daily activities. The research, conducted through remote focus groups and optional individual training sessions, identified three key best practices: 1) implementing conversational learning in groups, 2) facilitating proxy interactions, and 3) providing 1-on-1 guided walkthroughs. These findings not only contribute best practices for instructing individuals with cognitive differences in using emerging text-based LLM-driven conversational agents but also emphasize the potential for inclusive design of AI systems tailored for people with mild to moderate dementia. The study underscores the interest and capability of individuals with dementia to learn and interact with LLM-driven conversational agents, offering insights into incorporating such technology into their lives.
Recent work highlights digital privacy education as a crucial component in overcoming the digital literacy gap among seniors, but also shows that seniors distrust AI systems and prefer a more personable education experience. To this end, we conducted a within-subjects experiment to explore the pivotal role of trust in digital privacy education for older adults, with a specific focus on how the physical characteristics of instructors—human and AI—influence trust levels. In our study, 36 younger and 27 older participants evaluated 9 introductions to a video tutorial on digital privacy, featuring 3 AI and 6 human instructors (the latter varying by age and gender). Analysis revealed that trust towards the AI instructors was lower than towards the human instructors. Among the AI instructors, a robot with human-like features was the most trusted, while among the human instructors, the older and middle-aged female instructors were the most trusted. Furthermore, participant demographics such as gender and rurality were found to moderate trust levels. This research has implications for instructional design and technology acceptance, particularly in addressing privacy concerns and fostering inclusive digital literacy among the senior population.
The seventh HUMANIZE workshop1 on Transparency and Explainability in Adaptive Systems through User Modeling Grounded in Psychological Theory took place in conjunction with the 29th annual meeting of the Intelligent User Interfaces (IUI)2 community that took place on between March 18-21, 2024 in Greenville, South Carolina, USA. The 2024 edition of the workshop was held together with SOCIALIZE (social and cultural integration with personalized interfaces) 3. The workshop provided a venue for researchers from different fields to interact by accepting contributions on the intersection of practical data mining methods and theoretical knowledge for personalization. A total of two papers were accepted for this edition of the workshop.
Recent research in eXtended Reality (XR) Technologies, Computer Vision and AI are redefining interaction with electronic media in new ways which was not thought before. From its inception at the 1956 Dartmouth Workshop, Artificial Intelligence (AI) revitalized itself many times by finding new theories and applications. The newly popularized concept of Metaverse can be an apt platform to merge AI and XR technologies together. This workshop is planned to investigate user interface and interaction issues with Metaverse and scoping the use of AI tools and technology for improving UI/UX for Metaverse. As part of the workshop, the very concept of Metaverse will be explored in details with members from academia, industry and standardization bodies. Paper and poster submissions will be solicited and interactive demonstration sessions will be organized to explore various use cases and related UI/UX challenges in Metaverse.
As the integration of Artificial Intelligence into daily decision-making processes intensifies, the need for clear communication between humans and AI systems becomes crucial. The Adaptive XAI (AXAI) workshop focuses on the design and development of intelligent interfaces that can adaptively explain AI’s decision-making processes and our engagement with those processes. In line with the human-centric principles of the Future Artificial Intelligence Research (FAIR) project1, this workshop seeks to explore, understand and develop interfaces that dynamically adapt, thereby creating explanations of AI-based systems that both relate to and resonate with a range of users with different explanation-based requirements. As AI’s role in our lives becomes ever more embedded, the ways in which such systems explain elements about the system need to be malleable and responsive to the ever-evolving individual’s cognitive state, relating to contextual needs/focus and to the social setting. For instance, easy to use and effective interaction modalities like Visual Languages can provide users with intuitive mechanisms to interact with, adjust, and reshape AI narratives. This ensures that a richer, more tailored understanding can be provided, allowing explanations to emerge in line with the users’ demands and the ever-shifting contexts they find themselves in, both as individuals and as part of a group. The Adaptive XAI workshop extends an invitation to scholars, designers, and tech-nologists to collaboratively shape the future of human-XAI interplay.
Generative AI has rapidly entered the public consciousness with the development of applications such as ChatGPT, Midjourney, and GitHub Copilot. Nielsen recently argued that we have entered a new era of human-computer interaction in which users need only specify what they want and not how it should be produced [1]. This paradigm of intent-based outcome specification shifts control over from people to AI, enabling new forms of co-creativity and co-creation. Although these systems are capable of holding fluent conversations and producing high-fidelity images, difficulties remain regarding their ability to produce outputs that satisfy their users’ needs. Our workshop will bring together researchers and practitioners from both the HCI and AI disciplines to explore the implications of this shift in control, deepen our understanding of the human-AI co-creative process, and examine how we can design, build, use, and evaluate human-AI co-creative systems that are both effective and safe.
This is the fourth edition of the SOcial and Cultural IntegrAtion with PersonaLIZEd Interfaces (SOCIALIZE) workshop. Like the three preceding editions, this year’s primary aim is to provide an occasion for everyone keen on exploring and adopting new technologies to foster inclusivity by overcoming potential cultural, social, and language barriers. Particular emphasis is given to those facing challenges in establishing interpersonal connections, and in pursuing this ambitious goal, social robots represent potential key contributors. This year, the invited talk will focus on a crucial goal for AI systems focused on human needs: facilitating and analyzing diverse perspectives, encompassing cultural, emotional, and contemplative dimensions. More precisely, the findings and insights derived from implementing diversity-focused recommendations will be presented and discussed as part of a European project that aims to expand users’ perspectives, promoting more inclusive, multi-faceted, and empathy-driven understandings of cultural content.
Digital History and Cultural Heritage encapsulate invaluable societal narratives, yet scholars and practitioners face challenges in data quality, accessibility, and engagement. Human-AI Interaction (HAI) holds promise to address these challenges, fostering enhanced analysis, discoverability, and storytelling at scale. However, its potential remains largely untapped by the HAI community. This workshop aimed to bridge this gap, inviting inviting scholars and practitioners from fields such as human-computer interaction (HCI), artificial intelligence (AI), history, cultural heritage, and GLAMs (galleries, libraries, archives, and museums) to explore innovative HAI methodologies and frameworks tailored to these domains. Through interdisciplinary dialogue, we aimed to propose tractable solutions, enriching both the Digital History and Cultural Heritage sectors, as well as the HAI field, while nurturing a fertile ground for historical storytelling and meaningful engagement with our shared past.
The aim of this workshop is two-fold. First, it aims to establish a research community focused on design and evaluation of synthetic speech (TTS) interfaces that are tailored not only to goal oriented tasks (e.g., food ordering, online shopping) but also personal growth and resilience promoting applications (e.g., coaching, mindful reflection, and tutoring). Second, through discussion and collaborative efforts, to establish a set of practices and standards that will help to improve ecological validity of TTS evaluation. In particular, the workshop will explore the topics such as: interaction design of voice-based conversational interfaces; the interplay between prosodic aspects (e.g., pitch variance, loudness, jitter) of TTS and its impact on voice perception. This workshop will serve as a platform on which to build a community that is better equipped to tackle the dynamic field of interactive TTS interfaces, which remains understudied, yet increasingly pertinent to everyday lives of users.
AI explanations have been increasingly used to help people better utilize AI recommendations in AI-assisted decision making. While numerous technical transparency approaches have been established, a human-centered perspective is needed for understanding how human decision makers use and process AI explanations. In my thesis, I start with an empirical exploration of how AI explanations shape the way people understand and utilize AI decision aids. Next, I move to the time‑evolving nature of AI explanations, exploring how explanation changes due to AI model updates affect human decision makers’ perception and usage of AI models. Lastly, I construct computational human behavior models to gain a more quantitative understandings of human decision makers’ cognitive interactions with AI explanations. I conclude with future work on carefully identifying user needs for explainable AI in an era when AI models are becoming more complex and human-AI collaboration scenarios are increasingly diversified.
Technological progress has persistently shaped the dynamics of human-machine interactions in task execution. In response to the advancements in Generative AI, this paper outlines a detailed study plan that investigates various human-AI interaction modalities across a range of tasks, characterized by differing levels of creativity and complexity. This exploration aims to inform and contribute to the development of Graphical User Interfaces (GUIs) that effectively integrate with and enhance the capabilities of Generative AI systems. The study comprises three parts: exploring fixed-scope tasks through news headline generation, delving into atomic creative tasks with analogy generation, and investigating complex tasks via data visualization. Future work aims to extend this exploration to linearize complex data analysis results into narratives understandable to a broader audience, thereby enhancing the interpretability of AI-generated content.
Trust is an essential aspect of human-robot interaction (HRI) and plays an important role in decision-making. Currently, measuring trust in real-time is challenging, especially in repeated interaction. Consequently, we see limited work on calibrating humans’ trust in robots in HRI. In this work, we describe a mathematical model that attempts to emulate the three-layered (initial, situational, learned) framework of trust capable of potentially estimating humans’ trust in robots in real-time. We evaluated the trust model in two different HRI user studies. The results showed that the model is valid based on the linear regression analysis. We look to design a robotic system that adapts to optimize human’s trust in robots, using our validated model’s metrics.
Retirement is a goal for many individuals as they complete their careers and provides the opportunity to define a new lifestyle. Many retirees find with the freedom of retirement they have time to pursue new interests and to apply their life skills in ways that can keep them active and socially engaged. In every community there are numerous non-profit organizations that rely on volunteers to carry out their mission. Our research investigates how retirees find volunteer opportunities that can provide them with the satisfaction of “giving back’ and also enhance their well-being and resilience as they grow older.
This project aims to explore how data-driven methods can be used to enhance humans’ problem-solving skills in spatial tasks and how they can be integrated into immersive tutoring systems. The goal is to provide an enhanced, state-dependent approach that is also applicable outside the laboratory, within real-life scenarios. Digital tutoring and learning systems are typically expert systems and follow a predefined training routine. Ongoing research concentrates on adaptive methods within affective- and cognition-aware systems, yet there is comparatively less investigation into recognizing and adapting problem-solving strategies. Spatial skill have been shown to be especially important for STEM (Science, Technology, Engineering, Mathematics) disciplines in general. Due to their design, immersive systems are particularly suitable for training of spatial skills and are thus frequently utilized for this.
With the advance of logging technologies, various data-driven systems utilizing a vast amount of data are being developed and used in our lives. By analyzing data and users’ usage patterns, data-driven systems are designed so that a typical user can better use the system. However, some users perceive and use data-driven systems differently from that of the majority of users. This leads to digital inequalities leaving out non-typical user groups from fully utilizing data-driven systems. My doctoral research suggests that there are three parts of data-driven systems—data collection, model, and interface—that require diverse user groups to be considered. Specifically, my prior work understands how (1) users with high concerns about privacy during data collection [11], (2) users with low performance of the model, and (3) users with different interface usage patterns use the data-driven systems and what kind of difficulties they face [10] in various contexts such as automatic personality assessment in workplaces, automatic speech recognition, and video platforms for learning. For future work, I will investigate how a data-driven system should be designed to enable better user experience for various user groups by suggesting a user-adaptive automatic speech recognition system.
Visual media without alternative text descriptions are inaccessible to blind and visually impaired people. While mainstream solutions often focus on achieving acceptable results based solely on visual media, they neglect the potential involvement of internet users in contributing to accessibility. With an explosion of data, diverse media formats, and constant updates, the responsibility of ensuring accessibility and equity cannot rest solely on the shoulders of a select few. The sheer volume of digital contents demands a community effort where each individual becomes an advocate for accessibility. With the central goal of effecting this community-level change, I design and develop intelligent user interfaces that engage internet users in making visual media accessible at three progressive levels: i) extracting visual information from human-authored texts in metadata to generate in-context descriptions; ii) nudging people to use visual descriptive language in online activities; iii) creating symbiotic information seeking experiences to benefit both sighted and blind peers of the same community.
With the ongoing digitalization of complex systems, for example in manufacturing, domain experts’ detailed understanding of datasets is pivotal to effectively training machine learning (ML) models. This understanding obtained through their deep domain knowledge, enables domain experts to collaborate with method experts to identify deficiencies in datasets, such as biases or anomalies, and curate them. Such curated datasets build the foundation for training effective ML models, which are able to inform subsequent decision-making processes. However, understanding the increasingly large and complex datasets and systems they represent is challenging. Therefore, this doctoral thesis investigates methods to support domain experts in building a solid data understanding for complex datasets. Specifically, the thesis focuses on three key areas: conceptualizing data understanding, augmenting domain knowledge through VIS4ML systems to curate datasets, and providing contextual information for AI-assisted decision-making. Initial findings indicate that VIS4ML systems effectively support domain experts in understanding and contextualizing datasets, enabling them to curate datasets collaboratively. This understanding, particularly when enriched through contextual information, shows promise in enhancing AI-assisted decision-making.
This tutorial engages researchers in a series of collaborative activities towards Enhanced Privacy and Integrity Considerations (EPIC) for human subjects research in the artificial intelligence (AI) field. The tutorial aims to identify common challenges to study integrity, convey best practices for protecting participants at the point of study design, and discuss how to best design tools to support robust, privacy-enhancing human subjects research in AI. In particular, the tutorial provides hands-on training on how to determine sample size and collect participant demographics in a way that prioritizes data integrity, participant privacy, and sample representativeness. Tutorial participants discuss and troubleshoot the unique challenges to and opportunities for designing robust and ethical human-centered AI research.