Menu Close

Progressive Learning of Topic Modeling Parameters

Contributors:

Mennatallah El-Assady, Rita Sevastjanova, Fabian Sperrle, Daniel Keim, and Christopher Collins

Topic modelling algorithms are widely used to analyze the thematic composition of text corpora but remain difficult to interpret and adjust. Addressing these limitations, we present a modular visual analytics framework, tackling the understandability and adaptability of topic models through a user-driven reinforcement learning process that does not require a deep understanding of the underlying topic modelling algorithms. Given a document corpus, our approach initializes two algorithm configurations based on a parameter space analysis that enhances document separability. We abstract the model complexity in an interactive visual workspace for exploring the automatic matching results of two models, investigating topic summaries, analyzing parameter distributions, and reviewing documents. The main contribution of our work is an iterative decision-making technique in which users provide document-based relevance feedback that allows the framework to converge to a user-endorsed topic distribution. We also report feedback from a two-stage study which shows that our technique results in topic model quality improvements on two independent measures.

This research was given a Best VAST Paper Honorable Mention Award at VAST 2017.

To apply our technique on your own data or try out a demo, please visit http://visargue.dbvis.de/ (Individual accounts will be created upon request).

Demo Video

Talk from IEEE VAST 2017

Publications

    [pods name="publication" id="4245" template="Publication Template (list item)" shortcodes=1]

Detecting Negative Emotion for Mixed Initiative Visual Analytics

Contributors:

Prateek PanwarĀ and Christopher Collins

The work describes an efficient model to detect negative mind states caused by visual analytics tasks. We have developed a method for collecting data from multiple sensors, including GSR and eye-tracking, and quickly generating labelled training data for the machine learning model. Using this method we have created a dataset from 28 participants carrying out intentionally difficult visualization tasks. We have concluded the paper by discussing the best performing model, Random Forest, and its future applications for providing just-in-time assistance for visual analytics.

Publications

    [pods name="publication" id="4215" template="Publication Template (list item)" shortcodes=1] [pods name="publication" id="4218" template="Publication Template (list item)" shortcodes=1]

Design and Evaluation of Visualization Techniques to Facilitate Argument Exploration

Contributors:

D. Khartabil, C. Collins, S. Wells, B. Bach, and J. Kennedy

Abstract

This paper reports the design and comparison of three visualizations to represent the structure and content within arguments.
Arguments are artifacts of reasoning widely used across domains such as education, policy making, and science. An argument
is made up of sequences of statements (premises) which can support or contradict each other, individually or in groups through
Boolean operators. Understanding the resulting hierarchical structure of arguments while being able to read the argumentsā€™text
poses problems related to overview, detail, and navigation. Based on interviews with argument analysts we iteratively designed
three techniques, each using combinations of tree visualizations (sunburst, icicle), content display (in-situ, tooltip) and interactive
navigation. Structured discussions with the analysts show benefits of each these techniques; for example, sunburst being good in
presenting overview but showing arguments in-situ is better than pop-ups. A controlleduser study with 21 participants and three
tasks shows complementary evidence suggesting that a sunburst with pop-up for the content is the best trade-off solution. Our
results can inform visualizations within existing argument

 

EuroVis 2022 Talk

 

 

Publications

    [pods name="publication" id="8911" template="Publication Template (list item)" shortcodes=1]

DocuBurst Website Now Live!

Try DocuBurst, an online document visualization tool for: Uploading your own text documents Generating interactive visual summaries Exploring keywords to uncover document themes or topics…

PivotSlice

Many datasets, such as scientific literature collections, contain multiple heterogeneous facets which derive implicit relations, as well as explicit relational references between data items. The exploration of this data is challenging not only because of large data scales but also the complexity of resource structures and semantics. In this paper, we present PivotSlice, an interactive visualization technique that provides efficient faceted browsing as well as flexible capabilities to discover data relationships. With the metaphor of direct manipulation, PivotSlice allows the user to visually and logically construct a series of dynamic queries over the data, based on a multi-focus and multi-scale tabular view that subdivides the entire dataset into several meaningful parts with customized semantics. PivotSlice further facilitates the visual exploration and sensemaking process through features including live search and integration of online data, graphical interaction histories and smoothly animated visual state transitions. We evaluated PivotSlice through a qualitative lab study with university researchers and report the findings from our observations and interviews. We also demonstrate the effectiveness of PivotSlice using a scenario of exploring a repository of information visualization literature.

Check out our Github Repository for source code related to this project.

Media

Presentation Slides

Publications

    [pods name="publication" id="4380" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

EduApps: Helping Non-Native English Speakers with Language Structure

First language (L1) influence errors are very frequent in English learners (L2), even more so when the learner’s proficiency level is higher (upper-intermediate/advanced). Our project aims to analyze errors made by learners from specific L1ā€™s using learner corpora. Based on the analysis we want to focus on a specific type of error and research a way to identify it automatically in learners’ essays depending on their L1. This would allow us to implement an application that helps English as Second Language (ESL) students to identify and analyze their errors and to better understand the reasoning behind them, consequently improving the students’ English level.

About the EduApps initiative

EduApps is a suite of apps housed in an online environment that focuses on the health, well-being and development of oneā€™s mind, body and community. Our research project titled, ā€œThereā€™s an App for Thatā€ is investigating the design process, development, implementation and evaluation of this suite of educational apps. Specifically, we are interested in helping students build confidence and competence in the cognitive, socio-emotional and physical domains. We are also interested in the impact a learning portal can have on studentsā€™ learning, teachers and the surrounding community. We hope that our research can build capacity for investigating and affecting innovation in formal and informal education settings in the use of digital technology. We have partnered with school boards and community organizations to develop and research the apps. More about each of the domains ā€” their purpose, apps and related research can be found at http://eduapps.ca/.

Publications

    [pods name="publication" id="4191" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

ThreadReconstructor: Modeling Reply-Chains to Untangle Conversational Text

Contributors:

Mennatallah El-Assady, Rita Sevastjanova, Daniel Keim, and Christopher Collins

We present ThreadReconstructor, a visual analytics approach for detecting and analyzing the implicit conversational structure of discussions, e.g., in political debates and forums. Our work is motivated by the need to reveal and understand single threads in massive online conversations and verbatim text transcripts. We combine supervised and unsupervised machine learning models to generate a basic structure that is enriched by user-defined queries and rule-based heuristics. Depending on the data and tasks, users can modify and create various reconstruction models that are presented and compared in the visualization interface. Our tool enables the exploration of the generated threaded structures and the analysis of the untangled reply-chains, comparing different models and their agreement. To understand the inner workings of the models, we visualize their decision spaces, including all considered candidate relations. In addition to a quantitative evaluation, we report qualitative feedback from an expert user study with four forum moderators and one machine learning expert, showing the effectiveness of our approach.

Publications

    [pods name="publication" id="4233" template="Publication Template (list item)" shortcodes=1]

Textension: Digitally Augmenting Document Spaces in Analog Texts

Contributors:

Adam James Bradley, Christopher Collins, Victor Sawal, and Sheelagh Carpendale

In this paper, we present a framework that allows people who work with analog texts to leverage the affordances of digital technology, such as data visualization, computational linguistics, and search, using any web-based mobile device with a camera. After taking a picture of a particular page or set of pages from a text or uploading an existing image, our prototype system builds an interactive digital object that automatically inserts visualizations and interactive elements into the document. Leveraging the findings of previous studies, our framework augments the reading of analog texts with digital tools, making it possible to work with texts in both a digital and analog environment.

Check out our online demo.

Publications

    [pods name="publication" id="4203" template="Publication Template (list item)" shortcodes=1] [pods name="publication" id="4230" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

This work was supported by NSERC Canada Research Chairs, The Canada Foundation for Innovation – Cyberinfrastructure Fund, and the Province of Ontario ā€“ Ontario Research Fund.