Menu Close

ThreadReconstructor: Modeling Reply-Chains to Untangle Conversational Text

Contributors:

Mennatallah El-Assady, Rita Sevastjanova, Daniel Keim, and Christopher Collins

We present ThreadReconstructor, a visual analytics approach for detecting and analyzing the implicit conversational structure of discussions, e.g., in political debates and forums. Our work is motivated by the need to reveal and understand single threads in massive online conversations and verbatim text transcripts. We combine supervised and unsupervised machine learning models to generate a basic structure that is enriched by user-defined queries and rule-based heuristics. Depending on the data and tasks, users can modify and create various reconstruction models that are presented and compared in the visualization interface. Our tool enables the exploration of the generated threaded structures and the analysis of the untangled reply-chains, comparing different models and their agreement. To understand the inner workings of the models, we visualize their decision spaces, including all considered candidate relations. In addition to a quantitative evaluation, we report qualitative feedback from an expert user study with four forum moderators and one machine learning expert, showing the effectiveness of our approach.

Publications

    [pods name="publication" id="4233" template="Publication Template (list item)" shortcodes=1]

NEREx: Named-Entity Relationship Exploration in Conversations

Contributors:

Mennatallah El-Assady, Rita Sevastjanova, Bela Gipp, Daniel Keim, and Christopher Collins

We present NEREx, an interactive visual analytics approach for the exploratory analysis of verbatim conversational transcripts. By revealing different perspectives on multi-party conversations, NEREx gives an entry point for the analysis through high-level overviews and provides mechanisms to form and verify hypotheses through linked detail-views. Using a tailored named-entity extraction, we abstract important entities into ten categories and extract their relations with a distance-restricted entity-relationship model. This model complies with the often ungrammatical structure of verbatim transcripts, relating two entities if they are present in the same sentence within a small distance window. Our tool enables the exploratory analysis of multi-party conversations using several linked views that reveal thematic and temporal structures in the text. In addition to distant-reading, we integrated close-reading views for a text-level investigation process. Beyond the exploratory and temporal analysis of conversations, NEREx helps users generate and validate hypotheses and perform comparative analyses of multiple conversations. We demonstrate the applicability of our approach on real-world data from the 2016 U.S. Presidential Debates through a qualitative study with three domain experts from political science.

For a demo, please visit: http://visargue.inf.uni-konstanz.de/

Publications

    [pods name="publication" id="4266" template="Publication Template (list item)" shortcodes=1]

ConToVi: Multi-Party Conversation Exploration using Topic-Space Views

Contributors:

Mennatallah El-Assady, Valentin Gold, Carmela Acevedo, Christopher Collins, and Daniel Keim

We introduce a novel visual analytics approach to analyze speaker behaviour patterns in multi-party conversations. We propose Topic-Space Views to track the movement of speakers across the thematic landscape of a conversation. Our tool is designed to assist political science scholars in exploring the dynamics of a conversation over time to generate and prove hypotheses about speaker interactions and behaviour patterns. Moreover, we introduce a glyph-based representation for each speaker turn based on linguistic and statistical cues to abstract relevant text features. We present animated views for exploring the general behaviour and interactions of speakers over time and interactive steady visualizations for the detailed analysis of a selection of speakers. Using a visual sedimentation metaphor we enable the analysts to track subtle changes in the flow of a conversation over time while keeping an overview of all past speaker turns. We evaluate our approach on real-world datasets and the results have been insightful to our domain experts.

For access to the tool, please take a look at the presentation slides or contact us via e-mail.

Presentation Slides (PDF)

Publications

    [pods name="publication" id="4281" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

DAViewer: Facilitating Discourse Analysis with Interactive Visualization

Contributors:

Jian Zhao, Fanny Chevalier, Christopher Collins, and Ravin Balakrishnan

A discourse parser is a natural language processing system that can represent the organization of a document based on a rhetorical structure tree—one of the key data structures enabling applications such as text summarization, question answering and dialogue generation. Computational linguistics researchers currently rely on manually exploring and comparing the discourse structures to get intuitions for improving parsing algorithms. In this paper, we present DAViewer, an interactive visualization system for assisting computational linguistics researchers to explore, compare, evaluate and annotate the results of discourse parsers. An iterative user-centred design process with domain experts was conducted in the development of DAViewer. We report the results of an informal formative study of the system to better understand how the proposed visualization and interaction techniques are used in the real research environment.

Resources

Publications

    [pods name="publication" id="4401" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

Lattice Uncertainty Visualization: Understanding Machine Translation

Contributors:

Christopher Collins, Gerald Penn, and Sheelagh Carpendale

Lattice graphs are used as underlying data structures in many statistical processing systems, including natural language processing. Lattices compactly represent multiple possible outputs and are usually hidden from users. We present a novel visualization intended to reveal the uncertainty and variability inherent in statistically-derived outputs of language technologies. Applications such as machine translation and automated speech recognition typically present users with a best guess about the appropriate output, with apparent complete confidence.

Through case studies in cross-lingual instant messaging chat and speech recognition, we show how our visualization uses a hybrid layout along with varying transparency, colour, and size to reveal the various hypotheses considered by the algorithms and help people make better-informed decisions about statistically derived outputs.

Publications

    [pods name="publication" id="4470" template="Publication Template (list item)" shortcodes=1]

Acknowledgements

WordNet Visualization

Contributors:

Christopher Collins

Interface designs for lexical databases in NLP have suffered from not following design principles developed in the information visualization research community. We present a design paradigm and show it can be used to generate visualizations that maximize the usability and utility of WordNet. The techniques can be generally applied to other lexical databases used in NLP research.

Publications

    [pods name="publication" id="4464" template="Publication Template (list item)" shortcodes=1]

Acknowledgements