Menu Close

Hierarchical Matrix for Visual Analysis of Cross-Linguistic Features

This paper presents a visualization technique for cross-linguistic error analysis in large learner corpora. H-Matrix combines a matrix, which is commonly used by linguists to investigate cross-linguistic patterns, with a tree diagram to aggregate and interactively re-weight the importance of matrix rows to create custom investigative views. Our technique can help experts to perform data operations, such as feature aggregation, filtering, ordering and language comparison interactively without having to reprocess the data. H-Matrix dynamically links the high-level multi-language overview to the extracted textual examples, and a reading view where linguists can see the detected features in context, confirm and generate hypotheses.

The source code for H-matrix can be found on our Github.

Publications

  • M. Shimabukuro, J. Zipf, M. El-Assady, and C. Collins, “H-Matrix: Hierarchical Matrix for Visual Analysis of Cross-Linguistic Features in Large Learner Corpora,” in Proceedings of the IEEE Conference on Information Visualization (short papers), 2019.

    PDF

    @InProceedings{shi2019a,
    author = {Mariana Shimabukuro and Jessica Zipf and Mennatallah El-Assady and Christopher Collins},
    title = {H-Matrix: Hierarchical Matrix for Visual Analysis of Cross-Linguistic Features in Large Learner Corpora},
    booktitle = {Proceedings of the IEEE Conference on Information Visualization (short papers)},
    year = 2019
    }

Acknowledgements

The authors wish to thank the reviewers, our colleagues, and domain experts. This work was supported in part by NSERC Canada Research Chairs and a grant from SFB-TRR 161. This research has also been made possible by the Ontario Research Fund, funding research excellence.