Ranking Companion: A Visual Analytics Approach to Item-Based Ranking with Hybrid Item Selection

Aman Kumar; Ibrahim Al-Hazwani; J\"urgen Bernard; Maximilian Tornow; Michaela Benk

arxiv: 2606.23263 · v1 · pith:BXOCJ6G5new · submitted 2026-06-22 · 💻 cs.HC · cs.IR

Ranking Companion: A Visual Analytics Approach to Item-Based Ranking with Hybrid Item Selection

Aman Kumar , Maximilian Tornow , Michaela Benk , Ibrahim Al-Hazwani , J\"urgen Bernard This is my paper

Pith reviewed 2026-06-26 07:21 UTC · model grok-4.3

classification 💻 cs.HC cs.IR

keywords item-based rankingvisual analyticsactive learningpreference elicitationhybrid item selectionuser studyranking explanationspersonalized ranking

0 comments

The pith

A visual analytics system combines six item-selection methods with active learning so users can build personalized rankings by judging known items.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Personalized ranking is hard when users do not know data attributes or how to score them. Ranking Companion lets people externalize preferences by selecting candidate items through a mix of model-driven suggestions and human-driven techniques. An iterative machine-learning step then computes a full ranking and shows it with explanations. In a study, ten participants tried each method over three rounds and reported different strengths in accuracy, diversity, novelty, transparency, control, and satisfaction. The work therefore argues that no single selection method suffices and that a hybrid space gives users more flexible control over the ranking process.

Core claim

By integrating model-driven active learning with six complementary human-driven item-selection methods inside one interactive space, users can externalize listwise preferences through judgments on selected candidates; an iterative ranking model then produces results accompanied by explanations, and a formative study with ten participants across three iterations reveals tradeoffs among perceived ranking quality dimensions.

What carries the argument

The hybrid item-selection space that unifies six complementary methods with active-learning updates to a ranking model.

If this is right

Users can express ranking preferences without first learning which attributes exist or how to weight them.
Different selection methods shift the balance among accuracy, diversity, novelty, transparency, control, and satisfaction in the produced rankings.
Explanations attached to each ranking let users inspect and refine the model output over successive iterations.
A single unified interface supports switching among methods so that users can adapt their strategy as the ranking evolves.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid selection pattern could be tested in other preference-elicitation tasks such as playlist creation or product recommendation.
Larger-scale studies could check whether the observed tradeoffs persist when users work with bigger item sets or over longer sessions.
Designers might add automatic method-switching rules that respond to early user feedback on the quality dimensions.

Load-bearing premise

That the hybrid use of six different item-selection methods will produce observable and meaningful differences in how users rate the resulting rankings on accuracy, diversity, novelty, transparency, control, and satisfaction.

What would settle it

A controlled study in which participants report no measurable differences across the six methods on any of the six quality dimensions after repeated iterations would falsify the claim of useful tradeoffs.

Figures

Figures reproduced from arXiv: 2606.23263 by Aman Kumar, Ibrahim Al-Hazwani, J\"urgen Bernard, Maximilian Tornow, Michaela Benk.

**Figure 1.** Figure 1: The Ranking Companion interface implements an iterative three-step ranking creation workflow: select (artist) items with different methods (left), [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Preference-input mechanics. (A) number of items per statement, (B) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Designs for item-selection methods in Ranking Companion: row-based suggestion lists for ISM1–ISM3, projection-based selection for ISM4, and [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Box plots of user ratings across six perceived ranking-quality [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

Personalizing item ranking creation is a challenging task, especially when users lack knowledge of data attributes or the ability to express and formalize their attribute preferences. Item-based ranking creation is an approach allowing users to directly externalize preferences through known-item judgments rather than attribute-based scoring. However, a core challenge of item-based ranking is identifying and selecting representative candidate items for externalizing preferences. Existing approaches rely on singular item-selection methods, limiting flexibility and user control. To address this challenge, we present Ranking Companion, a visual analytics approach for item-based ranking that combines model-driven active learning with human-driven item-selection methods. By drawing from six complementary item-selection methods, users can externalize listwise preferences based on selected candidate items, while an iterative machine learning process with a ranking model calculates ranking results, presented to users alongside explanations for interpretation. We evaluated Ranking Companion in a formative user study with 10 participants, in which participants used each item-selection method across three iterations, revealing tradeoffs in perceived ranking quality across accuracy, diversity, novelty, transparency, control, and satisfaction. Ranking Companion contributes a unified interactive item selection space and provides preliminary empirical guidance toward the hybrid use of multiple complementary item-selection methods in personalized item-based ranking creation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Hybrid item selection across six methods is a clean design idea, but the N=10 formative study gives almost no quantitative backing for the claimed tradeoffs.

read the letter

The paper's main contribution is a single visual analytics interface that lets users switch among six item-selection methods (mixing active learning with human-driven ones) while building an item-based ranking. That unified space is new enough to stand on its own.

The system description is clear on how the methods feed an iterative ranking model and how explanations are shown to the user. For an HCI audience working on preference elicitation, this is a practical way to increase flexibility when users cannot articulate attribute weights.

The evaluation is the clear weak point. The abstract and stress-test note describe a study with 10 participants across three iterations that supposedly surfaces tradeoffs in accuracy, diversity, novelty, transparency, control, and satisfaction. No metrics, no statistical tests, no effect sizes, and no error analysis are mentioned. A small exploratory study can generate design insights, but it cannot reliably demonstrate observable differences; order effects or individual variance could explain anything observed. The paper would be stronger if it had reported even basic quantitative results or a clearer protocol.

This is aimed at visual analytics and HCI researchers who build ranking tools. Readers looking for a concrete hybrid-selection prototype will get value from the design section. It is coherent on its own terms and shows honest engagement with the problem, so it deserves referee time rather than a desk reject. The authors should be asked to expand the evaluation with numbers and to clarify how the six methods were implemented.

I would send it to review with the expectation of major revisions on the empirical side.

Referee Report

2 major / 1 minor

Summary. The paper presents Ranking Companion, a visual analytics system for item-based ranking creation. It integrates model-driven active learning with six complementary human-driven item-selection methods, allowing users to externalize listwise preferences via candidate items. An iterative ML ranking model then computes results, displayed with explanations. A formative user study with 10 participants (each using the methods across three iterations) is reported to reveal tradeoffs in perceived ranking quality across accuracy, diversity, novelty, transparency, control, and satisfaction. The work contributes a unified interactive selection space and preliminary empirical guidance on hybrid method use.

Significance. If the claimed tradeoffs hold under more rigorous evaluation, the hybrid approach would meaningfully advance visual analytics for personalized ranking by overcoming limitations of single-method item selection and supporting users who cannot easily articulate attribute preferences. The integration of active learning with human-driven strategies, combined with explanatory feedback, addresses a practical HCI challenge in recommendation interfaces. The preliminary guidance on method tradeoffs could inform future system designs, though the small-scale formative study currently constrains the strength of this contribution.

major comments (2)

[Evaluation / formative user study description] The central empirical claim—that the hybrid combination of six methods produces observable and meaningful tradeoffs across six quality dimensions—rests on a single formative user study with N=10. The provided description supplies no quantitative metrics, statistical tests, effect sizes, or controls for order effects/individual variance, which is load-bearing for the claim that the hybrid mechanism itself drives the differences (as opposed to demand characteristics or participant variability).
[System overview and abstract] Implementation details of the six item-selection methods and their integration into the visual analytics interface (including how model-driven active learning interacts with the human-driven approaches) are absent from the abstract and high-level evaluation summary. This prevents assessment of whether the hybrid design is reproducible or genuinely novel relative to prior singular-method baselines.

minor comments (1)

[Abstract] The abstract would be clearer if it briefly named or categorized the six complementary item-selection methods rather than referring to them generically.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and indicate where revisions will be made to improve clarity and contextualization.

read point-by-point responses

Referee: [Evaluation / formative user study description] The central empirical claim—that the hybrid combination of six methods produces observable and meaningful tradeoffs across six quality dimensions—rests on a single formative user study with N=10. The provided description supplies no quantitative metrics, statistical tests, effect sizes, or controls for order effects/individual variance, which is load-bearing for the claim that the hybrid mechanism itself drives the differences (as opposed to demand characteristics or participant variability).

Authors: We agree the study is formative and exploratory. Its purpose was to surface qualitative patterns in perceived tradeoffs (accuracy, diversity, etc.) rather than to support statistical inference. No quantitative metrics or tests were collected because the protocol was not designed for hypothesis testing or effect-size estimation. We will revise the evaluation section to explicitly label the work as formative, state the absence of order-effect controls and statistical analysis, and frame the tradeoffs as preliminary observations that motivate future controlled studies. revision: yes
Referee: [System overview and abstract] Implementation details of the six item-selection methods and their integration into the visual analytics interface (including how model-driven active learning interacts with the human-driven approaches) are absent from the abstract and high-level evaluation summary. This prevents assessment of whether the hybrid design is reproducible or genuinely novel relative to prior singular-method baselines.

Authors: The abstract follows standard length constraints and therefore remains high-level; the six methods, their algorithmic realizations, and the precise interaction protocol with the active-learning ranking model are described in the System and Implementation sections of the manuscript. To address the concern about immediate assessability, we will expand the abstract with one sentence naming the six complementary strategies and will add a short cross-reference in the evaluation summary that points readers to the integration details. These changes improve accessibility while preserving the manuscript’s existing technical depth. revision: yes

Circularity Check

0 steps flagged

No circularity: design paper with external user study evaluation

full rationale

The paper presents a visual analytics system for item-based ranking using hybrid selection methods and evaluates it via a formative user study (N=10). No equations, derivations, fitted parameters, or predictions appear in the provided text. The central claim rests on the described system design and study observations rather than any self-referential reduction, self-citation chain, or renaming of known results. No load-bearing steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a systems and HCI design paper describing a prototype and formative study; it introduces no free parameters, mathematical axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5762 in / 1157 out tokens · 29960 ms · 2026-06-26T07:21:15.779425+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 24 canonical work pages · 1 internal anchor

[1]

Bradley Knox, and Todd Kulesza

S. Amershi, M. Cakmak, W. B. Knox, and T. Kulesza, “Power to the People: The Role of Humans in Interactive Machine Learning,”AI Magazine, vol. 35, no. 4, pp. 105–120, Dec. 2014. [Online]. Available: https://doi.org/10.1609/aimag.v35i4.2513

work page doi:10.1609/aimag.v35i4.2513 2014
[2]

A Review of User Interface Design for Interactive Machine Learning,

J. J. Dudley and P. O. Kristensson, “A Review of User Interface Design for Interactive Machine Learning,”ACM Transactions on Interactive Intelligent Systems, vol. 8, no. 2, pp. 1–37, Jun. 2018. [Online]. Available: https://doi.org/10.1145/3185517

work page doi:10.1145/3185517 2018
[3]

The human is the loop: new directions for visual analytics,

A. Endert, M. S. Hossain, N. Ramakrishnan, C. North, P. Fiaux, and C. Andrews, “The human is the loop: new directions for visual analytics,”Intelligent Information Systems, vol. 43, no. 3, pp. 411–435, Dec. 2014. [Online]. Available: https://doi.org/10.1007/ s10844-014-0304-9

2014
[5]

WeightLifter: Visual weight space exploration for multi-criteria decision making,

S. Pajer, M. Streit, T. Torsney-Weir, F. Spechtenhauser, T. M ¨oller, and H. Piringer, “WeightLifter: Visual weight space exploration for multi-criteria decision making,”IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 611–620, 2017. [Online]. Available: https://doi.org/10.1109/TVCG.2016.2598589

work page doi:10.1109/tvcg.2016.2598589 2017
[6]

LineUp: Visual Analysis of Multi-Attribute Rankings,

S. Gratzl, A. Lex, N. Gehlenborg, H. Pfister, and M. Streit, “LineUp: Visual Analysis of Multi-Attribute Rankings,”IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2277–2286, Dec. 2013. [Online]. Available: https://doi.org/10.1109/TVCG.2013.173

work page doi:10.1109/tvcg.2013.173 2013
[8]

PA VED: pareto front visualization for engineering design,

L. Cibulski, H. Mitterhofer, T. May, and J. Kohlhammer, “PA VED: pareto front visualization for engineering design,”Computer Graphics Forum (CGF), vol. 39, no. 3, pp. 405–416, 2020. [Online]. Available: https://doi.org/10.1111/cgf.13990

work page doi:10.1111/cgf.13990 2020
[9]

How applicable are attribute-based approaches for human- centered ranking creation?

C.-M. Barth, J. Schmid, I. Al-Hazwani, M. Sachdeva, L. Cibulski, and J. Bernard, “How applicable are attribute-based approaches for human- centered ranking creation?”Computers & Graphics, vol. 114, pp. 45–58,
[10]

Available: https://doi.org/10.1016/j.cag.2023.05.004

[Online]. Available: https://doi.org/10.1016/j.cag.2023.05.004

work page doi:10.1016/j.cag.2023.05.004 2023
[11]

Here or There: Preference Judgments for Relevance,

B. Carterette, P. Bennett, M. Chickering, and S. Dumais, “Here or There: Preference Judgments for Relevance,” inProceedings of ECIR. Springer, Jan. 2008. [On- line]. Available: https://www.microsoft.com/en-us/research/publication/ here-or-there-preference-judgments-for-relevance/

2008
[12]

Learning to Rank for Information Retrieval,

T.-Y . Liu, “Learning to Rank for Information Retrieval,”Foundations and Trends® in Information Retrieval, vol. 3, no. 3, pp. 225–331,
[13]

doi:10.1561/1500000016 , url =

[Online]. Available: https://doi.org/10.1561/1500000016

work page doi:10.1561/1500000016
[14]

ISBN 978-1-60558-205-4

F. Xia, T.-Y . Liu, J. Wang, W. Zhang, and H. Li, “Listwise approach to learning to rank: theory and algorithm,” inInternational conference on Machine learning - ICML. ACM Press, 2008, pp. 1192–1199. [Online]. Available: https://doi.org/10.1145/1390156.1390306

work page doi:10.1145/1390156.1390306 2008
[15]

Podium: Ranking Data Using Mixed-Initiative Visual Analytics,

E. Wall, S. Das, R. Chawla, B. Kalidindi, E. T. Brown, and A. Endert, “Podium: Ranking Data Using Mixed-Initiative Visual Analytics,”IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 288–297, Jan. 2018. [Online]. Available: https://doi.org/10.1109/TVCG.2017.2745078

work page doi:10.1109/tvcg.2017.2745078 2018
[16]

Preference- driven interactive ranking system for personalized decision support,

C. Kuhlman, S. Yang, X. Sun, and E. A. Rundensteiner, “Preference- driven interactive ranking system for personalized decision support,” inACM International Conference on Information and Knowledge Management. ACM, 2018, pp. 1931–1934. [Online]. Available: https://doi.org/10.1145/3269206.3269227

work page doi:10.1145/3269206.3269227 2018
[17]

Active Learning Literature Survey,

B. Settles, “Active Learning Literature Survey,” University of Wiscon- sin–Madison, Computer Sciences Technical Report 1648, 2009

2009
[18]

A survey on instance selection for active learning,

Y . Fu, X. Zhu, and B. Li, “A survey on instance selection for active learning,”Knowledge and Information Systems, vol. 35, no. 2, pp. 249–283, May 2013. [Online]. Available: http://link.springer.com/10. 1007/s10115-012-0507-8

2013
[19]

About Last.fm,

Last.fm Ltd., “About Last.fm,” publication Title: Last.fm. [Online]. Available: https://www.last.fm/about
[20]

How much metadata do we need in music recommendation? A subjective evaluation using preference sets,

D. Bogdanov and P. Herrera, “How much metadata do we need in music recommendation? A subjective evaluation using preference sets,” inInternational Society for Music Information Retrieval Conference, (ISMIR). University of Miami, 2011, pp. 97–102. [Online]. Available: http://ismir2011.ismir.net/papers/PS1-10.pdf

2011
[21]

User-based active learning,

C. Seifert and M. Granitzer, “User-based active learning,” inIEEE Conference on Data Mining Workshops (ICDMW). IEEE, 2010, pp. 418–425. [Online]. Available: https://doi.org/10.1109/ICDMW.2010.181

work page doi:10.1109/icdmw.2010.181 2010
[22]

Inter-active learning of ad-hoc classifiers for video visual analytics,

B. H ¨oferlin, R. Netzel, M. H ¨oferlin, D. Weiskopf, and G. Heidemann, “Inter-active learning of ad-hoc classifiers for video visual analytics,” in IEEE Visual Analytics Science and Technology (VAST). IEEE, 2012, pp. 23–32. [Online]. Available: https://doi.org/10.1109/V AST.2012.6400492

work page doi:10.1109/v 2012
[23]

VIAL: a unified process for visual interactive labeling,

J. Bernard, M. Zeppelzauer, M. Sedlmair, and W. Aigner, “VIAL: a unified process for visual interactive labeling,”The Visual Computer, vol. 34, no. 9, pp. 1189–1207, Sep. 2018. [Online]. Available: https://doi.org/10.1007/s00371-018-1500-3

work page doi:10.1007/s00371-018-1500-3 2018
[24]

A Taxonomy of Property Measures to Unify Active Learning and Human-centered Approaches to Data Labeling,

J. Bernard, M. Hutter, M. Sedlmair, M. Zeppelzauer, and T. Munzner, “A Taxonomy of Property Measures to Unify Active Learning and Human-centered Approaches to Data Labeling,”ACM Transactions on Interactive Intelligent Systems, vol. 11, no. 3-4, pp. 1–42, Dec. 2021. [Online]. Available: https://doi.org/10.1145/3439333

work page doi:10.1145/3439333 2021
[25]

Towards User-Centered Active Learning Algorithms,

J. Bernard, M. Zeppelzauer, M. Lehmann, M. M ¨uller, and M. Sedlmair, “Towards User-Centered Active Learning Algorithms,”Computer Graphics Forum, vol. 37, no. 3, pp. 121–132, Jun. 2018. [Online]. Available: https://doi.org/10.1111/cgf.13406

work page doi:10.1111/cgf.13406 2018
[26]

Comparing visual-interactive labeling with active learning: An experimental study,

J. Bernard, M. Hutter, M. Zeppelzauer, D. Fellner, and M. Sedlmair, “Comparing visual-interactive labeling with active learning: An experimental study,”IEEE Transactions on Visualization and Computer Graphics (TCVG), vol. 24, no. 1, pp. 298–308, 2018. [Online]. Available: https://doi.org/10.1109/TVCG.2017.2744818

work page doi:10.1109/tvcg.2017.2744818 2018
[27]

Interactive visual labelling versus active learning: An experimental comparison,

M. Chegini, J. Bernard, J. Cui, F. Chegini, A. Sourin, K. Andrews, and T. Schreck, “Interactive visual labelling versus active learning: An experimental comparison,”Frontiers of Information Technology & Electronic Engineering (FITEE), vol. 21, no. 4, pp. 524–535, 2020. [Online]. Available: https://doi.org/10.1631/FITEE.1900549

work page doi:10.1631/fitee.1900549 2020
[28]

Personalized Visual-Interactive Music Classification,

C. Ritter, C. Altenhofen, M. Zeppelzauer, A. Kuijper, T. Schreck, and J. Bernard, “Personalized Visual-Interactive Music Classification,” EuroVis Workshop on Visual Analytics (EuroVA), p. 5 pages, 2018. [Online]. Available: https://doi.org/10.2312/EUROV A.20181109

work page doi:10.2312/eurov 2018
[29]

Questioncomb: A gamification approach for the visual explanation of linguistic phenomena through interactive labeling,

R. Sevastjanova, W. Jentner, F. Sperrle, R. Kehlbeck, J. Bernard, and M. El-Assady, “Questioncomb: A gamification approach for the visual explanation of linguistic phenomena through interactive labeling,”ACM Transactions on Interactive Intelligent Systems (TiiS), vol. 11, no. 3–4, pp. 1–38, 2021. [Online]. Available: https://doi.org/10.1145/3429448

work page doi:10.1145/3429448 2021
[30]

M. A. Hearst,Search User Interfaces, 1st ed. Cam- bridge University Press, Sep. 2009. [Online]. Available: https://doi.org/10.1017/CBO9781139644082

work page doi:10.1017/cbo9781139644082 2009
[31]

Celma,Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space

O. Celma,Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer Berlin Heidelberg, 2010. [Online]. Available: https://doi.org/10.1007/ 978-3-642-13287-2

2010
[32]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,

L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,”arXiv preprint arXiv:1802.03426, 2018. [Online]. Available: https://doi.org/10.48550/ ARXIV .1802.03426

Pith/arXiv arXiv 2018
[33]

LightGBM: A Highly Efficient Gradient Boosting Decision Tree,

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y . Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” inAdvances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips. cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf

2017
[34]

LightGBM - Features - Revision 24af9fa5,

Microsoft Corporation, “LightGBM - Features - Revision 24af9fa5,”
[35]

Available: https://lightgbm.readthedocs.io/en/latest/ Features.html

[Online]. Available: https://lightgbm.readthedocs.io/en/latest/ Features.html
[36]

A unified approach to interpreting model predictions,

S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available: https://proceedings.neurips. cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

2017
[37]

In: Medical Imaging with Deep Learning (MIDL)

I. A. Hazwani, J. Schmid, M. Sachdeva, and J. Bernard, “A Design Space for Explainable Ranking and Ranking Models,” arXiv preprint arXiv:2205.15305, 2022. [Online]. Available: https: //doi.org/10.48550/ARXIV .2205.15305

work page internal anchor Pith review doi:10.48550/arxiv 2022
[38]

A survey of visual analytics for Explainable Artificial Intelligence methods,

G. Alicioglu and B. Sun, “A survey of visual analytics for Explainable Artificial Intelligence methods,”Computers & Graphics, vol. 102, pp. 502–520, Feb. 2022. [Online]. Available: https: //doi.org/10.1016/j.cag.2021.09.002

work page doi:10.1016/j.cag.2021.09.002 2022
[39]

A user-centric evaluation framework for recommender systems,

P. Pu, L. Chen, and R. Hu, “A user-centric evaluation framework for recommender systems,” inACM conference on Recommender systems. ACM, Oct. 2011, pp. 157–164. [Online]. Available: https://doi.org/10.1145/2043932.2043962

work page doi:10.1145/2043932.2043962 2011
[40]

Hart and Lowell E

S. G. Hart and L. E. Staveland, “Development of nasa-tlx (task load index): Results of empirical and theoretical research,” inAdvances in psychology. Elsevier, 1988, vol. 52, pp. 139–183. [Online]. Available: https://doi.org/10.1016/S0166-4115(08)62386-9

work page doi:10.1016/s0166-4115(08)62386-9 1988

[1] [1]

Bradley Knox, and Todd Kulesza

S. Amershi, M. Cakmak, W. B. Knox, and T. Kulesza, “Power to the People: The Role of Humans in Interactive Machine Learning,”AI Magazine, vol. 35, no. 4, pp. 105–120, Dec. 2014. [Online]. Available: https://doi.org/10.1609/aimag.v35i4.2513

work page doi:10.1609/aimag.v35i4.2513 2014

[2] [2]

A Review of User Interface Design for Interactive Machine Learning,

J. J. Dudley and P. O. Kristensson, “A Review of User Interface Design for Interactive Machine Learning,”ACM Transactions on Interactive Intelligent Systems, vol. 8, no. 2, pp. 1–37, Jun. 2018. [Online]. Available: https://doi.org/10.1145/3185517

work page doi:10.1145/3185517 2018

[3] [3]

The human is the loop: new directions for visual analytics,

A. Endert, M. S. Hossain, N. Ramakrishnan, C. North, P. Fiaux, and C. Andrews, “The human is the loop: new directions for visual analytics,”Intelligent Information Systems, vol. 43, no. 3, pp. 411–435, Dec. 2014. [Online]. Available: https://doi.org/10.1007/ s10844-014-0304-9

2014

[4] [5]

WeightLifter: Visual weight space exploration for multi-criteria decision making,

S. Pajer, M. Streit, T. Torsney-Weir, F. Spechtenhauser, T. M ¨oller, and H. Piringer, “WeightLifter: Visual weight space exploration for multi-criteria decision making,”IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 1, pp. 611–620, 2017. [Online]. Available: https://doi.org/10.1109/TVCG.2016.2598589

work page doi:10.1109/tvcg.2016.2598589 2017

[5] [6]

LineUp: Visual Analysis of Multi-Attribute Rankings,

S. Gratzl, A. Lex, N. Gehlenborg, H. Pfister, and M. Streit, “LineUp: Visual Analysis of Multi-Attribute Rankings,”IEEE Transactions on Visualization and Computer Graphics, vol. 19, no. 12, pp. 2277–2286, Dec. 2013. [Online]. Available: https://doi.org/10.1109/TVCG.2013.173

work page doi:10.1109/tvcg.2013.173 2013

[6] [8]

PA VED: pareto front visualization for engineering design,

L. Cibulski, H. Mitterhofer, T. May, and J. Kohlhammer, “PA VED: pareto front visualization for engineering design,”Computer Graphics Forum (CGF), vol. 39, no. 3, pp. 405–416, 2020. [Online]. Available: https://doi.org/10.1111/cgf.13990

work page doi:10.1111/cgf.13990 2020

[7] [9]

How applicable are attribute-based approaches for human- centered ranking creation?

C.-M. Barth, J. Schmid, I. Al-Hazwani, M. Sachdeva, L. Cibulski, and J. Bernard, “How applicable are attribute-based approaches for human- centered ranking creation?”Computers & Graphics, vol. 114, pp. 45–58,

[8] [10]

Available: https://doi.org/10.1016/j.cag.2023.05.004

[Online]. Available: https://doi.org/10.1016/j.cag.2023.05.004

work page doi:10.1016/j.cag.2023.05.004 2023

[9] [11]

Here or There: Preference Judgments for Relevance,

B. Carterette, P. Bennett, M. Chickering, and S. Dumais, “Here or There: Preference Judgments for Relevance,” inProceedings of ECIR. Springer, Jan. 2008. [On- line]. Available: https://www.microsoft.com/en-us/research/publication/ here-or-there-preference-judgments-for-relevance/

2008

[10] [12]

Learning to Rank for Information Retrieval,

T.-Y . Liu, “Learning to Rank for Information Retrieval,”Foundations and Trends® in Information Retrieval, vol. 3, no. 3, pp. 225–331,

[11] [13]

doi:10.1561/1500000016 , url =

[Online]. Available: https://doi.org/10.1561/1500000016

work page doi:10.1561/1500000016

[12] [14]

ISBN 978-1-60558-205-4

F. Xia, T.-Y . Liu, J. Wang, W. Zhang, and H. Li, “Listwise approach to learning to rank: theory and algorithm,” inInternational conference on Machine learning - ICML. ACM Press, 2008, pp. 1192–1199. [Online]. Available: https://doi.org/10.1145/1390156.1390306

work page doi:10.1145/1390156.1390306 2008

[13] [15]

Podium: Ranking Data Using Mixed-Initiative Visual Analytics,

E. Wall, S. Das, R. Chawla, B. Kalidindi, E. T. Brown, and A. Endert, “Podium: Ranking Data Using Mixed-Initiative Visual Analytics,”IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 288–297, Jan. 2018. [Online]. Available: https://doi.org/10.1109/TVCG.2017.2745078

work page doi:10.1109/tvcg.2017.2745078 2018

[14] [16]

Preference- driven interactive ranking system for personalized decision support,

C. Kuhlman, S. Yang, X. Sun, and E. A. Rundensteiner, “Preference- driven interactive ranking system for personalized decision support,” inACM International Conference on Information and Knowledge Management. ACM, 2018, pp. 1931–1934. [Online]. Available: https://doi.org/10.1145/3269206.3269227

work page doi:10.1145/3269206.3269227 2018

[15] [17]

Active Learning Literature Survey,

B. Settles, “Active Learning Literature Survey,” University of Wiscon- sin–Madison, Computer Sciences Technical Report 1648, 2009

2009

[16] [18]

A survey on instance selection for active learning,

Y . Fu, X. Zhu, and B. Li, “A survey on instance selection for active learning,”Knowledge and Information Systems, vol. 35, no. 2, pp. 249–283, May 2013. [Online]. Available: http://link.springer.com/10. 1007/s10115-012-0507-8

2013

[17] [19]

About Last.fm,

Last.fm Ltd., “About Last.fm,” publication Title: Last.fm. [Online]. Available: https://www.last.fm/about

[18] [20]

How much metadata do we need in music recommendation? A subjective evaluation using preference sets,

D. Bogdanov and P. Herrera, “How much metadata do we need in music recommendation? A subjective evaluation using preference sets,” inInternational Society for Music Information Retrieval Conference, (ISMIR). University of Miami, 2011, pp. 97–102. [Online]. Available: http://ismir2011.ismir.net/papers/PS1-10.pdf

2011

[19] [21]

User-based active learning,

C. Seifert and M. Granitzer, “User-based active learning,” inIEEE Conference on Data Mining Workshops (ICDMW). IEEE, 2010, pp. 418–425. [Online]. Available: https://doi.org/10.1109/ICDMW.2010.181

work page doi:10.1109/icdmw.2010.181 2010

[20] [22]

Inter-active learning of ad-hoc classifiers for video visual analytics,

B. H ¨oferlin, R. Netzel, M. H ¨oferlin, D. Weiskopf, and G. Heidemann, “Inter-active learning of ad-hoc classifiers for video visual analytics,” in IEEE Visual Analytics Science and Technology (VAST). IEEE, 2012, pp. 23–32. [Online]. Available: https://doi.org/10.1109/V AST.2012.6400492

work page doi:10.1109/v 2012

[21] [23]

VIAL: a unified process for visual interactive labeling,

J. Bernard, M. Zeppelzauer, M. Sedlmair, and W. Aigner, “VIAL: a unified process for visual interactive labeling,”The Visual Computer, vol. 34, no. 9, pp. 1189–1207, Sep. 2018. [Online]. Available: https://doi.org/10.1007/s00371-018-1500-3

work page doi:10.1007/s00371-018-1500-3 2018

[22] [24]

A Taxonomy of Property Measures to Unify Active Learning and Human-centered Approaches to Data Labeling,

J. Bernard, M. Hutter, M. Sedlmair, M. Zeppelzauer, and T. Munzner, “A Taxonomy of Property Measures to Unify Active Learning and Human-centered Approaches to Data Labeling,”ACM Transactions on Interactive Intelligent Systems, vol. 11, no. 3-4, pp. 1–42, Dec. 2021. [Online]. Available: https://doi.org/10.1145/3439333

work page doi:10.1145/3439333 2021

[23] [25]

Towards User-Centered Active Learning Algorithms,

J. Bernard, M. Zeppelzauer, M. Lehmann, M. M ¨uller, and M. Sedlmair, “Towards User-Centered Active Learning Algorithms,”Computer Graphics Forum, vol. 37, no. 3, pp. 121–132, Jun. 2018. [Online]. Available: https://doi.org/10.1111/cgf.13406

work page doi:10.1111/cgf.13406 2018

[24] [26]

Comparing visual-interactive labeling with active learning: An experimental study,

J. Bernard, M. Hutter, M. Zeppelzauer, D. Fellner, and M. Sedlmair, “Comparing visual-interactive labeling with active learning: An experimental study,”IEEE Transactions on Visualization and Computer Graphics (TCVG), vol. 24, no. 1, pp. 298–308, 2018. [Online]. Available: https://doi.org/10.1109/TVCG.2017.2744818

work page doi:10.1109/tvcg.2017.2744818 2018

[25] [27]

Interactive visual labelling versus active learning: An experimental comparison,

M. Chegini, J. Bernard, J. Cui, F. Chegini, A. Sourin, K. Andrews, and T. Schreck, “Interactive visual labelling versus active learning: An experimental comparison,”Frontiers of Information Technology & Electronic Engineering (FITEE), vol. 21, no. 4, pp. 524–535, 2020. [Online]. Available: https://doi.org/10.1631/FITEE.1900549

work page doi:10.1631/fitee.1900549 2020

[26] [28]

Personalized Visual-Interactive Music Classification,

C. Ritter, C. Altenhofen, M. Zeppelzauer, A. Kuijper, T. Schreck, and J. Bernard, “Personalized Visual-Interactive Music Classification,” EuroVis Workshop on Visual Analytics (EuroVA), p. 5 pages, 2018. [Online]. Available: https://doi.org/10.2312/EUROV A.20181109

work page doi:10.2312/eurov 2018

[27] [29]

Questioncomb: A gamification approach for the visual explanation of linguistic phenomena through interactive labeling,

R. Sevastjanova, W. Jentner, F. Sperrle, R. Kehlbeck, J. Bernard, and M. El-Assady, “Questioncomb: A gamification approach for the visual explanation of linguistic phenomena through interactive labeling,”ACM Transactions on Interactive Intelligent Systems (TiiS), vol. 11, no. 3–4, pp. 1–38, 2021. [Online]. Available: https://doi.org/10.1145/3429448

work page doi:10.1145/3429448 2021

[28] [30]

M. A. Hearst,Search User Interfaces, 1st ed. Cam- bridge University Press, Sep. 2009. [Online]. Available: https://doi.org/10.1017/CBO9781139644082

work page doi:10.1017/cbo9781139644082 2009

[29] [31]

Celma,Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space

O. Celma,Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space. Springer Berlin Heidelberg, 2010. [Online]. Available: https://doi.org/10.1007/ 978-3-642-13287-2

2010

[30] [32]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,

L. McInnes, J. Healy, and J. Melville, “UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction,”arXiv preprint arXiv:1802.03426, 2018. [Online]. Available: https://doi.org/10.48550/ ARXIV .1802.03426

Pith/arXiv arXiv 2018

[31] [33]

LightGBM: A Highly Efficient Gradient Boosting Decision Tree,

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y . Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” inAdvances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips. cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf

2017

[32] [34]

LightGBM - Features - Revision 24af9fa5,

Microsoft Corporation, “LightGBM - Features - Revision 24af9fa5,”

[33] [35]

Available: https://lightgbm.readthedocs.io/en/latest/ Features.html

[Online]. Available: https://lightgbm.readthedocs.io/en/latest/ Features.html

[34] [36]

A unified approach to interpreting model predictions,

S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available: https://proceedings.neurips. cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

2017

[35] [37]

In: Medical Imaging with Deep Learning (MIDL)

I. A. Hazwani, J. Schmid, M. Sachdeva, and J. Bernard, “A Design Space for Explainable Ranking and Ranking Models,” arXiv preprint arXiv:2205.15305, 2022. [Online]. Available: https: //doi.org/10.48550/ARXIV .2205.15305

work page internal anchor Pith review doi:10.48550/arxiv 2022

[36] [38]

A survey of visual analytics for Explainable Artificial Intelligence methods,

G. Alicioglu and B. Sun, “A survey of visual analytics for Explainable Artificial Intelligence methods,”Computers & Graphics, vol. 102, pp. 502–520, Feb. 2022. [Online]. Available: https: //doi.org/10.1016/j.cag.2021.09.002

work page doi:10.1016/j.cag.2021.09.002 2022

[37] [39]

A user-centric evaluation framework for recommender systems,

P. Pu, L. Chen, and R. Hu, “A user-centric evaluation framework for recommender systems,” inACM conference on Recommender systems. ACM, Oct. 2011, pp. 157–164. [Online]. Available: https://doi.org/10.1145/2043932.2043962

work page doi:10.1145/2043932.2043962 2011

[38] [40]

Hart and Lowell E

S. G. Hart and L. E. Staveland, “Development of nasa-tlx (task load index): Results of empirical and theoretical research,” inAdvances in psychology. Elsevier, 1988, vol. 52, pp. 139–183. [Online]. Available: https://doi.org/10.1016/S0166-4115(08)62386-9

work page doi:10.1016/s0166-4115(08)62386-9 1988