Latest News:
Leaderboard(s) for the first evaluation round are out!
Deadline for the final submission (all test sets, all scenarios) is moved by one week, to the 17th May 2024.
Registration and more:
Registration
You can register for the shared task using this registration
form
Data Download
You can download the data and evaluation script from this repository
Communication
Join our slack channel for any information or to ask questions here
Or write us an e-mail perspectiveargretrieval@gmail.com
About
The "Perspective Argument Retrieval" task addresses the often-overlooked challenge of incorporating
socio-cultural factors (such as political views, occupation, age, gender) in argument retrieval. By focusing
on these aspects, we acknowledge their potential latent influence on argumentation.
With this shared task, we invite the community to develop methods that concentrate on this crucial area and
advance state-of-the-art retrieval models by considering the perspective of societal diversity.
Task Description
Argument retrieval is the task of retrieving a set of top-k relevant arguments out of a
corpus given a specific query. With this shared task, we formulate perspective argument retrieval as an
expansion of
argument retrieval considering sociocultural factors. Concretely, this task proposes three scenarios of
varying difficulties to considering socio-cultural profiles. Therefore, we want to foster approaches
taking into account latent aspects of argumentation beyond semantic features, such as personal attitude. We
consider these aspects both during retrieval and evaluation.
Retrieval Scenarios
Scenario 1: Baseline retrieval of relevant arguments given a specific query from a given
corpus. Therefore,
we evaluate the general abilities of a system to retrieve relevant arguments.
Example query: Are you in favor of the introduction of a tax on foods containing sugar (sugar
tax)?
Example relevant candidates: The reduction of sugar in food should be pushed. Not every food
needs
additional sugar as a supplement.
Scenario 2: Explicit perspectivism extends the baseline task by explicitly adding
socio-cultural
information to the query and the corpus and limiting the relevant candidates to arguments from authors
matching the corresponding socio-cultural background. With this second scenario, we test whether a retrieval
system can consider socio-cultural properties when explicitly mentioned in the query and the candidates.
Example query: Given a left attitude, are you in favor of the introduction of a tax on foods
containing
sugar (sugar tax)?
Example relevant candidates: With a left attitude, reducing sugar in food should be pushed. Not
every
food
needs additional sugar as a supplement.
Scenario 3: Implicit perspectivism This scenario is similar to explicit perspectivism, but
we only add
socio-cultural information to the query. Therefore, we test the ability of a retrieval system to account
for latently encoded socio-cultural information within the argument.
Example query: Given a liberal attitude, are you in favor of the introduction of a tax on foods
containing
sugar (sugar tax)?
Example relevant candidates: Eating is an individual decision. It doesn't need a nanny
state.
Retrieval Evaluation
We employ for all three scenarios a two-folded evaluation for a comprehensive measure of
the retrieval
quality. Concretely, we distinguish between relevance and diversity /
fairness:
With relevance, we focus on the ability of a retrieval system to select the relevant
candidates, for
example, all arguments addressing the queried issue for the baseline scenario or
arguments that
additionally
match specific demographic properties for explicit or implicit
perspectivism.
Using diversity, we account for the influence of perspectivism in the evaluation by
measuring
to what extend a retrieval system retrieval system diversifies the relevant arguments regarding stance
distribution or other
socio-cultural factors, such as age or education.
Data
The data and evaluation script can be downloaded from this repository
This shared task is grounded on the x-stance dataset (Vamvas & Sennrich, 2020),
providing arguments annotated
with their stance regarding different political issues gathered from the voting recommendation platform
https://www.smartvote.ch/. This platform provides voting suggestions based on a questionnaire that
politicians and voters fill out. Therein, politicians can argue why they are in favor or against specific
political issues.
We use the arguments covering the 2019 Swiss Federal elections as a corpus and the political issues as
queries.
Afterward, we enrich these arguments with eight socio-cultural properties, either provided by the voting
platform itself (gender, age, party, …) or derived from the filled-out questionnaire of the politicians
(political
attitude, important political aspects, …). This collection encapsulates 26,335 arguments for 45 political
aspects from
German, French, and Italian.
We generate the train and development splits by considering 35 political aspects for training and 10 for
development,
while the argument corpus is used for both sets. Apart from the queries for the baseline scenario, we will
also provide
queries for the perspectivism scenarios, including socio-cultural information. As the x-stance dataset is
publicly
available, final evaluation data consist of secret test sets.
Submission Policy
You may use any external data source for pre-training your models. However, we do not accept submissions
using proprietary LLMs (e.g., GPT-4). Please do not input any of the data into the Chat-GPT online interface
to avoid data leakage.
You are allowed to use any open-source LLMs. Have a look at this website for a list of frequently
used open-source LLMs in case you want to use those.
There are three test sets for the evaluation of the shared task. The first test set is taken from the
election of 2019,
the second from the year 2023 and the third is a suprise test set. The final evaluation will be on the 8th
of May. You can
submit all final predictions until the 7th of May, 11.59 pm UTC -12h (“anywhere on Earth”). If you upload
predictions before
you will see the results of those predictions on the leaderboard (once on the 24th of April and once on the
30th of April).
You can change the predictions for all test sets, all scenarios, until the final deadline.
You can also submit partial results, however keep in mind that for the final ranking, all predictions will
be considered.
(average across all test sets and scenarios).
Important Dates
- Release Training Data: 4th March
- Release Test Set 1: 17 March
- First Evaluation Cycle (Test set 1): 24 April
- Second Evaluation Cycle (Test set 2): 2 May
- Third Evaluation Cycle (Surprise Test set): 13 May
- Submission Final Run: 17 May
- Evaluation Final Run: 18 May
- Submission System Description Paper: 27 May
- Camera-ready papers due: 1st July
- Workshop / Shared Task: 15th of August 2024
All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”).
Leaderboards
First Evaluation Circle (Test set 1, election 2019)
Relevance - Scenario Baseline
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.990189 |
0.988889 |
sbert_baseline |
1 |
4 |
0.990189 |
0.988889 |
GESIS-DSM |
1 |
4 |
0.982115 |
0.983333 |
Twente-BMS-NLP |
2 |
4 |
0.716184 |
0.683333 |
bm25_baseline |
3 |
8 |
0.987593 |
0.986111 |
sbert_baseline |
1 |
8 |
0.987593 |
0.986111 |
GESIS-DSM |
1 |
8 |
0.986639 |
0.988889 |
Twente-BMS-NLP |
2 |
8 |
0.671677 |
0.636111 |
bm25_baseline |
3 |
16 |
0.988446 |
0.990278 |
Twente-BMS-NLP |
1 |
16 |
0.983255 |
0.980556 |
sbert_baseline |
2 |
16 |
0.983255 |
0.980556 |
GESIS-DSM |
2 |
16 |
0.619426 |
0.579167 |
bm25_baseline |
3 |
20 |
0.988492 |
0.990000 |
Twente-BMS-NLP |
1 |
20 |
0.981093 |
0.977778 |
sbert_baseline |
2 |
20 |
0.981093 |
0.977778 |
GESIS-DSM |
2 |
20 |
0.596877 |
0.553333 |
bm25_baseline |
3 |
avg |
0.986423 |
0.988125 |
Twente-BMS-NLP |
1 |
avg |
0.985532 |
0.983333 |
sbert_baseline |
2 |
avg |
0.985532 |
0.983333 |
GESIS-DSM |
2 |
avg |
0.651041 |
0.612986 |
bm25_baseline |
3 |
Diversity - Scenario Baseline
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.901997 |
0.154536 |
GESIS-DSM |
1 |
political_spectrum |
Open Foreign Policy |
4 |
0.901690 |
0.155147 |
sbert_baseline |
2 |
political_spectrum |
Open Foreign Policy |
4 |
0.880867 |
0.174923 |
Twente-BMS-NLP |
3 |
political_spectrum |
Open Foreign Policy |
4 |
0.672682 |
0.152301 |
bm25_baseline |
4 |
political_spectrum |
gender |
8 |
0.908962 |
0.139420 |
GESIS-DSM |
1 |
political_spectrum |
gender |
8 |
0.908762 |
0.139904 |
sbert_baseline |
2 |
political_spectrum |
gender |
8 |
0.899820 |
0.158746 |
Twente-BMS-NLP |
3 |
political_spectrum |
gender |
8 |
0.643187 |
0.136052 |
bm25_baseline |
4 |
political_spectrum |
gender |
16 |
0.924070 |
0.106170 |
GESIS-DSM |
1 |
education |
gender |
16 |
0.923998 |
0.106429 |
sbert_baseline |
2 |
education |
gender |
16 |
0.923574 |
0.123654 |
Twente-BMS-NLP |
3 |
education |
gender |
16 |
0.608648 |
0.102753 |
bm25_baseline |
4 |
education |
gender |
20 |
0.931639 |
0.113165 |
Twente-BMS-NLP |
1 |
education |
gender |
20 |
0.929635 |
0.097030 |
GESIS-DSM |
2 |
education |
gender |
20 |
0.929557 |
0.097267 |
sbert_baseline |
3 |
education |
gender |
20 |
0.592760 |
0.093548 |
bm25_baseline |
4 |
education |
gender |
avg |
0.916166 |
0.124289 |
GESIS-DSM |
1 |
education |
gender |
avg |
0.916002 |
0.124687 |
sbert_baseline |
2 |
education |
gender |
avg |
0.908975 |
0.142622 |
Twente-BMS-NLP |
3 |
political_spectrum |
Open Foreign Policy |
avg |
0.629319 |
0.121164 |
bm25_baseline |
4 |
education |
gender |
socioVar(lowest_α) and socioVar(highest_α) are the socio-cultural variables with the lowest and highest α-ndcg
values, respectively.
Relevance - Scenario Explicit
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.853129 |
0.784139 |
Twente-BMS-NLP |
1 |
4 |
0.217063 |
0.216921 |
GESIS-DSM |
2 |
4 |
0.210929 |
0.209182 |
sbert_baseline |
3 |
8 |
0.826973 |
0.699215 |
Twente-BMS-NLP |
1 |
8 |
0.220313 |
0.215066 |
GESIS-DSM |
2 |
8 |
0.210107 |
0.206796 |
sbert_baseline |
3 |
16 |
0.806304 |
0.613947 |
Twente-BMS-NLP |
1 |
16 |
0.219608 |
0.206292 |
GESIS-DSM |
2 |
16 |
0.211977 |
0.205126 |
sbert_baseline |
3 |
20 |
0.798233 |
0.584441 |
Twente-BMS-NLP |
1 |
20 |
0.220142 |
0.204156 |
GESIS-DSM |
2 |
20 |
0.211292 |
0.201972 |
sbert_baseline |
3 |
avg |
0.821159 |
0.670436 |
Twente-BMS-NLP |
1 |
avg |
0.219281 |
0.210609 |
GESIS-DSM |
2 |
avg |
0.211076 |
0.205769 |
sbert_baseline |
3 |
Diversity - Scenario Explicit
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.803502 |
0.201835 |
Twente-BMS-NLP |
1 |
age |
Open Foreign Policy |
4 |
0.198266 |
0.161001 |
GESIS-DSM |
2 |
political_spectrum |
Open Foreign Policy |
4 |
0.193978 |
0.169943 |
sbert_baseline |
3 |
political_spectrum |
gender |
8 |
0.795288 |
0.188889 |
Twente-BMS-NLP |
1 |
political_spectrum |
Open Foreign Policy |
8 |
0.203975 |
0.145292 |
GESIS-DSM |
2 |
political_spectrum |
gender |
8 |
0.196950 |
0.153803 |
sbert_baseline |
3 |
political_spectrum |
gender |
16 |
0.788913 |
0.161965 |
Twente-BMS-NLP |
1 |
education |
gender |
16 |
0.207447 |
0.112282 |
GESIS-DSM |
2 |
education |
gender |
16 |
0.202131 |
0.120015 |
sbert_baseline |
3 |
education |
gender |
20 |
0.784744 |
0.154346 |
Twente-BMS-NLP |
1 |
education |
gender |
20 |
0.208944 |
0.103003 |
GESIS-DSM |
2 |
education |
gender |
20 |
0.202647 |
0.110547 |
sbert_baseline |
3 |
education |
gender |
avg |
0.793112 |
0.176759 |
Twente-BMS-NLP |
1 |
political_spectrum |
Open Foreign Policy |
avg |
0.204658 |
0.130394 |
GESIS-DSM |
2 |
political_spectrum |
gender |
avg |
0.198927 |
0.138577 |
sbert_baseline |
3 |
education |
gender |
socioVar(lowest_α) and socioVar(highest_α) are the socio-cultural variables with the lowest and highest α-ndcg
values, respectively.
Relevance - Scenario Implicit
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.197831 |
0.198685 |
GESIS-DSM |
1 |
4 |
0.195035 |
0.196141 |
sbert_baseline |
2 |
8 |
0.199804 |
0.200594 |
sbert_baseline |
1 |
8 |
0.198524 |
0.197360 |
GESIS-DSM |
2 |
16 |
0.205488 |
0.201150 |
GESIS-DSM |
1 |
16 |
0.205072 |
0.202502 |
sbert_baseline |
2 |
20 |
0.208600 |
0.204114 |
sbert_baseline |
1 |
20 |
0.207223 |
0.201124 |
GESIS-DSM |
2 |
avg |
0.202267 |
0.199580 |
GESIS-DSM |
1 |
avg |
0.202128 |
0.200838 |
sbert_baseline |
2 |
Diversity - Scenario Implicit
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.182819 |
0.160637 |
GESIS-DSM |
1 |
political_spectrum |
gender |
4 |
0.179801 |
0.155224 |
sbert_baseline |
2 |
political_spectrum |
gender |
8 |
0.186279 |
0.144410 |
GESIS-DSM |
1 |
political_spectrum |
gender |
8 |
0.186115 |
0.139378 |
sbert_baseline |
2 |
political_spectrum |
gender |
16 |
0.194678 |
0.110474 |
GESIS-DSM |
1 |
education |
gender |
16 |
0.193679 |
0.106891 |
sbert_baseline |
2 |
education |
gender |
20 |
0.197447 |
0.097982 |
sbert_baseline |
1 |
education |
gender |
20 |
0.197116 |
0.100929 |
GESIS-DSM |
2 |
education |
gender |
avg |
0.190223 |
0.129112 |
GESIS-DSM |
1 |
political_spectrum |
gender |
avg |
0.189261 |
0.124869 |
sbert_baseline |
2 |
education |
gender |
Second Evaluation Circle (Test set 2, election 2023)
Relevance - Scenario Baseline
|
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.903527 |
0.912500 |
boulder_NLP |
1 |
4 |
0.883713 |
0.887500 |
sbert_baseline |
2 |
4 |
0.851954 |
0.850000 |
GESIS-DSM |
3 |
4 |
0.775601 |
0.775000 |
bm_35_baseline |
4 |
4 |
0.768397 |
0.781250 |
TWENTE-BMS-NLP |
5 |
8 |
0.892774 |
0.893750 |
boulder_NLP |
1 |
8 |
0.863010 |
0.856250 |
sbert_baseline |
2 |
8 |
0.840808 |
0.834375 |
GESIS-DSM |
3 |
8 |
0.768952 |
0.775000 |
TWENTE-BMS-NLP |
4 |
8 |
0.764368 |
0.759375 |
bm_35_baseline |
5 |
16 |
0.874400 |
0.867188 |
boulder_NLP |
1 |
16 |
0.837641 |
0.823438 |
sbert_baseline |
2 |
16 |
0.811359 |
0.795312 |
GESIS-DSM |
3 |
16 |
0.755486 |
0.751563 |
TWENTE-BMS-NLP |
4 |
16 |
0.717200 |
0.693750 |
bm_35_baseline |
5 |
20 |
0.869493 |
0.861250 |
boulder_NLP |
1 |
20 |
0.836021 |
0.823750 |
sbert_baseline |
2 |
20 |
0.803148 |
0.786250 |
GESIS-DSM |
3 |
20 |
0.755566 |
0.752500 |
TWENTE-BMS-NLP |
4 |
20 |
0.692677 |
0.661250 |
bm_35_baseline |
5 |
avg |
0.885048 |
0.883672 |
boulder_NLP |
1 |
avg |
0.855096 |
0.847734 |
sbert_baseline |
2 |
avg |
0.826817 |
0.816484 |
GESIS-DSM |
3 |
avg |
0.762100 |
0.765078 |
TWENTE-BMS-NLP |
4 |
avg |
0.737462 |
0.722344 |
bm_35_baseline |
5 |
Diversity - Scenario Baseline
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.843193 |
0.190554 |
boulder_NLP |
1 |
gender |
Open Foreign Policy |
4 |
0.825927 |
0.183359 |
sbert_baseline |
2 |
gender |
Liberal Society |
4 |
0.802393 |
0.189029 |
GESIS-DSM |
3 |
gender |
Expanded Welfare State |
4 |
0.729309 |
0.194654 |
bm_35_baseline |
4 |
gender |
Expanded Welfare State |
4 |
0.715027 |
0.175067 |
TWENTE-BMS-NLP |
5 |
gender |
Enhanced Environmental Protection |
8 |
0.835625 |
0.184543 |
boulder_NLP |
1 |
civil_status |
Open Foreign Policy |
8 |
0.810864 |
0.175428 |
sbert_baseline |
2 |
civil_status |
Liberal Society |
8 |
0.793244 |
0.181590 |
GESIS-DSM |
3 |
civil_status |
Expanded Welfare State |
8 |
0.720015 |
0.186390 |
bm_35_baseline |
4 |
civil_status |
Expanded Welfare State |
8 |
0.717539 |
0.167862 |
TWENTE-BMS-NLP |
5 |
civil_status |
Enhanced Environmental Protection |
16 |
0.835437 |
0.170560 |
boulder_NLP |
1 |
education |
Open Foreign Policy |
16 |
0.803294 |
0.158082 |
sbert_baseline |
2 |
education |
Liberal Society |
16 |
0.780678 |
0.163216 |
GESIS-DSM |
3 |
education |
Expanded Welfare State |
16 |
0.719560 |
0.150388 |
TWENTE-BMS-NLP |
4 |
education |
Enhanced Environmental Protection |
16 |
0.693180 |
0.163362 |
bm_35_baseline |
5 |
education |
Expanded Welfare State |
20 |
0.836555 |
0.166120 |
boulder_NLP |
1 |
education |
Open Foreign Policy |
20 |
0.806363 |
0.153383 |
sbert_baseline |
2 |
education |
Liberal Society |
20 |
0.778455 |
0.157177 |
GESIS-DSM |
3 |
education |
Expanded Welfare State |
20 |
0.724297 |
0.144618 |
TWENTE-BMS-NLP |
4 |
education |
Enhanced Environmental Protection |
20 |
0.676614 |
0.154637 |
bm_35_baseline |
5 |
education |
Expanded Welfare State |
avg |
0.837703 |
0.177944 |
boulder_NLP |
1 |
civil_status |
Open Foreign Policy |
avg |
0.811612 |
0.167563 |
sbert_baseline |
2 |
civil_status |
Liberal Society |
avg |
0.788693 |
0.172753 |
GESIS-DSM |
3 |
civil_status |
Enhanced Environmental Protection |
avg |
0.719106 |
0.159484 |
TWENTE-BMS-NLP |
4 |
civil_status |
Enhanced Environmental Protection |
avg |
0.704779 |
0.174761 |
bm_35_baseline |
5 |
civil_status |
Enhanced Environmental Protection |
Relevance - Scenario Explicit
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.778944 |
0.693322 |
sövereign |
1 |
4 |
0.746718 |
0.664001 |
TWENTE-BMS-NLP |
2 |
4 |
0.719577 |
0.634820 |
GESIS-DSM |
3 |
4 |
0.152074 |
0.149411 |
sbert_baseline |
4 |
8 |
0.753616 |
0.601010 |
sövereign |
1 |
8 |
0.717243 |
0.567691 |
TWENTE-BMS-NLP |
2 |
8 |
0.688722 |
0.539913 |
GESIS-DSM |
3 |
8 |
0.147092 |
0.140853 |
sbert_baseline |
4 |
16 |
0.723055 |
0.494353 |
sövereign |
1 |
16 |
0.688206 |
0.466716 |
TWENTE-BMS-NLP |
2 |
16 |
0.661889 |
0.445391 |
GESIS-DSM |
3 |
16 |
0.145617 |
0.134961 |
sbert_baseline |
4 |
20 |
0.713903 |
0.457632 |
sövereign |
1 |
20 |
0.680463 |
0.433698 |
TWENTE-BMS-NLP |
2 |
20 |
0.654631 |
0.413215 |
GESIS-DSM |
3 |
20 |
0.145981 |
0.133109 |
sbert_baseline |
4 |
avg |
0.742379 |
0.561579 |
sövereign |
1 |
avg |
0.708157 |
0.533026 |
TWENTE-BMS-NLP |
2 |
avg |
0.681205 |
0.508335 |
GESIS-DSM |
3 |
avg |
0.147691 |
0.139583 |
sbert_baseline |
4 |
Diversity - Scenario Explicit
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.762587 |
0.204173 |
sövereign |
1 |
civil_status |
Enhanced Environmental Protection |
4 |
0.731383 |
0.201999 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
4 |
0.708314 |
0.201885 |
GESIS-DSM |
3 |
civil_status |
Enhanced Environmental Protection |
4 |
0.148394 |
0.188384 |
sbert_baseline |
4 |
civil_status |
Open Foreign Policy |
8 |
0.742925 |
0.195974 |
sövereign |
1 |
political_spectrum |
Enhanced Environmental Protection |
8 |
0.707982 |
0.193850 |
TWENTE-BMS-NLP |
2 |
political_spectrum |
Enhanced Environmental Protection |
8 |
0.683388 |
0.193435 |
GESIS-DSM |
3 |
political_spectrum |
Enhanced Environmental Protection |
8 |
0.144145 |
0.180716 |
sbert_baseline |
4 |
political_spectrum |
Open Foreign Policy |
16 |
0.720318 |
0.176849 |
sövereign |
1 |
political_spectrum |
Enhanced Environmental Protection |
16 |
0.685955 |
0.174981 |
TWENTE-BMS-NLP |
2 |
education |
Enhanced Environmental Protection |
16 |
0.662275 |
0.174012 |
GESIS-DSM |
3 |
education |
Enhanced Environmental Protection |
16 |
0.143560 |
0.163211 |
sbert_baseline |
4 |
political_spectrum |
Open Foreign Policy |
20 |
0.712741 |
0.170729 |
sövereign |
1 |
political_spectrum |
Enhanced Environmental Protection |
20 |
0.679465 |
0.169039 |
TWENTE-BMS-NLP |
2 |
political_spectrum |
Enhanced Environmental Protection |
20 |
0.655853 |
0.167952 |
GESIS-DSM |
3 |
education |
Enhanced Environmental Protection |
20 |
0.143953 |
0.157795 |
sbert_baseline |
4 |
political_spectrum |
Open Foreign Policy |
avg |
0.734643 |
0.186931 |
sövereign |
1 |
political_spectrum |
Enhanced Environmental Protection |
avg |
0.701196 |
0.184967 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
avg |
0.677457 |
0.184321 |
GESIS-DSM |
3 |
civil_status |
Enhanced Environmental Protection |
avg |
0.145013 |
0.172527 |
sbert_baseline |
4 |
civil_status |
Open Foreign Policy |
socioVar(lowest_α) and socioVar(highest_α) are the socio-cultural variables with the lowest and highest α-ndcg
values, respectively.
Relevance - Scenario Implicit
k |
ndcg@k |
precision@k |
team |
team_rank |
4 |
0.138857 |
0.134961 |
sbert_baseline |
1 |
4 |
0.135854 |
0.135943 |
TWENTE-BMS-NLP |
2 |
4 |
0.133892 |
0.133418 |
GESIS-DSM |
3 |
8 |
0.135941 |
0.131453 |
sbert_baseline |
1 |
8 |
0.133384 |
0.131594 |
TWENTE-BMS-NLP |
2 |
8 |
0.131639 |
0.127736 |
GESIS-DSM |
3 |
16 |
0.134325 |
0.126578 |
sbert_baseline |
1 |
16 |
0.134060 |
0.127666 |
TWENTE-BMS-NLP |
2 |
16 |
0.131696 |
0.123422 |
GESIS-DSM |
3 |
20 |
0.135666 |
0.127301 |
TWENTE-BMS-NLP |
1 |
20 |
0.134688 |
0.124804 |
sbert_baseline |
2 |
20 |
0.132429 |
0.122447 |
GESIS-DSM |
3 |
avg |
0.135953 |
0.129449 |
sbert_baseline |
1 |
avg |
0.134741 |
0.130626 |
TWENTE-BMS-NLP |
2 |
avg |
0.132414 |
0.126755 |
GESIS-DSM |
3 |
Diversity - Scenario Implicit
k |
αNDCG@k |
klDiv@k |
team |
team_rank |
socioVar(lowest_α) |
socioVar(highest_α) |
4 |
0.136044 |
0.187736 |
sbert_baseline |
1 |
civil_status |
Liberal Society |
4 |
0.130388 |
0.187460 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
4 |
0.128952 |
0.191469 |
GESIS-DSM |
3 |
civil_status |
Expanded Welfare State |
8 |
0.133246 |
0.179940 |
sbert_baseline |
1 |
civil_status |
Liberal Economic Policy |
8 |
0.128856 |
0.180906 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
8 |
0.127276 |
0.183187 |
GESIS-DSM |
3 |
civil_status |
Expanded Welfare State |
16 |
0.132503 |
0.162401 |
sbert_baseline |
1 |
civil_status |
Liberal Economic Policy |
16 |
0.130391 |
0.166958 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
16 |
0.128243 |
0.164174 |
GESIS-DSM |
3 |
civil_status |
Expanded Welfare State |
20 |
0.132824 |
0.156980 |
sbert_baseline |
1 |
civil_status |
Liberal Economic Policy |
20 |
0.131905 |
0.163094 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
20 |
0.129197 |
0.158113 |
GESIS-DSM |
3 |
civil_status |
Expanded Welfare State |
avg |
0.133654 |
0.171764 |
sbert_baseline |
1 |
civil_status |
Liberal Economic Policy |
avg |
0.130385 |
0.174605 |
TWENTE-BMS-NLP |
2 |
civil_status |
Enhanced Environmental Protection |
avg |
0.128417 |
0.174236 |
GESIS-DSM |
3 |
civil_status |
Enhanced Environmental Protection |
socioVar(lowest_α) and socioVar(highest_α) are the socio-cultural variables with the lowest and highest α-ndcg
values, respectively.