Proposed by Helmut Simonis
This problem was used for the programming competition of the ACP Summer School 2016 in Cork. The problem was defined in multiple stages, adding constraints and problem data as the competition progressed.
This is a totally made-up problem. None of the companies mentioned have been contacted, there are no interviews that will be arranged. Sorry! But there are conferences which run such a scheme, as well as some universities. The companies mentioned are interested in Constraint Programming, many are sponsoring the CP 2016 conference, or have participants in the ACP Summer School.
You have been asked to provide a tool that helps match students to companies in an interview session at the next major Constraint Programming conference. A number of companies have expressed interest in participating, there are currently 15 companies in the scheme.
Each student from set $S$ can express a preference to interview with each
company, with values ranging from 1 (very interested) to 5 (not at all interested).
We provide a comma separated text file that shows the answers that have been
collected through a Google Forms document (test.csv
).
Each student should have three interviews during the conference, with different companies. The companies have expressed limits on how many interviews
they want to perform during the conference.
We want to find an assignment of students to companies that stays within
the capacity limits for each company, while providing the best match to the
student preferences. Overall quality will be measured by two quality
indicators:
If the preference of student $i$ for company $j$ is $p_{ij}$ , and the three companies selected for student $i$ are $s_{ik}$ with $k \in [1, 3]$, the first overall quality indicator is the sum of the preferences for the selected assignments, i.e. $\sum_{i \in S} \sum_{k=1}^{3} p_{is_{ik}}$ . We want to minimize this sum.
We do not want to have a solution that satisfies all preferences for some students, but not for others. As a secondary criterion, we want to minimize the maximum of the total preference costs per student, adding the preference values for their assignments, i.e. $\min \max_{i \in S} \sum_{k=1}^{3} p_{is_{ik}}$ .
The capacities for the companies are given in Table 1.
Nr | Company | Capacity |
---|---|---|
1 | AIMMS | 5 |
2 | SAS | 5 |
3 | Keelvar | 3 |
4 | Microsoft | 10 |
5 | 10 | |
6 | IBM | 10 |
7 | Cadence | 5 |
8 | Quintiq | 10 |
9 | Siemens | 10 |
10 | Cosling | 3 |
11 | COSYTEC | 3 |
12 | LocalSolver | 3 |
13 | N-side | 3 |
14 | UTRC-I | 5 |
15 | Zoomer | 5 |
More students are signing up to the process, there is now an updated
file test2.csv
with a larger number of participating students.
In the data yesterday there were some students that had only negative preferences (5,…,5). This made it impossible to optimize the second criterion on balancing the assignment between students. As a first change we reduce the number of interviews to the minimum of three and the number of preferences given with a value less than 4. If a student only assigns preference value five, then no interviews should be scheduled. If a student only gives two preferences of value three or better, then only two interviews should be scheduled for the student.
Today, we also change the objective:
We first want to minimize the maximum regret over all students. This is the difference between the sum of the assigned preferences, and the three best indicated preferences. If a student had stated two preferences with cost one, and two with cost two, the best three choices have value four (1+1+2). If he is assigned to one interview with preference one, and two with preference two, then the assigned cost is five (1+2+2), a resulting regret of value 1. Minimizing the maximal regret is the highest priority, and then the total cost (sum of all assigned preferences) should be minimized as secondary criterion.
The companies have replied to the increased demand by also increasing the number of slots available. But they are worried that not enough student are assigned to their interviews, and they wont participate if less than half of their slots are taken. This leads to two scenarios:
For companies $j$, they should either be assigned no student, at a disappointment cost $d_{j}$, or be assigned at least a given minimum number of students $l_{j}$, and at most a maximum number $u_{j}$ so that they are satisfied. The data for the companies is given in Table 2.
Nr | Company | Disappointment Cost | Min Assignment | Max Assignment |
---|---|---|---|---|
1 | AIMMS | 10 | 5 | 10 |
2 | SAS | 10 | 5 | 10 |
3 | Keelvar | 10 | 3 | 6 |
4 | Microsoft | 10 | 10 | 20 |
5 | 20 | 10 | 20 | |
6 | IBM | 10 | 10 | 20 |
7 | Cadence | 5 | 5 | 10 |
8 | Quintiq | 10 | 10 | 20 |
9 | Siemens | 10 | 10 | 20 |
10 | Cosling | 5 | 3 | 6 |
11 | COSYTEC | 5 | 3 | 6 |
12 | LocalSolver | 5 | 3 | 6 |
13 | N-side | 5 | 3 | 6 |
14 | UTRC-I | 5 | 5 | 10 |
15 | Zoomer | 5 | 5 | 10 |
The interview assignment process seems to be working fine, but the conference organizers see a problem with the scheduling of the interviews. The conference is run over five days (Mon-Fri), with four time periods on each day. The sessions are numbered from 1 (Monday, early morning), to 20 (Friday, late afternoon). The students can indicate five of these twenty slots, when they are available for interviews. They are not available during any other period. Otherwise the conference organizers are worried that nobody will attend the talks of the conference.
Mon | Tue | Wed | Thu | Fri | |
---|---|---|---|---|---|
AM Early | 1 | 5 | 9 | 13 | 17 |
AM Late | 2 | 6 | 10 | 14 | 18 |
PM Early | 3 | 7 | 11 | 15 | 19 |
PM Late | 4 | 8 | 12 | 16 | 20 |
Students can only have one interview during one session, so their three interviews will occupy three of their five time slots. Companies can perform two interviews in one session. All interviews are scheduled in a set of suites, with each suite costing 200 units per week. Using a small number of suites is a good idea, as if even just one interview is scheduled in a suite, then the complete weekly rate has to be paid. Of course, each suite can hold only one interview at a time. There are at most 12 suites available.
If an interview of a company is scheduled in a period, then the company representative has to attend the conference for that day. Indeed, the company has to pay for all days between their earliest and the latest interview. If for example the first interview is in session 7 (Tue), and the latest interview in session 18 (Fri), then the company has to attend for four days (Tue, Wed, Thu, Fri), even if no interviews for them are scheduled on the Thursday. It therefore pays to group all interviews for a company together. The daily rate varies with each company, and is given in the company data below.
As the best solution yesterday allowed for a maximum of one regret only, this is now imposed as a hard constraint, so the maximal preference regret for each student is one. As all companies could be satisfied in yesterday’s optimal solution, we now have to plan for all companies getting between their lower and their upper bound of interviews, we can no longer disappoint them.
The objective now consists of the sum of the preference cost for the students, the attendance cost for the companies, and the rental cost for the interview suites. Using more suites might decrease the number of days that the companies have to attend, as more interviews can be performed in parallel, but increases the rental fee.
The time slots for the students are given in a new file
slots.csv
, which on each line, lists the possible five time slots
for each student.
StudentNr,slot1,slot2,slot3,slot4,slot5
1,3,5,9,13,17
...
The cost of attendance for each company is given is Table 4, which is otherwise unchanged.
Nr | Company | DisappointmentCost | Min Assignment | Max Assignment | Attendance Cost |
---|---|---|---|---|---|
1 | AIMMS | 10 | 5 | 10 | 20 |
2 | SAS | 10 | 5 | 10 | 20 |
3 | Keelvar | 10 | 3 | 6 | 10 |
4 | Microsoft | 10 | 10 | 20 | 30 |
5 | 20 | 10 | 20 | 30 | |
6 | IBM | 10 | 10 | 20 | 30 |
7 | Cadence | 5 | 5 | 10 | 10 |
8 | Quintiq | 10 | 10 | 20 | 10 |
9 | Siemens | 10 | 10 | 20 | 20 |
10 | Cosling | 5 | 3 | 6 | 5 |
11 | COSYTEC | 5 | 3 | 6 | 5 |
12 | LocalSolver | 5 | 3 | 6 | 5 |
13 | N-side | 5 | 3 | 6 | 5 |
14 | UTRC-I | 5 | 5 | 10 | 10 |
15 | Zoomer | 5 | 5 | 10 | 10 |
Success! People have heard about your assignment tool, and want to use
it for their next conference. You can be proud of your modelling
skills! There is only a small problem: This is for a major conference,
with 15 companies participating, and up to 400 students that should be
assigned. The corresponding preference file is testNNN.csv
, the slot
data are given in slotsNNN.csv
,
and the company data are in a file companyNNN.csv
.
The file companyNNN.csv
contains the company specific data in
a .csv file. We’ve added a field to tell how many interviews the
company can perform in parallel in each session. The data from
yesterday are in the file company.csv
shown in Table 5. The disappointment value
is not used.
Company | Disappointment | Lower | Upper | DailyRate | Parallel |
---|---|---|---|---|---|
AIMMS | 10 | 5 | 10 | 20 | 2 |
SAS | 10 | 5 | 10 | 20 | 2 |
Keelvar | 10 | 3 | 6 | 10 | 2 |
Microsoft | 10 | 10 | 20 | 30 | 2 |
20 | 10 | 20 | 30 | 2 | |
IBM | 10 | 10 | 20 | 30 | 2 |
Cadence | 5 | 5 | 10 | 10 | 2 |
Quintiq | 10 | 10 | 20 | 10 | 2 |
Siemens | 10 | 10 | 20 | 20 | 2 |
Cosling | 5 | 3 | 6 | 5 | 2 |
COSYTEC | 5 | 3 | 6 | 5 | 2 |
LocalSolver | 5 | 3 | 6 | 5 | 2 |
N-side | 5 | 3 | 6 | 5 | 2 |
UTRC-I | 5 | 5 | 10 | 10 | 2 |
Zoomer | 5 | 5 | 10 | 10 | 2 |
The new datasets are given in Table 6.
Preferences | Slots | Company |
---|---|---|
test2.csv | slots.csv | company2.csv |
test100.csv | slots100.csv | company100.csv |
test200.csv | slots200.csv | company200.csv |
test400.csv | slots400.csv | company400.csv |
We also may need more than 12 suites to schedule all the interviews.
Can you still provide a good assignment?