Drop
----
This problem was intended to be the hardest problem in the set, and indeed, nobody
solved it. When choosing to drop tests, there is a trade-off between tests with very
low percentage scores and tests with mediocre percentage scores but with a large number
of questions. Which one of these is better can depend on the other tests you are
keeping. Figuring out how to balance these requirements is the heart of this problem.
The first thing you have to realize is that a simple greedy algorithm is not going
to be correct. For example, it seems reasonable to drop tests one at a time, always
choosing to drop the test that will improve your cumulative average by as much as
possible. However, it turns out this is wrong. For example, consider dropping 2 tests
from a group of 100/100, 14/50, 14/50, 100/200, and 100/200. The greedy approach would
first drop a 14/50 test and then a 100/200 test to get a score of 214/350 = 61%, but
the optimal strategy is to drop both 100/200 tests for a score of 128/200 = 64%.
If you have done a lot of other programming competitions, you might also think of the
dynamic programming approach (see the stones write-up). By tracking the number of
questions and the number of tests, you can use this to solve the problem in
(number of questions)*(number of tests) time. Unfortunately, the maximum number of
questions over all the tests is a trillion, so that is no good either.
What do you do then? It turns out that the problem can be solved with a binary search.
We ask the following question: "Can you drop k tests to make your cumulative average
at least x?". It turns out that fixing x makes the problem substantially easier
because this is enough to determine which tests are better than others.
If we fix x, we need to choose tests so that
(sum a_i) / (sum b_i) >= x
<=> sum a_i >= sum (x*b_i)
<=> sum (a_i - x*b_i) >= 0.
Thus, we compute c_i = a_i - x*b_i for each i. We now need to drop k of those values so
that their sum is at least 0. This is a much easier problem! Just sort the c_i's and
drop the k smallest values.
This reduces everything to a standard binary search problem. For each x, we can test
whether we can get an average of at least x, and we need to find the maximum average
that can be made. C++ code is shown bleow.
double lb = 0, ub = 1;
for (i = 0; i < 100; i++) {
double x = (lb+ub)/2;
for (j = 0; j < tests; j++)
scores[j] = num[j] - x*den[j];
sort(scores, scores+tests);
double total = 0;
for (j = k; j < tests; j++)
total += scores[j];
if (total >= 0)
lb = x;
else
ub = x;
}
cout << int(100*lb + 0.5) << endl;