WEBVTT
1
00:00:00.000 --> 00:00:03.300
Hi! In this video I am going to present our work on
2
00:00:03.300 --> 00:00:08.366
efficiency-aware multiple importance sampling for
bidirectional rendering algorithms.
3
00:00:08.366 --> 00:00:13.033
When you're rendering an image of a virtual scene, one
question you have to answer is:
4
00:00:13.033 --> 00:00:15.000
"Which algorithm should you use?"
5
00:00:15.000 --> 00:00:19.033
For example, vast exteriors are best handled by
unidirectional path tracing
6
00:00:19.033 --> 00:00:24.333
But interior scenes with caustics benefit from bidirectional
methods like the VCM algorithm.
7
00:00:25.733 --> 00:00:29.599
A more annoying decision you have to make is how to allocate
your sample budget.
8
00:00:29.599 --> 00:00:33.566
You could trace no bidirectional samples at all, so
unidirectionally
9
00:00:33.566 --> 00:00:37.866
you could trace a small number of light paths, a large
number of light paths
10
00:00:37.866 --> 00:00:41.900
You could trace a small number of shadow rays, or a large
number of shadow rays.
11
00:00:41.900 --> 00:00:45.599
You could enable photon mapping, or disable photon mapping.
12
00:00:45.599 --> 00:00:50.300
And finding the right combination of all of these decisions
can be very, very challenging
13
00:00:50.300 --> 00:00:53.300
and also very frustrating, if you have to do it manually.
14
00:00:53.866 --> 00:01:00.199
This is where our method comes in, because we propose a way
to automatically optimize these sample parameters
15
00:01:00.199 --> 00:01:04.199
for the scene that we are currently rendering. But first, we
need to recap some background.
16
00:01:04.199 --> 00:01:09.533
Our goal is to estimate some integral F of a function f(x)
17
00:01:09.533 --> 00:01:15.900
We do that via Monte Carlo integration, which means we take
n random samples x, with some PDF p(x)
18
00:01:15.900 --> 00:01:19.733
And then we simply average the sample weights of those
samples.
19
00:01:19.733 --> 00:01:27.966
And if the pdf p(x) - illustrated in red down here - closely
matches our integrand f(x) - the black line,
20
00:01:27.966 --> 00:01:33.833
then we achieve a good estimator with low variance that
needs only a low number n of samples
21
00:01:33.833 --> 00:01:40.099
But, in practice, it's often very hard to find one such
sample density that can achieve this.
22
00:01:40.099 --> 00:01:46.333
That's where multiple importance sampling comes in, because
MIS allows us to combine multiple densities,
23
00:01:46.333 --> 00:01:49.699
for example the blue and yellow ones shown down here,
24
00:01:49.699 --> 00:01:54.300
that each resemble the function we want to integrate better
in different regions
25
00:01:54.300 --> 00:01:58.166
or for different cases, like different types of illumination
in rendering.
26
00:01:58.166 --> 00:02:04.900
And then we simply sum over all those sampling techniques,
take some number n_t of samples from them
27
00:02:04.900 --> 00:02:11.833
and again compute our Monte Carlo estimate value, but
additionally weighted by an MIS weighting function w.
28
00:02:12.599 --> 00:02:16.833
This basic MIS estimator can be improved further, for
example by
29
00:02:16.833 --> 00:02:20.166
finding better PDFs, by finding better weighting functions
30
00:02:20.166 --> 00:02:25.599
or, as we are doing, by finding better sample count
allocations across the different techniques.
31
00:02:26.800 --> 00:02:29.800
Previous work has already looked at those sample counts.
32
00:02:29.800 --> 00:02:33.900
For example, the oldest approach proposed to use some
domain-specific heuristics,
33
00:02:33.900 --> 00:02:38.933
where, for example, in a direct illumination setting, you
can look at the shininess of the surface
34
00:02:38.933 --> 00:02:43.166
to decide if you want to use more BSDF samples or more
shadow rays.
35
00:02:43.166 --> 00:02:50.366
These are of course not general, and they are also not
optimal. But they are simple and can be very effective.
36
00:02:51.366 --> 00:02:57.366
An alternative is to base the sample counts on the second
moments and/or variances of the different techniques.
37
00:02:57.366 --> 00:03:04.966
There are multiple ways to do this. One is to set the sample
counts n_t of technique t proportional to the efficiency,
38
00:03:04.966 --> 00:03:08.966
which is the inverse of the product of cost and variance.
39
00:03:08.966 --> 00:03:12.166
These decisions are also not optimal,
40
00:03:12.166 --> 00:03:19.133
but they are still comparatively easy to do, although you
now need to estimate second moments and variances.
41
00:03:20.099 --> 00:03:24.933
A way to achieve something that is closer to the actual
optimum is to use convex optimization.
42
00:03:25.733 --> 00:03:31.766
If we approximate the variance by the second moment, and if
we are using the classic balance heuristic,
43
00:03:31.766 --> 00:03:35.733
then we actually have a convex function in the sample count
that we can optimize
44
00:03:35.733 --> 00:03:40.133
with either first order gradient descent, or some second
order Newton optimizer.
45
00:03:41.199 --> 00:03:47.233
and that achieves something that is much closer to the
optimum. But we are very limited,
46
00:03:47.233 --> 00:03:50.866
because we can't use arbitrary MIS weighting functions,
47
00:03:50.866 --> 00:03:54.966
it's a continuous optimization so binary decisions aren't
handled that nicely,
48
00:03:54.966 --> 00:03:59.199
and all of these approaches ignore sample cost.
49
00:04:00.266 --> 00:04:04.599
If we contrast this to the specific challenges of a
bidirectional algorithm,
50
00:04:04.599 --> 00:04:07.900
we see that none of the previous approaches will handle this
very well.
51
00:04:07.900 --> 00:04:12.933
For example, in a bidirectional algorithm, we have a large
number of light paths.
52
00:04:12.933 --> 00:04:17.300
But all of these paths are shared across all the pixels in
our image.
53
00:04:17.300 --> 00:04:22.400
So the cost to trace these millions of paths is ammortized
over the pixel integrals.
54
00:04:22.966 --> 00:04:25.966
If we ignore that cost, then we will get bad optimization
results.
55
00:04:27.400 --> 00:04:31.500
And, additionally, while this number of light paths is a
global constant,
56
00:04:31.500 --> 00:04:37.699
we have other sample counts, like the number of shadow rays
that we can control, for example, on a pixel level
57
00:04:37.766 --> 00:04:42.566
And we have seen in previous work that the classic balance
heuristic
58
00:04:42.566 --> 00:04:48.133
can perform quite poorly in a bidirectional setting, for
some corner cases at least.
59
00:04:48.133 --> 00:04:51.400
So we don't want to restrict ourselves to just the classic
balance heuristic,
60
00:04:51.400 --> 00:04:54.400
which rules out any convex optimization.
61
00:04:55.333 --> 00:05:03.266
And, finally, we also don't want to have to explicitly work
with the full product path PDFs.
62
00:05:03.266 --> 00:05:09.099
Because, in a bidirectional algorithm, a single technique
generates a full path from the camera to the light.
63
00:05:09.699 --> 00:05:15.933
And those high-dimensional products of multiple PDFs might
not even be representable in floating point.
64
00:05:15.933 --> 00:05:23.033
and they will also definitely cause numerical problems if
we, for example, add two PDF values onto each other.
65
00:05:25.199 --> 00:05:27.699
So how do we overcome those challenges?
66
00:05:27.699 --> 00:05:31.066
Our method is very simple. It's a brute-force search,
67
00:05:31.066 --> 00:05:36.766
where we start with a set of fixed candidate sample counts
for all the possible techniques.
68
00:05:37.599 --> 00:05:43.000
And then we run a very quick pilot iteration -
unidirectional path tracing in our case -
69
00:05:43.000 --> 00:05:47.766
where we efficiently estimate the second moments of all of
these candidates,
70
00:05:47.766 --> 00:05:51.000
and then we combine that with a cost heuristic,
71
00:05:51.000 --> 00:05:55.033
that predicts the number of ray-tracing and shading
operations to approximate the render time.
72
00:05:55.033 --> 00:05:58.033
And then we simply pick the best one.
73
00:05:58.800 --> 00:06:04.833
This approach is inspired by the simplest, and possibly
stupidest, way to achieve this optimization.
74
00:06:04.833 --> 00:06:08.433
For example, if you wanted to find the best number of light
paths,
75
00:06:08.433 --> 00:06:12.533
what we could do is we could start with a bunch of sensible
options
76
00:06:12.533 --> 00:06:16.033
and then, with every single one of them, render an image
77
00:06:16.033 --> 00:06:19.033
estimate the pixel variance
78
00:06:19.033 --> 00:06:21.366
and measure the render time.
79
00:06:21.366 --> 00:06:27.266
This will give us good estimates of the efficiency and tell
us which one of those sample counts is the best.
80
00:06:27.833 --> 00:06:30.833
But it also will require us to render dozens or hundreds of
images
81
00:06:31.266 --> 00:06:33.533
with a full-blown bidirectional algorithm.
82
00:06:33.533 --> 00:06:35.699
so that's not at all practical.
83
00:06:36.033 --> 00:06:39.733
But we can make this practical, by rendering only a single
image,
84
00:06:39.733 --> 00:06:42.166
with an arbitrary configuration
85
00:06:42.166 --> 00:06:46.599
and, using the samples from that to efficiently estimate the
second moments
86
00:06:46.599 --> 00:06:48.766
as an approximation of the variance
87
00:06:48.766 --> 00:06:51.766
of all of the candidates.
88
00:06:51.766 --> 00:06:54.766
And then we can approximate the render time by a cost
heuristic
89
00:06:54.766 --> 00:07:00.266
where we have measured the number of ray-tracing and shading
operations as a heuristic value,
90
00:07:00.266 --> 00:07:07.166
and achieve a very similar optimization outcome as this
simple and very slow approach,
91
00:07:07.166 --> 00:07:10.166
but with just very, very low overhead.
92
00:07:10.166 --> 00:07:15.500
The key to make our method work is to have an efficient
estimation scheme for the second moment.
93
00:07:15.500 --> 00:07:21.166
We start by rendering with a pilot strategy m that uses m_t
samples from technique t.
94
00:07:21.300 --> 00:07:25.000
The second moment of this pilot is trivial to estimate.
95
00:07:25.000 --> 00:07:29.233
We simply square the sample weights of this pilot iteration.
96
00:07:30.166 --> 00:07:33.033
But, of course, we don't want the pilot's second moment,
97
00:07:33.033 --> 00:07:38.099
we want the second moments of all of our candidates n that
use n_t samples from technique t.
98
00:07:38.099 --> 00:07:41.933
We can get this equally easily and efficiently
99
00:07:41.933 --> 00:07:45.233
by taking the squared sample weights of the pilot
100
00:07:45.233 --> 00:07:48.733
and multiplying them by a correction factor.
101
00:07:49.566 --> 00:07:56.566
And this correction factor we have designed in a way to make
it both efficient to compute and numerically stable.
102
00:07:56.566 --> 00:08:02.633
The numerical stability is achieved by basing it only on MIS
weighting function values,
103
00:08:02.633 --> 00:08:06.933
and never ever considering explicit PDF values.
104
00:08:06.933 --> 00:08:11.933
The efficiency is achieved by introducing this proxy
strategy a,
105
00:08:11.933 --> 00:08:16.399
where we use some arbitrary sample count a_t from technique
t
106
00:08:16.399 --> 00:08:19.699
which we never actually use for sampling
107
00:08:19.699 --> 00:08:26.933
but just as a mathematical trick to write this correction
factor in terms of a single set of MIS weights.
108
00:08:26.933 --> 00:08:30.399
That means that for every sample, we just need to compute
109
00:08:30.399 --> 00:08:33.166
a few additional MIS weight terms
110
00:08:33.166 --> 00:08:36.000
and then we can compute the correction factors
111
00:08:36.000 --> 00:08:40.500
for every single candidate based on those few additional MIS
weight terms.
112
00:08:41.366 --> 00:08:45.366
So how does this work if we apply it to the full VCM
algorithm?
113
00:08:46.100 --> 00:08:50.866
First, we start rendering with a single sample per pixel
with unidirectional path tracing.
114
00:08:50.866 --> 00:08:57.233
We apply our moment estimation scheme to compute the second
moments of all our candidate sample counts.
115
00:08:57.233 --> 00:09:04.166
Then we optimize on the image-level the number of light
paths n and the number of connections c.
116
00:09:05.033 --> 00:09:08.033
If the outcome tells us that path tracing is the way to go,
117
00:09:08.933 --> 00:09:14.600
we stick to unidirectional rendering forever, and just
render until our budget is consumed.
118
00:09:14.600 --> 00:09:20.333
Otherwise, we throw away this image from the unidirectional
pilot
119
00:09:20.333 --> 00:09:25.766
because it usually has a high amount of noise if the scene
actually requires bidirectional samples.
120
00:09:25.766 --> 00:09:30.100
Then we start over with another single sample per pixel
rendering
121
00:09:30.100 --> 00:09:35.899
Using VCM with our optimized number of light paths, number
of connections
122
00:09:35.899 --> 00:09:38.899
but also doing photon mapping in every single pixel.
123
00:09:38.899 --> 00:09:43.266
We again estimate the second moments of all of our
candidates.
124
00:09:43.266 --> 00:09:52.033
From those, we now first optimize per pixel whether or not
photon mapping would be useful in that pixel.
125
00:09:52.033 --> 00:09:55.100
Of course, because we just have a single sample in each
pixel,
126
00:09:55.100 --> 00:09:59.100
we need to apply some aggressive filtering to make this work
robustly.
127
00:09:59.100 --> 00:10:02.866
And, with this per-pixel decision fixed,
128
00:10:02.866 --> 00:10:07.266
we optimize on the image-level again the number of light
paths and the number of shadow rays.
129
00:10:07.266 --> 00:10:11.466
And those are the parameters we then use to render until we
are done.
130
00:10:12.133 --> 00:10:15.133
So what good is this if we apply it in practice?
131
00:10:15.933 --> 00:10:22.100
Here you can see the statistics of our speed-up after 30s of
equal-time rendering
132
00:10:22.100 --> 00:10:28.933
of our method compared to unidirectional path tracing and
standard VCM with fixed parameters.
133
00:10:29.533 --> 00:10:35.533
Compared to unidirectional path tracing, in the worst-case
scenario out of all of our 20 test scenes,
134
00:10:35.533 --> 00:10:39.500
we are never slower than 2% less then unidirectional.
135
00:10:39.500 --> 00:10:43.533
This is because of the additional overhead from our
optimization.
136
00:10:43.533 --> 00:10:49.566
If we take some more care at optimizing this, this can be
reduced even further.
137
00:10:49.566 --> 00:10:54.800
On the other hand, compared to standard VCM, even in the
worst-case we are already 12% faster
138
00:10:54.800 --> 00:11:00.866
because the full-blown VCM beast actually wastes a lot of
samples
139
00:11:00.866 --> 00:11:05.733
on sampling techniques that, for the one scene we are
rendering right now, are not useful at all.
140
00:11:05.733 --> 00:11:10.266
And, of course, in the best-case we are hundreds of times
faster than path tracing
141
00:11:10.266 --> 00:11:12.899
and 5 times faster than standard VCM.
142
00:11:12.899 --> 00:11:17.033
So let's check out some results in a specific scene.
143
00:11:17.033 --> 00:11:22.000
Here we see a diffuse - or mostly diffuse - interior scene
with lots of indirect illumination
144
00:11:22.000 --> 00:11:28.166
but also some glass bowl here that causes caustics and
reflections of those caustics.
145
00:11:28.166 --> 00:11:31.699
Our method reduces the number of light paths
146
00:11:31.699 --> 00:11:36.833
because here most of the light paths actually find the
visible region, so we don't need that many.
147
00:11:36.833 --> 00:11:41.166
Instead, it increases the number of shadow rays used for
connections
148
00:11:41.166 --> 00:11:44.133
because of all the indirect, diffuse illumination
149
00:11:44.133 --> 00:11:47.733
Also, we limit merging to mostly just the glass bowl
150
00:11:47.733 --> 00:11:51.966
There are of course some regions from the blur and also from
noise in the estimates,
151
00:11:51.966 --> 00:11:55.899
where merging is turned on despite the fact that it is not
actually beneficial there.
152
00:11:56.500 --> 00:11:59.500
With this optimization, we achieve rendering performance
that's
153
00:11:59.500 --> 00:12:02.266
10 times faster than unidirectional path tracing
154
00:12:02.266 --> 00:12:05.266
and almost 3 times faster than a standard VCM.
155
00:12:07.600 --> 00:12:11.166
On the other hand, in this super simple diffuse exterior
scene,
156
00:12:11.166 --> 00:12:14.600
we achieve the same performance as a unidirectional path
tracer,
157
00:12:14.600 --> 00:12:17.533
because we never trace a single bidirectional sample.
158
00:12:17.533 --> 00:12:21.766
Compared to bidirectional path tracing, which is 40% slower
here
159
00:12:21.766 --> 00:12:27.433
Or VCM, which is even 70% slower because all of those
bidirectional samples are completely wasted.
160
00:12:27.433 --> 00:12:29.899
Of course, our method is not perfect.
161
00:12:29.899 --> 00:12:32.600
There is the overhead that we can reduce further by
162
00:12:32.600 --> 00:12:34.600
either reducing the number of candidates
163
00:12:34.600 --> 00:12:36.466
or adapting it to the scene
164
00:12:36.466 --> 00:12:40.566
Or maybe applying some binary search instead of the
brute-force search approach.
165
00:12:40.566 --> 00:12:43.366
We can also improve our filtering
166
00:12:43.366 --> 00:12:51.100
and, most importantly, what we could also do to improve the
results further is to include the covariance.
167
00:12:51.100 --> 00:12:54.100
Because currently we are only optimizing with the second
moments
168
00:12:54.100 --> 00:12:58.466
and the impact the covariance has is shown in this example
down here.
169
00:12:58.466 --> 00:13:02.300
Here we have rendered this simple diffuse kitchen scene
170
00:13:02.300 --> 00:13:06.733
and we have optimized based on the second moments and full
variances
171
00:13:06.733 --> 00:13:09.566
from some very long prepass renderings
172
00:13:09.566 --> 00:13:13.933
and we show the number of light paths, number of connections
173
00:13:13.933 --> 00:13:16.333
the per-pixel merging decisions
174
00:13:16.333 --> 00:13:20.233
and the corresponding speed-ups over unidirectional path
tracing.
175
00:13:21.333 --> 00:13:26.266
As we can see, the number of light paths and connections is
exactly the same
176
00:13:26.266 --> 00:13:30.333
if we optimize with the variance as if we optimize with the
moments.
177
00:13:30.333 --> 00:13:35.533
That's because those techniques do not have a lot of sample
correlation.
178
00:13:35.533 --> 00:13:37.399
But photon mapping does.
179
00:13:37.399 --> 00:13:41.600
And that's why our per-pixel merge decision with just the
second moments
180
00:13:41.600 --> 00:13:45.566
in this example is severly off. We're doing way too much
photon mapping.
181
00:13:45.566 --> 00:13:49.800
And that's because in this scene, we have a lot of
correlation
182
00:13:49.800 --> 00:13:55.233
so the second moment severely underestimates the variance in
photon mapping
183
00:13:55.233 --> 00:14:00.600
and then our optimizer thinks that photon mapping is much
better than it actually is.
184
00:14:02.066 --> 00:14:06.699
If we had a way to efficiently estimate, or at least
approximate, this covariance,
185
00:14:06.699 --> 00:14:11.633
we could achieve another 10% increase in performance in this
example.
186
00:14:13.166 --> 00:14:16.833
But, in conclusion, the question we wanted to answer is:
187
00:14:16.833 --> 00:14:22.399
"Can we automatically find the best, or at least good,
parameters for a bidirectional algorithm?"
188
00:14:22.399 --> 00:14:24.100
And the answer is: "Yes, we can."
189
00:14:24.100 --> 00:14:27.100
With a very simple search-based optimization,
190
00:14:27.100 --> 00:14:30.600
combined with an efficient scheme to estimate the second
moments,
191
00:14:30.600 --> 00:14:35.033
we can achieve consistent performance improvements over
both, unidirectional path tracing
192
00:14:35.033 --> 00:14:38.100
and a standard VCM approach with fixed parameters.
193
00:14:39.133 --> 00:14:45.766
Our method never traces a single bidirectional sample in
cases where bidirectional sampling is not beneficial.
194
00:14:47.033 --> 00:14:53.466
And with that, I conclude my presentation and I thank you
all for your attention and I am looking forward to some
interesting discussions.