Performance

In this assignment, you will be improving the performance of a raytracer.

A raytracer is a program which simulates the physics of light. Through a scene of objects, light rays are reflected and refracted to create a rendered image. However, unlike the real world, rays of light are not cast from light sources. Instead, they are cast from the camera and simulated through the scene towards light sources. That way, all the light rays which don't hit the camera do not need to be simulated. A raytracer is like a camera simulated in reverse.

The image below is raytraced with the reference implementation, on the reference hardware (listed below) in about 30 minutes. Below we say our reference render runs in 10 seconds, but that's only because the reference image (which you can find in your repository) is a lot lower quality.

In the image, the only light source is the lamp on the ceiling. The raytracer will programmatically scatter and reflect rays around all the objects in the scene. Since fewer rays hit the wall behind the tall block and find their way into the camera, a shadow is formed there. Well, I say "find their way into the camera", but the way a raytracer works is the other way around of course. Fewer rays from the camera hit the ceiling light there.

The graininess is formed since for each pixel, a couple of rays are traced with slightly different properties, which all reflect in slightly different ways. The more samples we take, the less grainy the image becomes.

Raytraced image

The template you get, started out as an acceptable raytracer. On our machine with:

  • 32Gb Ram
  • AMD Ryzen 9 5900HX with 8 cores + hyper-threading

It could render an image (with the same configuration as we give you) in under 10 seconds. Of course, the actual performance may differ on your hardware. However, we deliberately changed the code to make it extremely slow and inefficient. Now it's your job to find the performance issues and fix or mitigate them.

Assignment

Find and describe at least 5 sufficiently distinct ways to improve the performance of the code. Apply at least 3 of the techniques discussed in class. You can make any modification to the codebase you like, as long as you don't violate any of the following requirements:

  • The changed program should still output a raytraced image in render.bmp, which (ignoring differences due to random number generation) should be the same as the reference raytraced image.
  • You cannot hardcode the output, or any intermediate steps. The program should still actually perform raytracing.
  • You cannot import existing libraries which already implement (any substantial part of) raytracing.
  • If you use unsafe code, it should be accompanied by a (correct) safety disclaimer proving its soundness.

Note that it should not be necessary to change or understand the raytracing algorithm. Even if you don't fully understand know how raytracing works, you can still do this assignment.

Overview

In the project root there are two important folders. configurations and src. In configurations there are files that are loaded to configure the raytracer. In the src folder you will find a number of subdirectories which contain parts of the raytracer:

  • config contains the code to parse the configuration files
  • raytracer contains the code that casts rays into the scene from the camera
  • scene contains the code that keeps track of the entire scene, and all the triangles, and vertices in it
  • renderer contains code to construct the raytracer. Usually this is executed based on the configuration file
  • shader contains shaders. A shader is a function that's executed for every cast ray which basically turns it and the information gathered from it into a color
  • util contains some utilities, like an abstraction for 3d Vectors and a struct representing a color.
  • generator contains a sort-of scheduler. There are two options, basic which just sequentially fires rays for each pixel in the final image, while threaded will split up the work and divide it over a number of threads. For example, when there are supposed to be 5 threads, the image is carved up into 5 parts and each thread will raytrace a part.
  • datastructure contains code that can hold the scene in an efficient format. This format (a sort of 3d Tree) makes it possible to figure out whether a ray hit a triangle in logarithmic time instead of linear time. Note that we kept in this reasonably efficient datastructure, so you do not need to worry about optimizing it. You can assume it's reasonably efficient, and in any case it's not a simple datastructure, so it may take a lot of time to make it better.

Running the raytracer

You can simply use cargo run to execute the raytracer. By default, it uses configurations/reference.yml as a configuration file. You should keep the settings the same (changing them won't count as a performance improvement!). However, the default config specifies a 500x500 image. After some performance improvements, this should be fine. However, initially, this size will take ages to render. You can experiment with rendering at smaller sizes to test changes in your code more easily.

You will see that there is both a lib.rs and a main.rs file. The main.rs is so you can simply run cargo run. However, when you want to import the raytracer in benchmarks, you will actually use raytracer as a library through the lib.rs file. In both the main.rs and lib.rs files you will find a path to a configuration file that is used, which may be useful to change if you have multiple configurations. Note that we already removed quite a lot of configuration options, the original raytracer had more. Again, you shouldn't have to touch the configuration much.

Hint: it may help to start with some profiling to see what parts of the program are slow.

Grading requirements

See this mostly as an exercise in documenting and finding performance optimizations instead of necessarily improving the performance the most.

For a passing grade of a 6:

  • Apply at least 3 of the performance improvement techniques discussed in class
  • Apply at least 5 (distinct) performance optimizations
  • You are allowed to combine these optimizations. When you find a new way to improve the codebase, you can report on how it improved the code compared to a previous version which already had some optimizations.
  • Write at least 1 benchmark using criterion or another established benchmarking tool to evaluate the performance of your changes.
  • Write a report containing the following in a coherent story:
    • For every optimization, write at least 150 words in which you:
      • Describe what it is and how it works
      • Describe why it makes the code faster
      • Describe how you discovered this optimization. Why did you go for this approach? Did it have the effect you expected?
      • Show some kind of benchmark (does not need to be with criterion) which proves it made the code faster compared to a previous version of your code
    • You do not get any points if you do not write about it in your report.
    • In what order you discovered each of the optimizations.
    • What hardware you used to benchmark, and in what way your hardware affects the benchmarks you ran
    • When writing this report, you can apply the knowledge you gained when writing reports for Advanced Computing Systems if you took that course.

To gain extra points:

  • Find more ways to increase the performance of the code (depending on the complexity, your description and the effect it has, roughly 1 point per optimization)
  • Separate benchmarks (using criterion or established benchmarking tool) for each of your optimizations and the effect they have compared to the baseline or to the previous optimizations (1 point)
    • Make sure to substantiate why you are making the benchmarks and what you are comparing them to either previous ones or baseline.
  • After you hand your assignment in, we will benchmark your code against our original. If your code is faster than the reference you get 0.5 bonus. You can ask TAs to benchmark your code before you hand in. NOTE THAT THIS IS NOT THE POINT OF THIS ASSIGNMENT. It's a cool bonus, but the goal of this assignment is not to make a fast raytracer, but to find optimization possibilities in a large codebase, and describe those. Making an actually fast program is NOT THE MAIN GOAL.

Important:

Read the grading requirements carefully. The point of this assignment is NOT to make the fastest possible raytracer but to find ways to improve it. We mainly grade your report, if you wrote a bad report and have the fastest raytracer imaginable in the world, it still means you do not pass this assignment.