A common problem in real time rendering is ordering draw/compute tasks (henceforth, tasks) in an optimal manner. Different order of execution can greatly affect performance, for example by influencing the amount of API calls. A lot of recent development of the OpenGL API has been focused on reducing draw calls and “Approaching Zero Driver Overhead” . Likewise it is a common practice to group tasks by GLSL programs, FBOs and textures.
This article isn’t about indirect rendering or OpenGL per se, but about a better way of grouping tasks in an attempt to reduce total amount of API calls.
Furthermore, with asynchronous compute and multiple parallel command queues on the horizon, we should not regard task dispatching as a serial queue but as a scheduling problem.
Continue reading “Ordering Rendering Tasks, Ants and Graph Theory (Part 1)”
Sparsely allocated (partially resident) textures are exposed through an OpenGL extension (GL_ARB_sparse_texture), allowing us to allocate virtual textures as a portion of the GPU’s virtual addressing space, and commit to physical memory only specific pages as needed.
This post aims to cover the voxelization process, using a sparse 3D image serving as a lattice for the voxels’ data structure.
Continue reading “Dense Voxelization into a Sparsely Allocated 3D Lattice”
Been messing around with non-analytical BRDFs for the last few days.
Thanks to Pab Ltd for the data.
A Bi-Directional Reflectance Distribution Function is a 4D function that defines surface reflection. To avoid the 4-th dimension I generated 3D isotropic approximation from Pab’s data. Unlike anisotropic surfaces (e.g. brush metal), isotropic surfaces reflection is invariant of the tangental direction, therefore the two ϕ angles of the incident and exitant vectors in spherical coordinates can be merged. This results in a 3D function which can be easily encoded as a 3D texture.
Continue reading “Playing with BRDFs”
Wait-free algorithms attract vast interest and are an area of intense research, the motivation being that true lock-free algorithms and data structures provide great benefits in terms of performance and scalability over lock-based variants. However designing lock-free systems isn’t a simple matter.
Recently I wrote a lock-free hash-map and in this post I will describe the process in detail.
The reader should have basic familiarity with hash tables, lock-free concurrency theory and the C++11 atomic library. From now on, by “lock-free” I mean true “wait-free”.
Continue reading “Designing a Lock-Free, Wait-Free Hash Map”