2010年8月2日 星期一

[Feature]Photon Mapping

I use photon mapping to do the indirect illumination. Indeed, it's way faster than path tracing.Following image is rendered with global photon map and caustic photon map.

[Cornell Box]
Parameters:1000 samples direct lighting
10000 photons
1000 photons for irradiance estimation


















Parameters:1000 samples direct lighting
10000 photons
1000 photons for irradiance estimation


















Parameters:
2000 samples direct lighting
1500x1500 HDR image as background
10000 photons
1000 photons for irradiance estimation

















2010年7月21日 星期三

[Path Tracing]Environment Mapping

After adding environment mapping, everything looks good ^^.

Path tracing isn't a good method to do caustic, so it's noisy for the caustic even though I use comparably more samples.

Parameters:
700 samples path tracing
1500x1500 HDR image as background



















Image based Lighting
Parameters:
700 samples path tracing
50 samples image based lighting
1500x1500 HDR image as background



















Image based Lighting
Parameters:
1500 samples path tracing
50 samples image based lighting
1500x1500 HDR image as background



2010年7月15日 星期四

[Progress]Path Tracing

After I review the whole concept about path tracing, now I'm more aware of it.

1. Original path tracing doesn't separate direct illumination part from indirect illumination part.

2. Direct illumination part is for smoothing the result. (Because if the sampled ray doesn't hit the luminaire, it'll be totally black and otherwise it'll be really bright.)

Here is the result. Path tracing with a glass ball.

Parameters:
1 shadow ray
1 hemisphere sample
15 times max bounce (if none of bounces hits the light)
200 samples per pixel


2010年7月1日 星期四

[Progress]BVH with SSE

Finally I finish my BVH with SSE implementation, but the outcome for a simple model (66454 triangles) doesn't seem like helpful.

Here is the outcome:

Without SSE
==BVH Info==
1-element leaf: 1838
2-element leaf: 18209
3-element leaf: 2876
4-element leaf: 2811
Total: 27047 bounding boxes
Building time: 2.453573 sec
==Rendering Info==
Rendering time: 1.565233 sec
Ray-Box intersection: 19658128
Ray-Triangle intersection: 1393575

With SSE
==BVH Info==
1-element leaf: 0
2-element leaf: 1
3-element leaf: 0
4-element leaf: 16613
Total: 16614 bounding boxes
Building time: 2.569810 sec
==Rendering Info==
Rendering time: 1.429016 sec
Ray-Box intersection: 19821888
Ray-Triangle intersection: 673405

Two weeks for 0.1 sec improvement. Cool!!

BVH(dark blue lines) for Sponza model


















Rendered Image

2010年6月29日 星期二

[Note]VTK InputConnection

My final solution is to use vtkFixedPointVolumeMapper to map the 4-channel data input into the 4-channel renderable volume.

//==following is perhaps wrong guessing==

According to my experiment, the latest version (>5.0) vtk supports 3 channel color transfer function in the class "vtkVolumeProperty". Question is that it seems like no matter how many components are input (I input a 4-component data, and I checked it's still 4-component data after read by the reader), the mapper(vtkVolumeMapper) will merge the 4 component as one. I checked the source code of vtkVolumeMapper.cxx. I think it's very likely it's done at

vtkImageData *Input = vtkImageData::SafeDownCast(genericInput);//it might lose the four channels here when it down casts to vtkImageData.

and the following

this->setInputConnection(0, input);//then it pass the merged data to the 0 port (first port).

It might be the reason that the 3-channeled color transfer function doesn't work.

[Progress]Aligned Memory allocation

For now, I use _mm_load_ps to load the data into __m128 datatype. Somehow I don't face the problem that needs to use _mm_malloc() to specify 16-bit aligned memory allocation, but I haven't done yet, so maybe there's some problem not yet found.

This time I do overload the operator and save lots of redundancy code which makes the code more understandable (even though in practice, it's not faster.)

Hope I can finish the BVH with SSE tomorrow.

2010年6月28日 星期一

[Progress]BVH with AABB

I finished two version of BVH, but they're mostly the same.
Here is the basic program flow:

For each BVH node (including the root)
Initial the BVH node (find max and min corner)
Sort with three axises and find the best gap to separate into two child nodes.
(using surface area to find the best event)

If I set the initial cost value to infinity then all leaf node will contain only one element.
But if I set it as a value that evaluate by the ray-box intersection cost and ray-triangle intersection cost, then the leaf node might contain more than one element, but however, if your parameters are correct, it might have better performance.

For SSE implementation, the BVH node "MUST" be 4-element leaf or the performance will be compromised. So I also implement a version that every node (except the leftmost or rightmost one for the case that the number is not multiple of 4) contains exactly 4 elements. It's not hard to implement but must be careful considering both two situations (put the remains in the left or right.)

2010年6月24日 星期四

Trick for releasing memory for C++ vector

For C++ vector, there's a huge problem that when you use either clear() or erase(), it won't release the memory for you. And there is "NO" direct way doing so. I searched on the internet and find one trick does that by using swap(). Here is the sample code by Tiyano (a student from Taiwan) explaining how it works.

#include
#include
using std::vector;
int main(void) {
vector A, B;
vector *C;
A.resize(1000);
printf("%d %d\n", A.size(), A.capacity());
A.clear();
printf("%d %d\n", A.size(), A.capacity());
A = B;
printf("%d %d\n", A.size(), A.capacity());
A.swap(B);
printf("%d %d\n", A.size(), A.capacity());
C = new vector;
C->resize(1000);
printf("%d %d\n", C->size(), C->capacity());
delete C;
C = new vector;
printf("%d %d\n", C->size(), C->capacity());
return 0;
}

The second method (new and delete) I'm not 100% sure it works, cuz in my case, the elements are pointers, so I adopt the first method which works.


2010年6月22日 星期二

[VTK]Volume Rendering

Intro:
VTK (Visualization toolkit) is an open source toolkit to do visualization.

I'm now using it to do volume rendering.
It's fine to do it with 1 scalar per point, but I can't find a way to do it with RGBA dataset.
Bellow is what I found on the internet related to this topic.


Here is a link I found which might help with volume rendering data input.

This is a thread for converting SimpleRayCast example from tcl to C++


Yeah!! Found a thread that is really useful. Confirmed by one of the author of vtk, they support 4-component data.
Here is the thread:
http://markmail.org/message/7abjuh22f7eywxty#query:vtk%20imagedata%204%20components+page:1+mid:7abjuh22f7eywxty+state:results

And here is my final solution.
Use vtkFixedPointVolumeRayCastMapper (vtkVolumeTextureMapper3D doesn't work on my computer) and also set the independentComponentOff for the volumeProperty. The first three components will be read as RGB and the fourth channel data will through the opacity function mapping.

SSE Implementation

This is the introduction from Toshiya which tells you how to implement the ray tracer with SSE.

My experiment:

Add two float[4] together for 10e8 times and compare it with add two __m128 with _mm_add().

Here is what I found through my experiment:
1. If you want the SSE implementation really to work, you have to optimize it (set optimize level when you compile it with gcc.) The performance can be at most nearly 4 times as fast as the implementation without SSE.

2. For my experiment, there is no big difference between using __m128 directly or using __m128* and allocate the memory with _mm_malloc(sizeof(__m128),16).
(From Toshiya: it's no difference between those two. The only thing matters is how to load the data (non-__m128 data). Using _mm_load + _mm_malloc will be compiled as faster instructions)