Handmade Hero»Episode Guide
Continuing Streamlining the Raycaster
?
?

Keyboard Navigation

Global Keys

[, < / ], > Jump to previous / next episode
W, K, P / S, J, N Jump to previous / next marker
t / T Toggle theatre / SUPERtheatre mode
V Revert filter to original state Y Select link (requires manual Ctrl-c)

Menu toggling

q Quotes r References f Filter y Link c Credits

In-Menu Movement

a
w
s
d
h j k l


Quotes and References Menus

Enter Jump to timecode

Quotes, References and Credits Menus

o Open URL (in new tab)

Filter Menu

x, Space Toggle category and focus next
X, ShiftSpace Toggle category and focus previous
v Invert topics / media as per focus

Filter and Link Menus

z Toggle filter / linking mode

Credits Menu

Enter Open URL (in new tab)
0:01Welcome to the stream
🗩
0:01Welcome to the stream
🗩
0:01Welcome to the stream
🗩
0:06Determine to continue with optimisation
🏃
0:06Determine to continue with optimisation
🏃
0:06Determine to continue with optimisation
🏃
0:57Recap yesterday's welding optimisation in GridRayCast()
📖
0:57Recap yesterday's welding optimisation in GridRayCast()
📖
0:57Recap yesterday's welding optimisation in GridRayCast()
📖
4:09Consider optimisation potential of the SpecTexel load / stores in GridRayCast()
📖
4:09Consider optimisation potential of the SpecTexel load / stores in GridRayCast()
📖
4:09Consider optimisation potential of the SpecTexel load / stores in GridRayCast()
📖
7:22Illustrate the possibility of loading in the SpecTexel values and InvBlend at the outset
7:22Illustrate the possibility of loading in the SpecTexel values and InvBlend at the outset
7:22Illustrate the possibility of loading in the SpecTexel values and InvBlend at the outset
9:23Seek easier optimisation opportunities in GridRayCast()
📖
9:23Seek easier optimisation opportunities in GridRayCast()
📖
9:23Seek easier optimisation opportunities in GridRayCast()
📖
11:43Simplify out OcclusionN from GridRayCast()
11:43Simplify out OcclusionN from GridRayCast()
11:43Simplify out OcclusionN from GridRayCast()
12:27Seek optimisation with OcclusionD and RayD in GridRayCast()
📖
12:27Seek optimisation with OcclusionD and RayD in GridRayCast()
📖
12:27Seek optimisation with OcclusionD and RayD in GridRayCast()
📖
18:48Streamline the SignRayD and NormalXYZ computations in GridRayCast()
18:48Streamline the SignRayD and NormalXYZ computations in GridRayCast()
18:48Streamline the SignRayD and NormalXYZ computations in GridRayCast()
25:35Reacquaint ourselves with the hit testing and shuffling code in GridRayCast()
📖
25:35Reacquaint ourselves with the hit testing and shuffling code in GridRayCast()
📖
25:35Reacquaint ourselves with the hit testing and shuffling code in GridRayCast()
📖
30:30Streamline the Normal selection in GridRayCast()
30:30Streamline the Normal selection in GridRayCast()
30:30Streamline the Normal selection in GridRayCast()
34:46Check out the port usage of various instructions, noting that we may get an AND for free1
📖
34:46Check out the port usage of various instructions, noting that we may get an AND for free1
📖
34:46Check out the port usage of various instructions, noting that we may get an AND for free1
📖
40:23Continue to streamline the Normal selection in GridRayCast(), introducing a NormalTable, before toggling back to the old code
40:23Continue to streamline the Normal selection in GridRayCast(), introducing a NormalTable, before toggling back to the old code
40:23Continue to streamline the Normal selection in GridRayCast(), introducing a NormalTable, before toggling back to the old code
48:12Run successfully
🏃
48:12Run successfully
🏃
48:12Run successfully
🏃
48:31Streamline the ProbeSampleNSingle usage in GridRayCast()
48:31Streamline the ProbeSampleNSingle usage in GridRayCast()
48:31Streamline the ProbeSampleNSingle usage in GridRayCast()
55:01Run successfully, and consider unit testing the grid ray cast
🏃
55:01Run successfully, and consider unit testing the grid ray cast
🏃
55:01Run successfully, and consider unit testing the grid ray cast
🏃
56:49Treat ProbeSampleNSingle wide in GridRayCast()
56:49Treat ProbeSampleNSingle wide in GridRayCast()
56:49Treat ProbeSampleNSingle wide in GridRayCast()
1:01:34Run successfully
🏃
1:01:34Run successfully
🏃
1:01:34Run successfully
🏃
1:01:50Treat OcclusionD wide in GridRayCast()
1:01:50Treat OcclusionD wide in GridRayCast()
1:01:50Treat OcclusionD wide in GridRayCast()
1:03:28Run successfully
🏃
1:03:28Run successfully
🏃
1:03:28Run successfully
🏃
1:04:02Finish streamlining the Normal selection in GridRayCast()
1:04:02Finish streamlining the Normal selection in GridRayCast()
1:04:02Finish streamlining the Normal selection in GridRayCast()
1:07:46Run successfully
🏃
1:07:46Run successfully
🏃
1:07:46Run successfully
🏃
1:08:13Temporarily try hard setting the NormalIndex to 0 in GridRayCast()
1:08:13Temporarily try hard setting the NormalIndex to 0 in GridRayCast()
1:08:13Temporarily try hard setting the NormalIndex to 0 in GridRayCast()
1:08:27We can't tell it's wrong
🏃
1:08:27We can't tell it's wrong
🏃
1:08:27We can't tell it's wrong
🏃
1:08:56Let GridRayCast() set the computed NormalIndex and make a note to test this
1:08:56Let GridRayCast() set the computed NormalIndex and make a note to test this
1:08:56Let GridRayCast() set the computed NormalIndex and make a note to test this
1:09:36hhlightprof total seconds elapsed: 4.534789
🏃
1:09:36hhlightprof total seconds elapsed: 4.534789
🏃
1:09:36hhlightprof total seconds elapsed: 4.534789
🏃
1:10:20Simplify out tUpdateBlend in GridRayCast()
1:10:20Simplify out tUpdateBlend in GridRayCast()
1:10:20Simplify out tUpdateBlend in GridRayCast()
1:12:49Augment light_atlas with StrideXYZ_4x and VoxelDim_4x
1:12:49Augment light_atlas with StrideXYZ_4x and VoxelDim_4x
1:12:49Augment light_atlas with StrideXYZ_4x and VoxelDim_4x
1:17:45Run successfully
🏃
1:17:45Run successfully
🏃
1:17:45Run successfully
🏃
1:17:54Make MakeLightAtlas() set the StrideXYZ and VoxelDim, for GridRayCast() to load out of that atlas, changing their format in light_atlas to be an array of 4
1:17:54Make MakeLightAtlas() set the StrideXYZ and VoxelDim, for GridRayCast() to load out of that atlas, changing their format in light_atlas to be an array of 4
1:17:54Make MakeLightAtlas() set the StrideXYZ and VoxelDim, for GridRayCast() to load out of that atlas, changing their format in light_atlas to be an array of 4
1:20:37Run successfully
🏃
1:20:37Run successfully
🏃
1:20:37Run successfully
🏃
1:20:46hhlightprof total seconds elapsed: 4.513986
🏃
1:20:46hhlightprof total seconds elapsed: 4.513986
🏃
1:20:46hhlightprof total seconds elapsed: 4.513986
🏃
1:22:09Remove the old AABBRayCast()
1:22:09Remove the old AABBRayCast()
1:22:09Remove the old AABBRayCast()
1:24:42Run successfully
🏃
1:24:42Run successfully
🏃
1:24:42Run successfully
🏃
1:24:51Prepare lighting_box to pack down to 64-bits total, propagating this change
1:24:51Prepare lighting_box to pack down to 64-bits total, propagating this change
1:24:51Prepare lighting_box to pack down to 64-bits total, propagating this change
1:28:29Run successfully
🏃
1:28:29Run successfully
🏃
1:28:29Run successfully
🏃
1:28:38Clean out the sprawl from FullCast()
1:28:38Clean out the sprawl from FullCast()
1:28:38Clean out the sprawl from FullCast()
1:36:20Run successfully
🏃
1:36:20Run successfully
🏃
1:36:20Run successfully
🏃
1:36:25Look into welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:36:25Look into welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:36:25Look into welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:39:21hhlightprof total seconds elapsed: 4.511818
🏃
1:39:21hhlightprof total seconds elapsed: 4.511818
🏃
1:39:21hhlightprof total seconds elapsed: 4.511818
🏃
1:39:36Extend GridRayCast() to operate on twice as many samples
1:39:36Extend GridRayCast() to operate on twice as many samples
1:39:36Extend GridRayCast() to operate on twice as many samples
1:40:44Run successfully
🏃
1:40:44Run successfully
🏃
1:40:44Run successfully
🏃
1:40:46hhlightprof total seconds elapsed: 4.394170
🏃
1:40:46hhlightprof total seconds elapsed: 4.394170
🏃
1:40:46hhlightprof total seconds elapsed: 4.394170
🏃
1:41:52Toggle off the debug code in FullCast()
1:41:52Toggle off the debug code in FullCast()
1:41:52Toggle off the debug code in FullCast()
1:43:26hhlightprof total seconds elapsed: 4.392245
🏃
1:43:26hhlightprof total seconds elapsed: 4.392245
🏃
1:43:26hhlightprof total seconds elapsed: 4.392245
🏃
1:43:41Consider welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:43:41Consider welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:43:41Consider welding the GridRayCast() calling loop from FullCast() into GridRayCast() itself
📖
1:45:57Q&A
🗩
1:45:57Q&A
🗩
1:45:57Q&A
🗩
1:47:07mindmark42 Q: Yesterday you changed your SIMD extract functions to use shuffles instead. Could you explain again why that is better?
🗪
1:47:07mindmark42 Q: Yesterday you changed your SIMD extract functions to use shuffles instead. Could you explain again why that is better?
🗪
1:47:07mindmark42 Q: Yesterday you changed your SIMD extract functions to use shuffles instead. Could you explain again why that is better?
🗪
1:47:26Extract vs Shuffle
🖌
1:47:26Extract vs Shuffle
🖌
1:47:26Extract vs Shuffle
🖌
1:56:14"Semantic" Extraction
🖌
1:56:14"Semantic" Extraction
🖌
1:56:14"Semantic" Extraction
🖌
1:58:02Unnecessary extract and cast, with thanks to mmozeiko
🖌
1:58:02Unnecessary extract and cast, with thanks to mmozeiko
🖌
1:58:02Unnecessary extract and cast, with thanks to mmozeiko
🖌
1:59:05Shuffle
🖌
1:59:05Shuffle
🖌
1:59:05Shuffle
🖌
2:00:413ygun Q: Is there such a thing as smooching too much and causing the compiler to bail before doing optimizations?
🗪
2:00:413ygun Q: Is there such a thing as smooching too much and causing the compiler to bail before doing optimizations?
🗪
2:00:413ygun Q: Is there such a thing as smooching too much and causing the compiler to bail before doing optimizations?
🗪
2:01:11billdstrong Q: Would we gain any speed by moving ahead 16 and doing 12 ops per pass?
🗪
2:01:11billdstrong Q: Would we gain any speed by moving ahead 16 and doing 12 ops per pass?
🗪
2:01:11billdstrong Q: Would we gain any speed by moving ahead 16 and doing 12 ops per pass?
🗪
2:01:40Thank you, everyone
2:01:40Thank you, everyone
2:01:40Thank you, everyone