AI and VFX

LinkedIn was doing an article on AI in VFX, but the comment field was small, and I sometimes ramble. So rather than making a more concise comment, I just posted it here.

The history of the “AI Treadmill” is always important to keep in mind when we get into a hype cycle.

In the 1950’s, expression parsing was considered AI because humans could do it, but we hadn’t yet figured out how to make computers do it. Concepts like syntax trees and the notation of ‘*’ for pattern matching come from linguistics research and got adapted for Computer Science by early AI researchers. Today, they underly parsing for basically every kind of script and file format. Every little Python script or expression knob is using “AI” according to an AI researcher who just woke up from a very long nap.

In the 1980’s, computer vision was a hot topic in AI research. In the 1990’s, traditional VFX was predictably upended by technology from AI research, again. Before tools like Boujou became available, doing manual frame-by-frame 3D tracking was a niche. Then that niche was gone, making up a small part of the VFX workload for a few frames here and there when the automated tracker glitched.

Today, Deep Learning technology from AI research is the thing predictably disrupting VFX, yet again, as has always been the case. If anything, we’ve been in an unusually stable period in the last decade or so which makes us long overdue for a massive disruption that changes the way the work is done. Change is, as it ever was, coming for your job. Change was coming for your job last year. And change will be coming for your job next year. That part never changes. We’ve just grown so used to the past changes that they all seem like they must have always been obvious, so we forget how disruptive they were in their own ways over the years.

Maybe this time is different. Maybe this time we’ll build Sky Net and Sky Net will build Terminators, and this time it won’t just be happening on screen. But probably not.

Photogrammetry will make it easier to put real things in 3D spaces. And to put 3D things in images of real spaces. But it still takes humans to think up interesting things that we’ll actually care about seeing rendered in spectacular visual effects. That aspect is often missed in a lot of the research papers. New techniques are often so tied to using ground-truth in training the models that they wind up focused on being able to render stunningly realistic versions of things that I could already see without needing a 3D render. I have plenty of spoons in a drawer in my kitchen. I can see plenty of spoons on the local coffee shop. A paper on a more realistic way to render a spoon faster is interesting as far as it goes. But two hours of realistic spoons won’t win any Oscars. If it could, we could have seen two hours of footage of actual spoons. Modern software makes it easier to make Bullet Time sequences than ever before. But it still takes a human to create a film like “The Matrix” that muses about the possibilities of the human condition that arise if you assume that there is no spoon.

Every Lie Incurs a Tech Debt to the Truth

I recently watched a video about “Vranyo,” the corrupt culture of normalized lying and dishonesty that has been a huge problem for Russia in Ukraine. This topic doesn’t lead directly to a conversation about tech debt, but the video mentioned a quote from Chernobyl that “Every lie incurs a debt to the truth,” and the dismal failures by the Russian military suddenly seemed oddly familiar.

Leadership shocked when things fail. Small problems accumulating for no good reason. Manpower wasted in wave after wave of attacks one what is believed to be a simple problem. Seeing any patterns?

And meanwhile, Elon Musk is surrounding himself with Yes Men and trying to take over Twitter with about the same level of success that Putin has been having in Ukraine, so it seems timely to try to make a ham fisted metaphor about everything currently in the news.

Honestly, it’s pretty long. You don’t necessarily have to watch it all right now. But it is well made, and it solidly fleshes out some specific examples of how things are going wrong in Ukraine.
Continue reading

Of course your JIRA backlog keeps growing. It’s math.

Programmers generally understand mathematical limits. At least in terms of “Big O” notation for complexity, even if they don’t love digging into mathematical formalisms about limits from Calculus class. Some algorithms scale better than others. Knowing how things scale is an important part of being able to shout buzzwords like “Web Scale!” in meetings. Scaling organizations is at least as hard as scaling technology.

Stolen from Twitter.
We all learned what exponential growth is during the pandemic.

What is the limit of the logarithmic function log(t) as t approaches infinity? (The limit does not exist / infinity.)

What is the limit of the exponential function et as t approaches infinity? (The limit does not exist / infinity. But bigger than the log function.)

What is the limit of the constant function C as t approaches infinity? (The limit is C.)

And finally… It costs log(t) to remove an item from a queue. It costs C to add an item to that queue. What is the upper bound on storage required for that queue to hold all items as t approaches infinity?

Even junior developers have no problem answering these questions. But empirically, it seems like architects as team leads have no idea. People are often surprised at how much Jira queues tend to grow over time, and they don’t expect to have to scale organizations over time in the way that they scale technology. The math above is more than enough to explain why, in the long term, you will never ever catch up on your Jira queue. Let’s dig in.

Continue reading

Vulkan Unit Tests with RenderDoc

edit to add: The very helpful author of RenderDoc told me on Twitter that the command line capture utility is meant as an internal thing, and not for general use. So don’t complain if this breaks in the future. Also, he said that my “--opt-ref-all-resources” is probably superfluous in most cases and will generally bloat the capture for no benefit.

One of the things I really like about Vulkan vs. OpenGL is that Vulkan is “offscreen by default” while OpenGL depends on the windowing system. With Vulkan, you can just allocate some space for an image, and draw/compute to that image. You can pick which specific device you are using, or do it once for every device. If you want to display the image, blit it to the swapchain. With OpenGL, you tend to need to open a Window with certain options, initialize OpenGL and create a Context using that Window handle, and use one of several extensions for offscreen rendering. (Probably FBO, maybe PBO. It’s an archaeological expedition through revisions of the spec and outdated documentation to be sure exactly what to do.)

Ceci n’est pas une teste.

Consequently, it is muuuuuch easier to make a “little” Vulkan application (that admittedly depends on a not-so-little engine/library with a bunch of boiler plate and convenience code) that does one thing off screen and exits without needing to pop up anything in the GUI as a part of the process.

This naturally raises the question… How are you sure that little utility actually does what you told it to?

Continue reading

Okay, but what if we just made static linking better?

Every once in a while, somebody gets so frustrated by the modern dynamic linking ecosystem that they suggest just throwing it out entirely and going back to static linking everything. Seriously, the topic comes up occasionally in some very nerdy circles. I was inspired to write this blog post by a recent conversation on Twitter.

Static and Dynamic Linking C Code – Flawless!
I got this diagram from another blog.

So, what’s dynamic linking? Why is it good? Why is it bad? Is it actually bad? What’s static linking, and how did things work before dynamic linking? Why did we mostly abandon it? And… What would static linking look like if we “invented” it today with some lessons learned from the era of dynamic linking? (And no, I am not talking about containers and Docker images when I say modern static linking. This isn’t one of those blog posts where we just treat a container image as morally equivalent to a static linked binary. Though I have some sympathy for that madness given that the modern software stack often feels un-fixable.)

Continue reading

Electron is fine, or it will be tomorrow, thanks to Moore’s Law, right?

Electron is somewhat controversial as an application development framework. Some people complain that the applications that use it aren’t very efficient. Others say they just use RAM and no sense letting it go to waste. So, which is it? Is the tradeoff of developer time for efficiency worthwhile?

We take it as an article of faith that newer cheaper better faster machines come out every year.

Sure, Moore’s Law doesn’t give the gains it once did. And sure, I am looking at buying a laptop today that has literally the exact same amount of RAM as my 10 year old desktop that seems to finally have died. And sure, modern laptops make RAM upgrades impossible because the RAM is soldered on, so I’ll have the same amount of RAM for the next several years over the lifespan of the laptop. So I’ll have the same amount of RAM in my main computer for somewhere between 10 and fifteen years depending on how long the new laptop lasts.

But faith says I’ll have more RAM over time no matter what!

Continue reading

The Corporate Event Horizon

We all like to think the whole Universe is centered on us. And thanks to relativity and whatnot, the whole observable universe really is centered on the observer! In science, the universe is constantly expanding, and every observer sees themselves as being at the center of that expansion.

Expansion - Meaning of Expansion
Books always use raisins in a rising loaf of bread to explain the expansion of the universe not having a center. Despite the fact that bread does have a center. So, here’s some raisin bread. Now you understand relativity 100%. Congrats.

You can’t see everything — you can only see a subset of the whole universe. Light doesn’t travel infinitely fast, and the universe isn’t infinitely old. That means you can only see a ting bubble of only about 15 billion light years around yourself. Aliens living in a distant galaxy half way across the universe can likewise only see a 15 billion light year bubble around themselves. And if those aliens are more than 15 billion light years away from us, we can’t see them and they can’t see us.

To clarify that a bit, it’s not just that we can’t see them with our eyeballs or our current telescopes — we can’t have any sort of interaction with them. We can’t ever have had any sort of interaction with them at any point in the past. And the things we are doing today can’t possibly be effected by any sort of interaction with those distant aliens. That is to say, they are beyond a metaphorical horizon, beyond which the events don’t matter to us and can’t be observed by us. Just like how you can’t see stuff below the Earth’s horizon. Hence, an Event Horizon.

Event Horizon (1997) - IMDb
This blog post has nothing to do with the 1990’s sci fi horror film.

So, why am I trying to shoehorn the concept of an Event Horizon into corporate life? When a corporation is small, everybody shares an event horizon. Three engineers wedged into one work room in a startup will hear each other’s conversations. When something breaks, they’ll all know about it. They all observe the exact same bubble in the universe. Their shared universe is small. And they can readily reach consensus about what’s important, what’s broken, and what’s working. They may disagree about what to do next, which Linux distribution or programming language is best, etc. But they exist in a shared universe and a shared understanding of basic facts. That shared universe which can not last forever. And the inevitable collapse of the shared perspective leads to all sorts of trouble that is, by its nature, impossible to observe directly.

Continue reading

The Vulkan-HPP Dilemma

The Vulkan API is C. It’s uncommon to write full applications in only C these days — C++ is far more common. If you are writing C++, it makes sense to use the Vulkan-C++ wrapper which adds type safety, exceptions, and optional automatic RAII style scoped memory management with “Unique” data types. Vulkan is hard enough to use without that syntax sugar sprinkled on top, so if you are using C++ and it is practical, your application should absolutely use the C++ binding types to take advantage of objectively good things like type safety, and having cleanup work properly. (Or not?)

Continue reading

The Wheel of Reinvention. Or, the Eight Steps to get to Step One.

The wheel of reinvention is a term that I most closely associate with the history of computer graphics, but the idea is pretty universal.  Technology is cyclical, no matter the specific field. All software needs to be a bit more flexible than you originally thought, and it eventually spawns more software because the solution to the problem of a computer program you don’t like is almost always another computer program that is even less carefully constructed.

Technology is Cyclical from 30 Rock
Maybe Liz Lemon’s boyfriend on 30 Rock was right, after all.

This is basically my take on the Configuration Complexity Clock. I’m not the first person to write about the topic. And since it’s practically a Natural law, like a gravitational pull, I certainly won’t be the last to deal with it, notice it, or write about it. But it is something that popped up on my radar again recently, so I wanted to put it in my own words. Let’s look at the whole cycle of terrible crippling success after terrible success…

Continue reading

First thoughts on Vulkan Video API’s in the context of a post production pipeline

Vulkan has some new extensions for decoding video. Like everything with Vulkan, they seem to be kind of a pain / kind of awesome. I don’t have experience using them in practice yet, but I have poked through the extensions and some sample code.

So… Disclaimer: I’m not *good* at Vulkan. At best, I know just barely enough to have an opinion. A ton of people have spent way more time with the API than I have. And they have done way more interesting things than me. That said, a lot of the information out there is focused on gamedev. Vulkan has a ton of functionality that can be handy for offline image processing type tasks, but you have to figure out some of the details yourself. Hopefully, some of my notes may be handy for you if you are going down this path. Since I have been playing with Vulkan, a few people have asked me questions about it, and I have started making some notes that I figured I may as well share in case they prove useful to anybody. The target audience here is admittedly very narrow — people who know enough about Vulkan to want to do stuff with it, but not enough to just go read the extension specifications themselves.

Continue reading