The Vulkan API is C. It’s uncommon to write full applications in only C these days — C++ is far more common. If you are writing C++, it makes sense to use the Vulkan-C++ wrapper which adds type safety, exceptions, and optional automatic RAII style scoped memory management with “Unique” data types. Vulkan is hard enough to use without that syntax sugar sprinkled on top, so if you are using C++ and it is practical, your application should absolutely use the C++ binding types to take advantage of objectively good things like type safety, and having cleanup work properly. (Or not?)Continue reading
The wheel of reinvention is a term that I most closely associate with the history of computer graphics, but the idea is pretty universal. Technology is cyclical, no matter the specific field. All software needs to be a bit more flexible than you originally thought, and it eventually spawns more software because the solution to the problem of a computer program you don’t like is almost always another computer program that is even less carefully constructed.
This is basically my take on the Configuration Complexity Clock. I’m not the first person to write about the topic. And since it’s practically a Natural law, like a gravitational pull, I certainly won’t be the last to deal with it, notice it, or write about it. But it is something that popped up on my radar again recently, so I wanted to put it in my own words. Let’s look at the whole cycle of terrible crippling success after terrible success…Continue reading
Vulkan has some new extensions for decoding video. Like everything with Vulkan, they seem to be kind of a pain / kind of awesome. I don’t have experience using them in practice yet, but I have poked through the extensions and some sample code.
So… Disclaimer: I’m not *good* at Vulkan. At best, I know just barely enough to have an opinion. A ton of people have spent way more time with the API than I have. And they have done way more interesting things than me. That said, a lot of the information out there is focused on gamedev. Vulkan has a ton of functionality that can be handy for offline image processing type tasks, but you have to figure out some of the details yourself. Hopefully, some of my notes may be handy for you if you are going down this path. Since I have been playing with Vulkan, a few people have asked me questions about it, and I have started making some notes that I figured I may as well share in case they prove useful to anybody. The target audience here is admittedly very narrow — people who know enough about Vulkan to want to do stuff with it, but not enough to just go read the extension specifications themselves.Continue reading
(This started as a post many years ago on an old blog. In those days it was fashionable to install every plugin on your self hosted word press instance, and then never update any of it, so naturally that blog needed to get taken down. A colleague reminded me of this in a conversation, so I decided to dig it out of an old MySQL backup since I don’t think anybody else on the Internet was ever bored enough to document this many color wheel widgets in various applications. At some point perhaps I’ll more actively revisit the topic with some screenshots of software from the current decade. At this point the post is mainly interesting for having documented long obsolete color wheels like Shake and “Apple Color” from the old Final Cut Studio.)
Have you ever noticed how many variations there are on color wheels? It’s basically a very simple idea. Different colors go around the edge of a circle. Different saturations go from the edge to the center. Most saturated is at the outermost edge of the circle. Easy, right? Couldn’t be simpler. So you’d think… While making a color adjuster widget for an application recently, I have been pondering them with more attention than I ever thought I would. Here are some well known examples of color wheels and color wheel type adjuster widgets… (Click images to get them full-sized, in all their utilitarian glory.)
This Shake wheel widget is ‘live.’ So, as you adjust the value slider, it will get darker or lighter. If value is set to 0, you will just see a big black square with no wheel in it, which can be counter-intuitive the first time you try to select a color.
For some reason, the wheel in the Nuke color picker uses Shake ordering for the colors, despite the fact that the color wheel node doesn’t. (It’s shown in thumbnail in the DAG view to the right of the color picker window in case you don’t believe me.) The button you click to bring up the color picker has an image on it with the same ordering as the node, despite the fact that this means the wheel on the button doesn’t match the wheel on the window it brings up. Also notable are the radius-circle and vector line in the color wheel pointing out exactly where the current selected color is.
That last one was inspired in part by the visual softness of the FCP wheels with their nonlinear saturation falloff, I am experimenting with a biased cubic falloff for the saturation in my color wheels. This results in a slightly smoother appearance than straight linear falloff, but the biasing prevents the center from being completely blown out, so you can always see what you are doing, even if you zoom in on the widget so you can’t see the fully saturated edges anymore, because you want very fine adjustments. It’s intended for a color adjuster for color correction, rather than a true color picker, so it wasn’t important to me to have the color of the clicked spot map precisely to the resulting color. Also new in this version is that the widget will paint itself with the selected color as the background, giving you a local, live preview as you adjust.
Personally, I think red along +X makes the most sense from a correctness standpoint. In mathematics, we define a unit circle such that this is the direction of zero degrees. In HSV color space, we define zero degrees as red. It seems like a simple, learnable convention. I can’t see why having red in some other direction would specifically be more intuitive or easier for the user, but I’m willing to be proven wrong if somebody has a good argument. Without that, I’ll stick to +X = red for reasons of comfort with the mathematics.
As far as order around the circle, the unit circle in mathematics goes counter clockwise from +X. In optics, we learn the Roy G Biv color ordering according the frequency of colors in the spectrum. Logically, we should match increasing frequency to increasing angle. Consequently, Orange should be next after red as you go counter-clockwise. This is a “green up” or “Shake” orientation. (And contrary to the most recent screenshot I have posted of my own widget. It’s still a work in progress…)
I therefore declare that a Standard Color Wheel ought to be red at +X, and green at top-left. So, why are there so many variations when it seems like there is a correct answer? I dunno. I guess a lot of these color wheels were made pretty much independently by people who weren’t particularly concerned about matching up with some other standard. Some may have intentionally wanted to differentiate themselves from existing color wheels that they had seen just for the sake of novelty. Most of the details are presumably now lost to time.
So, just from the apps that I had handy to check on, there are four different layouts for the colors, and four different saturation calculations for the middle. What do you prefer? Do you know of any other interesting variations on a simple color wheel? Is one or the other more intuitive or functional for you?
In any CS-101 textbook, you learn about the scaling of certain algorithms. I’ve had a few conversations with several colleagues recently related to this topic, and how badly the “cult of complexity” can lead one astray when trying to scale real systems rather than ones in textbooks. I wanted to write a bit about the topic, and share a simple program that demonstrates the realities.
In the textbook, one algorithm might scale linearly — O(n). Another might do better at O(log(n)), or have a constant number of operations regardless of working set size and be O(1), etc. It’s a pretty straightforward way of keeping track of the terribleness of an algorithm. But the key is that Big O notation only estimates the number of operations as n grows large, rather than than the amount of time taken to run it. There is a pretty natural and intuitive implicit assumption that makes Big O useful. The assumption is that one read operation from memory is about as slow as any other. So you just need to count up the number of read operations to get an idea of how much time you’ll spend waiting on memory.
This is, of course, pants-on-fire nonsense.
The slides for my talk at Scale17x are available here: In Google Slides
The synopsis is here: At the SCALE website
My repository with W-Utils, the “Worst Utils” versions of a few common userland utilities is here: On GitHub . The implementations are all incomplete, proof of concept sorts of things, but they are useful examples that are much simplified compared to the “real thing” in GNU Coreutils.
A video of the talk is up on the SCALE YouTube page. My talk in the “Ballroom C” video starts at 4:05:30. The camera doesn’t cover the screen, so you may need to be clicking along with the slide deck in you want the full experience. (The embedded video starts near the right time, but it doesn’t seem to want to seek to the exact spot, so you may need to move in time a little.)
I was recently sitting, staring at a progress bar, which is how very nerdy adventures start.
The particular progress bar was telling me about the packages being installed as part of upgrading my workstation from Ubuntu 16.04 to the newer Ubuntu 18.04. As the package names whizzed by, one after the other, the thing that annoyed me was that it took So. Damned. Long. My day job often involves trying to understand why Linux systems don’t go as fast as I would like, so I naturally started firing up some basic utilities to see what was happening. The most obvious thing to check is always CPU usage. top showed me that my CPU cores were sitting almost entirely idle. CPU usage is a metric that I often describe as convenient to measure, relatively easy to understand, and generally useless. But it’s still a good place to start. I wasn’t really surprised that the installation process wasn’t CPU bound, so I fired up iotop, which is a much more useful utility for seeing what processes on a system are io bound, and saw… Nothing interesting. And it was then that I sort of fell into a curiosity. If you count all the many servers I have caused package installations to happen on, I have probably installed many millions of debian packages over the years. Some with salt, others with apt-get, and some with dpkg, but I never really studied in detail exactly how the ecosystem worked.
I started by trying to figure out exactly what a debian package is. It seems like a silly question, with a simple answer. Of course, “a debian package is just a common standard ar archive,” as a friend of mine pointed out while I was talking to him. But that sort of understates things. First off, ar archives aren’t that common, or particularly standardised. Ar archives are ‘common’ only as the format for static libraries, and debian packages. They just aren’t common as general purpose archives, like tarballs or zip files. Which is sort of interesting in it’s own right.
Let’s consider just how standard the format actually is… Wikipedia has a good breakdown of the format. Is the diagram on Wikipedia all we’d need to know to read a debian package? Well, man 5 ar notes “There have been at least four ar formats” and “No archive format is currently specified by any standard. AT&T System V UNIX has historically distributed archives in a different format from all of the above.” Eep, that’s not terribly promising. Thankfully, debian packages are at least consistent among themselves in their Ar dialect, since they can generally be assumed to be made with the ar on a debian Linux distribution.
There’s a whole side-story here about how there is a C system header for reading ar archives in an old-school “read a struct” way. But the format use a slightly odd whitespace padded text pattern, so to get trimmed filenames as C++ std::strings and integer number values is more of a pain in the neck than you’d hope. There isn’t a good c++ library with a modern API for the format. So I wrote a YAML definition for Katai in order to have a convenient C++ API for reading it, and used the SPI Pystring library for some of the string manipulation. In any event, I could read the format. Yay, I could read a debian package myself!
A debian package consists of just three things when you unpack it. I file called ‘debian-binary’ that tells you the version number of the format. And, two tarballs. One with control metadata about the package and the other with the actual contents of the package.
At this point, anybody trying to write their own code to unpack a debian package in order to better understand the process will try and punch a wall. Because we’ve just figured out how to write code to read this relatively uncommon Ar format, and the first thing we find inside of it is two tarballs, which is a completely different format! Surely, we could have designed the package files to either be an Ar with Ar archives in it, or a tar file with tar files in it! Well, okay, my friend’s assertion that I just needed to know about Ar archives was a lie, but I only need to know about two formats. That’s not too bad. Oh, well, tarballs are actually two formats unto themselves. There’s a compression format, and then the actual tar archive. So, you need to handle three file formats to install a debian package. I have some code that will unpack the Ar layer, so let’s see which compression method is used on the tar files…
Wait, those two tar files have different compression formats. One is a .gz file, and the other is a .xz! Not just different compression formats from debian files of different eras. For example, if Ubuntu 12.04 packages used gz and Ubuntu 18.04 used xz, you would only need to support one or another to install packages from any particular distribution. As it turns out, there are different compression formats inside a single package. Okay, so to unpack and install a debian file, you actually need to support a few compression formats. Let’s say xz, bz2, and gz at a minimum. Okay, so you need to support 5 different formats. So, what’s in that control archive?
You get a few scripts. preinst, postinst, and prerm. Those scripts get run when you would expect. Before install, after install, and before removing the package if you uninstall it. Languages like Python can be embedded in native applications, but shell scripts aren’t really intended to be used that way. (And actually, if I were embedding Python today, I’d probably use PyBind11 instead of Boost.Python like I did in my old blog post. But that’s neither here nor there.) So, you can pass on being responsible for running the scripts in-process if you are trying to implement something to install the packages, and just shell out to do it. (Writing a shell is definitely at least a whole other blog post unto itself.) You also have files called md5sums, control, and conffiles. Conffiles is just a newline separated list of files that the package uses for configuration so the install program can warn you about merging local changes during install. It’s barely a file format, so we’ll count it as half. md5sums is a listing of checksums of all the files in the content archive called “data,” in the format of md5sums.
b25977509ca6665bd7f390db59555b92 usr/bin/apturl da0e92f4f035935dc8cacbba395818f2 usr/lib/python3/dist-packages/AptUrl/AptUrl.py 2c645156bfd8c963600cd7aed5d0fc0b usr/lib/python3/dist-packages/AptUrl/Helpers.py 927320b1041af741eb41557f607046a7 usr/lib/python3/dist-packages/AptUrl/Parser.py b697ac30c6e945c0d80426a8a4205ef8 usr/lib/python3/dist-packages/AptUrl/UI.py d41d8cd98f00b204e9800998ecf8427e usr/lib/python3/dist-packages/AptUrl/Version.py d41d8cd98f00b204e9800998ecf8427e usr/lib/python3/dist-packages/AptUrl/__init__.py a8f4538391be3cd2ecac685fe98b8bca usr/lib/python3/dist-packages/apturl-0.5.2.egg-info 4bd6e933c4d337fdb27eee28abbd289d usr/share/applications/apturl.desktop 3824814ef04af582f716067990b7808f usr/share/doc/apturl-common/changelog.gz 2ae15dd4b643380e1fbb9c44cf8e9c54 usr/share/doc/apturl-common/copyright 019ea97889973f086dfd4af9d82cf2fb usr/share/kde4/services/apt+http.protocol
This is also a pretty simple format, but you need to split the space after the hash, while correctly handling the possibility of things like spaces in filenames. (And I’m not entirely sure what you do if you have a newline in a filename, which is possible, in these simple formats.) So we are up to Six and a half file formats.
Package: apturl-common Source: apturl Version: 0.5.2ubuntu11.2 Architecture: amd64 Maintainer: Michael Vogt <firstname.lastname@example.org> Installed-Size: 168 Depends: python3:any (>= 3.3.2-2~), python3-apt, python3-update-manager Replaces: apturl (<< 0.3.6ubuntu2) Section: admin Priority: optional Description: install packages using the apt protocol - common data AptUrl is a simple graphical application that takes an URL (which follows the apt-protocol) as a command line option, parses it and carries out the operations that the URL describes (that is, it asks the user if he wants the indicated packages to be installed and if the answer is positive does so for him). . This package contains the common data shared between the frontends.
The “control” file is yet another text file, but the format is different from conffiles or md5sums. We are now up to seven and a half file formats. Which is surely a far cry for the original “you just need to know the Ar format!” that I got as received wisdom when I first fell into this rabbit hole.
On the bright side, this does give us enough information to unpack and install the data in the package. (And I’d like to complain how vague a name “data” is for the archive with the actual contents. As if the rest of the package was somehow something other than data!) But we still haven’t covered any of the local database that keeps track of what packages are available, what are installed, how dependency resolution works, etc. But some of that will have to wait for another blog post. This is certainly enough content that the original progress bar that isnpired me did finish what it was doing long before I made it this far with my own code.
Learning how to unpack packages wound up just being the first steps of a project to try and do my own simple implementations of a whole raft of common UNIX command line utilities that I depend on every day. Trying to implement a useful subset of a complete userland is what inspired the blog post’s title, “Adventures in Userland.” The UNIX userland is full of fascinating history, layers of cruft, clever design, and features you never even realised were there. Even implementing my own cat turned out to be an interesting project, despite how simple that utility seems. I am hoping to make time to document some of the things I learned while poking around the things I have long taken for granted, and how shaky and wobbly some of the underpinnings of modern state of the art cloud and container systems are.
convenient modern C++ API’s for things like machine learning and image processing are easy to find, but not so much for things like .debs, and .tars. The utilities in GNU coreutils sometimes have surprising limitations, and some files haven’t had any commits since Star Trek: The Next Generation was in first run. I think it’s fair to say some of that stuff is about due for a fresh look.
Don’t Be That Guy
If your application dereferences symlinks by default, you are a jerk. Your software is bad, and you should feel bad. Why do you hate your users?
Won’t Someone Think Of The Users?
On OS-X in the Finder, there is a neat pane on the left where you can bookmark your favorite places to get to them quickly and easily. Just drag a folder into it, and you can get to it from any Finder window. It’s super convenient., Unless of course you make a symbolic link. Which is basically just another concept for an easy way to get to another place.
if you create a symlink, and then try to add it as a Favorite, Finder will dereference the link, and favorite what the link points to rather than the link itself. This is evil. It’s not what the user asked for! It’s an extreme violation of the Principle Of Least Surprise. The implicit contract between the user and the system is that if I favorite something, clicking on the thing and the favorite will always take me to the same place. The favorite represents the thing I was dragging into the Favorites bar, not whatever it may have been pointing at. If I ever change where the symlink points, the favorite and the symlink will now be doing two different things. For no obvious reason!
File this one under Stupid Python Tricks. I have written a bit about working on an app with an embedded Python run time. It’s good fun. I recently added a new feature to the script editor that was relatively easy, but for some reason isn’t very common. There are a few small quirks to making a script editor do this, so in case anybody is curious how to do it in their own app, this is how I did it.
I couldn’t figure out how to fit this into a 140 character tweet, so now it’s a blog post. Recently on a mailing list that I am subscribed to, a friend and former coworker posted:
Went to download the Unity 5 updates, what I ended up with was:
I can pretty much guarantee that at no time in the history of electronic software distribution has anyone ever said “gee, I really wish this application had its own custom downloader, because those guys at Microsoft / Apple / Mozilla / Google clearly don’t know what they are doing with those web browser things, and don’t even get me started about those curl / wget people…”. I feel slightly better now.
And I had unfortunately spent the morning wrestling with Adobe CC, so I chimed in with my response…