The Best Reading App

Since the start of this year—for some reason, I can’t put my finger on what—I’ve been reading far more RSS feeds and articles that I’ve come across. I’ve sporadically used RSS in the past, but never really got into a groove with it. Currently I’m using NetNewsWire, which is good but doesn’t quite match the experience that I want, and so I’m writing this to manifest into existence the perfect app.

I’m an absolute fiend for a reverse-chronological list of items where my position is perfectly preserved. Tweetbot and Ivory are absolutely perfect for this; I can open the app, scroll a little bit, and then leave and come back later. It’s been part of my daily routine to scroll through the tech news and gossip each morning as I start my day.

Sadly no RSS reader seems to have quite the interface I want. So I’m going to describe it in enough detail that someone can find one for me, or some enterprising developer can implement it.

The main interface would of course be a reverse-chronological list of posts (oldest at the bottom), with the key feature that your scrolling position would remain where you left it. New posts would be loaded “above” your scroll position, so you would just continue to scroll up as you read through the feed. The feed should make use of article and feed images to present a visually engaging view, rather than a simple list of titles.

It’s not a use case that I particularly care for, but if I were making this app it’s something I’d be sure to handle: viewing a single feed. This doesn’t really mesh too well with the main list of posts, but wouldn’t be an insurmountable UI challenge. What I would probably do would allow viewing a single feed (or group of feeds) like you’d open a user profile on a social media app. Except that you’d be put into another reverse-chronological feed in the same position as the main feed—but just for posts from that publication. You could then scroll through the single feed, and once you were at the top there would be an option to clear those posts from the main feed. That way you could swap back to the main feed and continue reading without repeating posts. This would be useful if a single feed has dumped a big collection of posts and you just want to see if there’s something interesting, otherwise get them out of the way.

The second most important feature would be a built-in read-later service. I switched from Instapaper to GoodLinks and am very happy with it so far, but I would be a lot happier if it were built right into my feed reader. I’ll often come across an interesting post, but won’t have time or be in the mood for reading a longer or more technical post. Ideally in this case I could just mark it for reading later, without having to share the post to a different app (even if that other app is very good). This would unlock the ability to read half a post, realise you’ve run out of time, and then just close the article and have it automatically saved for later—with your position already saved.1

Automatically saving posts would definitely be a UX challenge. You don’t want to flag every single post that gets opened as “read later”, but you also don’t want to have the interaction be unreliable. I would probably lean towards just having a very convenient “close and keep for later” button that is just as easily accessible as swiping back to exit the article.

The next UI challenge would be presenting the main feed and the read-later feed in such a way that neither appears to be playing second fiddle to the other, while also making it easy to swap between them. Perhaps you’d automatically switch between the two depending if there were new posts? Or maybe that would just end up being annoying.

The app would need all the features of a good read-later app; saving links from other apps, presenting web pages in a friendlier reader view, saving reading progress, and saving pages for offline reading.

A problem that I would like solved—but I’m not sure if link-sharing APIs allow for this—is knowing where the link was shared from. I find myself getting to the bottom of a post that I’ve saved from somewhere and thinking “oh whoever shared this obviously has excellent taste, I should see what else they do” but have no good way to find where I got it from. Alternatively I think of someone that I should share it with, only to find out that they were the person that sent it to me in the first place.

I have considered creating a second Mastodon account and just subscribing to the feeds of websites I follow (or using RSS-to-activitypub translators), and adding this account to Ivory. What stops me from doing this is that it would only get me half way there—no read-later integration—and Ivory would be doing double duty, meaning I’d have to switch accounts constantly.


If you’re looking for some fresh feeds, I quite like grumpy.website (examples of frustrating UI design), and Pixel Envy (links and commentary on technology with a focus on privacy and open design, which is my jam).

  1. It’s not too uncommon for me to save stuff straight to GoodLinks if I think I might not read it in full immediately, so I don’t have to find where I got up to later. 


The Five Stages of Swift on Linux

Recently I attempted to learn about Swift’s async support by doing my favourite thing—writing an RPC framework. In this case the “RPC framework” is just a request/response abstraction over websockets (which are message-based), which makes the actual RPC bit very simple, as all it’s really doing is wrapping some objects and matching responses to requests.

In doing this, I think I went through all five stages of grief1, which often happens when I try and use Swift on Linux—despite my previous excitement about it.

Denial

So first of all I found the documentation for URLSession#webSocketTask(with:). At first glance the API seemed pretty reasonable. I had a quick read over some blog posts and ended up with some code to test out:

let task = URLSession.shared.webSocketTask(
  with: URL(string: "ws://brett:9080")!)
try! await task.send(.string("test message"))

Don’t use this code as an example. It doesn’t work. That’s the whole point—keep reading.

This seems pretty easy, I create a websocket task and then send a message using it. The message should be received by a simple Crystal HTTP::WebSocketHandler and logged, so I know when it’s working.

I run the program, and it just hangs. No error, no timeout (at least not one that I was patient enough to wait for). Now there isn’t anything that I can see from the documentation that I’m missing (mostly because there is no documentation for send(_:)).

Eventually I look back over the blog posts and see that you need to call resume() on the URLSessionWebSocketTask for it to do anything.

Anger

This is very frustrating. If I were writing the documentation for this class, I would make sure that the requirement to call resume() was the first thing anyone saw when looking at the docs. Currently you have to go to the URLSessionTask superclass and find the resume() method docs which state:

Newly-initialized tasks begin in a suspended state, so you need to call this method to start the task.

A friendly API would raise an exception if you tried to use it before it was ready—failing fast is going to reveal your problem more readily than silently doing the wrong thing. However, I don’t know enough about the wider URLSession API know whether there’s a design tradeoff here that makes failing fast impractical.

Ok so I’ve wasted a bunch of time trying to work out what’s wrong all because my task was suspended. Never mind, at least I know what the problem is now. I add the resume() call and now I get:

Fatal error: 'try!' expression unexpectedly raised an error: Error Domain=NSURLErrorDomain Code=-1002 "(null)"
Current stack trace:
0    libswiftCore.so                    0x00007fecbfa6eb80 _swift_stdlib_reportFatalErrorInFile + 112
1    libswiftCore.so                    0x00007fecbf76043f <unavailable> + 1442879
2    libswiftCore.so                    0x00007fecbf760257 <unavailable> + 1442391

Hmm an NSURLErrorDomain problem. A -1002 problem to be precise. This is my first rodeo in Swift-networking-land so I don’t know what a -1002 means off the top of my head. Eventually I find some info that points me to this list of all the error codes. Hilariously it doesn’t include the code in the list—just the name—so you have to open each case one-by-one until you find the one that matches your error code. The fourth from last one turned out to by my error: NSURLErrorUnsupportedURL.

Immediately I start thinking of all the possible ways that you could consider a URL unsupported, maybe the ws:// scheme should be wss://? or maybe it won’t handle hostnames and needs an IP address? Perhaps I’ve messed something up in my container2 and it’s counting a closed port as an unsupported URL? (a bizarre thing to do, but at this stage all bets were off).

Bargaining

So maybe URLSessionWebSocketTask is a lost cause, but SwiftNIO is always an option. I won’t go into this too much, but basically I stumbled at the first hurdle when I followed this post to add SwiftNIO as a dependency. I don’t really understand all the moving pieces here but basically:

dependencies: [
  .package(url: "https://github.com/apple/swift-nio", from: "2.58.0")),
],
.executableTarget(
  name: "WebSocketRPC",
  // bad, doesn't work
  dependencies: ["SwiftNIO"],
  // also no good
  dependencies: ["NIOWebSocket"],
  // perfect and excellent
  dependencies: [.product(name: "NIOWebSocket", package: "swift-nio")]
)

Why do I need a .product instead of just a string? No idea, and I couldn’t find this mentioned anywhere in the SPM documentation. I happened to stumble across an NIO example project and looked at the Package.swift file to find this.3

However after learning more about the SwiftNIO websocket implementation, it seems that I would need to handle much more of the underlying protocol and HTTP-to-websocket upgrade than I had expected. The example websocket client has over 200 lines to do the same thing I was hoping to accomplish in two.

Depression

Maybe websockets aren’t that cool anyway, what if I just use plain old HTTP? Maybe this will help me understand whatever I’m doing wrong with the websocket API. While I’m at it, why don’t I translate the callback-based API into an async one—that was the original purpose of this exercise in the first place, right?

func download(url: URL) async throws -> String {
  return try await withCheckedThrowingContinuation { (continuation: CheckedContinuation<String, Error>) in
    let task = URLSession.shared.dataTask(with: url) { (data, response, error) in
      if let error = error {
        continuation.resume(throwing: error)
      } else if let data = data {
        continuation.resume(returning: String(data: data, encoding: .utf8)!)
      } else {
        fatalError("impossible?")
      }
    }
    task.resume()
  }
}

print(try! await download(url: URL(string: "https://willhbr.net")!))

And that just works first time? That’s definitely weird. Was this an excuse to include some tidy callback-to-async code? Maybe.

Acceptance

At this point my curiosity got the better of me—would it work on MacOS? Maybe I would get a better error and suddenly understand what was going wrong?

After a bit of an adventure with xcrun (it turns out you can’t use the Swift compiler that’s installed with the Xcode Command Line Tools), I installed Xcode and ran the exact code I had been trying on Linux for hours.

And it worked first time without any issues. The most frustrating result.

Eventually I found this GitHub issue linked from a project’s README:

fatalError when trying to send a message using URLSessionWebSocketTask

That code runs perfectly fine under macOS (using Swift 5.7), but as soon as it’s run on Linux I get the error from above.

A few people chime in saying they see the same issue, and then this comment points to this page of the libcurl documentation:

WebSocket is an EXPERIMENTAL feature present in libcurl 7.86.0 and later. Since it is experimental, you need to explicitly enable it in the build for it to be present and available.

So if your underlying library doesn’t support websockets, it makes sense that a websocket URL is unsupported.


I don’t have much of a conclusion here, apart from the fact that this was a very frustrating journey. I’m sad to see that almost eight years after being open-sourced and supporting Linux, Swift is still full of subtle traps that are hard to debug. Hopefully the Swift Server Working Group is aware of these issues and continues to make improvements—a simple @available annotation would have saved a lot of time.

  1. Yeah I know the titles don’t really match the content, I just did this for a funny title, alright? 

  2. Of course I’m running this in a container. 

  3. While writing this I did end up finding that towards the bottom of the readme for SwiftNIO there is a “Getting Started” section that has the correct incantations. Only after you’ve read past the conceptual overview, repository organisation, and versioning scheme, however. 


Warp Terminal

Yesterday I came across Warp Terminal via their advertisement on Daring Fireball.1 Immediately I was fascinated to know what their backwards-compatibility story was, and how their features were implemented. This is in a similar vein to the difficulties of modernising shells, that I wrote about in more detail last month.

If you’re not sure of the difference between a terminal and a shell, The TTY demystified is a really good read to understand the history and responsibilities of both. Basically the terminal emulator pretends to be a computer from 1978, and the shell runs inside of that.

I only spent about half an hour playing around with Warp, so my impressions are not particularly well-informed, it’s still in beta so many of these issues could be on a roadmap to fix. I didn’t look at any of the AI or collaboration features, I’m only interested in the terminal emulation and shell integration.

What sets Warp apart from other terminal emulators is that it hooks into the shell and provides a graphical text editor for the prompt, rather than using the TTY. For normal humans that are used to the standard OS keyboard shortcuts, and being able to select and copy text in a predictable way this is an excellent feature. The output from each command you run lives in a block, which stack up and scroll off the screen. In the prompt editor, autocomplete and other suggestions are native UI, not part of the TTY. They can be clicked, support non-monospaced fonts, and many other UI innovations from the last 40 years.

In their blog post “How Warp Works” there is a brief explanation of how they integrate with the shell.2 Basically they use callbacks within popular shells (ZSH, Bash, and Fish) to know when the command is started. If my interpretation of this is correct, they do away with the shell prompt entirely, and instead use their non-shell editor to allow the user to write their command, then they pass the whole finished command to the shell, and use hooks in the shell to know when to cut off the output and create a new block.

What this means is that Warp has some significant limitations on what it can “warpify”. Only the input to the shell prompt gets the magic editor experience, if you run another interactive program (like irb) then you’re back to inputting text like it’s the ’70s. You can tell Warp to inject some code into certain commands, but this will only work in the aforementioned shells. If the command doesn’t understand POSIX shell syntax with the functions that Warp expects, it won’t work.

So by default, if you start your login shell and then run bash to start a sub-shell, that sub-shell will miss out on the Warp features. I’m aware that this argument is entirely a “perfect solution” fallacy but hey, someone’s got to advocate for a perfect solution.

What is nice is that if you run a command that uses the “full screen” TTY, it will just work—the block takes up the whole screen while the command is running. You can still run vim and tmux, so if this takes over I’ll still be able to get things done.

The prompt editor is definitely good if you’re not used to working with a traditional shell, but since I’m used to having Vim mode in ZSH, going back to a normal editor feels broken. Also since the editor is split out from the shell, autocompletions are in a separate system. I have a few custom autocompletes setup in ZSH, and not being able to access those in the editor was frustrating. I’d type gcd <TAB>, expecting to see a list of my projects, but instead just get a list of the files in the current directory. I assume there’s some way of piping this information into Warp, but it’s a shame they don’t (yet?) have integration to pull this straight from ZSH.

The autocompletes that I did get were mostly good—files or arguments from my shell history—but I did get a few weird suggestions. I tried ssh and was suggested a bunch of hosts with names that were some base64-encoded junk. None of these appeared in my shell history of SSH config files.

I said I wasn’t going to look at any of the AI features, but then I connected to my server to see how the dialog command worked. The answer was that it wasn’t installed. Warp then said “✨ Insert suggested command: dig 13:02:20”. I don’t know how it made the leap in logic from “command not found” to “do a DNS lookup”, or why it wanted to suggest passing the current time to the DNS lookup—it was 1:02PM UTC when that suggestion popped up.

Warp is another example of how hard it is modernise things that directly interact with the underlying OS concepts. Perhaps Warp can partner with the nushell developers and reinvent the shell and terminal at the same time.

In the end I’m obviously not going to move away from using iTerm. Warp is solving a bunch of problems that I don’t have, and adding a whole suite of AI features that I have no interest in. If you are a fairly light terminal user, and get frustrated at editing commands in the traditional shell prompt, then maybe Warp is for you. Use my referral code so I can get a free t-shirt.

You get like 80% of the benefit of using Warp’s fancy editor by knowing that in the MacOS terminal, option-click will move the cursor around by sending the appropriate arrow keys to the shell.

  1. Who needs ad personalisation when you can just go directly to your target market? 

  2. If you skip over all the bits about how they render Rust on the GPU and stuff. 


The Curse of Knowledge

The curse of knowledge is the idea that as you become more of an expert in an area, it becomes harder to explain basic concepts in that area, because your assumed based level of knowledge is much greater than the typical level of understanding. Basically you might try and explain at an undergraduate level, but in reality you need to start from a high school level and build up from there. You forget the difficulty of grasping the key concepts of the topic.

A similar phenomenon happens when you try and make a “simple” version of something, which requires you to become an expert in the thing you’re attempting to simplify. Once you’ve become an expert, you understand the edge cases, tradeoffs, and other complexities in the system, and often you’re able to use the complex thing without needing it to be simplified, and appreciate why it is not simple in the first place. You’re then left to explain the subtleties of this complex system to people that have yet to make the leap in understanding—and experience the difficulty of explaining something it in basic terms.

I went through this whole process with tmux. Before I was a certified tmux nerd, I wanted a simpler way of configuring and controlling my tmux panes. The binding and manipulation controls seemed too limited, I wanted to be able to send commands to different tabs and split the output of commands to different panes. I managed to do some of this by hacking small scripts together, but I wanted a solution that would unify it all into one system.

There are a few projects that do similar things (like tmuxinator), but they are mostly focussed on automatic pane/window creation, rather than adding scripting to your interaction with tmux.

So I spent months learning the ins and outs of tmux’s command-line interface, and the functionality available in control mode. Eventually I had a program that ran alongside tmux and provided an object-oriented scripting interface to basically the entirety of tmux. You could do something like:

server.on_new_session do |session|
  session.on_new_window do |window|
    window.panes.first.split :vertical
  end
end

Under many layers of abstraction, this would listen for events in tmux, run the associated Ruby code, and send any commands back to tmux if the model had changed. It was a wonderful hack, and I’m still very happy with how it all fit together.

However, in doing so I learnt a lot about the tmux CLI, and started to get a fairly in-depth understanding of how it had been designed.

Ok I need to share just how neat the tmux API is. It’s all really well documented on the man page. Control mode outputs tmux events to stdout, so if you read from that process you can receive what’s happening with every tmux session on a server—input, output, layout changes, new windows, etc. You can also write commands into stdin of the control mode process, and their output will be returned as a control mode message.

Most tmux commands print some kind of output, by default it’s somewhat human-readable, intended to display in a terminal. Take tmux list-sessions as an example:

$ tmux list-sessions
flight-tracker: 2 windows (created Fri Jul 28 10:41:53 2023)
pixelfed-piper: 1 windows (created Fri Jul 28 11:14:18 2023)
pod: 3 windows (created Sat Jul 29 03:17:47 2023)
willhbr-github-io: 2 windows (created Fri Jul 28 11:13:50 2023) (attached)

It would be really annoying to write a script to parse that into a useful data structure (especially for every single command!), and thankfully we don’t have to! Every tmux command that prints output also supports a format string to specify what to print and how to print it:

$ tmux list-sessions -F '#{session_id}||#{session_name}||#{session_created}'
$1||flight-tracker||1690540913
$3||pixelfed-piper||1690542858
$4||pod||1690600667
$2||willhbr-github-io||1690542830

The only logical thing for me to do was write an RPC-like abstraction over the top of this, with macros to map fields in the generated format string to attributes on the objects that should be returned. This allowed me to build a fairly robust abstraction on top of tmux.

After that I started learning about all the features that tmux supports. Almost every option can be applied to a single pane (most normal people would apply them globally, but if you want they can be applied to a just one session, window, or pane)—so if you want one window with a background that’s unique, you can totally do that. You can also define hooks that run when certain events happen. You can remap keys (not just after the prefix, any key at all) and have arbitrary key “tables” that contain different key remappings. Windows can be linked for some reason—I still don’t know what this would be used for—and you can pipe the output of a pane into a command. Exactly how all these features should be used together is left as an exercise for the user, but they’re all there ready to be used.

With this much deeper understanding of how to use the tmux API, I no longer really needed a scripting abstraction, I was able to pull together the existing shell-based API and do the handful of things that I’d be aiming to accomplish (like my popup shell). I’d basically cursed myself with the knowledge of tmux, and now a simple interface wasn’t necessary. So I abandoned the project.

One of my software development Hot Takes™ is that git has an absolutely awful command-line interface.1 The commands are bizarrely named, it provides no guidance on the “right” or “recommended” way of using it,2 and because of this it is trivial to get yourself in a situation that you don’t know how to recover from. Most git “apologists” will just say that you should either use a GUI, or just alias a bunch of commands and never deviate from those. The end result being that developers don’t have access to the incredibly powerful version control system that they’re using, and constantly have to bend their workflow to suit the “safe” part of its API.

The easiest example of something that I would like to be able to do in git is a partial commit—take some chunks from my working copy and commit them, leaving the rest unstaged. The interface for staging and unstaging files is already fairly obtuse, and then if you want to commit only some of the changes to a file, you’re in for a whole different flavour of frustration.

  • git add stages a file (either tracked or untracked)
  • git restore --staged removes a file from being staged
  • git restore discards changes to an unstaged file

Why we haven’t settled on a foo/unfoo naming convention completely baffles me. stage/unstage and track/untrack tell you what they’re doing. restore --staged especially doesn’t match what it does—the manual for git-restore starts out saying it will “restore specified paths in the working tree with some contents from a restore source”, but it’s also used to remove files from the pre-commit staging area? That doesn’t involve restoring the contents of a file at all. Just read the excellent git koans by Steve Losh to understand how I feel trying to understand the git interface.3

What I really want is an opinionated wrapper around git that will make a clear “correct” path for me to follow, with terminology that matches the actions that I want to take. Of course the only correct opinionated wrapper would be my opinionated wrapper, which means I need to make it. And of course for me to make it, I need to have a really good understanding of how git works—so that I can make an appropriate abstraction on top of it.

So this is where I’ve ended up, I want to make an abstraction over git, which would require me to learn a lot about git. If I learn enough about git to do this, I will become the thing that I’ve sworn to destroy—someone who counters every complaint about git with “you just have to think of the graph operation you’re trying to achieve”.

  1. Is it a hot take when you’re right? I guess not. 

  2. This would probably be considered a feature to many people, which I suppose is fair enough. 

  3. To be honest, much of this is probably because I forged my git habits back around 2012, and since then a lot of commands have been renamed to make more sense. I’m still doing git checkout -- . to revert unstaged files and it makes absolutely no sense—isn’t checkout for changing branches? 


Helicopter Tracking for Safer Drone Flights

Avid readers will know that I like to fly my drone around the beaches in Sydney. The airspace is fairly heavily trafficked, and so I take the drone rules very seriously. This means no flying in restricted airspace (leading to other solutions for getting photos in these areas), no flying in airport departure or arrival paths, and no flying above the 120m ceiling (or 90m in certain areas). This is easily tracked with a drone safety app (I’m a big fan of ok2fly).

What is more difficult is flying a drone in an area that may have other aircraft nearby. The drone rules state:

If you’re near a helicopter landing site or smaller aerodrome without a control tower, you can fly your drone within 5.5 kilometres. If you become aware of manned aircraft nearby, you must manoeuvre away and land your drone as quickly and safely as possible.

This basically means that if a helicopter turns up, you should get the drone as low as possible and land as quickly as possible. In theory, crewed aircraft should be above 150m (500ft), with a 30m (100ft) vertical gap between them and the highest drones. However on the occasions where there have been helicopters passing by, to my eye they seem to be much closer than that, which makes me anxious—I want my drone to remain well clear of any helicopters.

Virtually all aircraft carry an ADS-B transmitter which broadcasts their GPS location to nearby planes and ground stations. They use this location to avoid running into each other, especially in low-visibility conditions. Flight-tracking services like flightradar24 aggregate this data globally and present it on a map.

My first idea was to write an app that would stream the ADS-B data from a service like flightradar24 for any aircraft in the nearby airspace, and sound an alert if an aircraft was on a trajectory that would intersect with my location. This would be great, but it would be a lot of work, require some kind of API key and agreement from the data provider, and ongoing use would require paying the annual $99USD/$150AUD Apple developer program fee.1

a drone photo of waves coming in to a beach

I realise that I’m a few paragraphs into a post about drone photography and haven’t included a drone photo yet. Here you go.

The next best idea was to setup a Stratux ADS-B receiver using a Raspberry Pi. This would either allow me to pull data from it to my phone (no need to deal with API keys and suchlike) or do all the processing on the Pi (no need to deal with developer restrictions). While this would have been cool, it would have also cost a bit to get all the components, and working out some kind of interface to an otherwise-headless RPi seemed like a frustrating challenge.

After considering these two options for a while I settled on a completely different third option. Instead of building something to alert me in real time, I could just work out which beaches would have nearby aircraft at what times of day, and avoid flying during those times. This is when I came across the OpenSky Network, a network of ADS-B receivers that provides free access to aircraft locations for research purposes. So all I had to do was get the data from Opensky for aircraft in Sydney, and then visualise it to understand the flight patterns around the beaches.

Opensky has a historical API with an SQL-like query interface, as well as a live API with a JSON REST interface. I requested access to the historical data, but was informed that they only provide access to research institutions due to the cost of querying it. So to make do I wrote a simple program that would periodically fetch the positions of aircraft within the Sydney area. This data was then saved to a local SQLite database so I could query it again later. Since the drone rules also forbid flights during the night, I only needed to fetch data during civil daylight hours.

To visualise the data, I used my hackathon-approved map rendering solution: get a screenshot of Open Street Map and naively transform latitude/longitudes to x/y coordinates. After messing up the calculation a bunch, I got a map with a line for every flight, which looked something like this:

map of Sydney Harbour showing many paths taken by aircraft over the harbour

Eventually after staring at this map2 for a long time, I realised that most helicopter (or rotorcraft as they are referred to in the API) routes went from north from the airport, passed along the western side of the city, directly over the Harbour Bridge, did a few loops over the harbour (as seen in the map above), exited the harbour by Watson’s Bay, then turned south and hugged the coastline along the beaches, before finally turning west at Maroubra to get back to the airport.

I finally had the realisation that probably should have been fairly obvious a long time before this—all these helicopters are tourist flights, repeating the same route over and over again. Sure enough if I search for “helicopter sight seeing Sydney” I find the website for a helicopter tour company that does the exact route I saw plastered over my map. Optimistically I emailed them asking how many flights they usually flew in a day, and what time their earliest flight was—this would give me enough information to make a reasonably informed decision about when was best to fly my drone. Sadly they said they couldn’t share this information with me.

Ok so I would have to do some more data visualisation to work this out for myself. First of all I filtered out any data points that were above 200 metres, since they would be well clear of any drones.

map of Sydney and beaches from the southern head of the harbour down to Cronulla, including Botany Bay

There are some interesting things in this map:

  • The arrival and departure paths for commercial aircraft are very accurate.
  • Helicopters arrive and depart from the eastern part of the airport.
  • Rose Bay is where a lot of seaplanes take off from, so you can see tracks starting and stopping there.
  • By far the densest route is between Bondi and Maroubra, hugging the coast.
  • Planes flying the Victor 1 VFR route are further from the coast.
  • There’s obviously a strict route for aircraft flying over the inner harbour (west of the bridge) creating an aerial highway.

I then compared that with the same view over the northern beaches:

map of Sydney's northern beaches, from the harbour entrance up to Barrenjoey head

It’s worth noting that all the maps contain data for just over one month of flights. There is definitely still a large number of flights going up the coast, but they thin out significantly as you get further north, especially past Long Reef—the headland south of Collaroy beach. I was surprised to see that no aircraft fly over the harbour side of Manly, they instead follow the water out the harbour entrance.

A friend suggested a nice way of visualising the data: plot the time of day on one axis, and the position down the coast on the other, and create a heatmap of the highly-trafficked times/areas. In theory you should be able to see a line for each flight flying down the coast. Sadly my matplotlib skills aren’t that good, so this is the best I could come up with:

histogram of latitude to time in the day

The left axis is the latitude (limited in range from Bondi to Maroubra) and the bottom axis is the fraction of the day (eg 0.5 is midday). Using this we can see that the bulk of flights start at 0.4, which is 9.6 hours into the day, or 9:36 AM. Which makes sense for tourist flights, since passengers presumably have to sign some waivers and do a safety briefing, and they’re not going to want to get out of bed too early. I added the ability on my map to filter out flights past a certain time of day, and sure enough if I only look at flights before 10:00am, the sky is much clearer.

Armed with this new knowledge, I can make some more informed decisions about when to fly my drone around the beaches in Sydney. I’m just not going to bother flying during the middle of the day anywhere between Bondi and Maroubra, if I want to fly there I’ll do it just after sunrise—which will give me better light3 anyway. Flying in the further north beaches is still an option, but I will still want to position myself somewhere with a good view up and down the coast to see other aircraft coming. Since the flight paths are much more predictable than I had expected, if I did make some kind of alerting system, I could simply trigger whenever an aircraft exited the harbour, since their next move is likely to be up or down the coast.

Of course the most important thing—and the lesson I hope you take away from this—is to follow the rules, always check airspace restrictions before flying, be aware of your surroundings, and if in doubt just descend and land as promptly as possible. Don’t use a few map screenshots from someone’s blog as guidance on where to fly your drone.


Map data © OpenStreetMap contributors.

Flight data from OpenSky:

Bringing up OpenSky: A large-scale ADS-B sensor network for research Matthias Schäfer, Martin Strohmeier, Vincent Lenders, Ivan Martinovic, Matthias Wilhelm ACM/IEEE International Conference on Information Processing in Sensor Networks, April 2014

  1. I could install it on my phone with a free developer account, but that requires re-installing the app from Xcode every week. 

  2. Well not this map, the full-size map with way more lines on it. 

  3. Although it will give me worse sleep. 


Simple Home Server Monitoring with Prometheus in Podman

The next step in my containerising journey is setting up Prometheus monitoring. I’m not going to use this for alerts or anything fancy yet, just to collect data and see what the load and health of my server is and be able to track trends over time. In doing this I wanted:

  • I don’t want to edit a central YAML file when I start a new service
  • Key container metrics (CPU/memory/etc) should be monitored automatically
  • Prometheus itself should run in a container

There are plenty of existing posts on setting up Prometheus in a container, so I’ll keep this short. I used pod to configure the containers:

containers:
  prometheus:
    name: prometheus
    image: docker.io/prom/prometheus:latest
    network: prometheus
    volumes:
      prometheus_data: /prometheus
    bind_mounts:
      ./prometheus.yaml: /etc/prometheus/prometheus.yml
    ports:
      9090: 9090
    labels:
      prometheus.target: prometheus:9090

  podman-exporter:
    name: podman-exporter
    image: quay.io/navidys/prometheus-podman-exporter:latest
    bind_mounts:
      /run/user/1000/podman/podman.sock: /var/run/podman/podman.sock,ro
    environment:
      CONTAINER_HOST: unix:///var/run/podman/podman.sock
    run_flags:
      userns: keep-id
    network: prometheus
    labels:
      prometheus.target: podman-exporter:9882

  speedtest:
    name: prometheus_speedtest
    image: docker.io/jraviles/prometheus_speedtest:latest
    network: prometheus
    labels:
      prometheus.target: prometheus_speedtest:9516
      prometheus.labels:
        __scrape_interval__: 30m
        __scrape_timeout__: 2m
        __metrics_path__: /probe

prometheus contains the actual Prometheus application, which has its data stored in a volume. podman-exporter exports Podman container metrics, accessed by mounting in the Podman socket.1 speedtest isn’t essential, but I was curious to see whether I had any variations in my home internet speed, and running one more container wasn’t difficult. This also forced me to work out how to customise the scraping of jobs configured via Prometheus HTTP service discovery.

To meet my first requirement of having no global config, I needed to setup some kind of automatic service discovery system. Prometheus supports fetching targets via an HTTP API—all you have to do is return back a list of jobs to scrape in a basic JSON format. Since I already run a container that shows a status page for my containers (more on that another time, perhaps) I have an easy place to add this endpoint. You just need to add the endpoint into your prometheus.yaml config file once:

scrape_configs:
  - job_name: endash
    http_sd_configs:
    - url: http://my_status_page:1234/http_sd_endpoint

That endpoint returns some JSON that looks like this:

[
  {
    "targets": ["prometheus:9090"],
    "labels": {
      "host": "Steve",
      "job": "prometheus",
      "container_id": "4a98073041d6b"
    }
  },
  {
    "targets": ["prometheus_speedtest:9516"],
    "labels": {
      "host": "Steve",
      "job": "prometheus_speedtest",
      "container_id": "db95c10b425cc",
      "__scrape_interval__": "30m",
      "__scrape_timeout__": "2m",
      "__metrics_path__": "/probe"
    }
  }
]

targets is a list of instances to scrape for a particular job (each container is one job, so only one target in the list). labels defines additional labels added to those jobs. You can use this to override the job name (otherwise it’ll unhelpfully be the name of the HTTP SD config, in my case endash) and set some of the scrape config values, if the target should be scraped on a different schedule.

My status dashboard has an endpoint that will look at all running containers and return an SD response based on the container labels. This allows me to define the monitoring config in the same place I define the container itself, rather than in some centralised Prometheus config. You can see in my pods.yaml file (above) that I use prometheus.target and prometheus.labels to make a container known to Prometheus as a job.

The thing that really makes this all work is Podman networks. The easiest way to get Prometheus running is to run it on the host network, so that it doesn’t run in its own containerised network namespace. So when it scrapes some port on localhost that’s the host localhost, not the container localhost. This works reasonably well if all your containers publish a port on the host. This is definitely an acceptable way of setting things up, but I wanted to be able to run containers without published ports and still monitor them.

You can do this by creating a Podman network and attaching any monitor-able containers to it, so that they are accessible via their container names:

> podman network create prometheus
> podman run -d --network prometheus --name network-test alpine:latest top
> podman run -it --network prometheus alpine:latest
$ ping network-test
PING network-test (10.89.0.16): 56 data bytes
64 bytes from 10.89.0.16: seq=0 ttl=42 time=0.135 ms
64 bytes from 10.89.0.16: seq=1 ttl=42 time=0.095 ms
...

I’m running top in the network-test container just to keep it running in the background for this example. If you ran a shell, it would exit immediately since there is no input connected.

The one wrinkle of using a Podman network is that it makes accessing non-container jobs more difficult. I wanted to setup node_exporter to keep track of system-level metrics, and it can’t run in a container as it needs full system access (or at least, it doesn’t make sense to run in a container). Thankfully this ended up being super easy, I can just install node_exporter via apt:

$ sudo apt install prometheus-node-exporter

Which will automatically start a service running in the background and serving metrics on localhost:9100/metrics. To access this from our Prometheus container, you can just use the magic hostname host.containers.internal, which resolves to the current host. For example:

> podman run -it alpine:latest
$ ask add curl
$ curl host.containers.internal:9100/metrics
... a whole bunch of metrics

So I have to add one static config into my prometheus.yaml file:

scrape_configs:
  - job_name: steve
    static_configs:
      - targets: ['host.containers.internal:9100']

So now I’ve got a fully containerised, automatic monitoring system for anything running on my home server. Any new containers will get picked up by podman-exporter, and get their resource usage recorded automatically. If I integrate a Prometheus client library and export metrics, then I can just add monitoring config to the pods.yaml file for that project, and have my service discovery system pick it up and have it scraped automatically.

I’ve added a lot of functionality to pod since I first wrote about it, I’m aiming to get it cleaned up and documented better soon.

  1. This obviously gives the exporter full access to do anything to any container, so you’ve just kinda got to trust it’s doing the right thing. 


Limited Languages Foster Obtuse APIs

On the topic of the design decisions of a low-level system limiting the design space of things built on top of them, the design of programming languages has a significant impact on the APIs and software built using them.

Go is heralded by the likes of Hacker News and r/programming as modern, exciting, and definitely not anything like Java, which is old and boring. Java developers spend their days writing abstract factory interfaces and generic strategy builders, whereas Go developers spend their time solving Real Problems™. Although if you squint a bit, you can see the similarities between Go and Java, and perhaps see where Go developers might end up.

Let’s think about factories. I’d include a quote from Effective Java here, but I don’t have a copy handy. The tl;dr is that you use a factory so that you’re free from the limitations of object construction in Java. When you call new MyThing(), you can only get one of two things; an exception, or a new instance of MyThing. If you call a static method like MyThing.create(), then you can get absolutely anything. Of course, good taste would prevent us from returning anything, but we can do things like cache expensive objects, or return a different MyThing subclass.

A concrete example (I know some people like that kind of thing) would be the main interface to an RPC framework1. Connection.create(String url) could return a different implementation based on the protocol of the URL passed in (TCP, HTTP, Unix socket, in-memory, etc). The normal constructor syntax can’t do this, so you end up with a recommendation for developers to prefer static constructor-methods in case of future flexibility.

Go has this exact same limitation. Struct creation is a different and special type of syntax. It can only do one thing—create a new instance of a struct (it can’t even throw an exception because Go doesn’t have those). This leads to the recommendation for packages to have a function that creates the instance for you:

func MakeMyThing(a string, b int) MyThing {
  return MyThing{a: a, b: b}
}

Does this look familiar?

Struct initialisation in Go also has a surprising feature: it will silently set any attributes to their default value if the attribute is omitted from the list. So this code compiles without any warnings:

type MyThing struct {
  foo string
  bar string
}

func main() {
  fmt.Println(MyThing{foo: "hello"})
}

And bar will silently be set to "" (the default value for a string). If you want to have any guarantee that attributes will all be set correctly, or be able to add an attribute to a struct and know that all usages have it set, you should wrap the creation of the struct in a factory function.

The other limiting factor for Java is the handful of types that have special syntax in the language. Only the builtin number types, strings, and arrays can use operators and the subscript syntax, there is no mechanism for these to be used on any user-defined type. So if you have a method that returns some data as a byte[] (for performance or convenience or whatever), and you want to change it to be MyByteSequence, you have to change all subscripts over to be a method call, since you define that operator on MyByteSequence.

Go has the exact same limitation; only number types, strings, slices, and maps use the operator and subscript syntax. In both cases this means that if you want to build an abstraction over the underlying data, you need to wrap them in a struct/object and define functions/methods on that object.

Prior to generics being added to Go, there was even more limited ability to build abstractions on top of the built-in types.

The effect of this is that you end up with a bunch of code that is entirely composed of method or function calls. Which doesn’t seem like much of a problem on the surface, but you end up in a state where every operation looks the same, making it hard to see the “shape” of what the code is doing.

This is the exact problem that keeps me from enjoying Lisp (and oh boy have I tried to enjoy Lisp). When I look at any non-trivial piece of Lisp code, I have a lot of trouble working out what is actually happening because every action has equal precedence—literally. Clojure does a commendable job at improving this by adding TWO additional types of brackets that allow for some glanceable differentiation.

; the square brackets make it easier to find the
; function arguments
(defn my-method [foo bar]
  ; the curly brackets allow for defining different
  ; types of commonly-used literals, like a map
  {:foo foo
   :bar bar})

Java code ends up devolving towards a similar type of syntax, since the only part of the syntax that you get “access” to is method calls. I used an example earlier in the context of Crystal about using time APIs in Java:

// This Java code
Duration.ofHours(4).plus(Duration.ofMinutes(5))
// Is surprisingly similar to Clojure
(.plus (Duration.ofHours 4) (Duration.ofMinutes 5))

I think one of the reasons that I find Elixir easier to read than Clojure is that it has much more syntax, so different actions actually look different. In Clojure, a case statement looks just the same as a method call, whereas in Elixir the addition of an infix -> operator to separate the match from the code makes the code block much easier to read.

Now, if you’re a particular type of person who solves every problem with a profiler and a flame graph, you’re probably preparing an argument about how overriding operators and subscripts allows for hiding potentially expensive operations. If developers are discouraged from using the built-in types that support these operators, then your expensive operations are just hidden behind a method call. Every Java 101 class tells you never to use an array, instead use List<>. Who knows how .get() is implemented in that list? It could be a linked list, and each call could be an O(n) operation. Would it really be much worse if that was behind a subscript instead of a method?

Unless you’re using some capabilities-based language where you can limit the type of operations a module can do, any function call could result in a network request or slow inter-process communication. It could even just do some blocking I/O, wasting valuable time that your thread could spend doing something more interesting.

Limiting language features for the sake of performance issues is ignoring what actually causes performance issues: slow code. Slow code can be called from anywhere, and limiting the expressiveness of the language seems like a high cost when you’re going to have to find your bottlenecks using a profiler anyway.

Of course no blog post about languages would be complete without me explaining how Crystal is perfect. There are virtually no special operators in Crystal. Operators are implemented as methods on the left-hand operand, subscripts are just a special method called []. The exception is that the array shorthand is linked to the built-in Array type, so [] of String is equivalent to Array(String).new.2

What this really boils down to is that programming language design should limit the amount of syntax that are bound to specific types. In Java this is operators and subscripts, in Go this is also includes channels. The Java ecosystem’s obsession with design patterns and abstraction is fuelled by the lack features in the language, requiring developers to invent another sub-language on top using the pieces of Java that they have access to—types and method calls. Go might have different built-in tools (like coroutines and channels) but since they are baked right into the language syntax, they can’t be replaced or altered as developer needs change.

  1. This is like, my favourite thing. 

  2. There are actually variations of this syntax that other types can override. 


Why Modernising Shells is a Sisyphean Effort

Anyone that knows me is probably aware that I spend a lot of time in the terminal. One of the many things that I have wasted time learning is the various oddities of shell scripting, and so I am cursed with the knowledge of the tradeoffs in their design. It seems to be something that most people don’t appreciate. Your shell has to find a balance between getting out of your way for interactive use, and being the best way to link together multiple unrelated programs to do something useful. The Unix philosophy of having many small tools, each dedicated to one simple job means that you can more easily replace one with an alternative, or a new tool doesn’t have to reinvent the wheel before it can be useful.

The problem is that to most people, the shell is completely inscrutable. Even experienced programmers who have no problem juggling many other programming languages will get into a muddle with even a simple shell script. To be honest, you can’t really blame them. Shell languages are full of bizarre syntax and subtle traps.

The root of the problem is POSIX; it defines the API for most Unix (and Unix-like, e.g: Linux) operating systems. Most important is the process model. A POSIX process receives arguments as an array of strings, input as a stream of bytes, and can produce two streams of output (standard output and error). Unless you’re going to redesign the whole operating system1, you’ve got to work within this system.

POSIX does also define the syntax for the shell language, which is why Bash, ZSH, and other shells all work in a similar way. Fish, xonsh, nushell, and Oil are not entirely POSIX compatible, and so are free to alter their syntax.

What sets a shell apart from other languages is that external programs are first-class citizens2, you don’t have to do anything special to launch them. If you type git status the shell will go off and find the git program, and then launch it with a single argument status. If you were to do this in Ruby, you’d have to do system('git', 'status')—more fiddly typing, and completely different from calling a function.

So if you want programs to fit in just the same as shell functions, your functions need to work like POSIX processes. This means they can’t return something—just input and output streams—and their arguments must be handled as strings. This makes implementing a scripting language that can be compared to Ruby or Python basically impossible. The constraints of having all your functions act like processes hampers your ability to make useful APIs.

This makes it really difficult for your shell language to support any kind of strong typing—since everything passed to any command or function needs to be a string, you’re constantly reinterpreting data and risking it being reinterpreted differently. Having everything be handled like a string is consistent with how programs run (they have to work out how to interpret the type of their arguments) is a constant source of bugs in shell scripts.

My favourite fun fact about shells is that some of the “syntax” is actually just a clever use of the command calling convention. For example, the square bracket in conditionals is actually a program called [.

xonsh is a new shell that merges Python and traditional shell syntax, except it does it by trying to parse the input as a Python expression, and if that doesn’t make sense it assumes it should be in shell mode. This gets scripting and interactive use tantalisingly close, except it seems to me (without having used xonsh) that it would end up being unpredictable, and you would have to always be aware of the fact you’re straddling two different modes at all times.

nushell attempts to solve the problem in a different direction. It requires you to either prefix your command with an escape character or write an external command definition to have it be callable from the shell. This moves away from the typical design of shells, and relegates external programs to be second-class citizens. nu is really a shell in search of a new operating system—to really make the most of their structured-data-driven approach, you’d want a new process model that allowed programs to receive and emit structured data, so that all the features for handling that in the shell could be used on arbitrary programs without writing an external command definition first.

So if we’re too snobby to resort to parser tricks or fancy wrappers, what are we left with? Well we’ve got some serious constraints. The input space for command arguments is every single letter, number, and symbol. Any use of a special character for syntax makes it potentially harder for people to pass that character to commands, for example if + and - were used as maths operators, you’d need to quote every flag you passed: git add "--all" instead of git add --all, since the dashes would be interpreted as different syntax.

You’ve probably already come across this using curl to download a URL with query parameters:

$ curl https://willhbr.net/archive/?foo=bar
zsh: no matches found: https://willhbr.net/archive/?foo=bar
$ curl 'https://willhbr.net/archive/?foo=bar'
# ...

Since ? is treated specially in most shells to do filename matches, you have to wrap any string that uses it in quotes. Since so many people are used to dumping arbitrary strings unquoted as command-line arguments, you don’t want to restrict this too much and force people to carefully quote every argument. It’s easy to start an escaping landslide where you keep doubling the number of escape characters needed to get through each level of interpolation.

oil is the most promising next-generation shell, in my opinion. From a purist perspective, it does treat functions and commands slightly differently, as far as I can see. This does look like it’s done in a very well thought out way, where certain contexts appear to take an expression instead of a command. This is best understood by reading this post on the Oil blog.

# the condition is an expression, not a command so it can have operators
# and variables without a `$` prefix.
if (x > 0) {
  echo "$x is positive"
}
# you can still run commands inside the condition
if /usr/bin/false {
  echo 'that is false'
}

Once you’ve split the capabilities of functions and commands, you might as well add a whole set of string-processing builtin functions that make grep, sed, cut, awk and friends unnecessary. Being able to trivially run a code block on any line that matches a regex would be excellent. Or being able to use code to specify a string substitution, rather than just a regex.3

There’s also a third dimension for any shell, and that’s how well it works as an actual interface to type things into. The syntax of the Oil ysh shell is better than ZSH, but in ZSH I can customise the prompt from hundreds of existing examples, I can use Vim keybindings to edit my command, I have syntax highlighting, I have integration with tools like fzf to find previous commands, and I have hundreds of lines of existing shell functions that help me get things done. And to top it all off, I can install ZSH on any machine from official package sources. Right now, it’s not worth it for me to switch over and lose these benefits.

  1. Which doesn’t seem to be something many people are interested in; we’re pretty invested in this Linux thing at this point. 

  2. Except for modifying variables and the environment of the shell process. 

  3. I know I can probably somehow do all this with awk. I know that anything is possible in awk. There are some lines I will not cross, and learning awk is one of them. 


Picking a Synology

One of the key characteristics you want from a backup system is reliability. You want to minimise the number of things that can fail, and reduce the impact of each failure for when they do happen. These are not characteristics that would be used to describe my original backup system:

a small computer sitting on a shoebox with an external HDD next to it, surrounded by a nest of cables

The first iteration of my backup system, running on my Scooter Computer via an external hard drive enclosure.

This setup pictured above evolved into a Raspberry Pi (featured unused in the bottom of that photo) with two external 4T hard drives connected to it. All my devices would back themselves up to one of the drives, and then rsnapshot would copy the contents of one drive across to the other, giving me the ability to look back at my data from a particular day. The cherry on top was a wee program1 that ran an HTTP server with a status page, showing the state of my backups:

screenshot of a webpage with a list of backup times in a table

My custom backup status page that told me whether I was still snapshotting my data or not.

Naturally, this system was incredibly reliable and never broke,2 but I decided to migrate it to a dedicated NAS device anyway. Synology is the obvious choice, they’ve got a wide range of devices, and a long track record of making decent reliable hardware.

With the amount of data that I’m working with (<4T) I could absolutely have gone with a 1-bay model. However this leaves no room for redundancy in case one disk fails, no room for expansion, and I already had two disks to donate to the cause. Two bays would have been a sensible choice, it would have allowed me to use both my existing disks and have redundancy if one failed. But it would have limited expansion, and once you’re going two bays you might as well go four… right? If I’m buying something to use for many years, having the ability to expand up to 64T of raw storage capacity is reassuring.

At the time that I was researching, Synology had three different four-bay models that I was interested in: the DS420+, DS418, and DS420j.

The DS420+ is the highest end model that doesn’t support additional drive expansion (there are some DS9xx+ models that have 4 internal bays and then allow you to expand more with eSATA). It runs an x86 chip, supports Btrfs, allows for NVMe flash cache, and can run Docker containers. It has hot-swappable drive bays and was released in 2020 (that’s the -20 suffix on the model name3).

The DS418 is the “value” model, it’s basically just the one they made in 2018 and kept around. It also runs an x86 chip, supports Btrfs, and can run Docker containers. It uses the same basic chassis as the DS420+, so also has hot-swappable drives.

The DS420j is the low-cost entry model, running a low-end ARM chip, no Btrfs support, no Docker, and a cheaper chassis with no hot-swappable drives.

Btrfs is a copy-on-write filesystem that never overwrites partial data. Each time part of a block is written, the whole block is re-written out to an unused part of the disk. This gives it the excellent feature of near-free snapshots. You can record some metadata of which blocks were used (or even just which blocks to use for the filesystem metadata) and with that you get a view into the exact state of the disk at that point in time, without having to store a second copy of the data. Using Btrfs would replace my existing use of rsnapshot, moving that feature from a userspace application to the filesystem.

This had initially pointed me towards the DS420+ or DS418. My concern with the 418 was the fact that it was already over 4 years old. I didn’t want to buy a device that was bordering on halfway though its useful lifespan (before OS updates and other software support stopped). The cost of the DS418 was only a little bit less than the DS420+, so if I was going to spend DS418 money, I might as well be getting the DS420+.

The other feature of the DS418 and DS420+ was Docker support—you can run applications (or scripts) inside containers, instead of in the cursed Synology Linux environment. I wasn’t planning on running anything significant on the Synology itself, it was going to be used just for backup and archival storage. Anything that required compute power would run on my home server.

Eventually I decided that the advantages of Btrfs and Docker support were not enough to justify the ~$300 price premium when compared to the DS420j. I already knew and trusted rsnapshot to do the right thing, and I could put that money towards some additional storage. The DS420j is a more recent model, and gives me the most important feature, which is additional storage with minimal hassle.

I’ve had the DS420j for about three months now, it’s been running almost constantly the entire time, and my backup system has moved over to it entirely.

The first thing I realised when setting up the DS420j is despite the OS being Linux based, it does not embrace Linux conventions. Critically it eschews the Linux permission model entirely and implements its own permissions, so every file has to be 777—world read and writable—for the Synology bits to work. This has knock-on effects to the SSH, SFTP, and rsync features; any user that has access to these has access to the entire drive. Since I’m the only user on the Synology, I’m not that bothered by this. The only reason I’d want different users is to have guarantees that different device backups couldn’t overwrite each other.

The best thing by far with the Synology is how much stuff is built in or available in the software centre. Setting up Tailscale connectivity, archives from cloud storage (eg Dropbox), and storage usage analysis was trivial.

The most difficult thing about moving to the Synology was working out how to actually move my data over. Archives of various bits were scattered across external hard drives, my laptop, and my RPi backup system. Since I was using the disks from the RPi in the Synology, I had to carefully sequence moving copies of between different disks as I added drives to the Synology (since it has to wipe the drive before it can be used).

During the migration having USB 3 ports on the NAS was excellent, with the RPi I’d be forced to copy things from over the network using another computer, but now I can just plug directly in and transfer in much less time. An unexpected benefit was that I could use an SD card reader to dump video from GoPros directly onto the Synology (since I knew I wasn’t going to get around to editing it). This will probably come in handy if I want to actually pull anything off the Synology.

At the moment I’m using 4.1T of storage (most of that is snapshots of my backups). According to the SHR Calculator I can add two more 4T drives (replacing my 2T drive) to get 12T of usable space, or two 8T drives to get 16T. Since my photo library grows at about 400G per year, I think my expansion space in the DS420j will be sufficient for a long time.4

  1. The program was written in Crystal, and those in the know will be aware just how painful cross-compilation to ARM is! 

  2. It actually only broke once when one of the disks failed to mount of all my data was spewed onto the mount point on the SD card, filling up the filesystem and grinding the whole thing to a halt. 

  3. Can you really trust your backups to a company that has a naming scheme that is going to break in a mere 77 years? 

  4. Until I get a Sony a7RV and the size of my raw photos almost triples. 


Why Crystal is the Best Language Ever

Crystal is a statically typed language with the syntax of a dynamically typed one. I first used Crystal in 2016—about version 0.20.0 or so. The type of projects I usually work on in my spare time are things like pod, or my server that posts photos to my photos website.

Type System

This is the main selling point of Crystal, you can write code that looks dynamically typed but it’ll actually get fully type checked. The reality of this is that if I know the type and the method is part of a public interface (for me that’s usually just a method that I’m going to be calling from another file), I’ll put a type annotation there. That way I usually only have to chase down type errors in single files. If I’m extracting out a helper method, I won’t bother with types. You can see this in the code that I write:

private def calculate_update(config, container, remote) : ContainerUpdate
  ...

The three argument types are fairly obvious to anyone reading the code, and since the method is private the types are already constrained by the public method that uses this helper. If I wrote this in Java it would look something like:

private ContainerUpdate calculateUpdate(
  Config config, Container container, Optional<String> remote) {
  ...

There’s a spectrum between language type flexibility and language type safety. Dynamic languages are incredibly flexible, you can pass an object that just behaves like a different object and everything will probably work. The language gets out of your way—you don’t have to spend any time doing work explain to the compiler how things fit together—it’ll just run until something doesn’t work and then fail. Languages that boast incredible type safety (like Rust) require you to do a bunch of busywork so that they know the exact structure and capabilities of every piece of data before they’ll do anything with it. Crystal tries to bend this spectrum into a horseshoe and basically ends up with “static duck typing”—if it’s duck shaped at compile time, it will probably be able to quack at runtime.

It definitely takes some getting used to. The flow that I have settled on is writing code with the types that I know, and then seeing if the compiler can work everything out from there. Usually I’ll have made a few boring mistakes (something can be nil where I didn’t expect, for example), and I’ll either be able to work out where the source of the confusing type is, or I can just add some annotations through the call stack. Doing this puts a line in the sand of where the types can vary, making it easy to see where the type mismatch is introduced.

The Crystal compiler error trace can be really daunting, since it spits out a huge trace of the entire call stack of where the argument is first passed to a function all the way to where it is used in a way it shouldn’t be. However once you learn to scroll a bit, it’s not any harder than debugging a NoMethodError in Ruby. At the top of the stack you’ve got the method call that doesn’t work, each layer of the stack is somewhere that the type is being inferred at.

This can get confusing as you get more layers of indirection—like the result of a method call from an argument being the wrong type to pass into a later function—but I don’t think this is any more confusing than the wrong-type failures that you can get in dynamic languages. Plus it’s happening before you even have to run the code.

A downside of Crystal’s type system is that the type inference is somewhat load-bearing. You can’t express the restrictions that the type system will make from omitting type annotations, the generics are not expressive enough. So very occasionally the answer to fixing a type error is to remove a type annotation and have the compiler work it out.

Standard Library

This is probably the thing that keeps me locked in to using Crystal. Since I’m reasonably familiar with the Ruby standard library, I was right at home using the Crystal standard library from day one. As well as being familiar, it’s also just really good.

Rust—by design I’m pretty sure—has a very limited standard library, so a lot of the common things that I’d want to do (HTTP client and server, data serialisation, for example) require third-party libraries. Since Crystal has a more “batteries included” standard library, it’s easier for my small projects to get off the ground without me having to find the right combinations of libraries to do everything I want.

API design is hard, and designing a language’s standard library is especially difficult, since you want to leave room for other applications or libraries to extend the existing functionality, or for the standard library types to work as an intermediary between multiple libraries that don’t have to be specifically integrated together. This is where I really appreciate the HTTP server and I/O APIs. The HTTP server in the standard library is really robust, but the HTTP::Handler abstraction means that you can fairly easily replace the server with another implementation, or libraries can provide their own handlers that plug into the existing HTTP::Server class.

The IO API is especially refreshing given how hard it is to read a file in Swift. It’s a great example of making the easy thing easy, but then making the more correct thing both not wildly different, or much harder.

# Reading a file as a String is so easy:
contents = File.read(path)
# do something with contents
# And doing the more correct thing is just one change away:
File.open(path) do |file|
  # stream the file in and do something with it
end

And then since all input and output use the same IO interface, it’s just as easy to read from a File as it is to read from a TCPSocket.

There is definitely a broader theme here; Crystal is designed with the understanding that getting developers to write 100% perfect code 100% of the time is not a good goal. You’re going to want to prototype and you’re going to want to hack, and if you’re forced to make your prototype fully production-ready from the get-go, you’ll just end up wasting time fighting with your tools.

Scaling

I wrote back in 20171 thinking about how well different languages scaled from being used for a small script to being used for a large application. At this point I was still hoping that Swift would become the perfect language that I hoped it could be, but over five years later that hasn’t quite happened.

The design of Crystal sadly almost guarantees that it cannot succeed in being used by large teams on a huge codebase. Monkey-patching, macros, a lack of isolated modules, and compile times make it a poor choice for more than a small team.

Although I remain hopeful that in 10 years developers will have realised that repeatedly writing out type annotations is a waste of time, and perhaps we’ll have some kind of hybrid approach. What about only type annotations for public methods—private methods are free game? Or enforce that with a pre-merge check, so that developers are free to hack around in the code as they’re making a feature, and then baton down their types when the code is production ready.

Flexibility

I’m of the opinion that no piece of syntax should be tied in to a specific type in the language. In Java, the only things that can be subscripted are arrays—despite everyone learning at university that you should always use List instead. This limits how much a new type can integrate into the language—everything in Java basically just ends up being a method call, even if an existing piece of syntax (like subscript, property access, operator, etc) would be neater.

Pretty much everything in Crystal is implemented as a special method:

struct MyType
  def [](key)
    ...
  end

  def property=(value)
    ...
  end
end

There’s no special types that have access to dedicated syntax (except maybe nil but that is somewhat special), so you can write a replacement for Array and have it look just like the builtin class. Being able to override operators and add methods to existing classes allows things like 4.hours + 5.minutes which will give you a Time::Span of 4:05. If you did this in Java2 you’d have something like this, which absolutely does not spark joy:

Duration.ofHours(4).plus(Duration.ofMinutes(5))

Safety

While Crystal’s type system is game-changing, it doesn’t break the status quo in other ways. It has no (im)mutability guarantees, and has no data ownership semantics. I think this is down the design goal of “Ruby, but fast and type checked”. Ruby has neither of those features, and so nor does Crystal.

An interesting thought is what would a future language look like if it tried to do what Crystal has done to type checking to data ownership. The state of the art in this area seems to be Rust and Pony, although it seems like these are not easy to use or understand (based on how many people ask about why the borrow checker is complaining on Stackoverflow). A hypothetical new language could have reference capabilities like Pony does, but have them be inferred from how the data is used.

Macros

Every language needs macros. Even Swift (on a rampage to add every language feature under the sun) is adding them. Being able to generate boring boilerplate means developers can spend less time writing boring boilerplate, and reduces the chance that a developer makes a mistake writing boring boilerplate because they were bored. If my compiled language can’t auto-implement serialisation in different formats (JSON, YAML, MessagePack) then what’s even the point of having a compiler?

It’s a shame that Crystal’s macros are a bit… weird. The macro language is not quite the full Crystal language, and you’re basically just generating text that is fed back into the compiler (rather than generating a syntax tree). Crystal macros are absolutely weak-sauce compared to macros in Lisp or Elixir—but those languages have the advantage of a more limited syntax (especially in the case of Lisp) which does make their job easier.

Crystal macros require a fairly good understanding of how to hack the type system to get what you want. I have often found that the naive approach to a macro would be completely impossible—or at least impractical—but if you flipped the approach (usually by leveraging macro hooks) you can leverage the flexible type system to produce working code.

The current macros are good enough to fit the use cases that I usually have, and further improvements would definitely be in the realm of “quality of life” or “academically interesting”. You can always just fall back to running an external program in your macro, which gives you the freedom to do whatever you want.

The Bottom Line

Back in my uni days there would be a new language each week that I was convinced was the future—notable entries include Clojure, Elixir, Haskell, Kotlin, and Go. There are aspects to all these languages that I still like, but each of them have some fairly significant drawback that keeps me from using them3. At the end of the day, when I create a new project it’s always in Crystal.

Other languages are interesting, but I’m yet to see something that will improve my experience working on my own small projects. Writing out interface definitions to appease a compiler sounds just as unappealing as having my program crash at runtime due to a simple mistake.

  1. I’d only dabbled in Crystal for less than a year at this point, and was yet to realise that it was the best language ever. 

  2. After researching for hours which library was the correct one to use. 

  3. Really slow edit/build/run cycle, process-oriented model gets in the way for simple projects, I just don’t think I’m a monad type of guy, experience developing outside of an IDE is bad, lacking basic language features.