neilotoole.io

Web presence of Neil O'Toole.

sq passes 500 stars

It’s been a few months since I posted about sq, but a lot has been going on. We’re now on sq v0.48.3, and the project has passed 500 stars ⭐️ on GitHub.

The changes since the last sq post are too numerous to list here (see the extensive CHANGELOG for details), but some highlights include:

  • Large performance improvements to sq diff.
  • Caching mechanism for ingested data sources (such as CSV, JSON, etc.).
  • Significantly improved Excel support.
  • A progress bar implementation, particularly important for long-running operations such as ingesting large document sources. sq_inspect_remote_s3.png

streamcache

neilotoole/streamcache is a package I’ve had loitering in the backlog for some time. I finally polished it off for use with sq, to handle streaming from stdin or downloads. streamcache implements a Go in-memory byte cache mechanism that allows multiple callers to read some or all of the contents of a source io.Reader, while only reading from the source reader once. When only the final reader remains, the cache is discarded and the final reader reads directly from the source. This is particularly useful for scenarios where multiple readers may wish to sample the start of a stream, but only one reader will read the entire stream.

streamcache_multicase.png

sq package for Void Linux

Thanks to the efforts of Void Linux contributor @icp, there is now a sq package for Void Linux’s XBPS package manager.

# Update to get latest repo info
$ xbps-install -Syu;  xbps-install -yu xbps

# Install sq
$ xbps-install -yu sq

# Run sq
$ sq version
sq v0.42.1

Or, just use the install script, which detects Void Linux and calls xbps-install under the covers:

$ /bin/sh -c "$(curl -fsSL https://sq.io/install.sh)"

tparse appreciation

If you asked me to describe my (Go) developer toolchain essentials, I’d probably rattle off… macOS, zsh, git & GitHub, Docker, Kubernetes & kubectl, GoLand, golangci-lint, and ngrok?

But sometimes you don’t appreciate things until they’re gone.

If you practice a tight feedback loop, as I very much do, you could be running go test dozens or hundreds of times a day. Overall, I find the Go testing framework to be lovely to work with, and I consider it an integral part of the platform’s success. However, I feel that the default output format of go test leaves a little to be desired. In large test suites, e.g. as might be seen in a sq test run, it’s burdensome to find test failures in the wall of output.

The -json format is even weaker: it’s just barely structured, and feels like a low-effort addition to make the output technically JSON. But for both of these formats, there’s already a whole tool ecosystem that depends on the current output format details, so there’s little expectation of change.

Enter tparse: it’s a wrapper around go test, and I find its output much more suited to my local dev workflow. Give it a try.

tparse

The trigger for this post is that the most recent release of tparse panics on my sq test run. I’ve filed a bug report, but for now I’ve had to roll back to an older version of tparse. Hopefully that’ll be patched up soon, but in the meantime: I 😍 you, tparse. Come back to me.

UPDATE 2023-11-18: The tparse author has fixed the bug, and released v0.13.2, resolving the issue. Huzzah!

sq v0.41.0

sq v0.41.0 has been published. This release focuses heavily on improvements to Excel support, including:

  • An under-the-hood rewrite, and switch to a new backing Excel driver.
  • The ingest process can now handle duplicate column header names, per #99.
  • And it will now also auto-detect header rows, like CSV has done for ages.
  • And a bunch more minor improvements, including better ingest performance.

Download via sq.io.

sq v0.40.0

sq v0.40.0 is a major release, with a completely rewritten join mechanism. Previously, only a single join was supported, but now you chain as many as desired.

$ sq '.actor | join(.film_actor, .actor_id) | join(.film, .film_id) | .first_name, .last_name, .title'

There’s also support for the full family of join types (inner, left, right, cross, etc.).

Download via sq.io.

shelleditor

shelleditor is a tiny package I’ve put together to call out to the system editor from within a Go CLI. Like most everybody in the Kubernetes ecosystem, I spend a lot of time with the kubectl CLI, and more time than I’d like doing kubectl edit X, to directly edit k8s config for debugging.

Recently, I’ve been working on a complete overhaul of sq’s config mechanism. I may have gone a little overboard.

sq is highly configurable, via flags, via its config file, and via sq config set.

However, when making changes to a sq source, who can remember all the flags? Yes, there’s shell completion for those flags, but there’s a lot of flags. Wouldn’t it be easier to be able to see all the available flags in front of you, and edit them as desired? Kinda like what kubectl edit does?

Enter sq config edit:

sq config edit

That idea’s inception led down a deep rabbit hole, worthy of its own post. But for the purposes of this post, I had an immediate problem to solve: how to call out to the system editor, e.g. vi. I found a standalone package, tj/go-editor, but it didn’t seem to be actively maintained, and anyhow, I was curious to see how kubectl did it. After eventually figuring out how to navigate the Kubernetes source code, I decided I may as well just lift the kubectl edit implementation from them. Alas, a simple import pulled in an ungodly amount of Kubernetes dependencies. So, I extracted the essentials, and the shelleditor package was born.

slogt

Go 1.21 has a new structured logger in stdlib: slog. I’ve been using the slog pre-releases in my projects for many months. While there’s still a bit of a debate about the merits of its API, slog checks enough boxes that it will become the standard. This eliminates a major hassle for Go developers: needing to add bridges to various logging packages in order to get a unified logging experience.

It also means I can formally retire my lg experiment.

One aspect that the Go team have strangely neglected is that there’s no bridge between slog and the stdlib testing package. I created the slogt package as that bridge. Hopefully something like this will one day be incorporated into stdlib, and I can then retire slogt.

UPDATE 2024-03-15: I was recently looking through the docs for riverqueue, which is a Postgres-backed job queue for Go that I was investigating, and I saw this:

riverqueue_slogt.png

It’s gratifying to see that I’m not the only one who cares about correlating logs with tests!

asciinema

I’ve recently been using asciinema on the sq.io docs site.

I found this Hugo module that makes it trivial to integrate asciinema.

{{< asciicast src="/telescope-repo-nvim/telescope.json" poster="npt:0:04" autoPlay=true loop=true theme="nord" >}}

One minor shortcoming was that it wasn’t possible to set default values, e.g. setting theme="nord" throughout the site. One quick PR later, the author took care of that.

Very much relatedly, I wanted the Asciinema player to use the Nord theme. I ported Nord to an asciinema theme. It is mighty pretty.

asciinema Nord Theme

However, it still requires a minor bit of config to use the theme. I’ve opened an asciinema player PR to bundle the Nord theme with the player. It’s only a tiny additional bit of CSS.

UPDATE 2023-04-15: That PR was merged to asciinema, so you should shortly be able to use the Nord theme directly from the asciinema distro.

jsoncolor

One of the benefits of using jq is that it colorizes & prettifies JSON output, making it much easier to parse visually. I wanted to do the same for sq, which is written in Go. Looking around the existing libraries landscape, I found several colorizing encoders, but each had shortcomings.

So, I created neilotoole/jsoncolor, which ticks all the boxes. It’s fast (faster than stdlib in many cases), configurable, and is a drop-in replacement for the stdlib json package. I’ve already been using it in CLIs on client projects, and it works well.

Colorization is configurable. This is the color scheme I chose for sq, mimicking jq’s color palette.

jsoncolor output

errgroup

I’m a big fan of the Go stdlib sync/errgroup package. The errgroup.Group type makes it trivial to manage multiple worker goroutines that all need to succeed, and often obviates the need for sync.WaitGroup.

However, both in personal and client projects, I regularly ran into the issue where I wanted to rate-limit the number of simultaneous goroutines. An example would be where the goroutines are calling a backend API (or DB) that itself has rate limits.

I created the neilotoole/errgroup package as a drop-in replacement for stdlib errgroup. You use it like so:

import "github.com/neilotoole/errgroup"

func main() {
  numG, qSize := 8, 4
  g, ctx := errgroup.WithContextN(ctx, numG, qSize)
  g.Go(func() error {
    // do something
    return nil
  })
}

The numG and qSize params control the gating behavior of the Group. In benchmarks, this implementation often outperforms stdlib errgroup, but obviously take your own benchmarks for your specific workload.

UPDATE 2023-01-10: The Go team have updated sync/errgroup with a Group.SetLimit method, which eliminates the need for my implementation (and their implementation is superior too). You should use sync/errgroup instead; I’m happily retiring neilotoole/errgroup.

xcgo

xcgo is a maximalist Docker image for cross-compiling and releasing/distributing CGo-enabled Go/Golang applications. At this time, it can build and distribute macOS, Windows and Linux CGo projects for arch amd64.

xcgo has what gophers crave:

  • go 1.14
  • OSX SDK Catalina / macOS 10.15
  • docker
  • snapcraft
  • goreleaser
  • golangci-lint
  • mage
  • zsh and oh-my-zsh
  • and a bunch of other stuff.

Start at the neilotoole/xcgo repo, and then head over to the wiki. The neilotoole/xcgo images are published to Docker Hub.

There’s also a companion example project (neilotoole/sqlitr) that was created explicitly to exhibit xcgo: it demonstrates pretty much the entire array of xcgo’s capabilities, showing how to release to brew, scoop, snap, Docker Hub, GitHub, etc.

UPDATE 2023-09-22: xcgo was created as a toolchain element for sq, with its many target architectures. At the time, there really wasn’t anything else that covered all the bases. However, recent releases of GoReleaser plus GitHub workflows are now able to satisfy the needs of sq, and so off to the farm with xcgo. It’s still a totally viable project, but it requires a decent chunk of work to add Apple Silicon support, and ongoing work for each Go, OS, and misc tool release. That’s time I’d rather devote to sq

lg v0.2

The first version of the lg package, released years ago, was one of those rites of passage for Go devs: everybody needs to do their own logger implementation at least once. This v0.2 release has a legitimate purpose: it is an exploration of a small, leveled, unstructured logging interface for enterprise applications. lg delegates the actual log entry creation to backing libs (such as uber/zap), explores some idioms (log.WarnIfFuncError), and plays nicely with testing.T.

gohdoc

gohdoc is a command line tool that opens a package’s godoc in the browser. Use like this:

$ gohdoc .               # open current dir godoc in the browser
$ gohdoc fmt             # open pkg fmt
$ gohdoc encoding/jso    # will open pkg encoding/json

This feels like one of those things that should be included in the Go toolchain. See the repo for more.

sq v0.5.0

Back in the sq saddle after some downtime: sq v0.5.0 is now available. Finally there’s conditional selects (basically, the SQL WHERE clause), and a number of other improvements. Download via sq.io.

sq v0.4.0

sq v0.4.0 is out. There’s a bunch of additional output formats in this release: Excel, CSV, TSV, and XML. Download via sq.io.

sq v0.3.0

sq v0.3.0 is done. The major new feature is the addition of the sq inspect command, which provides metadata about the datasource (schema, tables, cols, etc). Download via sq.io.

sq v0.2.1

sq v0.2.1 is now available. There’s a bunch of bug fixes, but the big new feature is cross-datasource joins (you can join an Excel spreadsheet with a MySQL DB, etc.) Download via sq.io.

sq v0.1.11

sq v0.1.11 is now available. This is a bug-fix release. Download via sq.io.

techo testing lib

I’ve just released techo, an Echo-based alternative to Golang’s http.httptest. The genesis was trying to write automated tests for generated (swagger-codegen) code, and I sometimes found stdlib http.httptest to be tedious and verbose. The value of this library is that writing tests is cleaner and expressive with techo.

Swagger Codegen Core Team

In my role at HPE, I’ve been doing a lot of work with Swagger, in particular using the swagger-codegen tool to generate Go clients from our REST API’s swagger document. When I first started working with codegen, the Go generator was pretty minimal (and in beta), and since then I’ve pushed several PRs to the project. I’ll now be doing that in an official capacity: I’ve been invited to be part of the Swagger Codegen Core Team, specifically to look after the Go client generator.

go-github auth API

I’ve been working a lot with the GitHub API of late, for the Go-based client I’m building for HPE. One of the key elements for us (and for everybody else I assume) is the GitHub auth infrastructure. So far, the go-github lib has served us well, except that, for whatever reason, the entire auth API was not implemented. From my perspective, that would have been the first thing I implemented, but that was the situation.

elastic_email module for Drupal

I recently put together a small Drupal module that interfaces with Elastic Email, a mail relay service. That is, instead of your website sending mail via its own SMTP server, outgoing email is directed through the Elastic Email service and out onto the internet. This module provides plug n’ play integration with the Elastic Email service.

Teradata Viewpoint Rewind

Believe me, this was not an easy thing to build, especially considering we’re supporting IE6. But I’m not shy about saying that this is one of the sweetest features I’ve ever seen built in-browser (and not using any Flash nonsense). It’s ahead of its time.

PHP static global caching

I’ve dug into a little PHP/Drupal work of late… not a huge fan of this ecosystem as it currently stands. It seems there’s a lot of simple things that should be done, that just aren’t done. This is my little contribution to that: a simple PHP global static caching library.

If you’ve read this far, you can read the rest over here.

RMI Interruptus! Or how to interrupt RMI method calls

Once upon a time, I was working on the assignment for the SCJD certification. The project was a classic client-server application, with an RMI server, and a Swing-based GUI, which you could use to book hotel rooms or some such. One of the project requirements was to provide record locking, such that only one instance of the client could edit the hotel room entry at once. Fairly standard stuff. So, when the user clicks Edit Room in the client GUI, the client makes an RMI call to acquireLock, and when that method returns, the client can then call update, and so on. But if another client already has the lock, then the acquireLock method will block until the lock becomes available. In that case, the app shows a pleasant dialog like this:

RMI demo app

Obviously you, as the user, would like to be able to click Cancel. And obviously you, as the developer, would like the RMI thread that’s stuck in the remote acquireLock method to return promptly. But you, as the developer, would be sorely disappointed, because the RMI thread will continue to block. Indefinitely! And from this outrage, thus was born the Interruptible RMI library.

JCP Expert Group for JSR-261: WS-Addressing

My years toiling away on web services seem to be paying off: I’ve been invited to be a member of the JCP Expert Group for JSR-261 (Web Services Addressing). A primer: JCP is the Java Community Process: it’s the mechanism for developing standard technical specifications for Java technology. Basically, whatever is decided upon in a JSR goes into the next version of Java. JSR is a Java Specification Request: the formal documents that describe proposed specifications and technologies for adding to the Java platform.

Committer at Apache Software Foundation

For the best part of the last year I’ve been an active contributor to some of the Apache Commons projects (specifically Collections and Lang). I’m now an official committer at the ASF, with write access to the VCS and what not. And a shiny @apache.org email address.

Front page of the WSJ

OK, so WSJ is Web Services Journal and not the Wall Street Journal, but still. This was a nice little surprise: even though I left Cape Clear a few months ago to move stateside, an article I submitted to Web Services Journal just got published in their print edition. Front page and all. The checkout guy at the Borders (an American bookstore) asked me why I was buying so many copies of the magazine, and then shook my hand and told me he’d “never met an author before”. My mother is very proud.

Return to Deutschland for JAX 2002 conference

Little bit of deja-vu. It feels like I was just in Germany speaking at a conference only recently… This time I was in Frankfurt, at JAX2002 (JAX being Java Apache XML). I gave a talk on Web Services and the WSDL Design View, seemed to go down fairly well, although one of the (German) attendees told me afterwards that he had a bit of trouble with the accent. The feeling was mutual mein Freund!

Visual Studio .NET launch conference

Another trip to Germany, this time for Microsoft’s Visual Studio .NET launch event, in Karlsruhe. I gave a joint talk with my buddy Darius (from Microsoft) on Java/.NET web services interoperability. It was a bit nerve-racking, we were on stage in front of probably close to a thousand devs, and I could barely see the audience with the kleig lights pointed down at us. I’ll tell you this much, either the Microsoft events people have way too much money on their hands, or else they just plain love to have a good time. It was quite the show. And the .NET tools are pretty decent compared to what’s available in the Java world at the moment.

Munich: WS interoperability

This is definitely not the worst thing about working for Cape Clear… I just came back from a trip to Munich to speak at the OOP2002 conference, where I was a member of panel discussing Web Services interoperability. Unfortunately they don’t seem to have any material from the conference online, so I’ll spare you the suspense re interoperability. The conclusion is: we’ll get there eventually!

Other conclusions: Munich is wonderful, the food is wonderful, the beer is wonderful. I could get used to that place.