Update 2020/10/08: Since writing this, the architecture of Attics has changed considerably. It now uses serverless Go functions deployed on AWS which I have found much easier to maintain. Look for a upcoming post explaining this new architecture!

I recently rewrote the API that serves Attics its data from a small Rails app to Go. With this transition, I added a simple logging middleware that runs on every request to the API.

func (app *App) logRequest(h http.HandlerFunc) http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		h(w, r)
		app.logger.Printf("%s %s\n", r.Method, r.URL.Path)
	}
}

In production, this produces logs that look like

$ docker-compose logs attics_api | tail

attics_api_1  | 2019/08/17 16:24:57 GET /v1.1/GratefulDead/top_shows
attics_api_1  | 2019/08/17 16:25:00 GET /v1.1/GratefulDead/1969/1969-12-31
attics_api_1  | 2019/08/17 16:25:02 GET /v1.1/sources/gd69-12-31.sbd.gardner.7373.sbeok.shnf
attics_api_1  | 2019/08/17 16:31:24 GET /v1.1/GratefulDead/top_shows
attics_api_1  | 2019/08/17 16:31:31 GET /v1.1/GratefulDead/1965/1965-11-01
attics_api_1  | 2019/08/17 16:31:39 GET /v1.1/GratefulDead/1969/1969-11-02
attics_api_1  | 2019/08/17 16:31:41 GET /v1.1/sources/gd69-11-02.sbd.goodbear.1125.sbefail.shnf
attics_api_1  | 2019/08/17 16:37:09 GET /v1.1/GratefulDead/top_shows
attics_api_1  | 2019/08/17 16:37:15 GET /v1.1/GratefulDead/1989/1989-10-09
attics_api_1  | 2019/08/17 16:37:16 GET /v1.1/sources/gd89-10-09.sbd.serafin.7721.sbeok.shnf

The latest update to Attics which moved to this API hasn’t even been out a week yet, and it’s already received thousands of requests!

$ docker-compose logs attics_api | wc -l
5478

I’m curious which shows people are listening to, so let’s use some shell scripting to count the number of times each show has been visited. The log for a visit to the endpoint for getting the songs for a source (Archive speak for a recording) looks like

attics_api_1  | 2019/08/17 21:08:44 GET /v1.1/sources/gd79-10-27.sbd.clugston.13980.sbeok.shnf

Let’s get all the lines like this using grep.

$ docker-compose logs attics_api | grep 'sources'

attics_api_1  | 2019/08/17 21:19:20 GET /v1.1/sources/gd1992-06-11.sbd.miller.90105.sbeok.flac16
attics_api_1  | 2019/08/17 21:20:35 GET /v1.1/sources/gd1975-06-17.aud.unknown.87560.flac16
attics_api_1  | 2019/08/17 21:20:45 GET /v1.1/sources/gd75-08-13.fm.vernon.23661.sbeok.shnf
attics_api_1  | 2019/08/17 21:20:56 GET /v1.1/sources/gd76-06-09.set2-sbd.gardner.5426.sbeok.shnf
attics_api_1  | 2019/08/17 21:21:11 GET /v1.1/sources/gd73-02-09.sbd.bertha-fink.14939.sbeok.shnf
...

Every source has the date in its identifier, either in the form XXXX-XX-XX or XX-XX-XX. The latter will match all the cases of the former, so let’s use grep again and search for the latter pattern.

$ docker-compose logs attics_api \
    | grep 'sources' \
    | grep -o -E '[0-9]{2}-[0-9]{2}-[0-9]{2}'

92-06-12
92-06-17
92-06-18
72-05-03
92-06-11
75-06-17
75-08-13
76-06-09
73-02-09
...

The -o switch tells grep to print only the text in the line that matches the pattern, and -E allows us to use the {2} syntax.

Now we need to get a count of how many time each date appears. Luckily, the uniq tool can do this with the -c flag. However, uniq expects all the unique lines to be adjacent, so that for example each occurrence of 92-06-12 needs to be grouped together. We can easily do this with sort.

$ docker-compose logs attics_api \
    | grep 'sources' \
    | grep -o -E '[0-9]{2}-[0-9]{2}-[0-9]{2}' \
    | sort \
    | uniq -c

1 95-06-04
5 95-06-18
1 95-06-19
3 95-06-22
2 95-06-24
1 95-06-25
1 95-06-27
1 95-06-28
12 95-06-30
...

Perfect! Now we can get the most visited shows by sorting this list numerically and reversing it with the -g and -r flags respectively.

$ docker-compose logs attics_api \
    | grep 'sources' \
    | grep -o -E '[0-9]{2}-[0-9]{2}-[0-9]{2}' \
    | sort \
    | uniq -c \
    | sort -g -r \
    | head

39 77-05-08
37 69-08-16
29 91-06-17
28 89-07-07
27 71-08-06
26 65-11-03
24 76-06-09
23 87-09-18
22 82-10-10
22 80-05-16

And we’re done. There are some classics here like 77-05-08 and 71-08-06, but also quite a few I personally haven’t listened to, so I have some catching up to do!

Unix tools are great for quickly analyzing text like this. Knowing your way around the basic tools like grep can get you far alone, and if you get stuck, the man pages are always there to help.