Photo by Cédric Dhaenens on Unsplash

Planestream - the ADS-B datasource

The background behind our demo dataset and initial inspiration for Eventador


by Kenny Gorman, Founder and CEO
   23 Oct 2017

When we started Eventador.io in 2016 we needed a simple data source to help us build the platform on. We needed something that exemplified streaming data, something massively dynamic, and something with a lot of data. Tweets were played out, we wanted something better.

My Co-Founder, Erik Beebe, stepped up with the answer, quipping “You know, most all aircraft send out data…”. If you know anything about ADS-B, then you know where we are going with this. We had found our simple data source.

The brief explanation is - Aircraft emit radio signals with data about their flight. Automatic dependent surveillance – broadcast (ADS–B) is a standardized data format that allows for the transmission of data in real time from aircraft to ground, and from aircraft to aircraft. Aircraft transmit and receive this data as a constant stream of data once a second.

You can capture this stream of data using inexpensive Software Defined Radio (SDR) components and emit it to a streaming application engine (like Eventador.io) and do all sorts of amazing stuff with it. We use this data in-house for examples, load testing, demo’s and all sorts of things. As we write new products and components to our system, all of them get a workout with ‘planestream’ data.

Radio hacking for fun and datasource

Both Erik and I have receivers setup at home as well as a single one setup at the office. This covers the Austin skyline and landings into Austin-Bergstrom International Airport pretty well. Erik’s antenna is something that is to be behold, he’s nuts. Even some of our investors are now playing with this type of data. If you are a true data nerd, it’s hard to look away.

rtlsdr

My RTL-SDR receiver setup

It’s worth noting that ADS-B is ubiquitous in the US, but the EU, Asia, etc aren’t yet cut over to ADS-B across the board, so don’t expect much traffic if you are in, say, Poland or Australia.

I thought I would share a bit about how we initially capture the data in this post. The goal is to get up and running and get the data into Kafka. In follow on posts we will detail other fun things we do with the data once it’s in Kafka. It should be said that this is ADS-B out data, we are only reading the data and visualizing it. This is read-only in terms of radio hacking.

Digging in

If you weren’t aware, and I wasn’t until Erik enlightened me, there are a number of tools for gathering this data. All of them are low cost or free, and easy to configure and get running. At a high level the components are:

  • The receiver: A software defined radio and antenna (SDR) as an ADS-B receiver (RTLSDR)
  • The data: A utility to translate the radio signal to something more usable (dump1090)
  • Producing: A utility to take that data and feed Eventador.io Kafka (or any Kafka endpoint really)

The receiver: RTL-SDR

The first thing you will need is an SDR - both hardware and software. There are a number of inexpensive hardware components and open source software to interface with it. It can be a dedicated Raspberry Pi or it can simply be running on your computer with a USB antenna. Any RTLSDR device will do.

I am using an RTL2832U dongle ( Amazon ) and running the RTL-SDR software on OSX. Installing and getting testing it looks something like this:

brew install cmake
brew install libusb
brew install pkgconfig
brew install sox
git clone git://git.osmocom.org/rtl-sdr.git
cd rtl-sdr/
mkdir build
cd build/
cmake ../
make
sudo make install

Now let’s test it:

/usr/local/bin/rtl_test -t

Found 1 device(s):
  0:  Realtek, RTL2838UHIDIR, SN: 00000001

Using device 0: Generic RTL2832U OEM
Found Rafael Micro R820T tuner
Supported gain values (29): 0.0 0.9 1.4 2.7 3.7 7.7 8.7 12.5 14.4 15.7 16.6 19.7 20.7 22.9 25.4 28.0 29.7 32.8 33.8 36.4 37.2 38.6 40.2 42.1 43.4 43.9 44.5 48.0 49.6
[R82XX] PLL not locked!
Sampling at 2048000 S/s.
No E4000 tuner found, aborting.

Perfect.

It’s also worth noting that FlightAware has built a service on this data (among other sources). In fact, you can contribute to it’s dataset by purchasing an approved hardware device and participating in their network. You can build an ADS-B receiver with their parts or your own.

Usable data: dump1090

Now we need to make sense of the radio signals. We will use a mode-S decoder for RTL-SDR devices called ‘dump1090’. It’s name being derived from the fact that Mode-S messages (ADS-B data) are transmitted on 1090 Mhz (for commercial aircraft at least). This utility will decode the radio messages into usable text that we can pipe to Kafka. Since we use JSON in most of our examples and datasets we have modified dump1090 to output JSON to STDOUT. So use our fork of dump1090 (as shown below).

We take the output of dump1090 and pipe it through jq then pipe it into Kafka via kafkacat. If you aren’t familiar with these tools, here is a primer.

First let’s install the software:

git clone git@github.com:Eventador/dump1090.git
make
brew install kafkacat jq

Now let’s look at some data:

./dump1090 --write-json-stdout | jq -r '"\(.aircraft[])"' 2>/dev/null
{"hex":"a71283","lat":30.282184,"lon":-97.642529,"nucp":5,"seen_pos":13.4,"altitude":22000,"vert_rate":0,"track":89,"speed":279}
{"hex":"a24b56","mlat":[],"tisb":[],"messages":38,"seen":134.2,"rssi":-24.9}
{"hex":"a1ca1d","mlat":[],"tisb":[],"messages":12,"seen":214.6,"rssi":-25.7}
{"hex":"a50c7c","altitude":3500,"mlat":[],"tisb":[],"messages":121,"seen":4.6,"rssi":-25.5}

Each JSON formatted message is a little bit of data being sent from an aircraft. Aircraft send data once a second (1Hz). Each message will likely contain a subset of all of the possible keys. The unique key for the aircraft is the ICAO# number represented by the “hex” key. This key uniquely identifies the aircraft. The jq utility helps us parse out the keys we want, and emits each results one by one to STDOUT.

Producing JSON to Kafka

Lastly, let’s produce the data to Kafka. You could write a program to wrap this into a simple client, but it’s nice to see the discrete steps on the command line. In this example we use a topic named “adsb_test” and a hypothetical server named mykafkasever on the default port:

./dump1090 --write-json-stdout | jq -r '"\(.aircraft[])"' 2>/dev/null | kafkacat -P -b mykafkaserver:9092 -t adsb_test

Now we have a Kafka topic that is populated with real-time flight information in JSON format. This is the ultimate in streaming data! If you want to take a peek at the data in Kafka, simply use kafkacat to consume the data:

kafkacat -C -b mykafkaserver:9092 -t adsb_test

In follow on posts we will reference this data, and perhaps show some interesting (not too interesting!) ways of manipulating and using this data. We should add that we are using this data in a read only fashion using ADS-B out. Please hack responsibly!

Join the discussion

If you need help setting this up for yourself, want to discuss this in greater detail, or perhaps send a data stream to our corporate cluster - join our slack channel and ask away! We are would love to chat and hopefully be helpful. You can sign up for our slack channel using slackpass.