Erlang: An Introduction to Records

by Jesse Farmer on Sunday, October 5, 2008

Internally Erlang has only two internal compound data types: lists and tuples. Neither of these data types support named access, so creating associative arrays a la PHP, Ruby, or Python is an impossibility without additional libraries.

That is, in Ruby, I could do:

server_opts = {:port => 8080, :ip => '127.0.0.1', :max_connections => 10}

while in Erlang there's no such support at the language (syntax) level.

To get around this limitation the Erlang VM provides a pseudo data type called records. Records to support named access with some cruft. We'll see why I call these "pseudo" data types later on.

Defining Records

Records are more similar to structs in C than associative arrays in that that require you to define their contents up front and they can only hold data. Here's an example record that stores connection options for a server of some kind.

-module(my_server).

-record(server_opts,
	{port,
	ip="127.0.0.1",
	max_connections=10}).

% The rest of your code goes here.

Records are defined using the -record directive. The first parameter is the name of the record and the second parameter is a tuple that contains the fields in the record and their default values.

In our case we've defined a server_opts record that has three fields: a port, a binding IP, and the number of maximum connections allowed. There is no default port, but the default value of ip is "127.0.0.1" and the default value of max_connections is 10.

Creating Records

Records are created by using the hash (#) symbol. Using the server_opts record from above the following are all valid ways to create a record.

Opts1 = #server_opts{port=80}.

This creates a server_opts record with port set to 80. The other fields have their default value.

Opts2 = #server_opts{port=80, ip="192.168.0.1"}.

This create a server_opts like the above, expect now ip is set to "192.168.0.1".

In short, when creating a record you can include whatever fields you like. Omitted fields will take on their default value.

Accessing Records

Accessing records is clumsy and it's where they start to reveal their cruft. If I want to access the port field I can do

Opts = #server_opts{port=80, ip="192.168.0.1"},
Opts#server_opts.port

Yep, that's right, any time you want to access a record you have to include the record's name. Why? Because records aren't really internal data types, they're a compiler trick.

Internally records are tuples that look something like this:

{server_opts, 80, "127.0.0.1", 10}

The compiler maps the named fields to their position in the tuple.

The VM keeps track of record definitions and the compiler translates all the record logic to tuple logic when you compile your Erlang program. That is, there is no record "type," so you have to tell Erlang what record we're talking about every time you access one.

Updating Records

Updating records works much like creating records. For example,

Opts = #server_opts{port=80, ip="192.168.0.1"},
NewOpts = Opts#server_opts{port=7000}.

would first create a server_opts record. NewOpts = Opts#{port=7000} create a copy of Opts with a port number of 7000 rather than 80 and bind it to NewOpts.

Matching Records and Guard Statements

This wouldn't be a tutorial about Erlang unless we talked about pattern matching. Let's say we want to do something particular with a server if it is running on port 8080 and something else otherwise.

handle(Opts=#server_opts{port=8080}) ->
	% do special port 8080 stuff
handle(Opts=#server_opts{}) ->
	% default stuff

Guard statements work similarly. For example, binding to ports below 1024 often require root access, so we might want to special cast that:

handle(Opts) when Opts#server_opts.port <= 1024 ->
	% requires root access
handle(Opts=#server_opts{}) ->
	% Doesn't require root access

Using Records

In my limited time using Erlang I've seen records used primarily for two things. First, records are used to keep state, especially when using the generic server behaviour. Since Erlang is side-effect free state cannot be kept globally. Instead it must be passed around from function to function.

Perhaps a subset of the first, records are also used to keep track of configurable options.

There are limitations to records, however. Most notably the ability to add and remove fields on the fly. Like C structs the structure of the record is defined beforehand.

If you want to to add and remove fields on the fly, or if you don't know what fields you'll have until runtime, you should use dicts rather than records.