In my last few articles about Erlang we've covered the basics of network programming with gen_tcp and Erlang/OTP's gen_server, or generic server, module. Let's combine the two.

In most people's minds "server" means network server, but Erlang uses the terminology in the most abstract sense. gen_server is really a server that operates using Erlang's message passing as its base protocol. We can graft a TCP server onto that framework, but it requires some work.

The Structure of a Network Server

Most network servers have a similar architecture. First they create a listening socket that listens for incoming connection. They then enter an accept state in which they loop until termination, accepting each new connection as it arrives and starting the real client/server work.

To see this in action recall the simple echo server from my network programming article:

-module(echo).
-author('Jesse E.I. Farmer <jesse@20bits.com>').
-export([listen/1]).

-define(TCP_OPTIONS, [binary, {packet, 0}, {active, false}, {reuseaddr, true}]).

% Call echo:listen(Port) to start the service.
listen(Port) ->
    {ok, LSocket} = gen_tcp:listen(Port, ?TCP_OPTIONS),
    accept(LSocket).

% Wait for incoming connections and spawn the echo loop when we get one.
accept(LSocket) ->
    {ok, Socket} = gen_tcp:accept(LSocket),
    spawn(fun() -> loop(Socket) end),
    accept(LSocket).

% Echo back whatever data we receive on Socket.
loop(Socket) ->
    case gen_tcp:recv(Socket, 0) of
        {ok, Data} ->
            gen_tcp:send(Socket, Data),
            loop(Socket);
        {error, closed} ->
            ok
    end.

As you can see, listen creates a listening socket and immediately calls accept. This waits for an incoming connection, spawns a new worker (loop) that does the real work, and then waits for the next incoming connection.

In this code the parent process owns both the listen socket and the accept loop. As we'll see this doesn't work so well when we try to integrate the accept/listen loop with gen_server.

Abstracting The Network Server

Network servers come in two parts: connection handling and business logic. As I described above the connection handling is basically the same for every network server. Ideally we'd be able to do something like

-module(my_server).
start(Port) ->
    connection_handler:start(my_server, Port, business_logic).

business_logic(Socket) ->
    % Read data from the network socket and do our thang!

Let's go ahead and do just this.

Implementing A Generic Network Server

The problem with implementing a network server using gen_server is that the call to gen_tcp:accept is blocking. If we were to call this in the server's initialization routine, for example, the whole gen_server mechanism would block until a client connected.

There are two ways to get around this. One involves using a lower-level connection mechanism that supports non-blocking (or asynchronous) accepting. There are then a whole family of functions, most notably gen_tcp:controlling_process, that helps you manage who receives what messages when clients connect.

A simpler and, in my opinion, more elegant solution is to have a single process that owns the listening socket. This process does two things: spawns new acceptors and listens for "connection received" messages. When it receives a message it knows to spawn a new acceptor.

An acceptor is free to call the blocking gen_tcp:accept since it's running in its own process. When it receives a connection it fires an asynchronous message back to the parent process and immediately calls the business logic function.

Here's the code. I've commented where appropriate, so hopefully it's readable.

-module(socket_server).
-author('Jesse E.I. Farmer <jesse@20bits.com>').
-behavior(gen_server).

-export([init/1, code_change/3, handle_call/3, handle_cast/2, handle_info/2, terminate/2]).
-export([accept_loop/1]).
-export([start/3]).

-define(TCP_OPTIONS, [binary, {packet, 0}, {active, false}, {reuseaddr, true}]).

-record(server_state, {
        port,
        loop,
        ip=any,
        lsocket=null}).

start(Name, Port, Loop) ->
    State = #server_state{port = Port, loop = Loop},
    gen_server:start_link({local, Name}, ?MODULE, State, []).

init(State = #server_state{port=Port}) ->
    case gen_tcp:listen(Port, ?TCP_OPTIONS) of
        {ok, LSocket} ->
            NewState = State#server_state{lsocket = LSocket},
            {ok, accept(NewState)};
        {error, Reason} ->
            {stop, Reason}
    end.

handle_cast({accepted, _Pid}, State=#server_state{}) ->
    {noreply, accept(State)}.

accept_loop({Server, LSocket, {M, F}}) ->
    {ok, Socket} = gen_tcp:accept(LSocket),
    % Let the server spawn a new process and replace this loop
    % with the echo loop, to avoid blocking
    gen_server:cast(Server, {accepted, self()}),
    M:F(Socket).
   
% To be more robust we should be using spawn_link and trapping exits
accept(State = #server_state{lsocket=LSocket, loop = Loop}) ->
    proc_lib:spawn(?MODULE, accept_loop, [{self(), LSocket, Loop}]),
    State.

% These are just here to suppress warnings.
handle_call(_Msg, _Caller, State) -> {noreply, State}.
handle_info(_Msg, Library) -> {noreply, Library}.
terminate(_Reason, _Library) -> ok.
code_change(_OldVersion, Library, _Extra) -> {ok, Library}.

We use gen_server:cast to pass asynchronous messages back to the listening process. When the listening process receives the message accepted it spawns a new acceptor.

Right now this server is not very robust because if the active acceptor fails, for whatever reason, the server will stop accepting connections. To make it more OTP-like we should be trapping exits and firing off a new acceptor in the event that a connection fails.

A "Generic" Echo Server

The echo server is the easiest server to write, so let's do it using our new abstract socket server.

-module(echo_server).
-author('Jesse E.I. Farmer <jesse@20bits.com>').

-export([start/0, loop/1]).

% echo_server specific code
start() ->
    socket_server:start(?MODULE, 7000, {?MODULE, loop}).
loop(Socket) ->
    case gen_tcp:recv(Socket, 0) of
        {ok, Data} ->
            gen_tcp:send(Socket, Data),
            loop(Socket);
        {error, closed} ->
            ok
    end.

As you can see the "server" becomes nothing more than its business logic. The connection handling has been generalized and pushed off into its own socket_server. The loop in our generic server is actually identical to the loop in our original echo server, too.

Hopefully you all can learn from this as much as I did. I finally feel like I'm starting to understand Erlang.

Also, feel free to leave a comment, especially if you have any thoughts on how I can improve my code. Cheers!

8 Comments

  1. Kyle K June 16th, 2008 / 11:33 am

    What are you using for your code highlighting? It looks great.

  2. Jesse June 16th, 2008 / 11:44 am

    Kyle,

    I’m using geshi with an Erlang syntax plugin I wrote myself.

  3. John B July 9th, 2008 / 7:12 pm

    Jesse, thanks for this really useful overview! This is my first exposure to gen_server and gen_tcp and I would have been lost without this.

  4. mike July 13th, 2008 / 7:11 pm

    You should probably explain why with the echo server up and running,
    compiling echo_server.erl twice in a row kills the server.

    This is really frustrating to someone coming from lisp where
    hot code updates trully “just work” and things do not fail
    mysteriously. I have been trying to find out the cause of
    this behavior for the past hour with no success.

  5. mike July 14th, 2008 / 9:49 am

    Ok so apparently erlang keeps 2 versions of module code (old and current)
    and processes that run old code are purged (killed). This is badly
    explained almost everywhere i’ve looked and examples are scarce to come
    by for something so important. So my next question then is how to
    do a hot code update without killing the tcp server or dropping any existing
    connections. After all this is erlang where 99.999% availability can be achieved
    this trivial task (which you get for free in common lisp without having to do
    anything special) should be possible.

    After hours of scavenging in badly written documentation, trying to understand
    the monstrosity that seems to be OTP (also noticed that pragmatic programming: erlang
    conveniently pushes this issue under the rug) i still have found no solutions.
    It seems everywhere i look at these days, proponents of erlang keep rambling about
    “scalability, availability, hot code updates” as if these are things that can be
    magically gotten for free. Well if “hot code updates” is how things are done in erlang,
    i will have none of that.

  6. Ludovic Kuty August 6th, 2008 / 9:58 am

    Great article ! I was just wondering how I could make a TCP server using OTP gen_server after having read chapters 16 and 18 in Armstrong’s book. Accept() was my problem and you solved it. Right on time
    :)
    I would also be interested in some informations about asynchronous accept. In fact I use “receive” and messages between processes to handle recv() on the socket. I do this because my process must have the ability to recv() and send() datas without knowing which one will come first. So it could be good for me to extend this style of managing things to accept() as well.

  7. Arni Hermann August 7th, 2008 / 11:07 am

    I think you can compile echo_server.erl twice in a row if you change loop(Socket); to ?MODULE:loop(Socket);

    Try it out.

  8. chrisfarms August 12th, 2008 / 10:09 am

    @Jesse, would I be right in thinking that a TCP service exposed like this could in fact end up using millions of [erlang] processes and potentially hammer the system resources?
    I’m *very* new to erlang, so may not be following correctly, but it looks like a process is spawned for every “accept”, so a malicious user could cause some trouble if the spawned process was actually doing something more intensive?.

    Thanks for the erlang posts, do keep them up if you can.

Leave a Reply