A little addition



I have added tags to blog posts making them a bit easier to navigate. To see more posts having a specific tag, click any tag and you will be taken to a page with all tags. Hope you like it!


Integrating Lua (Part 2)


C++, Lua

Calling C functions

In the last tutorial we saw that it was possible to call Lua functions from C (and C++). It is also possible to define functions in C that is callable from Lua.

All C functions that will be callable from Lua follows the same pattern and must have a prototype as specified by lua_CFunction.

typedef int (*lua_CFunction) (lua_State *L);

All C functions get a pointer to the Lua state as parameter and returns an integer representing the number of values that the function pushed to the stack. This means that the function does not need to clean up the stack after it is executed. Lua will automatically remove everything below the results.

Defining our C function

Our C function will be a simple add function. It is defined as

int lua_add(lua_State *L)
    // Check the number of arguments
    int argc = lua_gettop(L);
    if (argc != 2)
        luaL_error(L, "Wrong number of arguments for add, expected 2, got %d", argc);
        return 0;

    // Obtain arguments
    int a = luaL_checkinteger(L, 1) + luaL_checkinteger(L, 2);
    lua_pushnumber(L, a);

    // We have pushed one argument
    // onto the stack
    return 1;

Let’s walk it through. The first thing we do is check that we got the correct number of arguments. This is done by a call to lua_gettop(L). This function returns the current top of the Lua stack, i.e. the number of arguments provided to our function. If this number is not 2 we raise an error with luaL_error() and return 0 to say that we did not push any values onto the stack.

If the argument count was correct we proceed to the row

int a = luaL_checkinteger(L, 1) + luaL_checkinteger(L, 2);

luaL_checkinteger() checks that the argument at a specified position in the stack is actually a number. Otherwise it returns an error and displays a message. If all is well, the result of the addition is in a.

The only thing left now is to push our result value back to Lua. This is done with lua_pushnumber(L, a). We then return 1 to tell Lua that we pushed one value onto the stack.

Time to test

To test this new addition function, the hello function in the Lua file (see previous post) is modified to include the row print("add(2,4) = " .. add(2, 4)). When building and running the example you should get this output:

Hello Lua!
add(2,4) = 6

That’s it for now! Next time I will show a way to use Lua with C++ objects.

The two files used in this post can be found here:


A gentle introduction to integrating Lua


C++, Lua


To introduce a scripting language into your project can be a huge boost in productivity. Interpreted languages run slower than compiled C/C++ code but the win you make in productivity and rapid development will be larger.


Lua is a scripting language implemented in pure ANSI C. It is extremely well implemented, lightweight, efficient and has one of the cleanest API:s that I have ever seen. Surely there are other scripting languages that can be a good alternative depending on the application. Python and Javascript are two other examples. However, my idea is to let this post be the first in a small series of posts demonstrating some bits of integrating Lua into your C/C++ projects.

This series will NOT be an introduction to Lua as a language but rather its C API. For a reference of the language, refer to lua.org.

Worth to note is also that i am using the Lua 5.2 release candidate (rc5) and there can be some minor differences in the API.

A simple hello world

To start with, download the Lua source code and compile a static or dynamic library that you will then link to your application.

To start, we create a simple lua script that does nothing useful at all but nonetheless.

function hello()
    print("Hello Lua!")

This simple function just prints “Hello Lua!” in the console. Save the file as hello.lua (the code below assumes that it is saved in the same directory as the C/C++ file).

So, now for the C/C++ code.

we start by creating a C/C++ file and importing the necessary Lua libraries. A C++ header lua.hpp exists where the necessary Lua C header files have been wrapped with extern "C".

We can then proceed to creating the lua state by calling luaL_newstate() which returns a pointer to a lua_State. The Lua state is a representation of the state of the Lua interpreter and need to be passed as the first parameters to all API functions except for the functions that create states (like luaL_newstate()).

Next, we intitalize the standard Lua libraries which may be unnecessary to do what we will do in this tutorial but it will be needed in later tutorials.

int main()
    // Create a new lua state
    lua_State *L = luaL_newstate();
    // ...

We should probably also check that luaL_newstate() succeeded by checking that the L != NULL but we do not do it here.

After the state has been created, we try to load and run the file we created earlier.

// Try to read the file
if (luaL_dofile(L, "hello.lua") != 0)
    printf("Lua error: %s\n", lua_tostring(L, -1));
    return 1;

luaL_dofile is supplied the path of the file and will return 0 if it succeeds.

The error checking will need some explanation. All communication between Lua and C happends with a stack-like structure. This stack can be indexed with positive indices with an index of 1 meaning the lowest element in the stack and a negative index of -1 meaning the top of the stack.

If luaL_dofile fails in reading the file, it will push an error message (a string) onto the stack. Therfore we can retrieve this message by converting the top of the stack into a string.

If luaL_dofile succeeds, we can start executing the function.

// Execute hello function
lua_getglobal(L, "hello");

// hello() function is now on top of the stack (-1)
// Execute it with 0 arguments, 0 return values and
// the standard error function (0).
if (lua_pcall(L, 0, 0, 0) != 0)
    printf("Lua error executing hello(): %s\n", 
        lua_tostring(L, -1));
    return 1;


First we obtain the global function called “hello” that we created earlier. This puts this function on top of the stack. The function is then executed with lua_pcall. The parameters are in turn the Lua state L, that we supply 0 arguments, expect 0 return values and that lua should use the standard error handler.

For handling errors, we use the same scheme as with luaL_dofile.

Before exiting we should also remember to close Lua with lua_close(L).

This is actually it. If you execute this, you should get Hello Lua! printed in the console.

Next time, we will define C functions that are callable from Lua.


How to roll your own memory management system - Part 2

2011-10-12 18:19

C++, memory management


In this post, I am going to start talking about implementation of the actual allocators. This post will in particular deal with the simplest kind of allocator, the stack allocator.

The need for allocating aligned memory often arises. 128 bit SIMD vectors for example has to be 16-bit aligned. That is the hexadecimal address has to end with the nibble 0x0. To account for that I will also discuss what is needed to extend the allocators to handle aligned memory allocation.

A simple Allocator interface

So now we are ready to start defining a simple interface for allocators. The question arises whether to have the allocators as full-fledged classes with virtual functions and all that. In this case it is definately ok since you will probably not create a lot of memory allocators and memory management is an expensive operation anyway so the few extra virtual calls wont matter.

The interface will look something like this

class Allocator

    virtual void *allocate(uint32 size, uint32 align) = 0;    
    virtual void deallocate(void *p) = 0;
    virtual uint32 allocated_size(void *p) = 0;

    virtual uint32 get_free() = 0;
    virtual uint32 get_used() = 0;


With this interface ready we can start defining a concrete allocator. The allocation method allocate will return a pointer to allocated memory where the object can then be placed with placement new.

But wait, how do we actually obtain memory from the OS? The above interface suggest that we do not want to override global new. That is correct. It is much better if each allocator actually has a backing allocator that fetches raw memory directly from the OS in a system-dependent manner. This makes it possible to disallow the use of new and malloc globally by asserting (or something like it).

The Stack Allocator

A stack allocator works by allocating in a stack pattern (duh :P). This is implemented by holding a pointer to the current top of the stack. Each new allocation is placed on this stack top and the top pointer is moved to the end of the newly allocated object. The pointer then points to the new top of the stack.

The pattern is demonstrated in the figure below.

Stack Allocator

The pointers to the top and bottom of the stack is illustrated below.

Stack Allocator Pointers

Note that an extra pointer is introduced in the figure above, the marker. In a stack allocator you could implement this functionality to provide users of the allocators with a way to hold a reference to a specific position in the stack. This becomes handy when deallocation is done with the stack allocator (Object pointers is also markers though).

Deallocation is simply a matter of rolling the top pointer back to a lower point. This also means that allocations can not be done in arbitrary order, they have to take place in an order that is the reverse of allocations.

Now we can make an implementation of a stack allocator

class StackAllocator : public Allocator

    explicit StackAllocator(const uint32, Allocator *);


     void *allocate(uint32, uint32);

     void deallocate(void *p);

     uint32 allocated_size(void *p);

     uint32 get_free() { return _size - _allocated_size; }
     uint32 get_used() { return _allocated_size; }

     void clear();

     Marker _bottom;
     const uint32 _size;

     Marker _top, _latest;
     uint32 _allocated_size;

     Allocator *_backing_allocator;

We can see from this that the allocator holds a pointer to a backing allocator, an allocator that retrieves raw memory from the OS or another type of allocator.

To implement the allocate method (first without alignment)

void *StackAllocator::allocate(uint32 size, uint32 align)
    if (_size >= (_top+size)-_bottom)
        uint32 raw_address = _top;
        _top += size;
        _allocated_size += size;

        return (void*) raw_address;

    // If we get here we are out of memory :(
    XASSERT(false, "Out of memory!");
    return NULL;

This is simple enough and it gets the job done nicely, but what about the alignment now?

To align the allocation we modify the method slightly

void *StackAllocator::allocate(uint32 size, uint32 align)
    // Fix proper alignment
    if (align > 1)
        uint32 expanded_size = size + align;

        if (_size >= (_top+expanded_size)-_bottom)
            // Allocate memory
            _allocated_size += expanded_size;
            uint32 raw_address = _top;
            _top += expanded_size;
            // Adjustment
            uint32 mask = (align - 1);
            uint32 misalignment = (raw_address & mask);
            uint32 adjustment = align - misalignment;

            uint32 aligned_address = 
                 raw_address + adjustment;

            // Store the alignment
            // in the extra byte allocated
            uint8 *p_adjustment = 
            *p_adjustment = (uint8) adjustment;

            return (void*)aligned_address;
    else // Allocate unaligned
        if (_size >= (_top+size)-_bottom)
            uint32 raw_address = _top;
            _top += size;
            _allocated_size += size;

            return (void*) raw_address;

    // If we get here we are out of memory :(
    XASSERT(false, "Out of memory!");
    return NULL;

This is the complete allocation method. The problem here is that the deallocation has to be done in two different ways depending on whether the allocation is aligned or not but that can be solved in a number of different ways, for example to have an unaligned version of both allocate and deallocate. The best way would be to store the alignment even when alignment is 1. This way the deallocate method can still know how to deallocate. Either way, all allocators should have the ability to allocate aligned memory.

Worth to note in the above method is that the alignment of memory is stored in the extra bytes allocated.

That is all for now and I hope this gave you a better understanding of the stack allocator. In the next part i will discuss the pool allocator and some alternative allocators.

Further resources


Why you should (or should you?) roll your own memory manager (Part 1)

2011-10-06 10:51

C++, memory management


Do you write performance-critical applications? Do you feel that the built in memory manager in C/C++ does not do the trick for you? Then this article might be of interest :)

All optimizations must however be based on real data so if a custom memory manager does not improve the performance of your application (or maybe some other aspect), do not use it!

In high performance applications (for example games), allocation of dynamic memory can be a real bottleneck and should be kept to a minimum. Of course, it is impossible to avoid altogether and something has to be done.

The built in memory manager in C/C++ has to handle a lot of different allocation patterns in a good way. It has to be able to allocate many small blocks of memory as well as large blocks and it has no knowledge of the data it has to allocate memory for. This is probably not the case in your own application. You will know what kind of data your application handles and therefore can make very important assumptions to make memory management more efficient and simple whereas the built in memory manager has to be everything to every application. This makes it inefficient and also too complicated in some cases.

It also suffers in performance from the fact that is has to switch from user to kernel mode for each request. This can be avoided with custom memory management where a large block of memory can be allocated up front and memory request satisifed from this block. This means that no mode-switching occurs and we can avoid that bottleneck.

Placement new

To be able to write your own memory manager and control how data is laid out in memory, there is an important operator in C/C++ that you have to know: placement new.

Placement new lets you specify a memory address where you want to place your objects.

void *buffer = new byte[SOME_SIZE];
MyObject *obj = new (buffer) MyObject();

This will place the new instance of MyObject at the start of buffer. This operator allows you to control your memory allocation patterns.

Important to note here is that memory allocated by placement new can not be freed with delete. The object destructor has to be called explicitly: obj->~MyObject();.

When to allocate the memory?

Your application will often have some kind of “memory budget”, the maximum amount of memory it can use. Then it can be a good idea to allocate a chunk of memory that big when the application starts up and then divide it and use it (as suggested by Christian Gyrling here).

You probably need some extra memory for debugging purposes so you can allocate that memory in a separate “debugging” chunk.

Of course it is also possible to allocate memory “as we go” and your custom memory manager will still make it a lot easier to keep track of your memory usage, detect memory leaks, make memory alignment easier etc.

In the next part, I will get down to business and show some allocator implementations. So until then, think about your data access patterns ;)


Adventures in the HTML5 history API

2011-09-08 19:20


When developing drinkmixen we decided that we needed to have a javascript slider for pagination (like the GitHub tree slider).

The difference is that our slider is used in pagination.


To accomplish this we had to look into the HTML5 history API. This is not a HTML API per se but a JavaScript interface. The main idea is that the click on a page link, next or previous button is intercepted. The JavaScript method pushState() is then used to actually push a state to the browsers history stack.

The main idea looks like this (with jQuery):

    history.pushState({location: this.href}, "page", this.href);
    $("#center").replaceWithGet(this.href, current_url);

So, a new state is pushed to the browser history stack and then the new page is loaded with an Ajax GET call and the slide is done with a custom jQuery animation. Now the url is updated when the user clicks on a page link and all is fine.

The next problem arises when the user clicks his/her back button in the browser. Then we need a popState handler to handle this event. What we basically do in the popState handler is just load and display the appropriate page with a nice animation.

It is not all that easy though… It works fine in some browsers but not at all in others. This is since different browsers have different opinions on when a popState event is to be fired. For a nice crossbrowser solution (except for Internet Explorer of course that does not at all support the history API) I would suggest looking at pjax.

To have real cross-browser support for the history API there is always History.js but since we did not want to include more external JavaScript libraries we decided that Internet Explorer users will have a less pleasing visual experience.

P.S. I often use IE myself :) D.S.


Updated portfolio



I have updated my portfolio to include some newer and also current projects. Be shure to check it out!

In the very near future this blog will also serve as a development blog for the game engine we are developing so be shure to check back from time to time if you are interested.


Hello World (again)



Welcome to my new homepage where I will blog a lot more than I did on my old page, I promise.

My blog posts will be about computers, programming, software, computer graphics and to some extent life in general :)

My first tip is the software that has actually created this page: nanoc.

nanoc (yes, it must be written with the first letter lower case) is a static site compiler. It works much like a compiler for a traditional programming - you give it a file of rules and it compiles your source code into a result.

But now, why would you want a static site in the 21st century?

The answer is efficiency. Since what you see here is only static pages, there is no code executed on the web server. This also gives better security - try to hack a static page ;). It also gives you more control since you have all the code for your web page, your articles are not laying around in some database at some third party web host.

nanoc is also very flexible and can be extended in a number of ways. For dynamic content, javascript can be used. I use Disqus for the ability to have comments and it is really nice! So, for a small to medium sized website it can be very suitable.

Other static site generators are for example Jekyll (written in Ruby as nanoc) and Hyde (written in Python).