Introduction to Embedded Scripting Languages:
This blog is aimed at those with some understanding of memory management in programming languages and those curious about what a garbage collector is.
First contact: Squrriel. Recently I started working on a code base that was littered with files ended in a strange extension like .nut. Obviously having the humour of a 12 year old, I proceeded to tell my friends all about this hilarious language that I’d love to pair with some coq 🐓.
Jokes aside, I really was wondering what was the purpose of a garbage collected language in an embedded systems environment?
As a newbie to embedded systems programming, I had a few preconceptions about the field. Especially in the case of writing firmware for IoT devices.
I generally thought of embedded devices as:

Low power and low cost and not.
When you are not low power and you can eat at places with $$$
on google maps,
feel free to throw processing power at the problem, put Linux or Android on the
device and lots of RAM and run java (please don’t).
If constrained by low cost and low power, you are usually limited to bare metal or RTOS in C/C++.
For the rest, it is probably a specific combination of hardware and software that’s not really the focus right now.
All in all, the idea of having a garbage collected language run on IoT devices seemed pretty sacrilegious.
Whats embedded scripting?
Essentially embedded scripting languages serve as the cornerstone for enhancing the programmability and flexibility of embedded systems.
“Squirrel is a high level imperative, object-oriented programming language, designed to be a light-weight scripting language that fits in the size, memory bandwidth, and real-time requirements of applications like video games.”
Just reading what squirrel-lang says about Squirrel lang seemed oxymoronic to me so it prodded me to take a closer look as to why it serves its purpose and serves it well.
Even wihout being ultra-lean code like bare metal C to function efficiently, languages such as Squirrel and Lua showcase being an impressive blend of being lightweight while simultaneously offering a rich set of features.
This unique combination allows for a more dynamic and customizable approach to embedded system development, enabling both developers and end-users to explore a broader range of functionalities without compromising on performance or resource utilization.
First thing to debunk - embedded scripting languages don’t add as much bloat as I has originally thought. Squirrel can be as little as 100-150KB and Lua is down to a shocking 63K!
We all know how easy it is to protoype something out in python and essentially, embedded scripting langugage support that (any many other purposes). One of the most significant advantages of embedded scripting languages is their ability to make embedded systems more adaptable and interactive.
For instance, in the realm of video game development, these languages allow for rapid iteration of game logic and user interfaces, enabling developers to implement changes without the need for recompilation. This flexibility is not only a bonus for developers but also enhances the gaming experience by allowing for real-time game modifications and customizations.
Discovering Squirrel
As I delved deeper into the world of embedded scripting languages, Lua was a familiar companion, thanks to my adventures in Minecraft modding and tinkering with my Neovim setup. However, Squirrel was a novel encounter, a language I hadn’t crossed paths with before. This piqued my curiosity: Why does Squirrel exist?
Alberto Demichelis, the creator of Squirrel, found his inspiration while
integrating Lua at CryTek, a game development company. Lua, for all its
strengths, presented a challenge with its garbage collector’s unpredictability,
which could potentially hamper real-time performance—a critical aspect in
gaming. Lua’s approach to garbage collection, though somewhat manageable through
the lua_gc()
function, required fine-tuning of the step size argument to
mitigate performance impacts. This setting varied widely across projects and
needed adjustments with project evolution.
In an effort to circumvent these issues, Demichelis explored implementing reference counting in Lua. This method aimed to eliminate the need for periodic memory scans by the collector, instead tracking object ownership through reference counts and releasing objects when no longer referenced. However, the extensive modifications needed led him to consider a fresh start. Thus, Squirrel was born out of the desire to create a language that addressed Lua’s limitations, including its unconventional syntax, which often left co-workers puzzled over basic constructs like a for-loop, and its relatively limited feature set.
Squirrel was designed from the ground up to tackle these challenges, offering a solution that blends the ease of embedded scripting with improvements in garbage collection, syntax familiarity, and an enriched set of features. This innovative approach allows Squirrel to fill a unique niche in the embedded scripting landscape, particularly appealing to those in the game development sphere seeking a more predictable and developer-friendly language.
Some comparisons
I think its hard to frame the position of Squirrel without comparing a more familiar Lua.
Design and Performance: Squirrel vs. Lua At its core, Squirrel shares a spiritual lineage with Lua. Both languages are built atop a register-based virtual machine (VM), showcasing their efficiency and adaptability in embedding within C and C++ programs. This foundational similarity extends to their compactness and the ability to manage multiple VMs independently, making them highly versatile in various programming contexts. However, Squirrel distinguishes itself through its implementation and language features.
While Lua is penned in C, Squirrel ventures a step further into modern programming paradigms with its implementation in C++, offering a C API that mirrors Lua’s stack-based approach. This strategic choice allows Squirrel to integrate seamlessly into projects, providing developers the flexibility to modify the language source directly if necessary. Both languages maintain a minimal footprint, with their compiled forms hovering around the 100-150 kilobyte mark, demonstrating their lightweight nature.
Performance discussions often highlight Lua’s edge in microbenchmarks, attributed to its streamlined bytecode processing and strategic memory management during benchmarking. Squirrel, despite being slightly slower in these microbenchmarks, introduces more data types and language features, enriching the developer’s toolkit. The trade-off lies in Squirrel’s approach to memory management. Through reference counting, Squirrel ensures a consistent, predictable overhead in memory management, crucial for applications like video games where timing is everything.
Real-World Performance Considerations
Demichelis emphasizes the importance of assessing language performance within real application scenarios, where factors like memory management, cache misses, and memory aliasing play significant roles. Squirrel’s design prioritizes predictable memory management costs and efficient data structures to mitigate these issues. Innovations in memory management within Squirrel’s ecosystem, such as optimizing for cache friendliness and leveraging OS-level memory alignment, have shown to significantly boost performance in stress tests mimicking game loops.
In essence, Squirrel’s development philosophy centers around providing a robust, feature-rich scripting language that addresses the specific needs of real-time applications. Its approach to predictable memory management and performance optimization underscores its suitability for scenarios where timing and resource efficiency are paramount. As we delve deeper into Squirrel, it becomes clear that its existence is not merely to serve as an alternative to Lua but to offer unique solutions to the nuanced challenges of embedded and real-time programming.
Garbage Collection in Squirrel
Now I’m going to foray into the thing that has intrigued me the most & inspired me to go down this rabbit-hole in the first place - a garbage collector on embedded systems.
Why does garbage collecting affect performance anyways?
First thing to keep in mind is that, even without garbage collection, a program typically needs to allocate and free memory (unless you’re trying to make it to the MITRE CVE leaderboard!). Garbage collection will allow the ‘free’-ing half of the process to happen in batches, ideally when the program is relatievly idle or not encroaching on CPU bound tasks.
You’d want to batch together these operations to reduce the overhead of syscalls. If you happen to have a lot of memory spare, some collectors might even defer to free till the end of the program!
The collector has to still spend resources analysing what is reachable.
Simply put, reachable values are those that are accessible or usable somehow. They are guaranteed to be stored in memory.
There’s a base set of inherently reachable values, that cannot be deleted for obvious reasons.
For instance:
- The currently executing function, its local variables and parameters.
- Other functions on the current chain of nested calls, their local variables and parameters.
- Global variables. (there are some other, internal ones as well)
Typically we can achieve this using reference counting. Determining what is reachable is the real performance penalty. Additionally, this process does tend to briefly interrupt normal execution.
The interruptions can especially be a significant factor in applications that are sensitive to latency issues. This happens to include the common scenario of serving web content. Fortunately, there are also now garbage collector modes to reduce this impact.
The result is for most garbage collected platforms, the collector is one factor to consider when evaluating performance. The net effect of the garbage collector may be minimal — or even a net positive! — over the life of an application session, but there can short-lived periods where there is significant negative impact, and it’s important to be aware of how to mitigate those issues.
Under the hood: A look at how the trash man operates
I’m not going to let it go out without peeking under the hood and since the codebase for Squirrel is small enough, I think its a good way to study how a garbage collector can be implemented.
Core Components:
1. Reference Counting:
- Implemented via
SQRefCounted
structure. - Every collectable object has a reference count (
_uiRef
). Release()
method decreases the reference count and deletes the object if it reaches zero.struct SQRefCounted { SQUnsignedInteger _uiRef; // Reference count virtual ~SQRefCounted() {} virtual void Release()=0; // Decreases the reference count and deletes the object if it reaches zero. };
2. Mark-and-Sweep:
- Enabled by defining
NO_GARBAGE_COLLECTOR
to include GC-related code. SQCollectable
extendsSQRefCounted
, adding_next
,_prev
, for chaining collectable objects, and a mark flag (MARK_FLAG
).AddToChain()
andRemoveFromChain()
manage collectables in a global chain.- During the mark phase,
Mark()
is recursively called on objects to set the mark flag. UnMark()
clears the mark flag for the sweep phase.- Sweep phase finalizes and deletes unmarked (unreachable) objects.
struct SQCollectable : public SQRefCounted { SQCollectable *_next, *_prev; // For chaining collectable objects virtual void Mark(SQCollectable **chain)=0; // Marks objects recursively void UnMark(); // Clears the mark flag virtual void Finalize()=0; // Finalizes the object static void AddToChain(SQCollectable **chain, SQCollectable *c); // Adds objects to a global chain static void RemoveFromChain(SQCollectable **chain, SQCollectable *c); // Removes objects from the chain };
Implementation Details:
Mark Phase:
- Initiated by
SQSharedState::RunMark()
, marking root objects and recursively marking reachable objects. - Each collectable type (e.g.,
SQArray
,SQTable
,SQClass
, etc.) implements its ownMark()
method to mark its references. MarkObject()
function is used to dispatch the mark call based on object type.struct SQSharedState { void RunMark(SQVM *vm, SQCollectable **tchain); // Initiates the marking phase SQInteger CollectGarbage(SQVM *vm); // Performs garbage collection, combining mark and sweep phases static void MarkObject(SQObjectPtr &o, SQCollectable **chain); // Dispatches mark calls based on object type };
Sweep Phase:
CollectGarbage()
method inSQSharedState
triggers garbage collection, integrating both mark and sweep phases.- Performed in
SQSharedState::CollectGarbage()
, iterating through the global chain of collectable objects. - It finalizes and releases objects not marked (i.e., unreachable).
SQInteger SQSharedState::CollectGarbage(SQVM *vm) { SQInteger n = 0; // Counter for collected objects SQCollectable *tchain = NULL; RunMark(vm, &tchain); // Mark phase // Sweep phase: Iterate over collectables and delete unmarked objects SQCollectable *t = _gc_chain; while(t) { if (!(t->_uiRef & MARK_FLAG)) { // If object is unmarked SQCollectable *next = t->_next; if (--t->_uiRef == 0) t->Release(); // Delete if reference count is zero t = next; } else { t->UnMark(); // Clear mark for next GC cycle t = t->_next; } n++; } _gc_chain = tchain; // Update chain for next cycle return n; // Return number of collected objects }
Reference Table (RefTable
):
- Manages references to objects, used for adding, releasing, and getting the reference count.
- Supports mark phase by marking all objects it holds references to.
struct RefTable { void Mark(SQCollectable **chain); // Marks all objects it holds references to // Other methods omitted for brevity }; void RefTable::Mark(SQCollectable **chain) { RefNode *nodes = _nodes; for (SQUnsignedInteger n = 0; n < _numofslots; ++n, ++nodes) { if (sq_type(nodes->obj) != OT_NULL) { SQSharedState::MarkObject(nodes->obj, chain); // Mark the object } } }
Closing words: Hope you’ve learnt something! I found this rabbit hole pretty eye opening because I had previously black box’ed GCs and never really took a look under the hood. This was a gentle introduction to them and I hope I have the chance to circle back to see how more sophisticated ones operate. Till next time!