Storing Array Sizes Inside Pointers with Pointer Tagging

AuthorKevin Nygaard
Published

Pointers contain empty space usable for storing arbitrary data, and storing metadata about the pointed-to object in them – like the size of an array – creates clean and efficient interfaces. A pointer "tagged" with an array's size contains all the necessary information for its access, describing the entire array, or part of the array, via a single pointer.

Contents

1 Arrays and Array Metadata

A sequence of elements, the array is a staple data structure in programming languages. The C programming language lays out arrays sequentially in memory, one element after another, and returns a pointer to the first element. However, the number of elements, and equivalently the overall size of the array, is omitted; encoding, storing, and using the size – should it be needed – is the programmer's responsibility.1

As a special case, C encodes string literals (arrays of characters) with an implicit count via a sentinal value; it adds an extra element to the end of the array, a null or zero byte, which marks the end of valid characters. Other languages, like Pascal, add an extra element to the beginning of the array, a count byte, which contains the total number of elements in the array. Either way, the count must be communicated somehow, either by storing it implicitly or explicitly, or hardcoding it into an algorithm. Programmers face the same issue.

Both C and Pascal store string lengths within the array itself, simplifying usage; a single piece of information, the address of the array, communicates both the location and the extents. However, slicing an array – creating new arrays containing the first 3 elements, the last 3 elements, elements 7 through 9, etc – requires modifying the original array or duplicating the data. This points to a fundamental flaw: array metadata – data about the array itself – is stored within the array.

Storing metadata outside the array enables copy-free slicing, but at the expense of a more complex interface: using an array requires both an address and a count, whereas before, an address was sufficient. Furthermore, storing the address and count together efficiently is challenging: due to alignment requirements, an 8 byte pointer with a 1 byte count makes a 16 byte structure, effectively doubling the space for array references. Separating them avoids the penalty, but juggling the loose variables is clumsy and error prone.

The metadata fits awkwardly both inside and outside the array. Placing this data in between the two would be ideal – but is that even possible? Where exactly is this middle area? The pointer holds the answer.

2 Alignments and Virtual Address Spaces

Programs operate in a 64-bit byte-addressable space, but some addresses are unused due to alignment requirements and virtual address space limitations. For example, an array of 64-bit integers aligned on an 8 byte boundary has each element addressed at an 8 byte multiple, leaving the lower three bits always zero. Arrays of larger objects aligned to coarser boundaries yield even more unused bits.2 Furthermore, the hardware and operating system leave the upper bits of virtual address space unused, reserving them for future use. Exabytes of memory are unnecessary on today's systems; on macOS, the upper 17 bits are unused.3

However, the system still stores and operates on all 64 bits of a word, so every pointer has at least 2 bytes of usable space for arbitrary data. Pointer tagging is the technique of stuffing data into pointers – removing the data, restoring the address to its original value, is required before dereferencing.

The figure below shows an array configuration and its usable tag bits, bits which are unused by the system and therefore usable for other purposes.

A diagram showing the top 17 bits and lower 3 bits of each 8-byte element in an array never change.
Figure: Virtual memory address space with unused bits bold and highlighted

These tag bits are the "in between" area referred to earlier, the ideal location for storing metadata about an array. The following stores the size of an array in the upper 16 tag bits.

3 Tagged Pointers

Below is one approach of inserting a size into an array pointer – "tagging" a pointer with the array's size. It stores a negative size in the upper 16 bits which enables familiar pointer arithmetic. The code is divided into three sections: encoding a pointer and size into a tagged pointer, decoding a tagged pointer into a pointer and size, and iterating over arrays using tagged pointers:

#include <stdint.h>
#include <stdio.h>
#define TagMask 0xffff000000000000ULL
#define One     0x0001000000000001ULL
void *Encode(void *Data, int Size) {
    uint64_t Ret  = (uint64_t)Data & ~TagMask
                  | (uint64_t)-Size << 48;
    return (void *)Ret;
}
typedef struct { void *Data;  int Size; } info;
info Decode(void *Pointer) {
    int64_t Raw = (int64_t)Pointer;
    info Ret = {0};
    Ret.Size = -(Raw >> 48);
    Ret.Data = (void *)(Raw & ~TagMask);
    return Ret;
}
#define Valid(Pointer) ((int64_t)(Pointer) < 0)
int main(int ArgCount, char *Args[]) {
    int     Foo[]      = {1, 1, 2, 3, 5, 7, 11};
    double  Bar[]      = {3.14, 1.41, 2.71};
    char   *Baz[]      = {"hello", "world"};
    void   *FooPointer = Encode(Foo, sizeof(Foo));
    void   *BarPointer = Encode(Bar, sizeof(Bar));
    void   *BazPointer = Encode(Baz, sizeof(Baz));
    for (int *At = FooPointer; Valid(At); At += One) {
        printf("%d ", *(int *)Decode(At).Data);
    }
    for (double *At = BarPointer; Valid(At); At += One) {
        printf("%.2f ", *(double *)Decode(At).Data);
    }
    for (char **At = BazPointer; Valid(At); At += One) {
        printf("%s ", *(char **)Decode(At).Data);
    }
    return 0;
}
// Output: 1 1 2 3 5 7 11 3.14 1.41 2.71 hello world

The tagged pointer holds the negative size in the upper 16 bits, limiting the size of an array to 32 kB. Encode() takes an untagged pointer Data and a Size, zeros out the top 16 bits of the pointer, shifts up the negated size to this location, and bitwise ORs it in. The returned pointer is now tagged. Decode() performs the inverse operation on a tagged Pointer, shifting down and unnegating the size into Ret.Size, and clearing the upper 16 size bits, restoring the original pointer into Ret.Data. The returned structure contains the untagged pointer and size.

To illustrate working with different array types and lengths, the example usage defines three arrays: one of 32-bit integers, one of 64-bit doubles, and one of 64-bit character pointers. The array pointers and their total sizes are encoded into tagged pointers, followed by three loops which step through the arrays using these pointers.

Valid tagged pointers, as checked by the Valid() macro, are negative; their top bit is set, implying a positive, non-zero Size (NB, the size is stored negated). This also recognizes untagged pointers, which have their upper bits set to zero, as invalid.4 The iteration step adds One to the pointer, which contains a 1 for each "field" of the tagged pointer: the tag field, and the address field. This is a form of Single Instruction, Multiple Data (SIMD) Within A Register (SWAR). C's pointer arithmetic scales this value according to the pointer's object size and adds it. This increments the address by one element, and decrements the size by one element, tracking the amount of valid bytes in the adjusted array. The pointer is "untagged" with Decode() before dereferencing.

The result is a "size-aware" pointer, accurately reflecting the number of valid bytes at all times (provided modifications are done using One). Saving it or passing it to functions mid-iteration is safe. It also correctly recognizes over-incremented pointers as invalid (but not under-incremented ones). Furthermore, manually reducing the size (via re-encoding or using a macro similar to One) allows for sub-array creation. Overall, this is a clean and efficient solution to the metadata problem.

3.1 Advanced

The version below explores an experimental solution. It shares the same functionality as above, but utilizes SWAR for encoding and manipulating sizes too. Additionally, the compiler-dependent typeof() operator removes noisy pointer casting. Finally, it showcases sub-arrays which, although possible in the above code, are more succinct in this implementation.

#include <stdint.h>
#include <stdio.h>
#define TagMask        0xffff000000000000ULL
#define One            0x0001000000000001ULL
#define TagOne         TagMask
#define TagCount(A)    ((int)(-((int64_t)(A) >> 48)))
#define Valid(Pointer) ((int64_t)(Pointer) < 0)
#define Decoded(A)     ((typeof(A))((int64_t)(A) & ~TagMask))
#define ArrayCount(A)  (sizeof(A) / sizeof(*(A)))
int main(int ArgCount, char *Args[]) {
    int  Foo[]      = {1, 1, 2, 3, 5, 7, 11};
    int *FooPointer = Foo + TagOne * ArrayCount(Foo);
    for (int *At = FooPointer; Valid(At); At += One) {
        printf("%d ", *Decoded(At));
    }
    int *First3 = Decoded(FooPointer) + TagOne * 3;
    for (int *At = First3; Valid(At); At += One) {
        printf("%d ", *Decoded(At));
    }
    int *NoEnds = FooPointer + One - TagOne;
    for (int *At = NoEnds; Valid(At); At += One) {
        printf("%d ", *Decoded(At));
    }
    return 0;
}
// Output: 1 1 2 3 5 7 11 1 1 2 1 2 3 5 7

TagOne is a negative 1 in the tag field, which, due to two's complement encoding, is equivalent to TagMask. Since the size is encoded as a negative number, making TagOne negative enables intuitive size modification through addition and subtraction: the former increases the size and the latter decreases it. Similar to One, it encodes and modifies the size via pointer arithmetic.

The first loop prints the entire array, as before. First3 is a sub-array which contains only the first three elements. Similarly, NoEnds is a sub-array which omits the first and last elements. All three tagged pointers draw from the same data source, but from different locations and sizes.

3.2 Native Compiler Support

Although elegant, the above avante-garde approach stretches C beyond its intended use cases; the compiler (rightfully) complains from its static code analysis5 – but the code does work. However, native compiler support gives a robust and type-safe solution:

int main(int ArgCount, char *Args[]) {
    int         Data[] = {1, 1, 2, 3, 5, 7, 11};
    tagged int *Foo    = Data;
    for (int Index = 0; Index < Foo.count; Index++) {
        printf("%d ", Foo[Index]);
    }
    tagged int *First3 = Data[0:3];
    for (int Index = 0; Index < First3.count; Index++) {
        printf("%d ", First3[Index]);
    }
    tagged int *NoEnds = Data[1:-1];
    for (int Index = 0; Index < NoEnds.count; Index++) {
        printf("%d ", NoEnds[Index]);
    }
    return 0;
}

Implementation is left as an exercise to the reader.

4 Other Uses and Limitations

Pointer tagging is an economical and cache-friendly technique for storing metadata, but it has downsides. Development tools struggle with tagged pointers, and debugging (both the tools and the program) is difficult. The upper tag bits are few, implementation-dependent, and may disappear in the future.6 The lower tag bits are even fewer and require data alignment. And as a limited and useful resource, other programs may be using them already.

Because they are cache-friendly, tagged pointers are often used for high-performance tasks. Memory allocators store retain counts for garbage collection and allocation sizes for bounds checking. Object-oriented and interpreted systems store type information, or, if small enough, the entire object or value itself in lieu of an address, removing one level of indirection. With a little ingenuity and positive outlook, a bleak reality (8-byte pointers!) becomes an opportunity (6-byte pointers with 2 free bytes!).

Thank you for reading; I hope this exploration of tagged pointers was insightful. Feedback and corrections are welcome via email. ✚

See Also

Footnotes

  1. A responsibility poorly kept; according to MITRE's Common Weakness Enumeration (CWE) report, ignoring buffer boundaries is a common issue.
  2. See the LLVM, GNU, and Microsoft resources for how to specify alignment. Also, make sure the compiler and operating system actually do the alignment. Trust neither. At the time of writing, the ARM macOS kernel aligns to a maximum of 16 kB, but the Mach-O header can specify more than that. Discovering this was interesting, as Clang, assuming the OS would do a 32 kB alignment, helpfully optimized out my alignment validation code (despite optimizations being disabled).
  3. You can easily find this value on your system using mmap(2) (or VirtualAlloc on Windows). Starting at the highest address, request a page of memory at every power-of-two; it will repeatedly fail until it enters the supported address range.
  4. This is operating system dependent; macOS sets the unused bits to zero.
  5. On the First3 line: warning: the pointer incremented by -844424930131968 refers past the last possible element for an array in 64-bit address space containing 32-bit (4-byte) elements (max possible 4611686018427387904 elements)[-Warray-bounds]. Translation: You are using tagged pointers and I don't like it.
  6. Code that lives to see this day is either trivial and uninteresting, or has more portability problems than just tagged pointers.

Further Resources