binary/elf/tls

tls_segment_register(Reg:input_reg)

Thread Local Storage (TLS)

ELF binaries use a number of optimization models for dynamically loading and statically linking TLS variables. This module defines predicates for inferring and resolving thread-local variable references in ELF binaries.

  • ’tls_segment’ computes the boundaries and alignment of the TLS data block.

  • ’tls_index’ locates GOT allocated TLS structs used by dynamic TLS models.

  • ’tls_get_addr’ identifies __tls_get_addr calls.

  • ’tls_global_dynamic’ computes “General Dynamic” code sequences.

  • ’tls_local_dynamic’ disambiguates “Local Dynamic” code sequences.

  • ’tls_descriptor’ resolves tlsdesc structs used by TLS descriptor model.

  • ’tls_desc_call’ identifies indirect @TLSDESC calls.

  • ’tls_relative_addr’ computes TLS block-relative offsets in instruction operands.

Below is an overview of the supported ELF TLS models with code detailing the “initial” relocations (object-file) and “outstanding” relocations (linked binary).

  1. ”General Dynamic” (GD) - dynamic TLS

Dynamic TLS with offsets resolved dynamically by relocations in GOT.

CODE RELOCATIONS

————————————————————————( .o)–

add REG, _GLOBAL_OFFSET_TABLE_ lea EAX, X@TLSGD[REG] TLS_GD call ___tls_get_addr@PLT PLT32 mov EAX, DWORD PTR [EAX]

————————————————————————( .so)–

add REG, _GLOBAL_OFFSET_TABLE_ lea EAX, N call ___tls_get_addr@PLT PLT32 mov EAX, DWORD PTR [EAX] …

GOT(N ): .zero 4 DTPMOD32 GOT(N+1): .dword OFFSET DTPOFF32

General Dynamic resolves the address of the TLS variable directly through __tls_get_addr using an initialized OFFSET with relocation in the GOT.

Note that this example is for X86-32. While the implementation of these models is generally consistent between 32 and 64, relocation constants and names vary. In particular, X86-64 uses a fixed code-template for the General Dynamic model, detailed in ‘tls_global_dynamic’.

  1. ”Local Dynamic” (LD)

Dynamic TLS with static offsets.

CODE RELOCATIONS

————————————————————————( .o)–

lea REG, X@TLSLD[RIP] TLSLD call __tls_get_addr@PLT … mov REG, X@DTPOFF[RAX] DTOFF32

————————————————————————( .so)–

lea REG, N call __tls_get_addr@PLT … mov REG, [RAX+OFFSET] …

GOT(N): .zero 8 DTPMOD64

.zero 8

Local Dynamic resolves the beginning of the TLS block through __tls_get_addr, but the offset field in GOT is uninitialized. Consequently, this optimization level requires one less relocation in the GOT, but we must infer variable references from a static OFFSET in instruction operands - resolved by the linker - for @DTPOFF relocations.

  1. ”Initial Executable” (IE)

Static TLS with offset relocations allocated in the GOT.

CODE RELOCATIONS

————————————————————————( .o)–

mov RAX, X@GOTTPOFF GOTTPOFF mov FS:[RAX], RAX

————————————————————————(.exe)–

mov RAX, N mov FS:[RAX], RAX …

GOT(N): .zero 8 TPOFF64

IE requires a single relocation, resolved at startup (load-time), and stored in the GOT entry for X, where @GOTTPOFF relocations are in position-independent code and @INDNTPOFF relocations are in x86-32 position-dependent code.

  1. ”Local Executable” (LE)

Static TLS (link-time) with no relocations.

Resolves all TLS references to block-relative offsets statically, without dynamic relocations or indirect references through the GOT.

CODE RELOCATIONS

————————————————————————( .o)–

mov RAX, DWORD PTR FS:X@TPOFF TPOFF32

————————————————————————(.exe)–

mov RAX, DWORD PTR FS:[-4] NONE

TLS block address is stored in the FS and GS segment register for X86-64 and x86-32 respectively, and the @TPOFF relocations are resolved by the linker to integral offsets, without relocations.

  1. ”TLS Descriptors” (TLSDESC)

Dynamic TLS optimization with indirect call to a lazy relocation function pointer in GOT.

CODE RELOCATIONS

————————————————————————( .o)–

lea RAX, X@TLSDESC[RIP] TLSDESC call [QWORD PTR [RAX+X@TLSCALL]] mov RDX, QWORD PTR FS:0 add RAX, RDX mov EAX, DWORD PTR [RAX]

————————————————————————( .so)–

lea RAX, [RIP+N] call QWORD PTR [RAX] mov RDX, QWORD PTR FS:0 add RAX, RDX mov EAX, DWORD PTR [RAX] …

GOT(N): .zero 8 TLSDESC

General Dynamic and Local Dynamic access models to thread-local variables are “known to be extremely inefficient because of the need to call a function to obtain the address of a thread-local variable” [3] (register clobbering).

TLS descriptor model is an optimized variant (ca. 2018) that replaces tls_index structs with tlsdesc structs and uses @TLSDESC and @TLSCALL symbol attributes.

Descriptor TLS uses a relocated function pointer - stored in the GOT tlsdesc struct - to lazily preserve call-clobbered registers and call __tls_get_addr().

See ‘tls_descriptor’.

See the following for more detailed documentation:

[1] https://www.uclibc.org/docs/tls.pdf [2] https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/6n33n7feo/index.html [3] https://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt

tls_segment(Start:address, End:address, Align:unsigned)

A TLS data segment, which may contain contiguous sections (i.e. ‘.tbss’ and ‘.tdata’), begins at address ‘Start’ and ends at address ‘End’ and is aligned to ‘Align’ bytes.

tls_index(EA:address, Offset:unsigned)

A ‘tls_index’ struct is located in the GOT at address ‘EA’ for a TLS variable at some ‘Offset’ into the TLS block.

’tls_index’ structs are allocated to two contiguous entries in GOT,

typedef struct {

unsigned long int ti_module; unsigned long int ti_offset;

} tls_index;

Note that ‘Offset’ is initialized for @TLSGD relocations and zero for @TLSLD.

call_tls_get_addr(Call:address, Reg:register)

Identify Call to the builtin function ‘__tls_get_addr’.

Reg: the register loaded with the resolved address (either the beginning of the TLS segment for @TLSLD relocations, or the address of the variable for @TLSGD relocations.)

tls_get_addr(Load:address, Call:address, Dest:address)

A TLS address is resolved dynamically with a call to the builtin ‘__tls_get_addr(struct *ti_index)’ function at address ‘Call’.

The address of the variable’s tls_index struct is loaded in the instruction at address ‘Load’.

The resolved address ‘Dest’ is either the beginning of the TLS segment for @TLSLD relocations, or the address of the variable for @TLSGD relocations.

tls_desc_call(Load:address, Call:address, Dest:address)

A TLS variable is resolved dynamically by indirect call to a TLSCALL relocated builtin function at address ‘Call’.

The address of the variable’s tls_desc struct is loaded in the instruction at address ‘Load’.

tls_global_dynamic(EA:address)

Find all “General Dynamic” code sequences.

First we find code sequences referencing tls_index structs in GOT with both DTPMOD and DTPOFF relocations.

However, the GD and LD models cannot be distinguished by the presence of a DTPOFF relocation alone, as static variables may have only a DTPMOD relocation and still use the General Dynamic (@TLDGD) model.

For x86-64, we distinguish @TLSGD from @TLSLD with the code template of the form:

.byte 0x66 lea RDI, X@TLSGD[RIP] .value 0x6666 rex64 call __tls_get_addr@PLT

GCC uses explicit directives to inline 0x66 bytes (which are actually `data16’ instruction prefixes) as padding. Likewise, the accompanying `rex64’ prefix on the call instruction inserts a 0x48 byte to extend the code sequence to the required 16-byte length.

Finally, for x86-32 we identify tls_index structs with non-zero offsets and propagate backward to adjacent structs, as the offset value may be initialized to zero. Note that this approach will replace the @TLSGD relocations with a @TLSLD relocation for binaries with a single static TLS variable, as the two models are ambigous in this case.

https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/chapter8-60/index.html

tls_local_dynamic(EA:address)

Identify and disambiguate @TLSLD/@TLSLDM by exclusion of previously computed @TLSGD relocations.

General Dynamic TLS uses two relocations, DTPMOD and DTPOFF, allowing __tls_get_addr to return the address of the variable directly.

Local Dynamic TLS uses a single DTPMOD relocation, and __tls_get_addr returns the start of the TLS block. Variables are addressed with integral offsets in indirect operands.

We identify @TLSLD relocations by a call to __tls_get_address that are not global dynamic. For x86-64, we distinguish @TLSGD from @TLSLD with the code template described in ‘tls_global_dynamic’.

tls_descriptor(EA:address, Offset:unsigned)

A ‘tlsdesc’ struct is located in the GOT at address ‘EA’ and references a symbol at some ‘Offset’ in the ‘tls_segment’.

’tlsdesc’ structs are allocated to two contiguous entries in GOT, for a struct of the general form:

struct tlsdesc {

void *arg; uint64_t arg_slot;

};

Descriptor structs have a single outstanding relocation for the first struct member, a pointer referencing one of the following resolution functions:

_dl_tlsdesc_return(struct tlsdesc *on_rax); _dl_tlsdesc_undefweak(struct tlsdesc *on_rax); _dl_tlsdesc_dynamic(struct tlsdesc *on_rax);

Note that these functions take a single pointer argument, allowing the code to load the address of the struct and indirectly call the function, chosen by the dynamic loader, at the same location, e.g.:

lea RAX, X@TLSDESC[RIP] call [QWORD PTR [RAX+X@TLSCALL]]

Consequently, variations of this model are best explained in terms of the tlsdesc struct after dynamic loading (runtime).

  1. static _dl_tlsdesc_return Offset

  2. undefined weak _dl_tlsdesc_undefweak Addend

  3. unallocated _dl_tlsdesc_dynamic struct tlsdesc_dynamic_arg*

We can infer the target of tlsdesc relocations using the addend of the TLSDESC relocation. For x86-32, static TLS variables may store the offset directly in the arg_slot field.

Also, static TLS variables may be further optimized by the compiler using a combination of @TLSDESC/@TLSCALL and @DTPOFF relocations, using a code sequence like:

lea RAX, _TLS_MODULE_BASE_@TLSDESC[RIP] call [QWORD PTR [RAX+_TLS_MODULE_BASE_@TLSCALL]] mov RSI, QWORD PTR FS:0 lea R8, X@DTPOFF[RAX+RSI] … add RAX, OFFSET Y@DTPOFF

All @DTPOFF relocations will be resolved by the linker to integral indirect operand offsets, and only a single outstanding TLSDESC relocation will exist for consecutive static variables.

Use TLSDESC with the ‘-mtls-dialect=gnu2’ option for GCC.

tls_relative_operand(EA:address, Index:operand_index, Dest:address, Type:symbol)

Instruction at address ‘EA’ references a TLS data address ‘Dest’ in the operand at ‘Index’ using a TLS relocation of some ‘Type’.

Relocation ‘Type’ is one of following labels corresponding to those used in the ‘symbolic_operand_attribute’ predicate:

DTPOFF Local Dynamic 32+64 TPOFF Local Executable 64 NTPOFF Local Executable 32

Note that TLSGD and TLSLD attributes are excluded as they reference GOT, not TLS offsets. Likewise for “Initial Executable” relocations (e.g. @GOTTPOFF).

tls_operand_attribute(Type:symbol, Attribute:symbol)

Map TLS relocation types to one or more symbolic operand types.