main

This is the main module of the datalog disassembler. The disassembly has 3 main components:

1- code_inference.dl

-code_inference_postprocess.dl -cfg.dl

2- symbolization.pl
  • use_def_analysis.dl

  • value_analysis.dl

  • data_access_analysis.dl

  • pointer_reattribution.dl

In addition there are several modules that consider special cases, generic components and tables.

Special cases:

-relative_jump_tables.dl

Generic components:

-ordered_set.dl -empty_range.dl

Tables:

-float_operations.dl -jump_operations.dl

This module: - defines the input generated by the decoder - defines a series of auxiliary predicates and basic facts that are used everywhere. - defines some hard-code parameters of the analysis, such as the code and data sections

explored.

entry_point(ea:address)

endianness(End:symbol)

WARNING: Predicate not present in compiled Datalog program (Dead Code)

base_address(ea:address)

symbol(ea:address, size:unsigned, type:symbol, scope:symbol, visibility:symbol, sectionIndex:unsigned, originTable:symbol, tableIndex:unsigned, name:symbol)

section(Name:symbol, Size:unsigned, EA:address, Align:unsigned, Index:unsigned)

section_property(Name:symbol, Property:symbol)

section_type(Name:symbol, Type:unsigned)

byte_interval(BegAddr:address, EndAddr:address)

relocation(EA:address, Type:symbol, Name:symbol, Addend:number, SymbolIndex:unsigned, Section:symbol, RelType:symbol)

relocation_size(Type:symbol, Size:unsigned)

relocation_adjustment(EA:address, Adjustment:number, Reason:symbol)

Defines adjustments to relocation values.

relocation_adjustment_total(EA:address, Adjustment:number)

The total relocation adjustment for a location

binary_type(Type:symbol)

binary_format(Format:symbol)

arch_info(Key:symbol, Value:symbol)

ArchInfo auxdata derived ELF metadata

binary_isa(ArchName:symbol)

WARNING: Predicate not present in compiled Datalog program (Dead Code)

option(Option:symbol)

dynamic_entry(tag:symbol, value:unsigned)

instruction(ea:address, size:unsigned, prefix:symbol, opcode:symbol, op1:operand_code, op2:operand_code, op3:operand_code, op4:operand_code, immOffset:unsigned, displacementOffset:unsigned)

instruction_writeback(EA:address)

The instruction at EA has capstone’s cs_arm.writeback set.

instruction_cond_code(EA:address, CondCode:symbol)

The instruction at EA has capstone’s cs_arm.cc set.

register_access(EA:address, Register:input_reg, AccessMode:access_mode)

The register Register is accessed at EA with AccessMode.

AccessMode may be “R” or “W”

instruction_op_access(EA:address, Index:operand_index, AccessMode:access_mode)

The operand at index Index is accessed at EA with AccessMode

op_register_bitfield(Code:operand_code, Index:unsigned, RegisterName:input_reg)

Index: The index of the register in bitfield (starts with 0)

invalid_op_code(EA:address)

op_regdirect(Code:operand_code, RegisterName:input_reg)

op_fp_immediate(Code:operand_code, Imm:float)

WARNING: Predicate not present in compiled Datalog program (Dead Code)

op_immediate(Code:operand_code, Offset:number, SizeBytes:unsigned)

op_special(Code:operand_code, Type:symbol, Value:symbol)

op_indirect(Code:operand_code, Reg1:input_reg, Reg2:input_reg, Reg3:input_reg, Multiplier:number, Offset:number, SizeBytes:unsigned)

op_shifted(EA:address, Index:operand_index, Shift:unsigned, Type:symbol)

The operand identified by Index should be shifted with an immediate.

Used on ARM/ARM64, but not x86 or MIPS. Type is architecure-dependent.

op_shifted_w_reg(EA:address, Index:operand_index, Reg:input_reg, Type:symbol)

The operand identified by Index should be shifted with a register.

Used on ARM/ARM64, but not x86 or MIPS. Type is architecure-dependent.

address_in_data(EA:address, Value:address)

There is a potential address at ‘EA’ pointing to ‘Value’.

data_region(Begin:address, Size:unsigned)

ascii_string(EA:address, End:address)

Possible null-terminated ASCII string of ‘Size’ bytes begins at address ‘EA’.

reg_map_nullable(RegIn:input_reg, Reg:reg_nullable)

Maps input_reg to registers referred to by a single name. This is used to allow different register names that refer to the same storage to be tracked together, e.g., on x86, both AX and EAX are members of the EAX register.

reg_nonnull(RegNullable:reg_nullable, Reg:register)

WARNING: Predicate not present in compiled Datalog program (Dead Code)

reg_map(RegIn:input_reg, Reg:register)

instruction_immediate_offset(EA:address, Index:operand_index, Offset:unsigned, Size:unsigned)

This predicate determines the Offset and Size of an immediate operand of index Index. This is used to place symbolic expressions at the right address. The Offset is non-zero only for the x86 ISA, for other ISAs the symbolic expressions point to the beginning of the instruction.

instruction_displacement_offset(EA:address, Index:operand_index, Offset:unsigned, Size:unsigned)

This predicate determines the Offset and Size of a displacement in an indirect operand of index Index. This is used to place symbolic expressions at the right address. The Offset is non-zero only for the x86 ISA, for other ISAs the symbolic expressions point to the beginning of the instruction.

instruction_get_operation(ea:address, operation:symbol)

WARNING: Predicate not present in compiled Datalog program (Dead Code)

instruction_get_op(ea:address, index:operand_index, operator:operand_code)

instruction_get_src_op(EA:address, Index:operand_index, Op:operand_code)

Source operands

instruction_get_dest_op(EA:address, Index:operand_index, Op:operand_code)

Destination operands

next(n:address, m:address)

pc_relative_operand(EA:address, Index:operand_index, Dest:address)

EA has a PC-relative operand at Index, which is computed and stored in Dest. NOTE: Currently, we define pc_relative_operand only for X64.

split_load_operand(src:address, index:operand_index, dest:address)

instruction_has_loop_prefix(EA:address)

instruction_has_relocation(EA:address, Rel:address)

Instruction at address “EA” has a relocation for address “Relocation”.

unconditional_jump(n:address)

conditional_jump(src:address)

direct_jump(src:address, dest:address)

This predicate represents a direct jump from address src to destination dest. It captures only direct jump whose destination is known. E.g. a direct jump that depends on a relocation will not produce a direct_jump term.

impossible_jump_target(EA:address)

pc_relative_jump(Src:address, DataPointer:address)

This predicate represents a indirect jump

from address ‘Src’ to a destination contained in the data pointer located at address ‘DataPointer’. The location of the pointer can be easily inferred because it only depends on the program counter.

reg_jump(Src:address, Reg:register)

The instruction at address ‘Src’ has a jump using register ‘Reg’. The destination of the jump will be the value of the register.

indirect_jump(Src:address)

The instruction at address ‘Src’ has an indirect jump. I.e. a jump that reads its destination from memory.

direct_call(EA:address, Dest:address)

This predicate represents a direct call from address ‘EA’ to destination Dest. It captures only direct calls whose destination is known. E.g. a direct call that depends on a relocation will not produce a direct_call term.

pc_relative_call(Src:address, DataPointer:address)

This predicate represents a indirect call

from address ‘Src’ to a destination contained in the data pointer located at address ‘DataPointer’. The location of the pointer can be easily inferred because it only depends on the program counter.

reg_call(Src:address, Reg:register)

The instruction at address ‘Src’ has a call using register ‘Reg’. The destination of the call will be the value of the register.

indirect_call(Src:address)

The instruction at address ‘Src’ has an indirect call. I.e. a call that reads its destination from memory.

pc_load_call(Src:address, Dest:address)

Identify edge case direct calls that are used to load the program counter and not as control-flow (e.g. call-to-pop sequences).

halt(EA:address)

alignment_from_address(EA:address, AlignInBits:unsigned)

Find alignment depending on EA

WARNING: Predicate not present in compiled Datalog program (Dead Code)

alignment_candidate(EA:address, AlignInBits:unsigned)

Auxiliary predicate that builds initial alignments from alignment_required: the max alignment is picked for an EA later.

alignment(EA:address, AlignInBits:unsigned)

Information about alignment in bits for a given address

op_indirect_contains_reg(Op:operand_code, Reg:register)

op_indirect_mapped(Op:operand_code, Reg1:reg_nullable, Reg2:reg_nullable, Reg3:reg_nullable, Mult:number, Offset:number, Size:unsigned)

op_regdirect_contains_reg(Op:operand_code, Reg:register)

op_immediate_and_reg(EA:address, Operation:symbol, Reg:register, Imm_index:operand_index, Immediate:number)

cmp_immediate_to_reg(EA:address, Reg:register, Imm_index:operand_index, Immediate:number)

symbol_set(ea:address, size:unsigned, type:symbol, scope:symbol, visibility:symbol, sectionIndex:unsigned, name:symbol)

ambiguous_symbol(name:symbol)

function_symbol(ea:address, name:symbol)

defined_symbol(ea:address, size:unsigned, type:symbol, scope:symbol, visibility:symbol, sectionIndex:unsigned, originTable:symbol, tableIndex:unsigned, name:symbol)

relocation_active_symbol_table(Name:symbol)

The name of symbol table to use when looking up symbols from relocations

Although it’d be better to check what symbol table is referenced by the relocation section’s sh_link attribute, LIEF (as of 0.13.0) does not expose this metadata in the LIEF::ELF::Relocation object. LIEF actually uses a similar strategy to this, using dynsym if it exists, and otherwise symtab - rather than using the sh_link metadata.

This strategy will fail for binaries built with –emit-relocs, since they can have relocations referencing both symbol tables.

loaded_section(Beg:address, End:address, Name:symbol)

data_section(name:symbol)

exception_section(name:symbol)

special_data_section(name:symbol)

regular_data_section(name:symbol)

code_section(name:symbol)

tls_section(name:symbol)

bss_section(name:symbol)

non_zero_data_section(name:symbol)

bss_section_limits(Begin:address, End:address)

initialized_data_segment(Begin:address, End:address)

data_segment(Begin:address, End:address)

plt_block(block:address, function:symbol)

The basic block ‘Block’ implements a PLT thunk that refers to function ‘Function’.

got_reference(Got_entry:address, Symbol:symbol)

main_function(ea:address)

conditional_return(EA:address)

unconditional_return(EA:address)

no_return_call(EA:address)

Detects non-returning calls before even must/may fallthrough relations.

Calculated even before code inference.

function_pointer_section(Name:symbol)

no_return_function(Name:symbol)

is_padding(EA:address)

printable_char(N:unsigned)

The set of printable ASCII characters.

WARNING: Predicate not present in compiled Datalog program (Dead Code)

align_addr(AddrAligned:address, AddrOrig:address)

Align address in 4-byte boundary

WARNING: Predicate not present in compiled Datalog program (Dead Code)

abi_intrinsic(EA:address, Name:symbol)