pointer_reattribution
Compilers sometimes generate expressions of the form symbol+constant. It can happen that such expression falls:
in the middle of a pointer (or the middle of a symbol if we know its size)
in the middle of an instruction (only for programs with overlapping instructions)
outside the data sections or
on a data section that is different from the one of the symbols.
We want to detect those cases and generate the adequate symbol+constant.
We generate two predicates:
- -moved_data_label
for pointers in data sections
- -moved_label
for pointers in code sections
We only ‘move’ pointers in data sections if their destination falls in the middle of another pointer, symbol or instruction (cases 1 and 2).
In code sections we consider the three possibilities.
In addition, we distinguish three types:
P) the pointer is a pc-relative operand (and does not access memory i.e. LEA). pc-relative operands should always be symbolic, we just need to find the best candidate.
D) the pointer appears as an displacement in an indirect operand. For indirect operands, we know they are being used to access memory. This makes them more likely to be symbolic. They cannot be a float for example. But they could still be a constant. We make the displacement symbolic if we can “prove” that the registers used cannot contain a base address (so the displacement should contain a base address).
I) the pointer appears as an immediate operand. For immediate operands, they are likely to be symbolic if they are used to compute an address or compared to an address. We specifically detect cases where immediates are used to initialized loop counters or as loop bounds.
- moved_data_label(EA:address, Size:unsigned, Dest:address, NewDest:address)
A symbolic expression at address ‘EA’ pointing to ‘Dest’ should use a symbol pointing to ‘NewDest’ plus an offset . The offset is NewDest-Dest.
Uses:
address_in_data_refined_range.in_ea
,base_address
,cinf_ldr_add_pc
,code_in_block
,function_symbol
,instruction_displacement_offset
,overlapping_instruction
,symbol
Used by:
data_object_boundary
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function
- moved_label(EA:address, Index:operand_index, Dest:address, NewDest:address)
A symbolic operand at address ‘EA’ with index ‘Index’ and pointing to ‘Dest’ should use a symbol pointing to ‘NewDest’ plus an offset. The offset is NewDest-Dest.
Used by:
bad_symbol_constant
,false_negative
,false_positive
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function
- boundary_sym_expr(EA:address, Dest:address)
The symbolic expression at address ‘EA’ pointing to ‘Dest’ should point to an ‘at-end’ symbol.
Uses:
aligned_address_in_data
,arch.move_reg_imm
,binary_type
,block_boundaries
,cmp_immediate_to_reg
,cmp_reg_to_reg
,dest_enlarged_data_section
,instruction_displacement_offset
,instruction_immediate_offset
,loaded_section
,lsda_callsite
,lsda_symbol_minus_symbol
,pc_relative_operand
,reg_def_use.def_used
,special_data_section
,value_reg
Used by:
inferred_symbol
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function
- moved_label_class(EA:address, Index:operand_index, Reason:symbol)
Uses:
addr_outside_section_used_for_memory_access
,address_in_data
,address_in_data_refined_range.in_ea
,arch.cmp_operation
,arch.frame_pointer
,arch.jump
,arch.move_reg_imm
,base_address
,base_relative_symbolic_operand
,best_value_reg
,binary_format
,binary_type
,bss_section_limits
,cmp_immediate_to_reg
,cmp_reg_to_reg
,code_in_block
,data_access
,data_access_pattern_candidate
,defined_symbol
,dest_enlarged_data_section
,first_synchronous_access
,function_symbol
,got_relative_operand
,indirect_jump
,instruction
,instruction_displacement_offset
,instruction_get_op
,loaded_section
,moved_label_candidate
,moved_pc_relative_candidate
,next
,op_indirect
,op_indirect_mapped
,overlapping_instruction
,pc_relative_operand
,reg_def_use.def_used
,regular_data_section
,split_load_operand
,stack_def_use.def_used
,symbol_minus_symbol
,symbolic_data
,symbolic_operand
,value_reg_at_operand
- cmp_reg_to_reg(EA:address, Reg1:register, Reg2:register)
Instruction at address ‘EA’ compares registers ‘Reg1’ and ‘Reg2’.
Uses:
arch.cmp_operation
,instruction
,instruction_get_op
,op_regdirect_contains_reg
Used by:
boundary_sym_expr
,moved_immediate_candidate
,moved_label_class
,moved_pc_relative_candidate
Recursive:
const_value_reg_used
,function_inference.function_entry_initial
,no_return_call_propagated
,next_end
,may_fallthrough
,jump_table_candidate
,block_candidate_boundaries
,unresolved_interval_order
,inferred_main_in_reg
,reg_def_use.return_val_used
,inter_procedural_edge
,compare_and_jump_immediate
,block_limit
,value_reg_edge
,block_next
,reg_has_base_image
,stack_def_use.block_last_def
,possible_target_from
,no_return_block
,indefinite_litpool_ref
,stack_def_use.used_in_block
,adjusts_stack_in_block
,split_load_total_points
,jump_table_candidate_refined
,overlap_with_litpool
,value_reg_limit
,code_in_block_candidate
,candidate_block_is_not_padding
,likely_fallthrough
,arch.reg_relative_load
,arm_jump_table_cmp_limit
,value_reg_unsupported
,nop_in_padding_candidate
,arm_jump_table_block_start
,__agg_subclause2
,transition_block_limit
,block_overlap
,no_return_call
,data_segment
,jump_table_element_access
,reg_def_use.flow_def
,arm_jump_table_skip_first_entry
,block_total_points
,stack_def_use.defined_in_block
,reg_def_use.ambiguous_last_def_in_block
,straight_line_def_used
,basic_target
,symbol_minus_symbol_litpool_access_pattern
,value_reg
,reg_def_use.live_var_used
,compare_and_jump_indirect_op_valid
,litpool_ref
,padding_block_candidate
,self_contained_segment
,data_in_code_propagate
,unresolved_interval
,unlikely_have_symbolic_immediate
,symbolic_expr_from_relocation
,hi_load_prop
,block_points
,wis_memo
,split_load_for_symbolization
,arm_jump_table_data_block_limit
,adrp_used
,known_block
,wis_has_prior
,candidate_block_is_padding
,no_return_call_refined
,data_in_code
,reg_def_use.used
,is_padding
,reg_def_use.ref_in_block
,base_relative_operand
,def_used_for_address
,reg_def_use.live_var_at_prior_used
,block_heuristic
,relative_address_start
,tls_get_addr
,__agg_subclause6
,instruction_memory_access_size
,start_function
,arm_jump_table_candidate
,first_block_in_byte_interval
,stack_base_reg_move
,__agg_subclause3
,jump_table_target
,base_relative_operation
,resolved_reaches
,next_start
,reg_def_use.live_var_def
,must_fallthrough
,next_type
,arm_jump_table_candidate_start
,next_block_in_byte_interval
,__agg_single3
,split_load
,indexed_pc_relative_load_relative
,block_instruction_next
,stack_def_use.live_var_used_in_block
,cmp_defines
,block
,reg_def_use.block_last_def
,overlapping_instruction
,code_in_block
,no_value_reg_limit
,jump_table_signed
,impossible_block
,block_boundaries
,wis_prior
,composite_data_access
,relocation_adjustment_total
,discarded_block
,stack_def_use.last_def_in_block
,indexed_pc_relative_load
,__agg_single6
,stack_def_use.live_var_at_block_end
,after_end
,split_load_candidate
,incomplete_block
,straight_line_last_def
,jump_table_max
,block_points_proportional
,wis_schedule
,arm_jump_table_data_block
,compare_and_jump_register
,block_candidate_dependency_edge
,reg_has_got
,data_block_candidate
,got_relative_operand
,contains_plausible_instr_seq
,reg_def_use.ambiguous_block_last_def
,invalid_jump_table_candidate
,possible_target
,relative_address
,cmp_reg_to_reg
,reg_def_use.def_used
,cinf_ldr_add_pc
,block_implies_block
,contains_implausible_instr_seq
,split_load_point
,call_tls_get_addr
,unresolved_block
,split_load_operand
,inferred_main_dispatch
,gp_relative_operand
,code_in_block_candidate_refined
,reg_reg_arithmetic_operation_defs
,litpool_symbolic_operand
,data_access
,init_symbol_minus_symbol_candidate_arm
,plt_block
,invalid
,litpool_boundaries
,segment_target_range
,correlated_live_reg
,arm_jump_table_block_instruction
,simple_data_access_pattern
,reg_def_use.live_var_at_block_end
,reg_def_use.defined_in_block
,block_last_instruction
,flags_and_jump_pair
,split_load_conflict
,data_block_limit
,last_value_reg_limit
,reg_used_for
,initialized_data_segment
,__agg_subclause7
,stack_def_use.live_var_at_prior_used
,jump_table_start
,unresolved_block_overlap
,arch.extend_load
,stack_def_use.def_used
,init_ldr_add_pc
,litpool_confidence
,tls_desc_call
,common_tail
,stack_def_use.ref_in_block
,arch.simple_data_load
,reg_def_use.used_in_block
,base_relative_jump
,branch_to_calculated_pc_rel_addr
,stack_def_use.live_var_def
,wis_schedule_iter
,jump_table_prelude
,negative_block_heuristic
,relocation_adjustment
,compare_and_jump_indirect
,reg_def_use.last_def_in_block
,relative_jump_table_entry_candidate
,discarded_split_load
,__agg_single2
,padding_block_limit
,reg_def_use.return_block_end
,plt_entry
,stack_def_use.live_var_used
- dest_enlarged_data_section(EA:address, Reg:register, NewDest:address, Beg:address, End:address, OldBeg:address, OldEnd:address)
Auxiliary predicate to compute
moved_label
. This predicate detects that the register ‘Reg’ at address ‘EA’ is a loop counter iterating over data in a section at address [OldBeg,OldEnd). Based on the multiplier of the loop, we compute and extended area [Beg,End). If we find pointers to that extended area related to the same loop, we will move them to theNewDest
.
- addr_outside_section_used_for_memory_access(EA:address, Reg:register, Addr:address, AddrAccessed:address)
Auxiliary predicate to compute
moved_label
. This predicate detects an address loaded into a register that falls outside a data section, but it is ultimately used to access the data section. This is typically the case for the initialization of loop counters when these are pre-incremented.The address ‘EA’ is where address ‘Addr’ is loaded into the register ‘Reg’. Then that register is used to access ‘AddrAccessed’ at a later point (‘EA_access’).
- E.g.
mov RAX, Addr // EA_from
- loop:
sub RAX, 4 mov RBX, [RAX] // EA_access accesses AddrAccessed = Addr - 4 …
- moved_pc_relative_candidate(EA:address, Index:operand_index, Val:address, NewVal:address, Distance:unsigned)
A
moved_label
candidate for an instruction that has a pc-relative memory computation.Uses:
addr_outside_section_used_for_memory_access
,binary_format
,cie_entry
,cmp_reg_to_reg
,code_in_block
,dest_enlarged_data_section
,exception_section
,fde_entry
,instruction
,instruction_get_dest_op
,loaded_section
,op_regdirect_contains_reg
,pc_relative_operand
,reg_def_use.def_used
,regular_data_section
Used by:
moved_label_class
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function
- moved_displacement_candidate(EA:address, Op_index:operand_index, Dest:address, NewDest:address, Distance:unsigned)
A
moved_label
candidate for an instruction that has an indirect access (non pc-relative) where the displacement should be symbolic.Uses:
binary_format
,binary_type
,bss_section_limits
,data_access
,data_access_pattern_candidate
,instruction
,loaded_section
,pc_relative_operand
,regular_data_section
,split_load_operand
,value_reg_at_operand
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function
- moved_immediate_candidate(EA:address, Op_index:operand_index, Immediate:address, New_immmediate:address, Distance:unsigned)
A
moved_label
candidate for an instruction that has an immediate.
- moved_label_candidate(EA:address, Op_index:operand_index, Dest:address, NewDest:address, Priority:unsigned)
Auxiliary predicate to decide which
moved_label
should be taken for a given address. This is decided based on the ‘Priority’. Lower numbers indicate higher priority.Uses:
address_in_data
,address_in_data_refined_range.in_ea
,arch.cmp_operation
,arch.frame_pointer
,arch.jump
,base_address
,best_value_reg
,binary_format
,binary_type
,code_in_block
,defined_symbol
,first_synchronous_access
,function_symbol
,got_relative_operand
,indirect_jump
,instruction
,instruction_get_op
,moved_immediate_candidate
,next
,op_indirect
,op_indirect_mapped
,overlapping_instruction
,reg_def_use.def_used
,stack_def_use.def_used
Used by:
moved_label_class
Recursive:
data_object_conflict
,labeled_data_candidate
,boundary_sym_expr
,symbolic_expr_symbol_minus_symbol
,data_object
,resolved_transfer
,function_inference.function_entry
,string_candidate
,moved_data_label
,labeled_ea
,value_reg_address_before
,code_pointer_in_data
,moved_label_candidate
,symbolic_operand_point
,discarded_jump_table_entry
,data_object_total_points
,inferred_special_symbol
,data_limit
,next_data_limit
,code_in_refined_block
,data_access_limit
,symbol_minus_symbol_from_relocation
,+disconnected3
,moved_displacement_candidate
,best_func_symbol
,main_function
,data_object_point
,data_limit_after_access
,symbolic_data
,string_candidate_refined
,preferred_data_access
,address_array
,symbolic_expr
,split_block
,refined_block
,moved_label
,moved_pc_relative_candidate
,+disconnected6
,code_in_split_block
,+disconnected2
,after_address_in_data
,address_array_aux
,next_address_in_data
,best_symexpr_symbol
,data_object_candidate
,symbolic_expr_attribute
,+disconnected1
,symbolic_operand_attribute
,symbolic_operand
,got_reference
,block_needs_splitting_at
,jump_table
,label_conflict
,symbol_minus_symbol
,block_needs_merging
,relative_jump_table_entry
,base_relative_symbolic_operand
,symbol_minus_symbol_candidate
,discarded_data_object
,symbol_score
,inferred_main_function