Next: , Previous: , Up: Components   [Contents][Index]


3.3 Style Features

The Style Features component offers the ability to extract a set of features from a software object into a feature vector. We use the Code Stylometry Feature Set (CSFS) described in De-anonymizing Programmers via Code Stylometry at https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-caliskan-islam.pdf).

Extracted feature vectors can be used as fitness vectors with the lexicase evolution strategy. One application is to drive evolution towards solutions which better match the features of the surrounding source code.

3.3.1 API Support for Style Features

API support for style features is documented in the entries for classes sel/sw/styleable:style-feature, sel/sw/styleable:styleable, and sel/sw/styleable:style-project. We provide a brief overview here.

To extract the set of feature vectors from a software object use extract-features, providing a software object and a list of feature extractor functions.

Each feature extractor function is expected to operate on a clang object, and return a feature vector containing the values for that feature. Function extract-features returns one large feature vector that is the result of concatenating all of these vectors in order.

The SEL API provides several AST-related feature extractors for clang software objects, i.e., features derived from properties of a clang AST. The available feature extractors are:

By convention, feature extractor functions have names ending in “-extractor”.


Next: , Previous: , Up: Components   [Contents][Index]