Brainome Data Compiler

The Brainome Data Compiler is the world’s first data compiler for solving supervised machine learning problems.

In computer science, a code compiler (such as gcc) takes a program (or, more generally, a function) written in one language (C, C++, etc) and re-implements the same function in another language (e.g., assembly).

A function implemented in C, C++ or some other programming language

GCC or some other compiler

A function implemented in assembly language

In machine learning, instead of writing computer programs, we curate data sets that we believe contain a function (dog vs cat, good credit risk vs bad credit risk, etc). The goal is to use machine learning techniques to identify the function that explains the data set and encapsulate this function within a model or predictor. Brainome does exactly this in a 3 step process:

Data Set

1

BTC takes a labeled data set as input.

Brainome Table Compiler

2

BTC measures the data set to identify and size the explanatory function.

Python Predictor

3

BTC builds a predictor function based on the learnability measurements.

1

Brainome takes a labeled data set as input.

2

Brainome measures the data set to identify and size the explanatory function.

3

Brainome builds a predictor function based on the learnability measurements.

Features and Requirements:

Input Data Set
	CSV Format
	One column must contain the class labels (target column)
	Cell values supported: strings, floats, integers
	No pre-processing necessary
	No limit on number of rows or columns
	Unlimited number of classes. 100 instances per class recommended for best result.
	Support for unbalanced data sets
	Support for sparse data sets
	Support for missing value data sets
Model Creation
	Support for Decision Trees, Neural Networks, Random Forest
	Measurement driven building process for optimum model size and speed
	Produces very small models, often kilobytes in size; 2 to 3 orders of magnitude smaller than models produced by other tools
	Stand-alone python executable that requires only Numpy
	Written in clear text Python code (easily committed to your Git repository)
	GPU not required to run model
Measurements
	Data sufficiency (Capacity Progression)
	Attribute ranking
	Number of model parameters needed to learn
	Overfit risk
	Expected generalization
Integration
	BTC can be scripted from the command line
	BTC measurements are encapsulated in JSON or plain text

Features and Requirements:

Input Data Set

CSV Format

One column must contain the class labels (target column)

Cell values supported: strings, floats, integers

No pre-processing necessary

No limit on number of rows or columns

Unlimited number of classes. 100 instances per class recommended for best result.

Support for unbalanced data sets

Support for sparse data sets

Support for missing value data sets

Model Creation

Support for Decision Trees, Neural Networks, Random Forest

Measurement driven building process for optimum model size and speed

Produces very small models, often kilobytes in size; 2 to 3 orders of magnitude smaller than models produced by other tools

Stand-alone python executable that requires only Numpy

Written in clear text Python code (easily committed to your Git repository)

GPU not required to run model

Measurements

Data sufficiency (Capacity Progression)

Attribute ranking

Number of model parameters needed to learn

Overfit risk

Expected generalization

Integration

Brainome can be scripted from the command line

Brainome measurements are encapsulated in JSON or plain text

Ready to test Brainome with your own data?

or stay in touch

Brainome Data Compiler

1

2

3

1

2

3

Features and Requirements:

Features and Requirements:

Ready to test Brainome with your own data?

Product

Resources

Company

Legal

Share it