ModelHandler provides ML APIs

ModelHandler exposes machine learning models as APIs that applications can use over a REST API. (From v1.46.)

Classifier

Classifiers categories input data into different classes. This is used for success/failure prediction in many scenarios. Here’s an example that classifies iris.csv:

import pandas as pd
from gramex.ml import Classifier

# Read the data into a Pandas DataFrame
data = pd.read_csv('iris.csv', encoding='utf-8')

# Construct the model. The model only accepts a path where it should be saved
model = Classifier(
    model_class='sklearn.svm.SVC',        # Any sklearn model works
    model_kwargs={'kernel': 'sigmoid'},   # Optional model parameters
    url=filepath,
    # Input column names in data
    input=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'],
    output='species'
)
# Train the model
model.train(data)                         # DataFrame with input & output columns
model.save('iris.pkl')

This saves the model as model.pkl. You can use this in Python to make predictions:

import gramex.ml

model = gramex.ml.load('iris.pkl')
result = model.predict([{
  'sepal_length': 5.7,
  'sepal_width': 4.4,
  'petal_length': 1.5,
  'petal_width': 0.4,
}])
# result should be ['setosa']

Expose Endpoints

Such ML models can be exposed as a REST API.

url:
  modelhandler:
    pattern: /$YAMLURL/model/(.*?)/(.*?)
    handler: ModelHandler
    kwargs:
      path: $YAMLPATH # Folder with model files

This folder may contain multiple models. In our example, it would have iris.pkl. The endpoint for this model is model/iris/, which shows basic model information.

See the Iris model info

To classify using the model, visit model/iris/?sepal_width=1&sepal_length=2&petal_width=3&petal_length=4. This returns a JSON list with the inputs and the result:

[
  {
    "sepal_length": "2",
    "sepal_width": "1",
    "petal_length": "4",
    "petal_width": "3",
    "result": "versicolor"
  }
]

You can classify as many inputs as required by repeating the parameters. For example:

model/iris/?sepal_width=1.2&sepal_length=2.4&petal_width=3.2&petal_length=4.2
           &sepal_width=4.4&sepal_length=5.7&petal_width=0.4&petal_length=1.5
           &sepal_width=7.2&sepal_length=3.6&petal_width=6.1&petal_length=2.5

returns:

[
  {
    "sepal_length": "2.4",
    "sepal_width": "1.2",
    "petal_length": "4.2",
    "petal_width": "3.2",
    "result": "versicolor"
  },
  {
    "sepal_length": "5.7",
    "sepal_width": "4.4",
    "petal_length": "1.5",
    "petal_width": "0.4",
    "result": "setosa"
  },
  {
    "sepal_length": "3.6",
    "sepal_width": "7.2",
    "petal_length": "2.5",
    "petal_width": "6.1",
    "result": "virginica"
  }
]

Try classifying multiple values

Notes:

{
    "col1":["val1","val2"],
    "col2":["val3","val4"],
    "model_class":"sklearn.ensemble.RandomForestClassifier"
}

URL Query Parameters can be sent as they usually are in formhandler - /model/<name>/?col1=val1&col2=val2&col1=val3..

Example Usage

for example, the following requests via httpie will let you create a model around the iris dataset assuming that the server has a iris.csv inside the app folder

http PUT https://learn.gramener.com/guide/modelhandler/model/iris/ \
model_class=sklearn.linear_model.SGDClassifier \
output=species Model-Retrain:true url=iris.csv

If no input is sent, it will assume all columns except the output column are the input columns.

If no output is sent, it will assume the right-most or last column of the table is the output column.

Post which, visiting this link wil return the model parameters and visiting this link will return a prediction as a json object. (Answer should be setosa)

This form applies the URL query parameters directly. Try it.

API Reference

GroupMeans

gramex.ml provides access to the groupmeans() function that allows you to see the most significant influencers of various metrics in your data. (1.42)

groupmeans accepts the following parameters-

For more information, see autolysis.groupmeans

For example, Groupmeans used in an FormHandler

url:
  groupmeans-insights:
    pattern: /$YAMLURL/
    handler: FormHandler
    kwargs:
      url: $YAMPATH/yourdatafile.csv
      modify: groupmeans_app.groupmeans_insights(data, handler)

  groupmeans-data:
    pattern: /$YAMLURL/data
    handler: FormHandler
    kwargs:
      url: $YAMPATH/yourdatafile.csv
      default:
        _format: html

And in groupmeans_app.py

import gramex.ml

def groupmeans_insights(data, handler):
    args = handler.argparse(
        groups={'nargs': '*', 'default': []},
        numbers={'nargs': '*', 'default': []},
        cutoff={'type': float, 'default': .01},
        quantile={'type': float, 'default': .95},
        minsize={'type': int, 'default': None},
        weight={'type': float, 'default': None})
    return gramex.ml.groupmeans(data, args.groups, args.numbers,
                                args.cutoff, args.quantile, args.weight)

Links to Machine Learning and Analytics Usecases

Groupmeans Applied to the National Acheivement Survey Dataset