QBoard » Advanced Visualizations » Viz - Python » Data Science Model and Training - Understanding

Data Science Model and Training - Understanding

  • Coming from a programming background where you write code, test, deploy, run.. I'm trying to wrap my head around the concept of "training a model" or a "trained model" in data science, and deploying that trained model.

    I'm not really concerned about the deployment environment, automation, etc.. I'm trying to understand the deployment unit.. a trained model. What does a trained model look like on a file system, what does it contain?

    I understand the concept of training a model, and splitting a set of data into a training set and testing set, but lets say I have a notebook (python / jupyter) and I load in some data, split between training/testing data, and run an algorithm to "train" my model. What is my deliverable under the hood? While I'm training a model I'd think there'd be a certain amount of data being stored in memory.. so how does that become part of the trained model? It obviously can't contain all the data used for training; so for instance if I'm training a chatbot agent (retrieval-based), what is actually happening as part of that training after I'd add/input examples of user questions or "intents" and what is my deployable as far as a trained model? Does this trained model contain some sort of summation of data from training or array of terms, how large (deployable size) can it get?

    While the question may seem relatively simple "what is a trained model", how would I explain it to a devops tech in simple terms? This is an "IT guy interested in data science trying to understand the tangible unit of a trained model in a discussion with a data science guy".

    Thanks

      September 17, 2021 1:33 PM IST
    0
  • Training a model simply means learning (determining) good values for all the weights and the bias from labeled examples. In supervised learning, a machine learning algorithm builds a model by examining many examples and attempting to find a model that minimizes loss; this process is called empirical risk minimization.
      September 20, 2021 5:32 PM IST
    0
  • A trained model(pickled) or whatever you want to use, contains the features on which it has been trained at least. Take for example a simple distance based model, you design a model based on the fact that (x1,x2,x3,x4) features are important and if any point if comes into contact with the model should give back the calculated distance based on which you draw insights or conclusions. Similarly for chatbots, you train based on ner-crf, whatever features you want. As soon as a text comes into contact with the model, the features are extracted based on the model and insights/conclusions are drawn. Hope it was helpful !! I tried explaining the Feyman way.
      October 20, 2021 1:07 PM IST
    0
  • It depends on the model. For example linear regression, training will give you the coefficients of the slope and the intercept (generally). These are the "model parameters". When deployed, traditionally, these coefficients get fed into a different algorithm (literally y=mx+b), and then when queried "what should y be, when I have x", it responds with the appropriate value.

    Kmeans clustering on the other hand the "parameters" are vectors, and the predict algorithm calculates distance from a vector given to the algorithm, and then returns the closest cluster - note often times these clusters are post processed, so the predict algorithm will say "shoes" not "[1,2,3,5]", which is again an example of how these things change in the wild.

    Deep learning returns a list of edge weights for a graph, various parametric systems (as in maximum likelihood estimation), return the coefficients to describe a particular distribution, for example uniform distribution is number of buckets, Gaussian/Normal distribution is mean and variance, other more complicated ones have even more, for example skew and conditional probabilities.

      September 30, 2021 12:41 PM IST
    0