Is there a way to save the preprocessing objects in scikit-learn?



I am building a neural net with the purpose of make predictions on new data in the future. I first preprocess the training data using sklearn.preprocessing, then train the model, then make some predictions, then close the program. In the future, when new data comes in I have to use the same preprocessing scales to transform the new data before putting it into the model. Currently, I have to load all of the old data, fit the preprocessor, then transform the new data with those preprocessors. Is there a way for me to save the preprocessing objects objects (like sklearn.preprocessing.StandardScaler) so that I can just load the old objects rather than have to remake them?


As mentioned by lejlot, you can use the library pickle to save the trained network as a file in your hard drive, then you just need to load it to start to make predictions.

Here is an example on how to use pickle to save and load python objects:

import pickle
import numpy as np

npTest_obj = np.asarray([[1,2,3],[6,5,4],[8,7,9]])

strTest_obj = "pickle example XXXX"

if __name__ == "__main__":
    # store object information
    pickle.dump(npTest_obj, open("npObject.p", "wb"))
    pickle.dump(strTest_obj, open("strObject.p", "wb"))

    # read information from file
    str_readObj = pickle.load(open("strObject.p","rb"))
    np_readObj = pickle.load(open("npObject.p","rb"))

Answered By – GurstTavo

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More