Automated Machine Learning for Earth Science via AutoGluon¶

Authors¶

Author1 = {“name”: “Xingjian Shi”, “affiliation”: “Amazon Web Services”, “email”: “xjshi@amazon.com”, “orcid”: “”}
Author2 = {“name”: “Wen-ming Ye”, “affiliation”: “Amazon Web Services”, “email”: “wye@amazon.com”, “orcid”: “”}
Author3 = {“name”: “Nick Erickson”, “affiliation”: “Amazon Web Services”, “email”: “neerick@amazon.com”, “orcid”: “”}
Author4 = {“name”: “Jonas Mueller”, “affiliation”: “Amazon Web Services”, “email”: “jonasmue@amazon.com”, “orcid”: “”}
Author5 = {“name”: “Alexander Shirkov”, “affiliation”: “Amazon Web Services”, “email”: “ashyrkou@amazon.com”, “orcid”: “”}
Author6 = {“name”: “Zhi Zhang”, “affiliation”: “Amazon Web Services”, “email”: “zhiz@amazon.com”, “orcid”: “”}
Author7 = {“name”: “Mu Li”, “affiliation”: “Amazon Web Services”, “email”: “mli@amazon.com”, “orcid”: “”}
Author8 = {“name”: “Alexander Smola”, “affiliation”: “Amazon Web Services”, “email”: “alex@smola.org”, “orcid”: “”}

Purpose¶

In this notebook, we introduce AutoGluon to the Earth science community. AutoGluon is an automated machine learning toolkit that enables users to solve machine learning problems with a single line of code. Many earth science problems involve tabular-like datasets. With AutoGluon, you can feed in the raw data table and specify the label column. AutoGluon will deliver a model that has reasonable performance in a short period of time. In addition, with AutoGluon, you can also analyze the importance of each feature column with a single line of code. In the following, we illustrate how to use AutoGluon to build machine learning models for two Earth Science problems.

Setup¶

We have pre-installed AutoGluon via pip. Here, we will fix the random seed.

# Uncomment below to install autogluon
# !python3 -m pip install autogluon
import random
import numpy as np
random.seed(123)
np.random.seed(123)

WARNING: You are using pip version 20.2.4; however, version 21.1.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
WARNING: You are using pip version 20.2.4; however, version 21.1.1 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.

Forest Cover Type Classification¶

In the first example, we will predict the forest cover type (the predominant kind of tree cover) from strictly cartographic variables. The dataset is downloaded from Kaggle Forest Cover Type Prediction. Study area of the dataset includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. The actual forest cover type for a given 30 x 30 meter cell was determined from US Forest Service (USFS) Region 2 Resource Information System data. Independent variables were then derived from data obtained from the US Geological Survey and USFS. The data is in raw form and contains binary columns of data for qualitative independent variables such as wilderness areas and soil type. Let’s first download the dataset.

!wget https://deep-earth.s3.amazonaws.com/datasets/earthcube2021_demo/forest-cover-type-prediction.zip -O forest-cover-type-prediction.zip
!unzip -o forest-cover-type-prediction.zip -d forest-cover-type-prediction

--2021-05-16 08:43:12--  https://deep-earth.s3.amazonaws.com/datasets/earthcube2021_demo/forest-cover-type-prediction.zip
Resolving deep-earth.s3.amazonaws.com (deep-earth.s3.amazonaws.com)... 52.217.71.28
Connecting to deep-earth.s3.amazonaws.com (deep-earth.s3.amazonaws.com)|52.217.71.28|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26555059 (25M) [application/zip]
Saving to: ‘forest-cover-type-prediction.zip’

forest-cover-type-p 100%[===================>]  25.32M  63.0MB/s    in 0.4s    

2021-05-16 08:43:13 (63.0 MB/s) - ‘forest-cover-type-prediction.zip’ saved [26555059/26555059]

Archive:  forest-cover-type-prediction.zip
  inflating: forest-cover-type-prediction/sampleSubmission.csv  
  inflating: forest-cover-type-prediction/sampleSubmission.csv.zip  
  inflating: forest-cover-type-prediction/test.csv  
  inflating: forest-cover-type-prediction/test.csv.zip  
  inflating: forest-cover-type-prediction/test3.csv  
  inflating: forest-cover-type-prediction/train.csv  
  inflating: forest-cover-type-prediction/train.csv.zip  

Here, we load and visualize the dataset. We will split the dataset to 80% training and 20% development for the purpose of reporting the score on the development data. Also, for the purpose of demonstration, we will subsample the dataset to 5000 samples.

import pandas as pd
from sklearn.model_selection import train_test_split
df = pd.read_csv('forest-cover-type-prediction/train.csv.zip')
df = df.drop('Id', 1)
df = df.sample(5000, random_state=100)
train_df, dev_df = train_test_split(df, random_state=100)

By visualizing the dataset, we can see that there are 54 feature columns and 1 label column called "Cover_Type".

train_df.head(5)

	Elevation	Aspect	Slope	Horizontal_Distance_To_Hydrology	Vertical_Distance_To_Hydrology	Horizontal_Distance_To_Roadways	Hillshade_9am	Hillshade_Noon	Hillshade_3pm	Horizontal_Distance_To_Fire_Points	...	Soil_Type32	Cover_Type
7449	2762	17	16	270	49	2639	206	206	134	268	...	0	5
13086	2283	109	11	0	0	1138	240	227	116	1187	...	0	4
14221	3220	82	14	247	66	3328	239	214	103	819	...	1	1
768	3021	68	8	201	26	4134	228	225	130	2493	...	0	1
6132	2446	76	21	469	105	726	241	196	75	1401	...	0	6

5 rows × 55 columns

Train Model with One Line¶

Next, we train a model in AutoGluon with a single line of code. We will just need to specify the label column before calling .fit(). Here, the label column is Cover_Type. AutoGluno will inference the problem type automatically. In our example, it can correctly figure out that it is a “multiclass” classification problem and output the model with the best accuracy. Internally, it will also figure out the feature type automatically.

import autogluon
from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label='Cover_Type', path='ag_ec2021_demo').fit(train_df)

Warning: path already exists! This predictor may overwrite an existing predictor! path="ag_ec2021_demo"
Beginning AutoGluon training ...
AutoGluon will save models to "ag_ec2021_demo/"
AutoGluon Version:  0.2.1b20210511
Train Data Rows:    3750
Train Data Columns: 54
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
	7 unique label values:  [5, 4, 1, 6, 3, 2, 7]
	If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
NumExpr defaulting to 8 threads.
Train Data Class Count: 7
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    31462.81 MB
	Train Data (Original)  Memory Usage: 1.62 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Useless Original Features (Count: 4): ['Soil_Type7', 'Soil_Type8', 'Soil_Type15', 'Soil_Type25']
		These features carry no predictive signal and should be manually investigated.
		This is typically a feature which has the same value for all rows.
		These features do not need to be present at inference time.
	Types of features in original data (raw dtype, special dtypes):
		('int', []) : 50 | ['Elevation', 'Aspect', 'Slope', 'Horizontal_Distance_To_Hydrology', 'Vertical_Distance_To_Hydrology', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('int', []) : 50 | ['Elevation', 'Aspect', 'Slope', 'Horizontal_Distance_To_Hydrology', 'Vertical_Distance_To_Hydrology', ...]
	0.1s = Fit runtime
	50 features in original data used to generate 50 features in processed data.
	Train Data (Processed) Memory Usage: 1.5 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.09s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
	To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.13333333333333333, Train Rows: 3250, Val Rows: 500
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif ...
	0.72	 = Validation accuracy score
	0.02s	 = Training runtime
	0.11s	 = Validation runtime
Fitting model: KNeighborsDist ...
	0.744	 = Validation accuracy score
	0.01s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
	0.796	 = Validation accuracy score
	8.52s	 = Training runtime
	0.03s	 = Validation runtime
Fitting model: LightGBMXT ...
	0.83	 = Validation accuracy score
	1.93s	 = Training runtime
	0.03s	 = Validation runtime
Fitting model: LightGBM ...
	0.832	 = Validation accuracy score
	3.08s	 = Training runtime
	0.04s	 = Validation runtime
Fitting model: RandomForestGini ...
	0.822	 = Validation accuracy score
	0.85s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: RandomForestEntr ...
	0.824	 = Validation accuracy score
	1.02s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: CatBoost ...
	0.812	 = Validation accuracy score
	4.56s	 = Training runtime
	0.0s	 = Validation runtime
Fitting model: ExtraTreesGini ...
	0.802	 = Validation accuracy score
	0.71s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: ExtraTreesEntr ...
	0.808	 = Validation accuracy score
	0.81s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: XGBoost ...
	0.816	 = Validation accuracy score
	6.59s	 = Training runtime
	0.01s	 = Validation runtime
Fitting model: NeuralNetMXNet ...
	0.8	 = Validation accuracy score
	9.82s	 = Training runtime
	0.12s	 = Validation runtime
Fitting model: LightGBMLarge ...
	0.834	 = Validation accuracy score
	6.4s	 = Training runtime
	0.03s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	0.858	 = Validation accuracy score
	0.35s	 = Training runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 49.1s ...
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("ag_ec2021_demo/")

We can visualize the performance of each model with predictor.leaderboard(). Internally, AutoGluon trains a diverse set of different tabular models and computes a weighted ensemble to combine these models.

predictor.leaderboard()

                  model  score_val  pred_time_val   fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
 WeightedEnsemble_L2      0.858       0.432053  28.982538                0.000448           0.346764            2       True         14
       LightGBMLarge      0.834       0.032289   6.397996                0.032289           6.397996            1       True         13
            LightGBM      0.832       0.040016   3.076502                0.040016           3.076502            1       True          5
          LightGBMXT      0.830       0.028169   1.926692                0.028169           1.926692            1       True          4
    RandomForestEntr      0.824       0.102347   1.017105                0.102347           1.017105            1       True          7
    RandomForestGini      0.822       0.102403   0.848964                0.102403           0.848964            1       True          6
             XGBoost      0.816       0.011192   6.591112                0.011192           6.591112            1       True         11
            CatBoost      0.812       0.004262   4.561907                0.004262           4.561907            1       True          8
      ExtraTreesEntr      0.808       0.102475   0.814237                0.102475           0.814237            1       True         10
      ExtraTreesGini      0.802       0.102421   0.714676                0.102421           0.714676            1       True          9
     NeuralNetMXNet      0.800       0.124815   9.818555                0.124815           9.818555            1       True         12
    NeuralNetFastAI      0.796       0.029656   8.515627                0.029656           8.515627            1       True          3
     KNeighborsDist      0.744       0.102354   0.012857                0.102354           0.012857            1       True          2
     KNeighborsUnif      0.720       0.105474   0.017922                0.105474           0.017922            1       True          1

	model	score_val	pred_time_val	fit_time	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	WeightedEnsemble_L2	0.858	0.432053	28.982538	0.000448	0.346764	2	True	14
1	LightGBMLarge	0.834	0.032289	6.397996	0.032289	6.397996	1	True	13
2	LightGBM	0.832	0.040016	3.076502	0.040016	3.076502	1	True	5
3	LightGBMXT	0.830	0.028169	1.926692	0.028169	1.926692	1	True	4
4	RandomForestEntr	0.824	0.102347	1.017105	0.102347	1.017105	1	True	7
5	RandomForestGini	0.822	0.102403	0.848964	0.102403	0.848964	1	True	6
6	XGBoost	0.816	0.011192	6.591112	0.011192	6.591112	1	True	11
7	CatBoost	0.812	0.004262	4.561907	0.004262	4.561907	1	True	8
8	ExtraTreesEntr	0.808	0.102475	0.814237	0.102475	0.814237	1	True	10
9	ExtraTreesGini	0.802	0.102421	0.714676	0.102421	0.714676	1	True	9
10	NeuralNetMXNet	0.800	0.124815	9.818555	0.124815	9.818555	1	True	12
11	NeuralNetFastAI	0.796	0.029656	8.515627	0.029656	8.515627	1	True	3
12	KNeighborsDist	0.744	0.102354	0.012857	0.102354	0.012857	1	True	2
13	KNeighborsUnif	0.720	0.105474	0.017922	0.105474	0.017922	1	True	1

Evaluation and Prediction¶

We can also evaluate the model performance on the heldout predictor dataset by calling .evaluate().

predictor.evaluate(dev_df)

Evaluation: accuracy on test data: 0.8168
Evaluations on test data:
{
    "accuracy": 0.8168,
    "balanced_accuracy": 0.8170602919410992,
    "mcc": 0.7866393746250545
}

{'accuracy': 0.8168,
 'balanced_accuracy': 0.8170602919410992,
 'mcc': 0.7866393746250545}

To get the prediction, you may just use predictor.predict().

predictions = predictor.predict(dev_df)
predictions

   3
    5
  3
   2
  7
        ..
   5
   7
  1
   1
      2
Name: Cover_Type, Length: 1250, dtype: int64

For classification problems, we can also use .predict_proba to get the probability.

probs = predictor.predict_proba(dev_df)
probs.head(5)

	1	2	3	4	5	6	7
6084	0.000229	0.000843	0.744518	0.208950	0.000476	0.044587	0.000397
927	0.043397	0.347411	0.000929	0.001604	0.597819	0.006463	0.002378
10919	0.006373	0.060102	0.767284	0.000076	0.126009	0.038330	0.001827
8867	0.170293	0.748936	0.002065	0.000083	0.072658	0.002915	0.003051
14455	0.004558	0.004203	0.000125	0.000081	0.000263	0.000071	0.990699

Load the Predictor¶

Loading a AutoGluon model is straight-forward. We can directly call .load()

predictor_loaded = TabularPredictor.load('ag_ec2021_demo')
predictor_loaded.evaluate(dev_df)

Evaluation: accuracy on test data: 0.8168
Evaluations on test data:
{
    "accuracy": 0.8168,
    "balanced_accuracy": 0.8170602919410992,
    "mcc": 0.7866393746250545
}

{'accuracy': 0.8168,
 'balanced_accuracy': 0.8170602919410992,
 'mcc': 0.7866393746250545}

Feature Importance¶

AutoGluon offers a built-in method for calculating the relative importance of each feature based on permutation-shuffling. In the following, we calculate the feature importance and print the top-10 important features. Here, importance means the importance score and the other values give you an understanding of the statistical significance of the calculated score.

importance = predictor.feature_importance(dev_df, subsample_size=500)
importance.head(10)

Computing feature importance via permutation shuffling for 54 features using 500 rows with 3 shuffle sets...
	104.88s	= Expected runtime (34.96s per shuffle set)
	17.35s	= Actual runtime (Completed 3 of 3 shuffle sets)

	importance	stddev	p_value	n	p99_high	p99_low
Elevation	0.475333	0.029143	0.000625	3	0.642328	0.308339
Horizontal_Distance_To_Roadways	0.085333	0.008327	0.001579	3	0.133046	0.037621
Horizontal_Distance_To_Fire_Points	0.066000	0.002000	0.000153	3	0.077460	0.054540
Horizontal_Distance_To_Hydrology	0.053333	0.013317	0.010078	3	0.129639	-0.022973
Hillshade_9am	0.023333	0.009238	0.024239	3	0.076266	-0.029599
Wilderness_Area4	0.018000	0.011136	0.053704	3	0.081808	-0.045808
Hillshade_Noon	0.016667	0.023861	0.174968	3	0.153391	-0.120058
Aspect	0.016000	0.014000	0.093162	3	0.096222	-0.064222
Vertical_Distance_To_Hydrology	0.014667	0.003055	0.007078	3	0.032172	-0.002839
Wilderness_Area1	0.012667	0.004163	0.017088	3	0.036523	-0.011190

From the results, we can see that Elevation is the most important feature. Horizontal_Distance_To_Roadways is the 2nd most important feature.

Achieve Better Performance¶

The default behavior of AutoGluon is to compute a weighted ensemble of a diverse set of models. Usually, you can achieve better performance via stack ensembling. To achieve better performance based on automated stack ensembling, you can specify presets="best_quality" when calling .fit() in AutoGluon. For more details, you can also checkout our provided script. The detailed architecture is described in [1] and we also provide the following figure so you can know the general architecture.

With .fit(train_df, presets="best_quality"), we are able to achieve 82/1692 in the competition. To reproduce our number, you may try the command mentioned in link.

Solar Radiation Prediction¶

In the second example, we will train model to predict the solar radiation. The orignal dataset is available in Kaggle Solar Radiation Prediction. The dataset contains such columns as: “wind direction”, “wind speed”, “humidity” and “temperature”. The response parameter that is to be predicted is: “Solar_radiation”. It contains measurements for the past 4 months and you have to predict the level of solar radiation. Let’s download and load the dataset.

!wget https://deep-earth.s3.amazonaws.com/datasets/earthcube2021_demo/SolarPrediction.csv.zip -O SolarPrediction.csv.zip

--2021-05-16 08:44:25--  https://deep-earth.s3.amazonaws.com/datasets/earthcube2021_demo/SolarPrediction.csv.zip
Resolving deep-earth.s3.amazonaws.com (deep-earth.s3.amazonaws.com)... 52.217.95.209
Connecting to deep-earth.s3.amazonaws.com (deep-earth.s3.amazonaws.com)|52.217.95.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 523425 (511K) [application/zip]
Saving to: ‘SolarPrediction.csv.zip’

SolarPrediction.csv 100%[===================>] 511.16K  --.-KB/s    in 0.007s  

2021-05-16 08:44:25 (76.8 MB/s) - ‘SolarPrediction.csv.zip’ saved [523425/523425]

import pandas as pd
df = pd.read_csv('SolarPrediction.csv.zip')
train_df, dev_df = train_test_split(df, random_state=100)

train_df.head(10)

	UNIXTime	Data	Time	Radiation	Temperature	Pressure	Humidity	WindDirection(Degrees)	Speed	TimeSunRise	TimeSunSet
2664	1474412104	9/20/2016 12:00:00 AM	12:55:04	1039.15	65	30.40	57	2.26	5.62	06:11:00	18:21:00
12230	1476543319	10/15/2016 12:00:00 AM	04:55:19	1.21	51	30.46	23	181.58	6.75	06:17:00	17:59:00
11706	1476704422	10/17/2016 12:00:00 AM	01:40:22	1.22	50	30.47	39	142.56	10.12	06:18:00	17:58:00
12924	1476330025	10/12/2016 12:00:00 AM	17:40:25	28.35	59	30.45	42	167.42	4.50	06:16:00	18:02:00
27507	1482367563	12/21/2016 12:00:00 AM	14:46:03	637.93	57	30.39	74	40.94	4.50	06:53:00	17:49:00
2516	1474457405	9/21/2016 12:00:00 AM	01:30:05	1.21	45	30.39	73	159.07	3.37	06:11:00	18:20:00
32227	1480723808	12/2/2016 12:00:00 AM	14:10:08	177.19	45	30.34	93	134.78	11.25	06:42:00	17:42:00
12705	1476396922	10/13/2016 12:00:00 AM	12:15:22	1008.08	65	30.46	46	71.24	5.62	06:17:00	18:01:00
14992	1475697322	10/5/2016 12:00:00 AM	09:55:22	292.44	55	30.47	101	18.70	7.87	06:14:00	18:08:00
23615	1478267417	11/4/2016 12:00:00 AM	03:50:17	1.18	44	30.42	38	176.34	7.87	06:25:00	17:47:00

Like in our previos example, we can directly train a predictor with a single .fit() call. The difference is that AutoGluon can automatically determine that it is a regression problem.

predictor = TabularPredictor(label='Radiation', eval_metric='r2', path='ag_ec2021_demo2').fit(train_df)

Warning: path already exists! This predictor may overwrite an existing predictor! path="ag_ec2021_demo2"
Beginning AutoGluon training ...
AutoGluon will save models to "ag_ec2021_demo2/"
AutoGluon Version:  0.2.1b20210511
Train Data Rows:    24514
Train Data Columns: 10
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == float and many unique label-values observed).
	Label info (max, min, mean, stddev): (1601.26, 1.11, 206.52072, 315.54334)
	If 'regression' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    27858.63 MB
	Train Data (Original)  Memory Usage: 7.88 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting DatetimeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('float', [])                      : 3 | ['Pressure', 'WindDirection(Degrees)', 'Speed']
		('int', [])                        : 3 | ['UNIXTime', 'Temperature', 'Humidity']
		('object', ['datetime_as_object']) : 4 | ['Data', 'Time', 'TimeSunRise', 'TimeSunSet']
	Types of features in processed data (raw dtype, special dtypes):
		('float', [])                : 3 | ['Pressure', 'WindDirection(Degrees)', 'Speed']
		('int', [])                  : 3 | ['UNIXTime', 'Temperature', 'Humidity']
		('int', ['datetime_as_int']) : 4 | ['Data', 'Time', 'TimeSunRise', 'TimeSunSet']
	16.7s = Fit runtime
	10 features in original data used to generate 10 features in processed data.
	Train Data (Processed) Memory Usage: 1.96 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 16.74s ...
AutoGluon will gauge predictive performance using evaluation metric: 'r2'
	To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 22062, Val Rows: 2452
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif ...
	0.9501	 = Validation r2 score
	0.03s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: KNeighborsDist ...
	0.9531	 = Validation r2 score
	0.03s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: LightGBMXT ...

[1000]	train_set's l2: 5825	train_set's r2: 0.941343	valid_set's l2: 6881.24	valid_set's r2: 0.932405
[2000]	train_set's l2: 4818.35	train_set's r2: 0.951483	valid_set's l2: 6360.95	valid_set's r2: 0.937497
[3000]	train_set's l2: 4202.38	train_set's r2: 0.957684	valid_set's l2: 6212.24	valid_set's r2: 0.938993
[4000]	train_set's l2: 3751.34	train_set's r2: 0.962227	valid_set's l2: 6130.43	valid_set's r2: 0.939774
[5000]	train_set's l2: 3396.38	train_set's r2: 0.965805	valid_set's l2: 6110.98	valid_set's r2: 0.939962
[6000]	train_set's l2: 3117.1	train_set's r2: 0.968616	valid_set's l2: 6078.9	valid_set's r2: 0.940272
[7000]	train_set's l2: 2876.17	train_set's r2: 0.971039	valid_set's l2: 6073.82	valid_set's r2: 0.940339
[8000]	train_set's l2: 2666.91	train_set's r2: 0.973145	valid_set's l2: 6064.97	valid_set's r2: 0.940439
[9000]	train_set's l2: 2479.79	train_set's r2: 0.97503	valid_set's l2: 6082.82	valid_set's r2: 0.940253

9405	 = Validation r2 score
47s	 = Training runtime
3s	 = Validation runtime
Fitting model: LightGBM ...
9438	 = Validation r2 score
37s	 = Training runtime
02s	 = Validation runtime
Fitting model: RandomForestMSE ...

[1000]	train_set's l2: 2247.86	train_set's r2: 0.977368	valid_set's l2: 5751.29	valid_set's r2: 0.943489

	0.9436	 = Validation r2 score
	6.91s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: CatBoost ...
	0.942	 = Validation r2 score
	4.38s	 = Training runtime
	0.0s	 = Validation runtime
Fitting model: ExtraTreesMSE ...
	0.9445	 = Validation r2 score
	2.1s	 = Training runtime
	0.1s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
No improvement since epoch 0: early stopping
	-0.3674	 = Validation r2 score
	12.72s	 = Training runtime
	0.04s	 = Validation runtime
Fitting model: XGBoost ...
	0.9447	 = Validation r2 score
	5.8s	 = Training runtime
	0.01s	 = Validation runtime
Fitting model: NeuralNetMXNet ...
	0.9348	 = Validation r2 score
	77.28s	 = Training runtime
	0.12s	 = Validation runtime
Fitting model: LightGBMLarge ...
	0.9445	 = Validation r2 score
	12.47s	 = Training runtime
	0.01s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	0.9547	 = Validation r2 score
	0.36s	 = Training runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 171.49s ...
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("ag_ec2021_demo2/")

We can evaluate on the development set by calling .evaluate(). Here, we have specified the model to use R2 score so it will report the R2.

predictor.evaluate(dev_df)

Evaluation: r2 on test data: 0.9543093262635433
Evaluations on test data:
{
    "r2": 0.9543093262635433,
    "root_mean_squared_error": -67.7654685984138,
    "mean_squared_error": -4592.158734362603,
    "mean_absolute_error": -24.437721801314463,
    "pearsonr": 0.9768914212616937,
    "median_absolute_error": -1.0411042070388794
}

{'r2': 0.9543093262635433,
 'root_mean_squared_error': -67.7654685984138,
 'mean_squared_error': -4592.158734362603,
 'mean_absolute_error': -24.437721801314463,
 'pearsonr': 0.9768914212616937,
 'median_absolute_error': -1.0411042070388794}

Similarly, we can also measure the feature importance.

importance = predictor.feature_importance(dev_df)
importance

Computing feature importance via permutation shuffling for 10 features using 1000 rows with 3 shuffle sets...
	10.78s	= Expected runtime (3.59s per shuffle set)
	4.0s	= Actual runtime (Completed 3 of 3 shuffle sets)

	importance	stddev	p_value	n	p99_high	p99_low
UNIXTime	1.063341	0.042203	0.000262	3	1.305167	0.821515
Time	0.080480	0.004762	0.000583	3	0.107768	0.053192
Temperature	0.029517	0.001956	0.000730	3	0.040723	0.018311
Data	0.005320	0.001238	0.008781	3	0.012411	-0.001771
Humidity	0.004958	0.000768	0.003948	3	0.009356	0.000560
TimeSunRise	0.003704	0.001029	0.012388	3	0.009599	-0.002192
TimeSunSet	0.003685	0.001873	0.038180	3	0.014417	-0.007047
Pressure	0.000460	0.000689	0.183299	3	0.004408	-0.003487
WindDirection(Degrees)	0.000051	0.001093	0.471685	3	0.006313	-0.006211
Speed	0.000006	0.000357	0.489087	3	0.002052	-0.002039

More Information¶

You may check our website for more information and tutorials: https://auto.gluon.ai/. We also support automatically train models with text, image, and multimodal tabular data.

References¶

Erickson, Nick and Mueller, Jonas and Shirkov, Alexander and Zhang, Hang and Larroy, Pedro and Li, Mu and Smola, Alexander, AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data, 2020, https://arxiv.org/pdf/2003.06505.pdf

EarthCube 2021 Call for Notebooks