# LSTM Tutorial for SMAP Soil Moisture

This is a tutorial for training and testing a LSTM model for SMAP soil moisture prediction over CONUS.

## Training part

- **Load needed packages**

In [1]:
import sys
sys.path.append('../')
import os
import torch
import numpy as np
from hydroDL.master import default, wrapMaster, train
from hydroDL.data import dbCsv
from hydroDL.post import plot, stat
from hydroDL import master

ModuleNotFoundError: No module named 'torch'

- **Define training options**

In [None]:
cDir = os.getcwd()
# define training options
optData = default.update(
    default.optDataSMAP,
    rootDB=os.path.join(cDir, 'data'),
    subset='CONUSv4f1',
    tRange=[20150401, 20160401])
if torch.cuda.is_available():
    optModel = default.optLstm
else:
    optModel = default.update(
        default.optLstm,
        name='hydroDL.model.rnn.CpuLstmModel')
optLoss = default.optLossRMSE
optTrain = default.update(
    default.optTrainSMAP, 
    nEpoch=100,
    saveEpoch=50)
out = os.path.join(cDir, 'output', 'CONUSv4f1')
masterDict = wrapMaster(out, optData, optModel, optLoss, optTrain)

- **Train the LSTM model**

In [None]:
train(masterDict)

## Testing part

**This part is for model test and result plotting. By default the model will be saved in [here](output/CONUSv4f1/).**

 - **Define test option**

In [None]:
out = os.path.join(cDir, 'output', 'CONUSv4f1')
rootDB = os.path.join(cDir, 'data')
nEpoch = 100
tRange = [20160401, 20170401]

 - **Test the model in another year**

In [None]:
df, yp, yt = master.test(
    out, tRange=[20160401, 20170401], subset='CONUSv4f1', epoch=nEpoch, reTest=True)
yp = yp.squeeze()
yt = yt.squeeze()

 - **Calculate statistic metrics.**

In [None]:
# calculate statistics
statErr = stat.statError(yp, yt)
# Box plots to show the test results
statDictLst = [statErr]
keyLst=['Bias', 'RMSE', 'ubRMSE', 'Corr']
dataBox = list()
for iS in range(len(keyLst)):
    statStr = keyLst[iS]
    temp = list()
    for k in range(len(statDictLst)):
        data = statDictLst[k][statStr]
        data = data[~np.isnan(data)]
        temp.append(data)
    dataBox.append(temp)
%matplotlib notebook
fig = plot.plotBoxFig(dataBox, label1=keyLst, sharey=False, figsize=(12, 5))
fig.patch.set_facecolor('white')
fig.show()

- **Plot an interactive map and users can click on map to show time series of observation and model predictions.**

In [None]:
dataGrid = [statErr['RMSE'], statErr['Corr']]
dataTs = [yp, yt]
t = df.getT()
crd = df.getGeo()
mapNameLst = ['RMSE', 'Correlation']
tsNameLst = ['LSTM', 'SMAP']
colorMap = None
colorTs = None
# plot map and time series
%matplotlib notebook
plot.plotTsMap(
    dataGrid,
    dataTs,
    lat=crd[0],
    lon=crd[1],
    t=t,
    mapNameLst=mapNameLst,
    tsNameLst=tsNameLst,
    isGrid=True)