linear-regression-with-theano slides

Linear Regression with Theano

In [67]:
plt.plot(train_x, train_y,'b+')
plt.xlabel("x")
plt.ylabel("y")
Out[67]:
<matplotlib.text.Text at 0x10bfd7a10>

Input of the computational graph

In [68]:
# theano symbolic variables
x = T.vector("x")
target = T.vector("t") 

Linear Model

In [69]:
def linear_model(x, p0, p1):
    return  p0 + x * p1  

# parameters of the model are theano shared variables 
param0 = theano.shared(0., name="p0")
param1 = theano.shared(0., name="p1")

prediction = linear_model(x, param0, param1)
In [70]:
pics_file = "pics/linear_prediction.png"
theano.printing.pydotprint(prediction, outfile=pics_file, var_with_name_simple=True) 
#see http://deeplearning.net/software/theano/tutorial/printing_drawing.html
The output file is available at pics/linear_prediction.png
In [71]:
from IPython.display import Image
Image(filename=pics_file)
Out[71]:

Cost function

In [72]:
cost = T.mean(T.sqr(target - prediction))

Gradient Descent

Update rule: $$ \vec \theta \leftarrow \vec \theta - \alpha \nabla J(\vec \theta) $$

  • Parameters as vector $\vec \theta$
  • Cost function $J(\vec \theta)$
  • Learning rate $\alpha$
  • Nabla-operator: $\nabla = (\partial/\partial \theta_1, \partial/\partial \theta_2, \dots)^T$

Parameter update during learning

$$ \theta_i \leftarrow \theta_i - \alpha \frac{\partial J(\vec \theta)}{\partial \theta_i} $$

Calculations of Gradients in Theano

In [73]:
g_param0 = T.grad(cost=cost, wrt=param0)
g_param1 = T.grad(cost=cost, wrt=param1)
In [74]:
# learning rate 
alpha = 0.0005

# update rule - a step in gradient descent
updates = [[param0, param0 - alpha * g_param0], 
           [param1, param1 - alpha * g_param1]]

Training function

In [75]:
train_func = theano.function(inputs=[x, target], 
                outputs=cost, updates=updates, 
                allow_input_downcast=True)

Training

In [76]:
nb_epochs = 500
cost_over_epochs = np.ndarray([nb_epochs]) 
for epoch in range(nb_epochs):
    # full batch 
    c = train_func(train_x, train_y)
    cost_over_epochs[epoch] = c
    
plt.plot(np.arange(nb_epochs), cost_over_epochs)
plt.xlabel("epochs")
plt.ylabel("cost")
Out[76]:
<matplotlib.text.Text at 0x10b429990>
In [77]:
plt.plot(train_x, train_y,'.')
p0 = param0.eval()
p1 = param1.eval()
plt.plot(np.array([-5., 5]), np.array([p0-5.*p1, p0+5.*p1]),'-')    
Out[77]:
[<matplotlib.lines.Line2D at 0x109cb5b10>]