linear-regression-with-theano slides

Linear Regression with Theano¶

In [67]:

plt.plot(train_x, train_y,'b+')
plt.xlabel("x")
plt.ylabel("y")

Out[67]:

<matplotlib.text.Text at 0x10bfd7a10>

Input of the computational graph¶

In [68]:

# theano symbolic variables
x = T.vector("x")
target = T.vector("t")

Linear Model¶

In [69]:

def linear_model(x, p0, p1):
    return  p0 + x * p1  

# parameters of the model are theano shared variables 
param0 = theano.shared(0., name="p0")
param1 = theano.shared(0., name="p1")

prediction = linear_model(x, param0, param1)

In [71]:

from IPython.display import Image
Image(filename=pics_file)

Out[71]:

Cost function¶

In [72]:

cost = T.mean(T.sqr(target - prediction))

Gradient Descent¶

Update rule: $$ \vec \theta \leftarrow \vec \theta - \alpha \nabla J(\vec \theta) $$

Parameters as vector $\vec \theta$
Cost function $J(\vec \theta)$
Learning rate $\alpha$
Nabla-operator: $\nabla = (\partial/\partial \theta_1, \partial/\partial \theta_2, \dots)^T$

Parameter update during learning¶

$$ \theta_i \leftarrow \theta_i - \alpha \frac{\partial J(\vec \theta)}{\partial \theta_i} $$

Calculations of Gradients in Theano¶

In [73]:

g_param0 = T.grad(cost=cost, wrt=param0)
g_param1 = T.grad(cost=cost, wrt=param1)

In [74]:

# learning rate 
alpha = 0.0005

# update rule - a step in gradient descent
updates = [[param0, param0 - alpha * g_param0], 
           [param1, param1 - alpha * g_param1]]

Training function¶

In [75]:

train_func = theano.function(inputs=[x, target], 
                outputs=cost, updates=updates, 
                allow_input_downcast=True)

Training¶

In [76]:

nb_epochs = 500
cost_over_epochs = np.ndarray([nb_epochs]) 
for epoch in range(nb_epochs):
    # full batch 
    c = train_func(train_x, train_y)
    cost_over_epochs[epoch] = c
    
plt.plot(np.arange(nb_epochs), cost_over_epochs)
plt.xlabel("epochs")
plt.ylabel("cost")

Out[76]:

<matplotlib.text.Text at 0x10b429990>

In [77]:

plt.plot(train_x, train_y,'.')
p0 = param0.eval()
p1 = param1.eval()
plt.plot(np.array([-5., 5]), np.array([p0-5.*p1, p0+5.*p1]),'-')

Out[77]:

[<matplotlib.lines.Line2D at 0x109cb5b10>]