"# TP 2: Linear Algebra and Feedforward neural network\n",
"Master LiTL - 2023-2024\n",
"\n",
"## Requirements\n",
"In this section, we will go through some code to learn how to manipulate matrices and tensors, and we will take a look at some PyTorch code that allows to define, train and evaluate a simple neural network.\n",
"The modules used are the the same as in the previous session, *Numpy* and *Scikit*, with the addition of *PyTorch*. They are all already available within colab.\n",
"\n",
"## Part 1: Linear Algebra\n",
"\n",
"In this section, we will go through some python code to deal with matrices and also tensors, the data structures used in PyTorch.\n",
"\n",
"Sources: \n",
"* Linear Algebra explained in the context of deep learning: https://towardsdatascience.com/linear-algebra-explained-in-the-context-of-deep-learning-8fcb8fca1494\n",
"print(f\"Datatype of tensor: {tensor.dtype}\")\n",
"print(f\"Device tensor is stored on: {tensor.device}\")"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Shape of tensor: torch.Size([3, 4])\n",
"Datatype of tensor: torch.float32\n",
"Device tensor is stored on: cpu\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tu8RM6O7CaKO"
},
"source": [
"### 1.2.3 Move to GPU\n",
"\n",
"The code below is used to:\n",
"* check on which device the code is running, 'cuda' stands for GPU. If not GPU is found that we use CPU.\n",
"\n",
"\n",
"▶▶ **Check and move to GPU:**\n",
"* Run the code, it should say 'no cpu'\n",
"* Move to GPU: in Colab, allocate a GPU by going to Edit > Notebook Settings (Modifier > Paramètres du notebook)\n",
" * you'll see an indicator of connexion in the uppper right part of the screen\n",
"* Run the code from 1.2 again and the cell below (you can use the function Run / Run before or Exécution / Exécuter avant), you'll need to do all the imports again. You see the difference?"
"## Operations that have a _ suffix are in-place. For example: x.copy_(y), x.t_(), will change x.\n",
"print(tensor, \"\\n\")\n",
"tensor.add_(5)\n",
"print(tensor)"
],
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"tensor([[1., 0., 1., 1.],\n",
" [2., 0., 1., 1.],\n",
" [3., 0., 1., 1.],\n",
" [4., 0., 1., 1.]], device='cuda:0') \n",
"\n",
"tensor([[6., 5., 6., 6.],\n",
" [7., 5., 6., 6.],\n",
" [8., 5., 6., 6.],\n",
" [9., 5., 6., 6.]], device='cuda:0')\n"
]
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "DGmy-dtuOtiw"
},
"source": [
"# Part 2: Feedforward Neural Network\n",
"\n",
"In this practical session, we will explore a simple neural network architecture for NLP applications ; specifically, we will train a feedforward neural network for sentiment analysis, using the same dataset of reviews as in the previous session. We will also keep the bag of words representation.\n",
"\n",
"\n",
"Sources:\n",
"* This TP is inspired by a TP by Tim van de Cruys\n",
"▶▶ **Create a dataset object within the PyTorch library:**\n",
"\n",
"The easiest way to load datasets with PyTorch is to use the DataLoader class. Here we're going to give our numpy array to this class, and first, we need to transform our data to tensors. Follow the following steps:\n",
"* 1- **torch.from_numpy( A_NUMPY_ARRAY )**: transform your array into a tensor\n",
" * Note: you need to transform tensor type to float (for x), with **MY_TENSOR.to(torch.float)** (or cryptic error saying it was expecting long...).\n",
" * Print the shape of the tensor for your training data.\n",
"For this TP, we're going to walk through the code of a **simple feedforward neural network, with one hidden layer**.\n",
"\n",
"This network takes as input bag of words vectors, exactly as our 'classic' models: each review is represented by a vector of the size the number of tokens in the vocabulary with '1' when a word is present and '0' for the other words."
]
},
{
"cell_type": "markdown",
"source": [
"### 2.2.1 Questions\n",
"\n",
"▶▶ **What is the input dimension?**\n",
"\n",
"▶▶ **What is the output dimension?**"
],
"metadata": {
"id": "5KOM7ofrKUte"
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "BSK0j8YASriA"
},
"source": [
"▶▶ **What is the input dimension?** --> MAX FEATURES = 5000\n",
"\n",
"▶▶ **What is the output dimension?** --> number of classes = 2"
]
},
{
"cell_type": "code",
"source": [
"# Useful imports\n",
"import torch\n",
"import torch.nn as nn"
],
"metadata": {
"id": "DiNm2XwlG2_0"
},
"execution_count": 11,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### 2.2.2 Write the skeleton of the class\n",
"\n",
"▶▶ We're going to **define our own neural network type**, by defining a new class:\n",
"* The class is called **FeedforwardNeuralNetModel**\n",
"* it inherits from the class **nn.Module**\n",
"* the constructor takes the following arguments:\n",
" * size of the input (i.e. **input_dim**)\n",
" * size of the hidden layer (i.e. **hidden_dim**)\n",
" * size of the output layer (i.e. **output_dim**)\n",
"* in the constructor, we will call the constructor of the parent class\n",
"\n"
],
"metadata": {
"id": "bE4RgHUkGnGl"
}
},
{
"cell_type": "code",
"source": [
"# Start to define the class corresponding to our type of neural network\n",
" # Linear function (readout) # LINEAR ==> y = h1.W2\n",
" out3 = self.fc2(out2)\n",
" return out3"
],
"execution_count": 12,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## 2.3 Training the network\n",
"\n",
"Now we can use our beautiful class to define and then train our own neural network."
],
"metadata": {
"id": "sBrDXfQbO5yq"
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "oWLDfLGxpBvn"
},
"source": [
"### 2.3.1 Hyper-parameters\n",
"\n",
"We need to set up the values for the hyper-parameters, and define the form of the loss and the optimization methods.\n",
"\n",
"▶▶ **Check that you understand what are each of the variables below**\n",
"* one that you prabably don't know is the learning rate, we'll explain it in the next course. Broadly speaking, it corresponds to the amount of update used during training."
]
},
{
"cell_type": "code",
"metadata": {
"id": "fcGyjXbUoxx9"
},
"source": [
"# Many choices here!\n",
"VOCAB_SIZE = MAX_FEATURES\n",
"input_dim = VOCAB_SIZE\n",
"hidden_dim = 4\n",
"output_dim = 2\n",
"num_epochs = 5\n",
"learning_rate = 0.1"
],
"execution_count": 13,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### 2.3.2 Loss function\n",
"\n",
"Another thing that has to be decided is the kind of loss function we want to use.\n",
"Here we use a common one, called CrossEntropy.\n",
"We will come back in more details on this loss.\n",
"One important note is that this function in PyTorch includes the SoftMax function that should be applied after the output layer to get labels."
],
"metadata": {
"id": "yyJINiVHPoWq"
}
},
{
"cell_type": "code",
"source": [
"criterion = nn.CrossEntropyLoss()"
],
"metadata": {
"id": "TVVy7hhrPl-K"
},
"execution_count": 14,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### 2.3.3 Initialization of the model\n",
"\n",
"Now you can instantiate your class: define a model that is of the type FeedforwardNeuralNetModel using the values defined before as hyper-parameters."
# TP 2: Linear Algebra and Feedforward neural network
Master LiTL - 2023-2024
## Requirements
In this section, we will go through some code to learn how to manipulate matrices and tensors, and we will take a look at some PyTorch code that allows to define, train and evaluate a simple neural network.
The modules used are the the same as in the previous session, *Numpy* and *Scikit*, with the addition of *PyTorch*. They are all already available within colab.
## Part 1: Linear Algebra
In this section, we will go through some python code to deal with matrices and also tensors, the data structures used in PyTorch.
Sources:
* Linear Algebra explained in the context of deep learning: https://towardsdatascience.com/linear-algebra-explained-in-the-context-of-deep-learning-8fcb8fca1494
shape = (2, 3,) # shape is a tuple of tensor dimensions
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")
## from another tensor
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"\nFrom Ones Tensor: \n {x_ones} \n")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"From Random Tensor: \n {x_rand} \n")
```
%% Output
x_data tensor([[1, 2],
[3, 4]])
data type x_data= torch.int64
x_np tensor([[1, 2],
[3, 4]])
data type, np_array= int64 x_data= torch.int64
Random Tensor:
tensor([[0.4516, 0.4125, 0.0914],
[0.1381, 0.4802, 0.4308]])
Ones Tensor:
tensor([[1., 1., 1.],
[1., 1., 1.]])
Zeros Tensor:
tensor([[0., 0., 0.],
[0., 0., 0.]])
From Ones Tensor:
tensor([[1, 1],
[1, 1]])
From Random Tensor:
tensor([[0.8048, 0.0088],
[0.8002, 0.7587]])
%% Cell type:markdown id: tags:
### 1.2.2 Tensor attributes
▶▶ **A tensor has different attributes, print the values for:**
* shape of the tensor
* type of the data stored
* device on which data are stored
Look at the doc here: https://www.tensorflow.org/api_docs/python/tf/Tensor#shape
%% Cell type:code id: tags:
```
# Tensor attributes
tensor = torch.rand(3, 4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
```
%% Output
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu
%% Cell type:markdown id: tags:
### 1.2.3 Move to GPU
The code below is used to:
* check on which device the code is running, 'cuda' stands for GPU. If not GPU is found that we use CPU.
▶▶ **Check and move to GPU:**
* Run the code, it should say 'no cpu'
* Move to GPU: in Colab, allocate a GPU by going to Edit > Notebook Settings (Modifier > Paramètres du notebook)
* you'll see an indicator of connexion in the uppper right part of the screen
* Run the code from 1.2 again and the cell below (you can use the function Run / Run before or Exécution / Exécuter avant), you'll need to do all the imports again. You see the difference?
%% Cell type:code id: tags:
```
# We move our tensor to the GPU if available
if torch.cuda.is_available():
tensor = tensor.to('cuda')
print(f"Device tensor is stored on: {tensor.device}")
## Operations that have a _ suffix are in-place. For example: x.copy_(y), x.t_(), will change x.
print(tensor, "\n")
tensor.add_(5)
print(tensor)
```
%% Output
tensor([[1., 0., 1., 1.],
[2., 0., 1., 1.],
[3., 0., 1., 1.],
[4., 0., 1., 1.]], device='cuda:0')
tensor([[6., 5., 6., 6.],
[7., 5., 6., 6.],
[8., 5., 6., 6.],
[9., 5., 6., 6.]], device='cuda:0')
%% Cell type:markdown id: tags:
# Part 2: Feedforward Neural Network
In this practical session, we will explore a simple neural network architecture for NLP applications ; specifically, we will train a feedforward neural network for sentiment analysis, using the same dataset of reviews as in the previous session. We will also keep the bag of words representation.
Count_Vectorizer returns sparse arrays (for computational reasons) but PyTorch will expect dense input:
%% Cell type:code id: tags:
```
# from sparse to dense
x_train = x_train.toarray()
x_dev = x_dev.toarray()
print("Train:", x_train.shape)
print("Dev:", x_dev.shape)
```
%% Output
Train: (5027, 5000)
Dev: (549, 5000)
%% Cell type:markdown id: tags:
#### 2.1.2 Transform to tensors
▶▶ **Create a dataset object within the PyTorch library:**
The easiest way to load datasets with PyTorch is to use the DataLoader class. Here we're going to give our numpy array to this class, and first, we need to transform our data to tensors. Follow the following steps:
%% Cell type:code id: tags:
```
# Useful imports
import torch
from torch.utils.data import TensorDataset, DataLoader
```
%% Cell type:markdown id: tags:
* 1- **torch.from_numpy( A_NUMPY_ARRAY )**: transform your array into a tensor
* Note: you need to transform tensor type to float (for x), with **MY_TENSOR.to(torch.float)** (or cryptic error saying it was expecting long...).
* Print the shape of the tensor for your training data.
For this TP, we're going to walk through the code of a **simple feedforward neural network, with one hidden layer**.
This network takes as input bag of words vectors, exactly as our 'classic' models: each review is represented by a vector of the size the number of tokens in the vocabulary with '1' when a word is present and '0' for the other words.
%% Cell type:markdown id: tags:
### 2.2.1 Questions
▶▶ **What is the input dimension?**
▶▶ **What is the output dimension?**
%% Cell type:markdown id: tags:
▶▶ **What is the input dimension?** --> MAX FEATURES = 5000
▶▶ **What is the output dimension?** --> number of classes = 2
%% Cell type:code id: tags:
```
# Useful imports
import torch
import torch.nn as nn
```
%% Cell type:markdown id: tags:
### 2.2.2 Write the skeleton of the class
▶▶ We're going to **define our own neural network type**, by defining a new class:
* The class is called **FeedforwardNeuralNetModel**
* it inherits from the class **nn.Module**
* the constructor takes the following arguments:
* size of the input (i.e. **input_dim**)
* size of the hidden layer (i.e. **hidden_dim**)
* size of the output layer (i.e. **output_dim**)
* in the constructor, we will call the constructor of the parent class
%% Cell type:code id: tags:
```
# Start to define the class corresponding to our type of neural network
# Linear function (readout) # LINEAR ==> y = h1.W2
out3 = self.fc2(out2)
return out3
```
%% Cell type:markdown id: tags:
## 2.3 Training the network
Now we can use our beautiful class to define and then train our own neural network.
%% Cell type:markdown id: tags:
### 2.3.1 Hyper-parameters
We need to set up the values for the hyper-parameters, and define the form of the loss and the optimization methods.
▶▶ **Check that you understand what are each of the variables below**
* one that you prabably don't know is the learning rate, we'll explain it in the next course. Broadly speaking, it corresponds to the amount of update used during training.
%% Cell type:code id: tags:
```
# Many choices here!
VOCAB_SIZE = MAX_FEATURES
input_dim = VOCAB_SIZE
hidden_dim = 4
output_dim = 2
num_epochs = 5
learning_rate = 0.1
```
%% Cell type:markdown id: tags:
### 2.3.2 Loss function
Another thing that has to be decided is the kind of loss function we want to use.
Here we use a common one, called CrossEntropy.
We will come back in more details on this loss.
One important note is that this function in PyTorch includes the SoftMax function that should be applied after the output layer to get labels.
%% Cell type:code id: tags:
```
criterion = nn.CrossEntropyLoss()
```
%% Cell type:markdown id: tags:
### 2.3.3 Initialization of the model
Now you can instantiate your class: define a model that is of the type FeedforwardNeuralNetModel using the values defined before as hyper-parameters.
%% Cell type:code id: tags:
```
# Initialization of the model
# ...
```
%% Cell type:code id: tags:
```
# Initialization of the model
model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)
```
%% Cell type:markdown id: tags:
### 2.3.4 Optimizer
At last, we need to indicate the method we want to use to optimize our network.
Here, we use a common one called Stochastic Gradient Descent.
We will also go back on that later on.
Note that its arguments are:
* the parameters of our models (the Ws)
* the learning rate
Based on these information, it can make the necessary updates.