diff --git a/notebooks/TP4_m2LiTL_EmbeddingsWithNN_CORRECT_2324.ipynb b/notebooks/TP4_m2LiTL_EmbeddingsWithNN_CORRECT_2324.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..34343bddec459d3c475f97abdc0e31ad57270ca7 --- /dev/null +++ b/notebooks/TP4_m2LiTL_EmbeddingsWithNN_CORRECT_2324.ipynb @@ -0,0 +1,3685 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "source": [ + "# TP 4 : machine learning using neural network for text data\n", + "\n", + "In this practical session, we are going to build simple neural models able to classify reviews as positive or negative. The dataset used comes from AlloCine.\n", + "The goals are to understand how to use pretrained embeddings, and to correctly tune a neural model.\n", + "\n", + "you need to load:\n", + "- Allocine: Train, dev and test sets\n", + "- Embeddings: cc.fr.300.10000.vec (10,000 first lines of the original file)\n", + "\n", + "## Part 1- Pre-trained word embeddings\n", + "Define a neural network that takes as input pre-trained word embeddings (here FastText embeddings). Words are represented by real-valued vectors from FastText. A review is represented by a vector that is the average or the sum of the word vectors.\n", + "\n", + "So instead of having an input vector of size 5000, we now have an input vector of size e.g. 300, that represents the ‘average’, combined meaning of all the words in the document taken together.\n", + "\n", + "## Part 2- Tuning report\n", + "Tune the model built on pre-trained word embeddings by testing several values for the different hyper-parameters, and by testing the addition on an hidden layer.\n", + "\n", + "Describe the performance obtained by reporting the scores for each setting on the development set, printing the loss function against the hyper-parameter values, and reporting the score of the best model on the test set.\n", + "\n", + "-------------------------------------" + ], + "metadata": { + "id": "jShhTl5Mftkw" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Useful imports\n", + "\n", + "Here we also:\n", + "* Look at the availability of a GPU. Reminder: in Collab, you have to go to Edit/Notebook settings to set the use of a GPU\n", + "* Setting a seed, for reproducibility: https://pytorch.org/docs/stable/notes/randomness.html\n" + ], + "metadata": { + "id": "mT2uF3G6HXko" + } + }, + { + "cell_type": "code", + "source": [ + "import time\n", + "import pandas as pd\n", + "import numpy as np\n", + "# torch and torch modules to deal with text data\n", + "import torch\n", + "import torch.nn as nn\n", + "from torchtext.data.utils import get_tokenizer\n", + "from torchtext.vocab import build_vocab_from_iterator\n", + "from torch.utils.data import DataLoader\n", + "# you can use scikit to print scores\n", + "from sklearn.metrics import classification_report\n", + "\n", + "# For reproducibility, set a seed\n", + "torch.manual_seed(0)\n", + "\n", + "# Check for GPU\n", + "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", + "print(device)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "nB_k89m8xAOt", + "outputId": "0770ab66-263a-4895-b03c-e5196b865cfc" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "cuda\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "Paths to data:" + ], + "metadata": { + "id": "taGY9N-PJvWS" + } + }, + { + "cell_type": "code", + "source": [ + "# Data files\n", + "train_file = \"allocine_train.tsv\"\n", + "dev_file = \"allocine_dev.tsv\"\n", + "test_file = \"allocine_test.tsv\"\n", + "# embeddings\n", + "embed_file='cc.fr.300.10000.vec'" + ], + "metadata": { + "id": "kGty4hWCJurB" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Part 0: Read and load the data\n", + "\n", + "Reminder from TP1, the simplest solution is to use the DataLoader from PyTorch: \n", + "* the doc here https://pytorch.org/docs/stable/data.html and here https://pytorch.org/tutorials/beginner/basics/data_tutorial.html\n", + "* an example of use, with numpy array: https://www.kaggle.com/arunmohan003/sentiment-analysis-using-lstm-pytorch\n", + "\n", + "Here, we are going to define our own Dataset class instead of using numpy arrays. It allows for a a finer definition of the behavior of our dataset, and it's easy to reuse.\n", + "* Dataset is an abstract class in PyTorch, meaning it can't be used as is, it has to be redefined using inheritance https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset\n", + " * you must at least overwrite the ```__getitem__()``` method, supporting fetching a data sample for a given key.\n", + " * in practice, you also overwrite the ```__init__()``` to explain how to initialize the dataset, and the ```__len__``` to return the right size for the dataset\n", + "\n", + "You can also find many datasets for text ready to load in pytorch on: https://pytorch.org/text/stable/datasets.html" + ], + "metadata": { + "id": "Wv6H41YoFycw" + } + }, + { + "cell_type": "markdown", + "source": [ + "### 0.1 Load data (code given)\n", + "\n", + "Read the code below that allows to load the data, note that:\n", + "- we tokenize the text (here a simple tokenization based on spaces)\n", + "- we build the vocabulary corresponding to the training data:\n", + " - the vocabulary corresponds to the set of unique tokens\n", + " - only tokens in the training data are known by the system\n", + " - the vocabulary here is a Torch specific object, more details in section 0.4 below\n", + "\n", + "▶▶ **Question:** why do we use only tokens in the training set to build the vocabulary? What do we do with the dev and test sets?" + ], + "metadata": { + "id": "04vEei9QHPou" + } + }, + { + "cell_type": "code", + "source": [ + "# Here we create a custom Dataset class that inherits from the Dataset class in PyTorch\n", + "# A custom Dataset class must implement three functions: __init__, __len__, and __getitem__\n", + "\n", + "\n", + "class Dataset(torch.utils.data.Dataset):\n", + "\n", + " def __init__(self, tsv_file, vocab=None ):\n", + " \"\"\" (REQUIRED) Here we save the location of our input file,\n", + " load the data, i.e. retrieve the list of texts and associated labels,\n", + " build the vocabulary if none is given,\n", + " and define the pipelines used to prepare the data \"\"\"\n", + " self.tsv_file = tsv_file\n", + " self.data, self.label_list = self.load_data( )\n", + " # splits the string sentence by space, can t make the fr tokenzer work\n", + " self.tokenizer = get_tokenizer( None )\n", + " self.vocab = vocab\n", + " if not vocab:\n", + " self.build_vocab()\n", + " # pipelines for text and label\n", + " self.text_pipeline = lambda x: self.vocab(self.tokenizer(x)) #return a list of indices from a text\n", + " self.label_pipeline = lambda x: int(x) #simple mapping to self\n", + "\n", + " def load_data( self ):\n", + " \"\"\" Read a tsv file and return the list of texts and associated labels\"\"\"\n", + " data = pd.read_csv( self.tsv_file, header=0, delimiter=\"\\t\", quoting=3)\n", + " instances = []\n", + " label_list = []\n", + " for i in data.index:\n", + " label_list.append( data[\"sentiment\"][i] )\n", + " instances.append( data[\"review\"][i] )\n", + " return instances, label_list\n", + "\n", + " def build_vocab(self):\n", + " \"\"\" Build the vocabulary, i.e. retrieve the list of unique tokens\n", + " appearing in the corpus (= training set). Se also add a specific index\n", + " corresponding to unknown words. \"\"\"\n", + " self.vocab = build_vocab_from_iterator(self.yield_tokens(), specials=[\"<unk>\"])\n", + " self.vocab.set_default_index(self.vocab[\"<unk>\"])\n", + "\n", + " def yield_tokens(self):\n", + " \"\"\" Iterator on tokens \"\"\"\n", + " for text in self.data:\n", + " yield self.tokenizer(text)\n", + "\n", + " def __len__(self):\n", + " \"\"\" (REQUIRED) Return the len of the data,\n", + " i.e. the total number of instances \"\"\"\n", + " return len(self.data)\n", + "\n", + " def __getitem__(self, index):\n", + " \"\"\" (REQUIRED) Return a specific instance in a format that can be\n", + " processed by Pytorch, i.e. torch tensors \"\"\"\n", + " return (\n", + " tuple( [torch.tensor(self.text_pipeline( self.data[index] ), dtype=torch.int64),\n", + " torch.tensor( self.label_pipeline( self.label_list[index] ), dtype=torch.int64) ] )\n", + " )" + ], + "metadata": { + "id": "GdK1WAmcFYHS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 0.2 Generate data batches and iterator (code given)\n", + "\n", + "Then, we use *torch.utils.data.DataLoader* with a Dataset object as built by the code above. DataLoader has an argument to set the size of the batches, but since we have variable-size input sequences, we need to specify how to build the batches. This is done by redefining the function *collate_fn* used by *DataLoader*.\n", + "\n", + "```\n", + "dataloader = DataLoader(dataset, batch_size=8, shuffle=False, collate_fn=collate_fn)\n", + "```\n", + "\n", + "Below:\n", + "* the text entries in the original data batch input are packed into a list and concatenated as a single tensor.\n", + "* the offset is a tensor of delimiters to represent the beginning index of the individual sequence in the text tensor\n", + "* Label is a tensor saving the labels of individual text entries.\n", + "\n", + "The offsets are used to retrieve the individual sequences in each batch (the sequences are concatenated)." + ], + "metadata": { + "id": "bG3T9LQFTD73" + } + }, + { + "cell_type": "code", + "source": [ + "# This function explains how we process data to make batches of instances\n", + "# - The list of texts / reviews that is returned is similar to a list of list:\n", + "# each element is a batch, ie. a ensemble of BATCH_SIZE texts. But instead of\n", + "# creating sublists, PyTorch concatenates all the tensors corresponding to\n", + "# each text sequence into one tensor.\n", + "# - The list of labels is the list of list of labels for each batch\n", + "# - The offsets are used to save the position of each individual instance\n", + "# within the big tensor\n", + "def collate_fn(batch):\n", + " label_list, text_list, offsets = [], [], [0]\n", + " for ( _text, _label) in batch:\n", + " text_list.append( _text )\n", + " label_list.append( _label )\n", + " offsets.append(_text.size(0))\n", + " label = torch.tensor(label_list, dtype=torch.int64) #tensor of labels for a batch\n", + " offsets = torch.tensor(offsets[:-1]).cumsum(dim=0) #tensor of offset indices for a batch\n", + " text_list = torch.cat(text_list) # <--- here we concatenate the reviews in the batch\n", + " return text_list.to(device), label.to(device), offsets.to(device) #move the data to GPU" + ], + "metadata": { + "id": "oG0ZEYvYccBr" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 0.3 Exercise: Load the data\n", + "\n", + "* Use the code above to load the training and dev data with a batch size of 2:\n", + " * First create an instance of the Dataset class\n", + " * Then use this instance to create an instance of the DataLoader class with a batch size of 2, with NO shuffling of the samples, and using the *collate_fn* function defined above. Recall that the DataLoader class has the following parameters:\n", + " ```\n", + " torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=None, collate_fn=None)\n", + " ```\n", + "* Print the first two elements in the Dataset object built on the train set, and the first element in the DataLoader object built on the train. Print also the associated labels. Does it seem coherent?\n", + "\n", + "Once you checked that is seems ok, reload the data but this time, shuffle the data during loading." + ], + "metadata": { + "id": "U0ueXxdpZcqx" + } + }, + { + "cell_type": "code", + "source": [ + "# Load the training and development data\n" + ], + "metadata": { + "id": "hmNi9Zmla6CJ" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----------------------------------------------\n", + "SOLUTION" + ], + "metadata": { + "id": "GJFYwcRza7_M" + } + }, + { + "cell_type": "code", + "source": [ + "# Load the training and development data\n", + "train = Dataset( train_file )\n", + "dev = Dataset( dev_file, vocab=train.vocab )\n", + "\n", + "train_loader = DataLoader(train, batch_size=1, shuffle=False, collate_fn=collate_fn) #<-- use shuffle = True instead\n", + "dev_loader = DataLoader(dev, batch_size=1, shuffle=False, collate_fn=collate_fn)\n", + "\n", + "\n", + "print(train[0])\n", + "print(train[1])\n", + "for input, label, offset in train_loader:\n", + " print( input, label, input.size(), offset )\n", + " break" + ], + "metadata": { + "id": "sGAiiL2rY7hD", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "50d0ba12-2f2c-4448-90a2-13ca3de04c06" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "(tensor([ 2281, 2675, 374, 28, 13940, 15, 11282, 18, 3936, 203,\n", + " 1, 7998, 4, 307, 1114, 9134, 4495, 1, 92, 8752,\n", + " 24, 104, 28, 117, 53, 638, 8, 418, 23, 23816,\n", + " 904, 1378, 1, 126, 8, 1, 86, 108, 46, 4622,\n", + " 34, 2719, 91, 203, 49, 121, 2, 49, 1179, 113,\n", + " 111, 50, 136, 70, 3190, 19, 11708, 5, 12735, 91,\n", + " 7, 47, 431, 1498, 177, 4, 2738, 4, 550, 2,\n", + " 4, 46, 7858, 49, 1244, 5, 6791, 1220, 2, 6,\n", + " 376, 34, 345, 9, 593, 1158, 233, 2191, 31216, 33258,\n", + " 2822, 1486, 23, 219, 1, 3, 7, 2187, 112, 17,\n", + " 129, 37130, 1, 2845, 93, 95, 8111]), tensor(0))\n", + "(tensor([18487, 54, 7, 5, 8463, 159, 6042, 2, 12809, 12,\n", + " 30, 1385, 107, 14, 397, 8726, 1, 4654, 1, 6883,\n", + " 1, 12997, 43, 333, 22, 37, 149, 33, 532, 25,\n", + " 134, 4031, 31, 13, 283, 2584, 19, 4850, 12, 5501,\n", + " 270, 14, 6159, 5, 3, 121, 1, 3, 48, 2651]), tensor(1))\n", + "tensor([ 2281, 2675, 374, 28, 13940, 15, 11282, 18, 3936, 203,\n", + " 1, 7998, 4, 307, 1114, 9134, 4495, 1, 92, 8752,\n", + " 24, 104, 28, 117, 53, 638, 8, 418, 23, 23816,\n", + " 904, 1378, 1, 126, 8, 1, 86, 108, 46, 4622,\n", + " 34, 2719, 91, 203, 49, 121, 2, 49, 1179, 113,\n", + " 111, 50, 136, 70, 3190, 19, 11708, 5, 12735, 91,\n", + " 7, 47, 431, 1498, 177, 4, 2738, 4, 550, 2,\n", + " 4, 46, 7858, 49, 1244, 5, 6791, 1220, 2, 6,\n", + " 376, 34, 345, 9, 593, 1158, 233, 2191, 31216, 33258,\n", + " 2822, 1486, 23, 219, 1, 3, 7, 2187, 112, 17,\n", + " 129, 37130, 1, 2845, 93, 95, 8111], device='cuda:0') tensor([0], device='cuda:0') torch.Size([107]) tensor([0], device='cuda:0')\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "### 0.4 Exercise: understand the Vocab object\n", + "\n", + "Here the **vocabulary** is a specific object in Pytorch: https://pytorch.org/text/stable/vocab.html\n", + "\n", + "For example, the vocabulary directly converts a list of tokens into integers, see below.\n", + "\n", + "Now try to:\n", + "* Retrieve the indices of a specific word, e.g. 'mauvais'\n", + "* Retrieve a word from its index, e.g. 368\n", + "* You can also directly convert a sentence to a list of indices, using the *text_pipeline* defined in the *Dataset* class, try with:\n", + " * 'Avant cette série, je ne connaissais que Urgence'\n", + " * 'Avant cette gibberish, je ne connaissais que Urgence'\n", + " * what happened when you use a word that is unknown?" + ], + "metadata": { + "id": "Tus9Kedas5dq" + } + }, + { + "cell_type": "markdown", + "source": [ + "Hints: look at these functions\n", + "* lookup_indices(tokens: List[str]) → List[int]\n", + "* lookup_token(index: int) → str" + ], + "metadata": { + "id": "BR-hQMJlUfPR" + } + }, + { + "cell_type": "code", + "source": [ + "train.vocab(['Avant', 'cette', 'série', ','])" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "tb6TYA9Is5v6", + "outputId": "efc61903-0398-4ee4-b68f-aa64f61f7d89" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[2910, 18, 7, 144]" + ] + }, + "metadata": {}, + "execution_count": 8 + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "------------------------------------\n", + "SOLUTION\n", + "\n", + "You can use it to retrieve the indice of a specific word, e.g. 'mauvais'." + ], + "metadata": { + "id": "3aAwvzFavjIY" + } + }, + { + "cell_type": "code", + "source": [ + "print( train.vocab.lookup_indices( ['mauvais'] ))" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "k9cKqyj3vjT8", + "outputId": "8309230d-321c-48dc-9327-ee36957eaf17" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "[246]\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "print( train.vocab.lookup_token( 368 ) )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ATxVspC0bBO1", + "outputId": "82218041-25e7-4238-88cf-4f49f6b202d7" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "pas,\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "train.text_pipeline('Avant cette série, je ne connaissais que Urgence')" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "6i4C4sdmbN7N", + "outputId": "9fa4b33d-91e9-4e47-8480-b6eaade7ef3c" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[2910, 18, 89, 16, 17, 6120, 8, 10529]" + ] + }, + "metadata": {}, + "execution_count": 10 + } + ] + }, + { + "cell_type": "code", + "source": [ + "train.text_pipeline('Avant cette gibberish, je ne connaissais que Urgence')" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "8x32L7mVbN8p", + "outputId": "eb6f5a32-3c88-4a07-b307-a886fce467db" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "[2910, 18, 0, 16, 17, 6120, 8, 10529]" + ] + }, + "metadata": {}, + "execution_count": 11 + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Part 1- Using pretrained embeddings\n", + "\n", + "The first option would be to use randomly initialized word embeddings.\n", + "It allows the use of dense, real-valued input, that could be updated during training.\n", + "However, we probably don't have enough data to build good representations for our problem during training.\n", + "One solution is to use pre-trained word embeddings, built over very big corpora with the aim of building good generic representations of the meaning of words.\n", + "\n", + "Upload the file *cc.fr.300.10000.vec': first 10,000 lines of the FastText embeddings for French, https://fasttext.cc/docs/en/crawl-vectors.html.\n", + "\n", + "* **Each word is associated to a real-valued and low-dimensional vector** (e.g. 300 dimensions). Crucially, the neural network will also learn / update the embeddings during training (if not freezed): the embeddings of the network are also parameters that are optimized according to the loss function, allowing the model to learn a better representation of the words.\n", + "\n", + "* And **each review is represented by a vector** that should represent all the words it contains. One way to do that is to use **the average of the word vectors** (another typical option is to sum them). Instead of a bag-of-words representation of thousands of dimensions (the size of the vocabulary), we will thus end with an input vector of size e.g. 300, that represents the ‘average’, combined meaning of all the words in the document taken together." + ], + "metadata": { + "id": "UDlM7OZq56HO" + } + }, + { + "cell_type": "markdown", + "source": [ + "### 1.1 Load the vectors (code given)\n", + "\n", + "The function below loads the pre-trained embeddings, returning a dictionary mapping a word to its vector, as defined in the fasttext file.\n", + "\n", + "Note that the first line of the file gives the number of unique tokens (in the original file, here we only have 9,999 tokens) and the size of the embeddings.\n", + "\n", + "At the end, we print the vocabulary and the vector for a specific token." + ], + "metadata": { + "id": "RX2DkAqws1gU" + } + }, + { + "cell_type": "code", + "source": [ + "import io\n", + "\n", + "def load_vectors(fname):\n", + " fin = io.open(fname, 'r', encoding='utf-8', newline='\\n', errors='ignore')\n", + " n, d = map(int, fin.readline().split())\n", + " print(\"Originally we have: \", n, 'tokens, and vectors of',d, 'dimensions') #here in fact only 10000 words\n", + " data = {}\n", + " for line in fin:\n", + " tokens = line.rstrip().split(' ')\n", + " data[tokens[0]] = [float(t) for t in tokens[1:]]\n", + " return data\n", + "\n", + "vectors = load_vectors( embed_file )\n", + "print( 'Version with', len( vectors), 'tokens')\n", + "print(vectors.keys() )\n", + "print( vectors['de'] )" + ], + "metadata": { + "id": "yd2EEjECv4vk", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "adf05aec-8a74-48cc-dc43-f44c8e011f4a" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Originally we have: 2000000 tokens, and vectors of 300 dimensions\n", + "Version with 9999 tokens\n", + "dict_keys([',', 'de', '.', '</s>', 'la', 'et', ':', 'à', 'le', '\"', 'en', '’', 'les', 'des', ')', '(', 'du', 'est', 'un', \"l'\", \"d'\", 'une', 'pour', '/', '|', 'dans', 'sur', 'que', 'par', 'au', 'a', 'l', 'qui', '-', 'd', 'il', 'pas', '!', 'avec', '_', 'plus', \"'\", 'Le', 'ce', 'ou', 'La', 'ne', 'se', '»', '...', '?', 'vous', 'sont', 'son', '«', 'je', 'Les', 'Il', 'aux', '1', ';', 'mais', \"qu'\", 'on', \"n'\", 'comme', '2', 'sa', 'cette', 'y', 'nous', 'été', 'tout', 'fait', 'En', \"s'\", 'bien', 'ses', 'très', 'ont', 's', 'être', 'votre', 'ai', 'elle', 'n', '3', 'même', \"L'\", 'deux', 'faire', \"c'\", 'aussi', '>', 'leur', '%', 'si', 'entre', 'qu', '€', '&', '4', 'sans', 'Je', \"j'\", 'était', '10', 'autres', 'tous', 'peut', 'France', 'ces', '…', '5', 'lui', 'me', ']', '[', 'où', 'ans', '6', '#', 'après', '+', 'ils', 'dont', 'Pour', '°', '–', 'temps', '*', 'sous', 'Un', 'avoir', 'L', 'A', '}', 'site', 'peu', 'mon', 'encore', '12', 'depuis', '0', 'ça', 'fois', '2017', 'ainsi', 'alors', 'donc', 'notre', 'Ce', '20', '11', 'autre', 'monde', 'non', 'Paris', 'avant', 'Une', 'Elle', '15', 'également', 'Re', 'contre', 'Vous', 'c', 'moins', 'tu', 'suis', '7', 'ville', 'avait', 'vos', 'vers', 'premier', 'vie', 'Et', '2016', '2014', 'jour', '00', '2013', 'leurs', 'Dans', 'soit', '2012', 'toutes', 'nom', '2015', '14', 'De', 'On', '8', 'prix', '18', \"C'\", 'Mais', 'partie', '•', 'nos', 'voir', 'article', '16', 'Plus', '13', 'of', 'chez', 'inscription', 'première', 'quelques', 'toujours', '17', 'Nous', 'plusieurs', 'mai', 'place', 'français', '2011', 'cas', 'puis', 'Cette', 'année', 'ma', 'toute', '2010', 'the', '30', 'suite', 'pays', 'The', 'années', 'lors', 'fin', 'bon', '19', 'À', '21', 'dit', 'trois', 'grand', 'quand', 'partir', 'car', 'sera', '22', 'cet', 'jours', 'C', '2009', 'petit', '=', \"J'\", 'Si', 'maison', 'fut', 'ligne', 'faut', '9', 'nouveau', 'moi', 'lieu', 'mois', '23', 'cours', 'personnes', 'va', 'déjà', 'cela', '2008', 'beaucoup', 'juin', 'groupe', 'mars', 'travail', 'nouvelle', 'compte', '24', 'page', 'messages', '25', 'and', 'janvier', 'hui', 'film', 'commune', 'j', 'grande', 'ici', 'Au', 'avril', \"m'\", 'histoire', '2007', 'détail', 'famille', 'savoir', 'doit', 'avis', 'chaque', 'trop', 'enfants', 'eau', 'm', 'part', \"jusqu'\", 'septembre', 'mes', 'homme', 'rien', 'avons', 'octobre', 'décembre', 'forum', 'jeu', 'produits', 'trouve', 'juillet', 'produit', 'équipe', 'CEST', 'politique', 'là', 'novembre', 'permet', 'in', 'titre', 'pendant', 'notamment', 'recherche', 'nombre', '·', 'dire', 'http', 'service', 'pouvez', 'février', 'point', 'dernier', '05', 'moment', 'selon', 'mort', 'droit', '2006', 'DE', 'afin', 'jamais', 'effet', 'mise', 'Des', '—', '26', 'région', 'projet', '\\\\', 'saison', 'août', 'niveau', '28', 'reste', 'bonne', 'ensemble', '27', 'peuvent', 'exemple', 'Voir', '01', 'série', 'souvent', 'centre', 'Après', 'écrit', 'pouvoir', '--', 'mettre', 'km', 'général', 'Page', 'forme', 'début', '09', 'ceux', 'personne', 'eu', 'française', 'vraiment', 'services', 'demande', '29', 'question', 'Par', 'près', 'Merci', 'celui', 'qualité', 'vue', 'tant', 'petite', 'système', '©', 'Ils', 'ailleurs', 'Europe', 'avez', 'mieux', 'société', '^', 'informations', 'données', 'prendre', 'elles', 'guerre', 'surtout', 'to', 'Jean', 'né', 'CET', '08', 'certains', '06', 'village', 'membres', 'rapport', 'an', 'face', 'étaient', 'mot', 'femme', 'possible', '50', 'seul', '@', 'Prix', '04', 'rue', '07', 'te', 'celle', 'mal', 'articles', 'aide', 'nombreux', 'base', 'ayant', '<', '03', '2005', 'entreprise', 'Catégorie', '..', 'ni', 'liste', '02', 'livre', 'passe', 'https', 'mis', 'seulement', 'côté', 'public', 'utilisation', 'ton', 'développement', '31', 'vu', '100', 'D', 'chose', 'dès', 'quatre', 'situé', 'Ces', 'devant', 'photos', 'hommes', 'trouver', 'Son', 'image', '\\xad', 'fr', 'plan', 'étant', 'type', 'tour', '$', 'grâce', 'cadre', 'juste', 'musique', 'président', 'version', 'aime', 'points', 'simple', 'Avec', 'formation', 'jeune', 'assez', 'quoi', 'offre', 'origine', 'sens', 'serait', 'gratuit', 'Pierre', 'heures', 'Nombre', 'corps', 'salle', 'tête', 'sujet', 'adresse', 'carte', 'minutes', 'date', 'font', 'fils', 'création', 'donne', 'e', 'choix', 'album', 'dernière', 'agit', 'loi', 'passé', 'propre', 'coup', 'propose', 'environ', 'chambre', 'accès', 'devient', '....', \"D'\", 'semaine', 'sécurité', 'parce', 'vidéo', 'ensuite', 'porte', 'h', 'lien', 'haut', 'comment', 'femmes', 'façon', 'nationale', 'état', 'présente', 'long', 'nouvelles', 'tard', 'besoin', 'raison', 'club', 'gouvernement', 'retour', 'genre', 'problème', 'x', 'ancien', 'époque', 'séjour', 'Sur', 'Forum', 'passer', 'information', '40', 'auteur', 'belle', '�', 'autour', 'eux', 'rôle', 'bois', '2004', 'meilleur', 'jeux', 'marché', 'deuxième', 'population', 'État', 'manière', 'santé', 'photo', 'J', 'particulier', 'semble', 'pense', 'merci', 'proche', 'N', 'air', 'Tous', 'aurait', 'fonction', 'Tout', 'différents', 'Mar', 'entreprises', 'statistiques', 'plutôt', 'nuit', 'accueil', 'située', 'ordre', 'aller', '--Les', 'êtes', 'école', 'père', 'droits', 'as', 'petits', 'utiliser', 'édition', \"aujourd'\", 'occasion', 'maintenant', 'États-Unis', 'période', 'Grand', 'Saint', 'donner', 'fille', 'Lire', 'jeunes', 'millions', 'activités', 'sommes', 'aucun', 'enfant', 'seule', 'production', '000', 'autant', 'M.', 'II', 'anglais', 'hôtel', 'œuvre', 'habitants', 'espace', '“', 'art', 'nouveaux', 'Ajouter', 'réseau', 'gestion', 'modèle', 'but', 'prend', '2000', 'parfois', 'I', 'département', 'national', 'marque', 'New', 'veut', 'activité', 'quelque', 'église', 'avais', 'propos', '”', 'gauche', 'cause', 'texte', 'idée', 'pris', 'nombreuses', 'chef', 'existe', 'mots', 'main', 'scène', 'grands', 'route', 'gens', 'style', 'sites', 'durant', 'programme', 'pu', 'études', 'mesure', 'calme', 'Comment', 'conditions', 'ministre', 'seront', 'terme', 'laquelle', 'vient', 'mode', 'or', 'Comme', 'jardin', 'www.insee.fr', 'situation', 'travaux', 'vacances', 'journée', 'vrai', 'membre', 'plein', 'code', 'sein', 'web', 'rencontre', 'lire', 'mer', 'Du', 'numéro', 'pages', 'action', 'euros', 'Mai', 'loin', 'lorsque', 'sais', 'agréable', 'domaine', '2003', 'pourrait', 'nature', 'travers', 'Conseil', 'disponible', 'expérience', 'fond', 'François', 'roi', 'siècle', 'oui', 'sud', 'etc.', 'choses', 'heure', 'LA', 'Accueil', 'milieu', 'cuisine', 'pratique', 'terre', 'grandes', 'blog', 'américain', '~', 'questions', 'vente', 'construction', 'pourquoi', 'peux', 'différentes', 'toi', 'répondre', 'jusqu', 'Mon', 'emploi', 'abord', 'sortie', 'intérieur', 'droite', 'bas', 'cinq', 'Louis', 'aucune', 'plaisir', 'premiers', 'message', 'pièces', 'suivant', 'donné', 'enfin', 'proximité', 'logement', 'Alors', 'prise', 'voiture', 'objet', 'Nord', 'accord', 'section', 'âge', 'gros', 'nord', 'découvrir', 'technique', 'présent', 'République', 'soir', 'Depuis', 'créer', 'S', 'concernant', 'jouer', 'Paul', 'important', '2002', '1er', 'succès', 'appartement', 'Jeu', 'chambres', 'met', 'campagne', 'discuter', 'peut-être', 'territoire', 'Bonjour', 'certaines', 'argent', 'langue', 'rapide', 'parmi', 'geo', 'Internet', 'John', 'vais', 'Charles', 'résultats', 'Dieu', 'direction', 'moyen', '²', 'Français', 'Canada', 'couleur', 'Jeux', 'rendre', 'poste', 'fort', 'Sa', 'auprès', 'départ', 'armée', 'Michel', 'Centre', 'entrée', 'valeur', '2001', 'avaient', 'charge', 'zone', 'min', 'cœur', 'mère', 'match', 'taille', 'Allemagne', 'amour', 'noir', \"t'\", 'Sud', 'clients', 'aura', 'naissance', 'annonce', 'quartier', 'Québec', 'économique', 'frais', 'Afrique', 'mm', 'voyage', 'Pas', 'Selon', 'réponse', 'pied', 'Maison', 'international', 'culture', 'troisième', 'Mer', 'beau', 'connu', 'affaires', 'blanc', 'voix', 'doivent', 'directement', 'plupart', 'rouge', 'compris', 'amis', 'conseil', 'classe', 'Université', 'sujets', 't', 'Jacques', 'presse', 'protection', 'parti', 'arrivée', '35', 'rapidement', 'obtenir', 'application', 'parler', 'p.', 'association', 'doute', 'Sujets', 'mondiale', 'château', 'communauté', 'appel', 'images', 'panier', 'lequel', 'projets', 'étude', 'football', '60', 'générale', 'vite', 'libre', 'commentaire', 'arrive', 'Cet', 'ta', 'matière', 'aider', 'contrôle', 'risque', 'cm', 'commande', 'trouvé', '45', 'quel', 'unique', 'politiques', 'voit', 'Quand', 'intérêt', 'source', 'communes', 'contenu', 'internet', '1999', 'organisation', 'Date', 'utilisé', 'Robert', 'secteur', 'for', 'présence', 'B', 'mouvement', 'référence', 'is', 'villes', 'double', 'catégorie', 'techniques', 'force', 'lettres', 'ancienne', 'simplement', 'yeux', 'éléments', 'île', 'carrière', 'Coupe', 'vont', 'joueur', 'livres', 'passage', 'ET', 'historique', 'commence', 'petites', 'Italie', 'Cela', 'presque', 'Sam', 'sociale', 'parle', '32', 'moyenne', 'épisode', 'réalisé', 'particulièrement', 'problèmes', 'environnement', 'terrain', 'taux', 'films', 'tel', 'roman', 'David', 'chacun', '80', 'Bien', 'téléphone', 'pièce', 'Messages', 'actuellement', 'Tu', 'divers', 'super', 'dernières', 'Recherche', 'Histoire', 'similaires', 'second', 'couleurs', 'publié', 'parc', 'esprit', 'Votre', '33', 'derniers', 'énergie', 'publique', 'créé', 'cinéma', 'Union', 'lit', 'moteur', 'seconde', 'York', 'aujourd', 'disponibles', 'Philippe', 'sûr', 'US', 'Posté', 'TV', 'es', 'Non', 'facile', 'social', 'large', 'Google', 'siège', 'Lun', 'longtemps', 'communication', 'nécessaire', 'bord', 'Site', 'Ainsi', 'permis', 'liens', 'matin', 'directeur', 'mètres', 'Belgique', 'durée', 'vivre', 'Oui', 'Dim', 'table', 'Que', 'principal', 'solution', 'joue', 'devrait', 'idées', 'suivre', 'dimanche', 'personnel', 'ouverture', 'total', 'sait', 'envie', 'meilleure', 'six', 'fais', 'fil', 'collection', 'Liste', 'Marie', 'premières', 'semaines', 'groupes', 'désormais', 'parents', 'malgré', 'hôtels', '1998', 'Espagne', 'Guerre', 'Tour', '1990', 'ami', 'manque', 'lettre', 'position', 'hors', 'finale', 'via', 'cependant', 'nommé', 'conseils', 'haute', 'laisser', 'Notre', 'lieux', 'professionnels', 'difficile', 'militaire', 'venir', 'celles', 'bout', 'visite', 'Ven', 'évolution', 'coeur', 'internationale', 'veux', 'comprendre', 'université', 'voie', 'Rechercher', 'permettant', 'contrat', 'LE', 'Société', 'cher', 'Club', 'économie', 'soleil', 'partager', 'professionnel', 'chemin', 'devenir', 'permettre', 'Chine', 'bar', 'commentaires', 'établissement', 'traitement', 'réalité', 'utilise', 'retrouve', 'sélection', 'train', 'élèves', 'usage', 'port', 'tels', 'Bon', 'Etat', 'tes', 'européenne', 'Wikipédia', 'objectif', 'espèce', '{', 'faisant', 'concours', 'feu', 'lecture', 'location', 'suivi', 'certain', 'ca', '200', 'joueurs', 'vendredi', 'mariage', 'écran', 'propriété', '36', 'endroit', 'résultat', 'possède', 'samedi', 'disposition', 'décision', 'Facebook', 'analyse', 'mission', 'Très', 'etc', 'marche', '1997', 'from', '│', 'Lyon', 'Toutes', '34', 'soient', 'bâtiment', 'DU', 'moyens', 'province', 'Art', 'suivante', 'compagnie', 'longue', 'Fichier', 'américaine', 'puisque', 'inscrit', 'sorti', 'at', 'lundi', 'publics', 'pourtant', 'éviter', 'Suisse', 'finalement', 'Cependant', 'achat', 'personnage', 'parcours', 'Nouveau', 'enseignement', 'Commentaires', 'reçu', 'animaux', 'meilleurs', 'complet', 'parties', 'sources', '1996', '70', 'musée', 'chanson', 'Article', 'montre', 'Nos', 'Image', 'devez', 'importe', 'contact', 'officiel', 'outils', '1995', 'lui-même', 'DES', 'actions', 'peine', 'Juin', 'allemand', 'note', 'affaire', 'Église', 'bureau', 'processus', 'sol', 'matériel', 'Qui', 'changer', 'ait', '38', 'Nicolas', 'pratiques', 'importante', 'ouvrage', 'Pays', 'document', 'San', 'comprend', 'parfait', 'bain', 'furent', 'attention', 'liberté', 'possibilité', 'uniquement', 'Jan', 'M', 'by', 'X', 'sort', 'théâtre', 'frère', 'équipes', 'Ses', 'championnat', 'relation', 'police', 'mémoire', 'est-ce', \"S'\", 'Enfin', 'salon', 'Musée', 'laisse', 'commerce', 'armes', '44', 'personnages', '48', 'Henri', 'soutien', 'client', 'quelle', 'vitesse', 'Articles', 'lumière', 'extérieur', 'utilisateur', 'victoire', 'hôtes', 'Lors', 'course', '42', 'réaliser', 'choisir', 'objets', 'III', 'administration', 'véritable', 'bons', 'éducation', 'ouest', 'derrière', 'Ligue', 'tandis', 'généralement', 'Deux', 'annonces', 'peuple', 'acheter', 'règles', 'titres', '39', 'besoins', 'gamme', 'combat', 'huile', 'Wish', 'sociaux', 'honneur', 'critique', 'sorte', 'gare', 'continue', 'crise', 'papier', 'hiver', 'bataille', 'piscine', 'réseaux', 'sport', 'Japon', 'commun', 'retrouver', '41', '1994', 'permettent', 'puissance', 'modèles', 'thème', 'sciences', '37', 'mêmes', 'appelle', 'moderne', 'Ne', 'with', 'responsable', 'exposition', 'neuf', 'anciens', 'ajouter', 'court', 'classique', 'Petit', 'principe', 'ouvert', 'ouvre', 'forte', 'crois', 'précédent', 'sauf', 'stock', 'Publié', 'principale', '43', 'professeur', 'dispose', 'navigation', 'Londres', 'Amérique', 'régime', 'forces', '55', '90', 'garde', 'rend', 'buts', 'Elles', 'vol', 'appelé', '500', 'couple', 'livraison', 'celui-ci', 'Association', 'demander', 'avance', 'Accessoires', 'électrique', 'mains', '1992', 'connaître', 'transport', 'telle', 'connaissance', 'ressources', '--Le', 'changement', 'peau', 'dix', 'filles', 'i', 'offres', 'Chambre', 'Informations', 'Institut', 'Noël', 'impression', 'Voici', 'Angleterre', 'étape', 'magnifique', 'physique', '1980', 'vois', 'textes', 'cookies', 'numérique', 'présentation', 'utilisateurs', 'œuvres', 'Signaler', 'documents', 'majorité', 'fer', 'télévision', 'étudiants', 'artistes', 'recevoir', '52', 'relations', 'étais', 'élections', 'professionnelle', 'Russie', 'Festival', 'bande', 'poids', 'privée', 'principalement', 'St', 'émission', 'bonnes', '1993', 'familles', 'Ma', 'britannique', 'lignes', 'caractère', 'assurer', 'Thomas', 'participe', 'découverte', 'E', 'proposer', 'cartes', 'souhaite', 'Or', 'Salle', 'recherches', 'partenaires', 'chance', 'maire', 'peur', 'rivière', 'vignette', 'Ville', 'acteur', 'explique', 'univers', 'direct', '49', 'surface', 'soirée', 'confiance', 'journal', 'facilement', 'Windows', 'bientôt', 'Est', 'dessus', 'Mes', 'arrière', 'lac', 'Présentation', 'nouvel', 'Montréal', 'logiciel', 'abus', 'Répondre', 'devenu', 'installation', 'courant', 'faible', 'travailler', 'Service', 'Rome', 'profiter', 'langues', 'capacité', 'André', 'systèmes', 'auteurs', 'maisons', 'née', 'attaque', 'accessoires', 'Martin', 'actuelle', 'soins', 'Hotel', 'capitale', 'tableau', '1991', 'pieds', 'artiste', 'fête', 'fonds', 'concerne', 'est-à-dire', '46', 'classement', 'hauteur', 'plage', 'original', 'sept', 'mesures', 'coin', 'centrale', 'diverses', 'faite', 'dossier', 'cité', 'Pourquoi', 'ci-dessous', 'marques', 'EN', 'justice', 'Bernard', 'gratuitement', 'types', 'station', 'Sans', 'formes', 'célèbre', 'Joseph', 'éditions', 'fonctions', 'adore', 'dis', 'jeudi', 'paix', 'limite', 'sortir', 'LES', 'vote', 'Web', 'pleine', 'Championnat', 'mercredi', 'acteurs', 'principaux', 'plans', 'exploitation', 'épouse', 'supérieur', 'James', 'Georges', 'Parti', 'médias', 'confort', 'Claude', 'cherche', 'format', 'signe', 'propres', 'composé', 'R', 'mardi', 'V', '47', 'communale', 'naturel', 'FC', 'faites', 'di', 'lutte', '51', 'entièrement', 'peinture', 'actuel', 'écrire', 'structure', 'vieux', 'Premier', 'Vue', 'situe', 'pose', 'Monde', 'quotidien', 'espère', '´', 'Avr', 'espèces', '1970', 'risques', 'o', '59', 'populaire', 'prochain', 'infos', 'distance', 'étoiles', 'proches', 'latitude', 'moitié', 'détails', 'termes', 'Richard', 'immédiatement', 'solutions', 'contraire', 'sociales', 'importance', 'idéal', 'effets', 'représente', 'parfaitement', '¤', '56', 'Alain', 'locaux', 'entretien', '54', 'partout', 'penser', '1989', 'chaîne', 'top', 'pierre', 'patrimoine', 'voire', '57', 'choisi', 'réponses', 'g', 'rester', 'informatique', 'maladie', 'totalement', 'e-mail', 'Michael', 'discussion', 'complète', 'Guide', 'studio', 'pourra', 'Location', 'saint', 'Avis', 'défense', 'Carte', 'vidéos', 'réalisation', 'scientifique', 'Plan', 'Grande', 'rendez-vous', 'erreur', 'pourrez', 'Contact', 'décidé', 'gaz', 'opération', 'industrie', 'raisons', 'générales', 'H', 'Sujet', 'longitude', 'assurance', 'contacter', 'privé', 'améliorer', 'align', 'Hôtel', 'belles', 'valeurs', 'enquête', 'atteint', 'croissance', 'perdu', 'avenir', 'traduction', 'suffit', 'bébé', 'faits', 'participation', 'russe', 'régulièrement', 'zones', 'Président', 'appareil', 'goût', 'Groupe', 'El', 'terrasse', 'p', 'maximum', 'tellement', 'formé', 'Lycée', 'local', 'fonctionnement', 'Terre', 'dos', 'remporte', 'portes', '53', 'canton', 'ordinateur', 'gratuite', 'restaurant', 'machine', 'sexe', 'utilisant', 'fichier', 'construit', 'cour', 'division', 'mobile', 'approche', 'Bruxelles', 'recette', 'F', 'absence', 'écoles', 'vis', 'pouvait', \"lorsqu'\", 'unité', 'dû', 'sert', 'voyageurs', 'actualité', 'Rue', 'noms', 'volonté', 'existence', 'expression', 'ministère', 'méthode', 'italien', 'propriétaire', 'sociétés', 'Code', 'partage', 'cheveux', 'tient', 'décide', 'Marseille', '1988', 'International', 'Nov', 'bleu', 'consommation', 'entraide', 'élu', 'Autres', 'matchs', 'confortable', 'revient', 'européen', 'mondial', 'National', 'électronique', 'participer', 'régions', 'identité', 'Daniel', 'DH', 'Pendant', 'PC', 'minimum', 'Fév', 'faisait', 'ventes', 'quant', 'trouverez', 'apos', 'revue', 'probablement', 'Donc', 'apparaît', 'Oct', 'Chaque', 'fenêtre', 'voici', 'Aucun', 'demandes', 'recommande', 'ferme', 'outre', 'futur', 'morts', 'pression', 'maître', 'événements', 'réserve', 'attendre', '58', 'équipements', 'William', 'acte', 'viens', 'L.', 'regard', 'vert', 'publication', 'belge', 'différence', 'magasin', 'vent', 'kilomètres', 'étranger', 'reçoit', 'A.', 'manger', 'présenter', 'champ', '→', 'contexte', 'suivants', 'déjeuner', '300', 'École', 'chien', 'compétition', 'Services', 'Première', 'outil', 'commencé', 'porter', 'P', 'coupe', 'Bordeaux', 'statut', 'George', '́', 'montagne', 'chercher', 'responsabilité', 'Santé', 'description', 'Même', 'écriture', 'Empire', \"Aujourd'\", 'Partager', '1986', 'signifie', 'repas', 'immobilier', 'gagner', 'conception', 'travaille', 'bras', 'Bretagne', 'Avant', 'Dès', 'visage', 'fera', 'guide', 'efficace', 'rendu', 'puisse', 'id', 'épisodes', 'créée', '1987', 'hier', 'longueur', '†', 'Livraison', 'revenir', 'sinon', 'Titre', 'écrivain', 'correspond', 'élection', 'obtient', '1960', 'emplacement', 'design', 'celle-ci', 'tendance', 'Laurent', \"quelqu'\", 'voilà', 'Sciences', 'développer', 'réduction', 'domicile', 'applications', 'décès', '--La', 'jeunesse', 'réel', 'fit', '1982', 'retraite', 'contient', 'places', 'devait', 'radio', 'peintre', 'littérature', 'Déc', 'mises', 'Moi', 'forêt', 'figure', 'toutefois', 'beauté', 'clair', 'Commission', 'prochaine', 'Contenu', 'largement', '1984', 'cul', 'change', 'constitue', 'Sep', 'économiques', 'entier', 'ouverte', 'respect', 'disque', 'payer', 'Lorsque', 'vin', 'Connexion', '1985', 'Toulouse', 'café', 'milliards', 'bonheur', 'rejoint', 'programmes', 'vaut', 'médecin', 'oeuvre', 'pont', 'représentant', 'del', 'intérêts', 'Sélectionner', 'visiter', 'Entre', 'ciel', 'Retour', 'pétrole', 'verre', 'plantes', 'véhicule', 'preuve', 'chargé', 'suivantes', 'Patrick', 'Marc', 'joué', 'Homme', 'anniversaire', 'modifier', 'conseiller', 'Aoû', 'fruits', 'discours', 'débat', 'atteindre', 'altitude', 'phase', 'instant', 'historiques', 'Bonne', 'prévu', 'b', 'agence', '‘', 'vit', 'passant', 'Juil', 'regarder', 'Culture', 'final', 'certaine', 'Loire', 'inscrire', 'architecture', 'Nom', 'atelier', 'J.', 'critères', 'Maroc', 'issue', 'Disponible', 'vision', 'fleurs', 'spectacle', 'évaluation', 'huit', 'basse', 'prêt', 'complètement', 'louer', 'centres', 'volume', 'utilisés', 'sympa', 'Air', 'essayer', 'température', 'opérations', 'collaboration', 'fiche', 'souhaitez', '75', 'offrir', 'Ouest', 'demandé', 'Puis', 'dollars', 'distribution', 'Cliquez', 'tres', 'DVD', 'lu', 'supérieure', 'liés', 'montant', 'intervention', 'boutique', 'influence', 'Monsieur', 'diffusion', 'Conditions', 'troupes', 'sang', 'nécessaires', 'utilisée', 'Éditions', 'rejoindre', 'tenu', 'lance', 'véhicules', 'compter', 'objectifs', 'arrêt', 'Découvrez', 'Assemblée', 'construire', 'apprendre', \"N'\", 'présenté', 'Super', 'élevé', 'Mme', 'Certains', 'scolaire', 'publiques', 'compétences', 'éditeur', 'connecté', 'cliquez', 'Anne', 'excellent', 'écoute', 'budget', 'françaises', 'opposition', 'concept', 'étage', '150', 'équipé', 'événement', 'Tags', '1983', 'test', 'niveaux', 'commencer', 'avion', 'échange', 'caractéristiques', 'servir', 'envoyer', 'T', 'voulez', 'Château', 'tenue', 'fichiers', 'City', 'Sport', 'côtés', 'totale', 'poser', 'stade', 'eaux', 'entendu', 'Théâtre', 'conscience', 'humain', 'vallée', 'militaires', 'Christian', 'no', 'réussi', 'humaine', 'coordonnées', 'mauvais', 'touche', 'riche', 'Musique', 'associations', 'Twitter', 'suit', 'protéger', 'Top', 'Quelques', 'ouvrages', 'mari', 'portant', '×', 'remise', 'soi', 'candidat', 'Guillaume', 'Age', 'comte', 'utile', 'dur', 'aéroport', 'meilleures', 'IV', 'stratégie', 'hésitez', 'Algérie', 'promotion', 'Afficher', 'Créer', 'vide', '1975', 'autorités', 'Vie', '1981', 'telles', 'you', 'préparation', 'élève', 'technologie', 'théorie', 'Total', 'arrêté', '1978', 'Peter', 'paiement', 'journaliste', 'prises', 'tente', 'indique', 'locale', 'ouvrir', 'principales', 'Ben', 'traité', 'festival', 'espaces', 'von', 'loisirs', 'naturelle', 'défaut', 'support', 'baisse', 'Israël', 'phpBB', 'rencontres', 'O', 'Cour', '1968', 'Résultats', 'découvert', 'comptes', 'plat', 'Antoine', 'jolie', 'crée', 'Modèle', 'annoncé', 'victimes', 'avions', 'recettes', 'installer', 'lait', 'dehors', 'biens', 'légales', 'impossible', 'croire', 'email', 'Alexandre', 'municipalité', 'établissements', 'Asie', 'domaines', 'tombe', 'week-end', 'intéressant', 'noire', 'arts', 'conférence', 'Car', 'considéré', 'allez', 'champion', 'magazine', 'clubs', 'Olivier', 'coups', 'Parc', 'arriver', 'Parmi', 'commercial', 'pouvant', 'World', 'post', 'Disney', 'Académie', 'salles', 'fortement', 'résidence', 'artistique', 'champs', 'tourisme', 'proposé', 'In', 'CD', 'davantage', 'lancer', 'conflit', 'aventure', 'séries', 'serveur', 'rêve', 'civile', 'faveur', 'enregistrer', 'connue', '1979', 'Ça', 'tenir', 'japonais', 'perte', 'fonctionne', 'Albert', 'mairie', 'termine', 'espagnol', 'lesquels', 'garder', 'Jouer', 'allemande', 'précise', 'montrer', 'déclaré', 'exercice', 'quatrième', 'vérité', 'basée', 'scientifiques', 'trouvent', 'importants', 'right', 'capable', 'prison', 'villages', 'catégories', 'Maurice', 'soin', 'actes', 'aurais', 'métier', 'C.', 'veulent', 'foi', 'quantité', 'chinois', 'masse', 'expédition', 'récemment', 'charme', 'revanche', 'stage', 'concert', 'complexe', 'milliers', 'was', 'accéder', 'tôt', 'van', 'pop', 'pensée', '1976', 'Comité', 'secrétaire', 'der', 'superbe', 'clé', 'particuliers', 'fini', 'printemps', 'demain', 'commission', 'originale', 'camp', 'Permalien', 'dessin', 'marchés', 'envers', 'réception', 'lois', 'Dr', 'religion', 'chansons', 'lycée', 'ambiance', 'Mars', 'Quel', 'dois', 'vivant', 'engagement', '›', 'juridique', 'mur', 'noter', 'ski', 'consulter', 'central', 'option', '1977', 'juge', 'Sous', 'absolument', 'entrer', 'viennent', 'meme', 'Type', 'jaune', 'élément', 'r', 'chaleureux', 'catholique', 'Note', 'vendre', 'invite', 'menu', 'rose', 'essentiel', '1950', 'tarifs', 'couverture', 'Archives', 'Saison', 'Se', 'évêque', 'Pologne', 'Livre', 'Berlin', 'difficultés', 'blanche', '400', 'chapelle', 'olympiques', 'organisé', '1972', 'procédure', 'présents', 'frères', 'performance', 'perdre', '1973', 'Message', 'notes', 'génération', 'Journal', 'voitures', 'au-dessus', 'médecine', 'bâtiments', 'condition', '®', 'Photo', '1974', 'rappelle', 'importantes', 'glace', 'cheval', 'durable', 'connaît', 'effectuer', 'quitte', 'contenant', 'pro', 'continuer', 'tradition', 'candidats', 'beaux', 'lancé', 'automobile', 'Trois', 'Malgré', 'coût', 'réunion', 'Ca', 'primaire', 'réduire', 'chat', 'obtenu', 'définition', 'Produits', 'résumé', 'chasse', 'apporter', 'Jésus', 'sucre', 'Espace', 'Général', 'Vincent', 'laissé', 'vérifier', 'lendemain', 'député', 'Salon', 'traduit', 'froid', 'actrice', 'clés', 'terres', 'reprises', 'reprise', 'chiffres', 'résistance', 'publiée', 'surprise', 'chocolat', 'alimentation', 'Nuit', 'Nouvelle', 'échelle', 'autorité', 'ceci', 'Fiche', 'capital', 'Etats-Unis', 'chaud', 'comité', 'Plusieurs', 'M2', 'licence', 'échanges', 'interne', 'épreuve', 'collège', 'joli', 'liées', 'âme', 'tiers', 'critiques', 'enfance', 'vélo', 'Arts', 'télévisée', 'envoyé', 'pourraient', 'côte', 'Royaume-Uni', 'murs', '65', 'bus', 'fabrication', 'Black', 'réalisateur', 'demeure', 'prince', 'piste', 'conduit', 'soldats', 'lecteur', 'États', 'hôte', 'Fédération', 'douche', 'batterie', 'salariés', 'cadeau', 'Gestion', 'aspect', 'home', 'sommet', 'connaissances', 'Alpes', 'Projet', 'essentiellement', 'oublier', 'Politique', 'philosophie', 'René', 'seuls', 'district', 'Grâce', 'religieux', 'sac', 'IP', 'occupe', 'Nantes', 'locales', 'duc', 'documentaire', 'Toutefois', 'chute', 'méthodes', 'scénario', 'planète', 'parking', 'sympathique', 'héros', '2007Sujet', 'garantie', 'label', 'pêche', 'comportement', 'renseignements', 'cycle', 'humaines', 'crédit', 'mélange', 'consiste', 'précédente', 'accueille', 'logiciels', 'ajouté', 'Me', '.....', 'visiteurs', 'boîte', 'forcément', 'Van', 'grave', 'Australie', 'tournée', 'exception', 'multiples', 'chiffre', 'Film', 'connexion', 'logique', 'restauration', 'somme', '64', 'préparer', 'mail', 'comté', 'équipée', '1962', 'eut', 'édité', 'moindre', 'réflexion', 'portail', 'accessible', 'Actualités', 'vraie', 'anciennes', 'proposition', 'Mise', 'Inde', 'technologies', 'Leur', '1940', 'Bienvenue', 'financement', 'mouvements', 'modification', 'royaume', 'évidemment', 'acceptez', 'spécialiste', 'crème', 'Seconde', '1945', 'Voilà', 'File', 'etre', 'gérer', 'îles', 'découvre', 'affiche', 'Formation', 'équipement', 'Vos', '1967', 'augmentation', 'banque', 'règle', 'feuilles', 'agriculture', 'langage', 'Los', 'automatiquement', '3D', 'secret', 'simples', 'impact', 'Star', 'Ou', 'Description', 'till', 'certainement', 'régional', 'citer', 'info', 'rapports', 'portée', 'démarche', \"Qu'\", 'arrondissement', 'profil', 'hôpital', '1971', 'accueillir', 'suisse', 'expliquer', 'officielle', 'appareils', 'révolution', 'restaurants', 'violence', 'secondes', 'a-t-il', 'Durant', 'néanmoins', 'voulu', 'Pro', 'Brésil', 'veille', 'normal', 'animation', 'connais', 'Frédéric', \"--L'\", 'Roger', 'comporte', 'danse', '1969', 'inclus', 'Marine', 'apparition', 'bibliothèque', 'record', 'G', 'décrit', 'Strasbourg', 'score', 'Catherine', 'Bref', 'indépendance', 'archives', 'Henry', 'destination', 'Auteur', 'Genève', 'this', 'humains', 'composée', 'revenu', 'clairement', 'moments', 'f', 'VF', 'Dominique', 'faux', 'apprentissage', 'Aide', 'donnée', 'passion', 'achats', 'mauvaise', 'Attention', 'devoir', 'Royal', 'pilote', 'tome', 'Femme', 'chapitre', 'chaleur', 'faudra', 'permettra', 'USA', 'fournir', 'féminin', 'assure', 'reprend', 'thèmes', 'Radio', 'superficie', 'élus', 'séance', 'PS', 'investissement', 'commerces', 'producteur', 'citoyens', 'financière', 'Direction', 'indiqué', 'connecter', 'exactement', '1944', 'architecte', 'capitaine', 'Appartement', 'fondée', 'pire', 'publie', 'effectivement', 'science', 'meurt', 'heureux', 'initiative', 'météo', 'Maria', 'Révolution', 'conforme', 'entendre', 'arrivé', 'réforme', 'saisons', 'actif', 'accident', 'réalisée', 'matières', 'dessous', 'adultes', 'placé', 'rock', 'guitare', 'faudrait', 'truc', 'Place', 'text', 'Nice', 'bouche', 'nucléaire', 'réalise', 'hommage', 'acheté', 'essai', 'aimé', 'urgence', 'présidentielle', 'cuir', 'utiles', 'Collection', 'Var', 'reprendre', 'appartient', 'voyages', 'fondé', 'partenaire', 'tournoi', 'appelée', 'grosse', 'Banque', '1000', 'culturel', 'chômage', 'délai', 'principes', 'Quelle', 'pâte', 'eacute', 'piano', 'Sécurité', 'tours', 'décoration', '2008Sujet', 'WP', 'Y', 'frontière', 'difficulté', 'développé', 'étrangers', 'catalogue', 'faute', 'matériaux', 'spécial', 'missions', 'arabe', 'Anglais', 'circuit', 'four', 'Victor', 'permanent', 'réservation', 'étrangères', 'yoga', 'douce', 'auraient', 'Christophe', 'Jack', 'avantage', 'palais', 'responsables', 'Médecine', 'kg', 'classé', 'Park', 'pointe', 'supplémentaires', '\\ufeff', 'rares', 'bassin', 'Lille', 'Cours', 'sexy', 'avocat', 'pain', 'prennent', 'vêtements', '120', 'victime', 'pouvons', 'précis', 'One', 'Christ', 'profit', 'vouloir', 'disponibilité', 'parole', 'Eric', 'sent', 'marketing', 'arrêter', 'légèrement', 'signé', 'hausse', 'participé', 'Aux', 'robe', 'aurez', 'neige', 'Source', 'Catégories', 'Gérard', 'dynamique', 'transports', 'composition', 'classes', 'Guy', 'Générale', 'lesquelles', 'essais', 'produire', 'reconnaissance', 'devenue', 'propriétaires', 'visites', 'net', 'références', 'bains', 'vaste', 'spécifique', '1965', 'auquel', 'américains', 'conséquences', 'légende', 'convient', 'marine', 'elle-même', 'attend', 'restent', 'abbaye', 'Côte', 'empereur', '1966', '™', 'adapté', 'dessins', 'Max', 'Jules', 'efficacité', 'very', 'Pascal', 'entraîneur', 'environs', '250', 'institutions', 'parfaite', 'bref', 'décret', 'déclaration', 'conduite', 'Agence', 'machines', 'Sinon', '1964', 'rubrique', 'monuments', 'seraient', 'performances', 'single', 'au-delà', 'mandat', 'italienne', 'indépendant', 'You', 'partenariat', 'offert', 'portable', 'R.', 'souvenir', 'rédaction', 'Ceci', 'siècles', 'spécifiques', 'changements', 'porté', 'center', 'relativement', 'Simon', 'extension', 'organisations', 'électricité', 'limites', 'Microsoft', '1930', 'Manuel', 'Vidéo', 'collectif', 'Toute', 'représentation', 'plateau', 'possibles', 'réduit', 'recours', 'participants', 'centaines', 'dispositions', 'finir', 'Adresse', 'majeur', 'populations', 'agent', 'habitude', 'géographique', 'automatique', 'Ah', 'graphique', 'animal', 'sportif', 'Tourisme', 'emplois', 'Parlement', 'arbre', 'chefs', 'donnant', 'Tom', 'revenus', 'opinion', 'S.', 'fondateur', 'bilan', 'unités', 'Madame', 'estime', 'axe', 'Grèce', 'formule', 'littéraire', 'dépend', 'collections', 'dispositif', 'venu', 'bénéficier', 'fixe', 'phénomène', 'bouton', 'Blog', 'pape', 'rencontrer', 'rythme', 'préféré', 'notion', 'amoureux', 'répond', 'chant', 'dite', 'rare', 'gris', 'P.', 'remplacer', 'réellement', 'Accès', 'basé', 'fêtes', 'formations', 'und', 'usine', 'Ordre', 'bases', 'adaptation', 'financiers', 're', 'expériences', 'oiseaux', 'européens', 'pouvoirs', 'Marcel', 'procès', 'Est-ce', 'agents', 'apporte', 'étapes', 'Mentions', 'toile', 'rang', 'voies', 'ajoute', 'consacré', 'donnent', 'moto', 'tenter', 'finit', 'chacune', 'Région', 'longues', 'arbres', 'vues', 'versions', 'su', 'règne', 'Quant', 'liésPortail', 'copie', 'reconnu', '─', 'ex', 'actifs', 'Docteur', 'clip', 'circulation', 'remettre', 'iPhone', 'intitulé', 'fallait', 'Cuisine', 'courte', 'est-il', 'rues', 'dirigeants', 'métiers', 'Ensuite', 'climat', 'particulière', 'chiens', 'efforts', 'officiellement', 'récit', 'supprimer', 'bel', 'standard', 'experts', 'automne', 'courage', 'relative', '1963', 'attente', 'Pays-Bas', 'internautes', 'monter', 'Pièces', 'marqué', '1958', 'publicité', 'changé', 'partis', 'Stade', 'Jean-Pierre', 'Al', 'extrêmement', 'croix', 'Photos', 'bruit', 'Envoyer', 'tomber', 'conçu', 'Portail', 'commandes', 'latin', 'sable', '72', 'minute', 'left', 'former', 'télécharger', 'Thierry', 'enregistré', 'perd', 'Chambres', 'options', 'transfert', 'Palais', 'bateau', 'internationales', 'Bureau', 'justement', 'Blanc', 'pluie', 'territoires', 'titulaire', 'avancée', 'Office', 'obligatoire', 'angle', 'réservés', 'Love', 'horaires', 'structures', '1961', 'Petite', 'Afin', 'oublié', 'spécialisé', 'ci-dessus', 'examen', 'Divers', 'Galerie', 'pourront', 'personnelle', 'chanteur', \"puisqu'\", 'contenus', 'correspondant', 'Création', 'compositeur', 'interdit', 'histoires', 'sœur', 'Hollande', 'municipal', 'Communauté', 'Livres', 'sérieux', 'réussite', 'lecteurs', 'agricole', 'Bois', 'normes', 'mille', 'FAQ', 'Julien', 'quitter', 'avantages', 'Directeur', 'phrase', 'F.', 'augmenter', 'lorsqu', 'Rien', 'upright', '1920', 'logements', 'Washington', 'José', 'cabinet', 'enregistrement', 'Loi', 'B.', 'acier', 'Discussion', 'installé', 'précision', 'représentants', 'Dernière', 'Roi', 'intéresse', 'Bruno', 'classiques', 'exécution', 'listes', 'aménagement', 'œil', 'soumis', 'My', 'définir', 'billet', 'Lien', 'organisme', 'vigueur', 'logo', 'espoir', 'Autriche', 'raconte', 'utilisent', 'danger', 'v', 'prestations', 'Évaluation', 'paroles', 'Champion', 'Tunisie', 'Portugal', 'associé', 'attendant', 'avenue', 'allant', 'combien', 'Encore', 'organiser', 'Emmanuel', 'Faire', 'destiné', 'rayon', 'Vienne', 'Yves', 'vendu', 'démocratie', 'Chez', '♥', 'plastique', 'patients', 'impose', 'réaction', 'bronze', 'Âge', 'spéciale', 'extrait', 'potentiel', 'Montpellier', 'Apple', 'anglaise', 'vainqueur', 'constitué', 'allait', 'Ici', 'Table', 'canal', 'sel', 'midi', '1954', '95', 'Revue', 'Aucune', 'interprétation', 'doux', 'seigneur', 'limitée', 'camping', 'Système', 'TTC', 'dédié', 'chevaux', 'surveillance', 'intégration', 'maladies', 'appartements', 'pierres', 'issus', 'lancement', '85', 'session', 'légumes', 'hockey', 'supplémentaire', 'personnelles', 'Bibliothèque', 'Parce', 'musicale', 'individus', '1936', 'différent', 'organise', 'financier', 'ateliers', 'Affaires', 'Nationale', 'Nations', 'Jardin', 'Moscou', 'quels', 'Noir', 'montage', 'construite', 'rouges', 'numéros', 'Où', 'défaite', 'front', 'Père', 'culturelle', 'auront', 'armées', 'auto', 'commerciale', 'POUR', '1956', '1946', '1959', 'humour', 'postes', 'accepte', 'reine', 'autorisation', 'métro', 'remplacé', 'charges', 'Cannes', 'No', 'agir', 'métal', 'arme', 'Droit', 'règlement', 'poète', 'Entreprises', 'sports', 'Suite', 'Gilles', 'Pourtant', 'innovation', 'barre', 'vise', 'sorties', 'débuts', '62', 'Android', 'exemplaires', 'identifier', 'Big', 'Mr', 'alimentaire', 'garçon', 'hébergement', 'Normandie', 'rire', '600', 'employés', 'dates', 'Travaux', 'établir', 'écrits', 'vivement', 'pistes', 'flux', 'Série', 'socialiste', 'secondaire', 'Protection', 'apprend', 'dimension', 'égard', 'poissons', 'présentes', 'solo', 'don', 'tirer', 'vols', 'mécanique', 'poursuit', 'tourne', 'amélioration', 'annuaire', 'gagne', 'ceinture', 'électriques', 'Rennes', 'Californie', 'équilibre', 'secteurs', 'nuits', 'allons', 'nécessité', 'Infos', 'chercheurs', 'Belle', 'paysage', 'active', 'blancs', 'médaille', 'concurrence', 'Durée', 'Aller', 'erreurs', 'bac', 'joie', 'USB', 'ben', 'repris', 'travailleurs', 'préfère', 'royale', 'invités', 'respecter', 'Madrid', 'Demande', 'appeler', '1939', '2006Sujet', 'indispensable', '\\u200b', 'suppression', 'orchestre', 'Réponse', 'émissions', 'morceaux', 'luxe', 'précisément', 'trafic', 'léger', 'alentours', 'prénom', 'tué', 'Fondation', 'tennis', 'solaire', 'Denis', 'voulait', 'travaillé', 'Sports', 'cadeaux', 'Partagez', 'intention', 'naturelles', 'Famille', 'considère', 'Red', 'modernes', 'favoris', 'engage', 'hasard', 'vécu', 'sentiment', 'courses', 'Arthur', 'poursuivant', 'cesse', 'auparavant', 'Z', 'Vers', 'grec', 'Détails', 'urbaine', 'extrême', 'voulais', 'bio', 'live', 'rupture', 'soutenir', 'Programme', 'humanité', 'photographie', 'calcul', 'modifications', 'visant', 'faciliter', 'Santa', 'signature', 'plages', 'maintenir', 'déco', 'tissu', 'amie', 'affirme', 'Retrouvez', 'title', 'universitaire', 'Angeles', 'cathédrale', 'vingt', 'demi', '78', 'mettant', 'familiale', '77', 'Produit', 'personnalité', 'Offres', 'menace', 'mention', 'Mode', 'apprécié', 'Salut', 'scolaires', 'U', 'aimerais', 'écrite', 'quartiers', 'Sarkozy', 'calendrier', 'thé', 'rayons', 'News', 'All', 'exemples', 'conserver', 'échec', 'libres', 'vieille', 'SUR', '1957', 'décisions', 'plaque', 'dure', 'tribunal', 'alt', 'organisée', 'Qu', 'Constitution', 'conséquence', 'personnalités', 'Hôtels', 'Organisation', 'li', 'Emploi', 'semblent', 'fondation', 'maîtrise', 'essence', 'w', '68', 'leader', 'amateurs', 'magasins', 'bureaux', 'désigne', 'boite', 'coopération', 'retourner', 'propositions', 'Information', 'Music', 'issu', 'défendre', 'populaires', 'prévue', 'rugby', 'mettent', 'lié', 'installations', 'coté', 'King', 'maman', 'bonjour', 'Introduction', 'monument', 'ombre', 'Stéphane', 'Frais', 'remporté', 'journalistes', 'vins', 'D.', 'Moyen', 'Haute', 'it', 'civil', 'révèle', 'couples', 'fins', '1942', 'albums', 'souvenirs', 'Mark', 'transformation', 'tests', 'tourner', 'profondeur', 'Suède', 'ingénieur', 'fans', 'regarde', 'poésie', 'Q', 'touristique', 'terrains', '1955', 'HD', 'dialogue', 'nationales', 'scènes', 'Soins', 'dommage', 'Bourgogne', 'branche', 'après-midi', 'tir', 'ci', 'Seigneur', 'attaques', 'refuse', 'déroule', 'étudiant', 'profession', 'video', 'aimez', '800', 'régionale', 'autonomie', 'navigateur', 'G.', '66', '1943', 'éd.', 'plateforme', 'Veuillez', 'Provence', 'Milan', 'Turquie', 'Edition', 'Irlande', 'chaussures', 'empêcher', 'démocratique', 'HT', '1948', 'effort', 'instruments', 'façade', 'effectué', 'prévention', 'uns', 'Questions', 'rencontré', 'connus', 'accompagné', 'montagnes', 'canadien', 'Compagnie', 'jardins', 'sommes-nous', 'English', 'cent', 'commandant', 'Football', 'débute', 'source1', 'chemins', '1914', 'viande', 'enjeux', 'beurre', 'paraît', 'ul', 'servi', 'européennes', 'spécialisée', 'Mexique', 'WC', 'moteurs', 'intérieure', 'Isabelle', 'Télécharger', 'cinquième', 'utilisées', 'situations', 'Francis', 'devraient', 'Macron', 'Frank', 'capacités', 'personnels', 'visible', 'combats', 'devra', 'fan', 'Jeanne', 'K', 'invité', 'retard', 'réservé', 'galerie', 'Syrie', 'évolue', 'tester', 'acquis', '67', 'Concours', 'footballeur', 'légère', 'Avril', 'successeur', 'interface', 'serez', 'industriel', 'enceinte', 'accepter', 'contemporain', 'Annonce', 'entière', 'Développement', 'réelle', 'parlé', 'associés', 'Version', 'obligation', 'nul', 'déchets', 'appui', 'étudier', 'résolution', 'décédé', 'villa', 'envoie', 'comprenant', '1947', 'banques', 'poisson', 'députés', 'directe', 'excellente', 'établi', 'entend', '84', 'massif', 'suffisamment', 'Aujourd', 'u', 'sauver', 'silence', 'Chris', 'organismes', 'traditionnelle', '69', 'ordres', 'Raymond', 'déclare', 'cliquant', 'billets', 'enseignants', 'routes', 'malheureusement', 'EUR', 'concerts', 'Studio', 'possibilités', 'égalité', 'audio', 'Go', 'Home', 'we', 'stockage', 'assemblée', 'Division', 'prenant', 'mérite', 'effectue', 'thermique', 'énorme', 'Smith', 'propriétés', 'tuer', 'alimentaires', 'judiciaire', 'dimensions', 'devint', 'décor', 'Aussi', 'puissant', 'appartenant', 'récupérer', 'Point', 'Fin', 'naturels', 'sourire', 'couche', 'terminé', 'Lee', 'thèse', 'romans', 'paru', 'Haut', 'ennemi', 'secours', 'installe', 'accueilli', 'fermeture', 'nez', 'désigner', 'tarif', 'intermédiaire', 'Barcelone', 'assistance', 'dossiers', 'Autre', 'Maître', 'rappeler', 'Villa', 'oeil', 'cancer', 'arrête', 'Matériel', 'progrès', 'Records', 'poursuivre', 'Sainte', '1953', 'United', 'Master', 'cache', 'appliquer', 'morceau', 'aspects', 'entraînement', 'océan', 'Rose', 'fou', 'Informatique', 'navire', 'chauffage', 'développe', 'industrielle', '1952', 'confirme', 'fleuve', 'cuisson', 'remis', 'gouverneur', 'meteo', 'douze', 'aimer', '63', 'poche', 'are', 'Congrès', 'constituent', 'exprimer', 'Française', '1941', 'fusion', 'Là', 'Vente', 'Open', 'E.', 'peuples', 'Val', 'plante', 'Croix', 'musulmans', 'Live', 'votes', 'comprends', 'cellules', 'soviétique', 'internationaux', 'disparu', 'tableaux', 'étoile', 'Orange', 'audience', 'globale', 'médecins', 'lits', 'coûts', 'souci', 'transmission', 'Janvier', 'appris', 'orientation', 'ressemble', 'cimetière', 'rentrée', 'synthèse', 'gratuits', 'pensé', 'discussions', 'origines', 'docteur', 'Caroline', 'indépendante', 'recensement', 'cérémonie', 'Eglise', 'passée', 'Luxembourg', 'législatives', 'col', 'Cinéma', 'détachées', 'certes', 'do', 'ère', 'Dernier', 'proposons', 'relève', 'communautés', 'immense', 'Actualité', 'diplôme', 'acquisition', 'We', 'American', 'manifestations', 'chantier', 'déterminer', 'chers', 'télé', '--Autres', 'contribution', 'culte', 'convention', 'voisins', 'Notre-Dame', 'victoires', 'patron', 'montré', 'Alsace', 'tension', 'Ministère', 'définitivement', '1949', 'diversité', 'Man', 'troubles', 'm2', 'endroits', 'adresses', 'Ancien', '61', 'rive', 'Corée', 'mener', 'lol', 'riches', 'Atelier', 'consommateurs', 'montée', 'facteurs', 'adulte', 'UN', 'navires', '↑', 'mobilité', 'originaire', 'majeure', '76', 'formulaire', 'autonome', 'conduire', 'inverse', 'dépenses', 'touristiques', '1938', 'house', 'Tome', 'House', 'plats', 'symbole', 'sportive', 'Design', 'liée', 'privés', 'mathématiques', 'Championnats', 'déplacement', 'Sophie', 'intégrer', 'vol.', 'da', 'immeuble', '1951', 'kmEntre', '00ZNous', 'incendie', 'Serge', 'devis', 'relatives', 'religieuse', 'évidence', 'désir', 'aiment', 'Chicago', 'conséquent', 'regroupe', 'officier', 'fr.wikipedia.org', '99', 'Rock', 'Don', 'union', 'agricoles', 'Armée', 'disent', '74', 'color', 'repos', 'autrement', 'Grenoble', 'créations', 'traiter', 'frontières', 'poudre', 'énergétique', 'aluminium', 'Harry', 'généraux', 'introduction', 'musical', 'cercle', 'accompagner', 'Street', 'liquide', 'voile', 'Iran', 'essayé', 'index.php', 'envoi', 'parvient', 'BD', 'caractères', 'industriels', 'Gîte', 'portrait', 'cultures', 'orange', 'Maintenant', 'comptait', 'empêche', 'débit', 'écouter', 'Copyright', 'administrative', 'nommée', 'Rouge', 'régiment', 'contrats', 'traces', 'soucis', 'gagné', 'Gare', 'gîte', 'Mont', 'maintien', 'XV', 'mène', 'talent', 'Chapitre', 'courrier', 'nourriture', 'pauvres', 'vivent', 'office', 'guerres', 'comédie', 'laissant', 'expositions', 'équivalent', 'perspective', 'dessinée', 'biais', 'communiqué', 'conférences', 'provient', 'assuré', 'traditionnel', 'Fort', 'portent', 'paroisse', 'Beaucoup', 'Ministre', 'intégré', 'diffusée', 'tiens', 'occupation', 'représentent', 'différente', 'H.', 'degré', 'Ecole', 'chanteuse', 'temple', 'journaux', 'retrait', 'contrairement', 'remplir', 'infanterie', 'alcool', 'qualités', 'monte', 'Lettres', 'administrateur', 'modifié', 'AU', 'Education', 'University', 'dizaine', 'Juan', 'Pièce', 'Mot', 'inspiré', '1937', 'limité', 'amitié', 'adapter', 'optique', 'étend', 'Nancy', 'Couleur', 'remplacement', 'jus', 'fédéral', 'Commander', 'motif', 'diffusé', 'morale', 'Laisser', 'modes', 'nationaux', 'max', 'remarque', 'Léon', 'Nature', 'Florence', 'présentent', 'commandement', 'mets', 'Front', 'Antonio', 'Pont', 'individu', 'sentir', 'Action', 'Conseils', 'Presse', 'élevage', 'retrouvé', 'répondu', 'solidarité', 'progressivement', 'enseigne', '2ème', 'paramètres', '1ère', 'Mario', 'coton', 'Team', 'West', 'collectivités', '--Vos', '92', 'toucher', 'Inscription', 'conseillers', 'Hugo', 'Menu', 'Loisirs', 'codes', 'Seine', 'Alex', 'Communication', 'Porte', 'paysages', 'sud-ouest', 'Prince', 'collective', 'accompagnement', 'tentative', 'stations', 'posé', 'parallèle', 'démarches', 'déposé', 'million', 'Demandé', 'privées', 'verte', 'Base', 'Joe', 'réparation', 'publications', 'Force', 'médicaments', 'garantir', 'laboratoire', 'extérieure', 'dirigé', 'proposés', 'candidature', 'consultation', 'consulté', 'conseille', '83', 'race', 'monnaie', 'destruction', 'spécialistes', 'cible', 'astuces', 'administratif', 'bien-être', 'venue', 'Égypte', '1931', 'Miss', 'reproduction', 'compose', 'intelligence', 'Outils', 'deviennent', '93', 'TVA', 'Toujours', 'Octobre', 'signes', 'randonnée', 'dangereux', 'fruit', '2009Sujet', 'boulot', 'Corse', 'Savoie', 'libération', 'édifice', 'numériques', 'spécialement', 'Of', 'offrent', 'contemporaine', 'informatiques', 'occuper', 'manifestation', 'disparition', 'revoir', 'gras', 'communiste', 'Mac', 'défi', 'renforcer', 'conservation', 'informer', 'Travail', 'patient', 'mini', 'motifs', 'com', 'pseudo', 'romaine', 'wiki', 'liaison', 'avoue', '1935', '71', 'Mots', 'provenant', 'ceux-ci', 'venus', 'nécessite', 'envies', 'relais', 'Françoise', 'densité', '1918', 'Juillet', 'maintenance', 'repose', 'voter', 'débats', 'recueil', 'pommes', 'Express', 'Lorraine', 'solide', 'Peu', 'disant', 'profite', '180', 'dépôt', 'attentes', '79', 'imposer', 'fameux', 'Monaco', 'nettoyage', 'Wi-Fi', 'sols', 'Mike', 'Rio', 'attitude', 'fasse', 'retirer', 'éclairage', 'Réunion', 'Fils', 'PDF', 'nomme', 'dédiée', 'mesurer', '82', 'circonscription', 'jugement', 'sud-est', 'It', 'Meilleur', 'fonctionnalités', 'configuration', 'Scott', 'musiciens', 'Production', 'parcs', 'nord-est', 'souris', 'historien', 'colis', 'art.', 'dizaines', 'destinée', 'oreilles', 'Rapport', '00ZTrès', 'Celui-ci', 'voudrais', 'conflits', 'secondaires', '1933', 'toit', 'Classement', 'passent', 'venant', '73', 'Membre', 'béton', 'norme', '81', 'Partie', 'Francisco', 'programmation', 'cru', 'Village', 'annuel', '2018', 'duo', 'doigts', 'épreuves', 'Permission', 'euro', 'magique', 'dents', 'applique', 'oiseau', 'Juifs', 'respectivement', 'quotidienne', 'Temps', 'disques', 'constitution', 'feuille', 'championnats', 'This', 'correctement', 'condamné', 'rentrer', 'Enfant', 'Museum', 'Septembre', 'mourir', 'Versailles', 'Adam', '1934', '140', 'Napoléon', 'Soleil', 'Qualité', 'ministres', 'Commande', 'diamètre', 'Caractéristiques', 'variété', 'interview', 'librairie', 'Certaines', 'aient', 'Département', 'volumes', 'contributions', 'préalable', 'rarement', 'virus', 'considérée', 'retourne', 'Vacances', 'Chef', 'con', 'Port', 'Mary', 'dirige', 'afficher', 'Argentine', 'aventures', 'Défense', 'savent', '1900', 'baie', 'eux-mêmes', 'japonaise', 'VI', 'ajoutant', 'Lettre', 'instrument', 'idéale', 'mobiles', 'abbé', 'génie', 'tablette', 'UE', '88', 'cerveau', 'inconnu', 'reconnaître', 'Bill', 'expédié', 'W', 'lumineux', 'ennemis', 'déplacer', 'Vêtements', 'savez', '......', 'Didier', 'physiques', 'Province', 'rénovation', 'appelés', 'Situé', 'Achat', 'be', 'constructeur', 'compatible', 'linge', 'masculin', '1932', 'stratégique', 'fournisseurs', 'exercices', 'détente', 'bancaire', 'Renault', 'forêts', 'producteurs', '2e', 'exigences', 'lot', 'normale', 'évènements', 'Justice', 'réalisés', 'richesse', 'GPS', 'vas', 'prêts', 'situés', 'olympique', 'dites', 'queue', 'Press', 'blessés', 'Tokyo', 'publier', 'élevée', 'exclusivement', 'Anna', 'polonais', 'chrétiens', 'médical', 'contraintes', 'existent', 'transition', 'roues', 'placer', 'Cité', 'fleur', 'amateur', 'Gabriel', 'relatif', 'tenant', 'us', 'Awards', 'secrets', 'spéciales', 'Vendredi', 'tâches', 'financières', \"O'\", 'centre-ville', 'sportifs', 'chaude', 'éventuellement', 'reçuesAnnonce', 'récente', 'Commerce', 'champions', 'atmosphère', 'présidence', 'accompagne', 'messagerie', 'Novembre', 'Tableau', 'positions', 'urbain', 'Référence', 'bienvenue', 'intègre', 'Gouvernement', '--Divers', 'Épisode', 'sièges', 'Faites', 'Jones', 'Collège', '1926', \"Jusqu'\", 'proposent', 'esthétique', 'évoque', 'croit', 'externe', 'empire', 'datant', 'nouveautés', 'Face', 'Conférence', 'tâche', 'noirs', 'opérateur', 'Orléans', 'recrutement', 'carré', 'pneus', 'Canal', 'salaire', 'offrant', 'Alfred', 'Acheter', 'institution', 'fine', 'pauvre', 'professionnelles', 'étrange', 'courants', 'fermé', 'adaptée', 'arrivent', 'compréhension', 'quasiment', 'Benoît', 'francophone', 'féminine', 'nations', 'V.', 'prête', 'Sébastien', 'hypothèse', '91', 'adaptés', 'statue', 'douleur', 'look', 'Vierge', 'fenêtres', 'sauce', 'Beach', 'forts', 'apparence', 'bénéficie', 'appels', 'encre', 'Rouen', 'infrastructures', 'romain', 'inspiration', 'difficiles', 'inscrits', 'réputation', 'Cordialement', 'suivent', 'Samedi', 'Steve', '►', 'espagnole', 'Année', 'similaire', '86', 'dieu', 'morte', 'fonctionnel', 'régionales', 'prédécesseur', 'conserve', 'câble', 'blocage', 'Quantité', 'médicale', 'résoudre', 'LNH', 'vapeur', 'transformer', 'départements', 'publiés', 'exceptionnelle', 'quelles', 'finances', 'Amour', '1921', 'sombre', 'Bataille', 'scénariste', 'présentée', 'compagnies', 'procédé', 'Blue', 'Jérôme', 'forment', 'courriel', 'tendances', 'nord-ouest', '125', 'normalement', 'quart', 'pur', 'traverse', 'chaînes', 'préciser', 'Zone', 'Oh', 'chœur', 'Téléphone', 'fidèle', 'Venise', 'commandé', '★', '1929', 'Benjamin', 'dame', 'hotel', 'fortes', 'satellite', 'colère', 'trains', 'traite', 'Poste', 'occidentale', 'favorable', 'princesse', 'salut', 'américaines', 'Mairie', 'claire', 'prévisions', 'indiquer', 'battre', 'collègues', 'Environnement', 'Réseau', 'rôles', 'White', 'scrabble', 'menée', 'écart', 'répartition', 'bloc', 'autoroute', 'malade', 'prêtre', 'aérienne', 'discipline', '110', 'Hongrie', 'témoignage', 'sortes', 'lutter', 'évident', 'alliance', 'mn', 'mines', 'bat', 'apparaissent', 'global', 'fournit', 'variable', 'Etats', 'League', 'royal', 'fréquence', 'filtre', 'Intérieur', 'Février', 'remarquable', 'périodes', 'Bob', 'dette', 'sponsorisé', 'Eau', 'adjoint', 'grille', 'adopté', 'quête', 'Néanmoins', 'Vidéos', 'Calendrier', 'congrès', '1919', 'culturelles', 'Bertrand', 'trente', '89', 'C3', 'choc', 'totalité', 'fourni', 'Ã', 'Liège', 'Luc', 'fromage', 'distingue', 'fuite', 'affichage', 'commerciaux', 'commerciales', 'Convention', 'Rhône', 'effectif', 'engagé', 'sauvage', 'Quatre', 'bienvenusCadre', 'arc', 'valide', 'employé', 'URL', 'chats', 'détruit', 'kit', 'Tours', 'Ali', 'recommandons', '160', 'incroyable', 'chargée', '360', 'prévoit', 'adaptées', 'encyclopédie', 'impôt', 'positif', 'campagnes', '--LES', 'rêves', '1917', 'journées', 'Commentaire', 'prépare', 'Dictionnaire', 'expertise', 'Ligne', 'fidèles', 'communiquer', 'tire', 'photographe', 'Samsung', 'montrent', 'Xavier', 'musées', 'prends', 'modalités', 'individuelle', 'adversaire', 'Jeunesse', 'Trump', 'islam', 'OK', 'sec', 'Donald', 'promouvoir', 'module', 'tournage', 'refus', 'réussir', 'présentant', 'end', 'Business', 'Invité', 'disait', 'Management', 'locations', 'Films', 'contacts', 'Jean-Paul', 'vocation', 'Alice', 'bandes', 'news', 'your', 'réussit', 'remercie', 'observation', 'Tony', 'états', 'religieuses', 'prit', 'trucs', 'Localisation', 'fournisseur', 'perso', 'sensible', 'entraîne', 'consacrée', 'Arnaud', 'canadienne', 'municipale', 'quinze', 'localité', 'délais', 'prière', 'Méditerranée', 'Center', 'pensez', 'Activités', 'agglomération', 'cadres', 'smartphone', 'compléter', 'inférieur', 'policiers', 'trouvait', 'Fête', 'aides', 'Grande-Bretagne', 'Vol', 'aire', '1928', 'hectares', 'Julie', 'pédagogique', 'Collectif', 'couvert', 'écologique', 'prestation', 'Sénégal', 'vague', 'Christine', 'réserver', 'impôts', 'garage', 'Route', 'détaillée', 'juridiques', 'due', '1911', 'chevalier', 'naturellement', '98', '2.0', 'pilotes', 'Groupes', 'découvertes', \"y'\", 'av.', 'magie', 'lourd', 'Peut-être', 'êtres', 'complexes', 'analyses', 'aliments', '1901', 'proposées', 'Hélène', 'sixième', 'prisonniers', 'allemands', 'expliqué', 'future', 'console', 'économies', 'isbn', 'comparer', 'pouces', 'Vainqueur', 'étrangère', 'accords', '87', '1910', 'Hors', '94', 'connait', 'Edward', 'comparaison', 'pleinement', 'engager', '96', 'augmente', 'Restaurant', 'pub', 'expert', 'Lac', 'hautes', 'Royaume', 'voisin', 'management', 'régulière', 'jazz', 'essaie', 'spectacles', 'Congo', 'créateur', 'poster', 'exceptionnel', 'shift', 'couper', 'tiré', 'localisation', 'Die', 'concentration', 'Pen', 'paire', 'Animaux', 'commencent', 'contributeurs', 'clientèle', 'registre', 'clôture', 'implique', 'largeur', 'Science', 'farine', 'Inn', 'Hotels.com', 'Fonds', 'longs', 'juifs', 'url', 'RC', 'bijoux', 'contrôler', 'sommeil', 'mit', 'obligations', 'grève', 'sécurisé', 'notice', 'crime', 'initialement', 'clavier', 'soupe', 'UMP', 'répertoire', 'streaming', 'complémentaires', 'Life', 'Disponibilité', 'remonte', 'Sénat', 'Corps', 'Sony', 'flotte', 'Game', 'priorité', 'tonnes', 'sélectionné', 'Yoga', 'manuel', 'milieux', 'Nouveaux', 'tenté', 'proposée', 'Alger', 'Nathalie', 'favoriser', 'facteur', 'Real', 'meubles', 'chinoise', 'ingrédients', 'modifiée', 'évoluer', 'france', 'accessibles', '1925', 'Auguste', 'Jérusalem', 'An', 'aise', 'Bravo', 'ONU', 'enlever', 'foyer', 'Jean-Claude', 'caméra', 'ok', 'Membres', 'ordinaire', 'colonne', 'fiction', 'chronique', 'Claire', '1924', 'administratives', 'spéciaux', 'Panier', 'taxe', 'gardien', 'différences', 'identique', 'douceur', 'artillerie', 'RDV', 'Outre', 'autrefois', 'alerte', 'Annuler', 'hauts', 'maritime', 'peintures', 'Format', 'acide', 'témoignages', 'cycliste', 'panneaux', 'lectures', 'coucher', 'adoption', 'Danemark', 'progression', 'accepté', 'Best', 'professeurs', 'Ford', 'panoramique', 'Entreprise', 'ch', 'Euro', 'pareil', 'drapeau', 'admis', 'confirmé', 'Voyage', 'Dimanche', 'musiques', 'compétence', 'célèbres', 'extraordinaire', 'jouent', 'Group', 'canon', 'J.-C.', 'lune', 'Soyez', 'Carlos', 'maternelle', 'récent', 'Sommaire', '1915', 'boulevard', 'étions', 'constater', 'causes', 'InvitéInvitéSujet', 'investissements', 'tranquille', 'alternative', 'Jean-François', 'CE', 'chances', 'Kit', 'négociations', 'limiter', 'Atlantique', 'Enfants', 'Lot', 'Taille', 'bonus', 'annuelle', 'francs', 'jambes', 'lever', 'pertes', 'stress', 'connaissent', '3ème', 'veuillez', 'quasi', 'Données', 'lunettes', '1912', 'voient', 'habitat', 'fonde', 'Free', 'seules', 'procédures', 'jury', 'Green', 'antique', 'Numéro', 'Jeudi', 'Ukraine', 'nation', 'apparaître', 'garçons', 'Niveau', 'manches', 'riz', 'maîtres', 'hameau', 'ressort', 'récents', 'circonstances', 'québécois', 'rentre', 'newsletter', 'électroniques', 'crimes', 'habitation', 'el', 'all', 'réduite', 'profonde', 'trouvez', 'LED', 'entrées', 'Médaille', 'Naissance', '3e', 'content', 'régler', 'universités', 'peint', 'jusque', 'individuel', 'Jean-Baptiste', 'intervenir', 'Utilisateur', 'blessé', 'maximale', 'téléchargement', 'commander', 'échapper', 'Décembre', 'To', 'sponsoriséSujet', 'souffle', 'Amazon', 'venait', 'pousse', 'plaques', 'ouvriers', 'continent', 'forums', 'terminer', 'auxquels', 'restera', 'britanniques', 'Irak', 'FRANCE', 'trouvant', 'Semaine', 'diagnostic', 'Roland', 'récentes', 'CA', 'caisse', 'imaginer', 'quitté', 'Temple', 'rendent', 'considérer', 'permanence', 'instruction', 'explication', 'contribuer', 'junior', 'Costa', 'tapis', 'Commune', 'Résumé', 'Norvège', 'that', 'atteinte', 'FR', 'Cap', '1927', 'lancée', 'mixte', 'pure', 'micro', 'disponibilitésDu', 'ferroviaire', 'bleue', 'sentiments', 'Fermer', 'vertu', 'mont', 'courts', 'sacré', '130', 'Metz', 'contente', 'séjours', 'universitaires', 'v.', 'sachant', 'resté', 'colonel', 'ménage', 'couvre', 'Utilisation', 'Brown', 'PàS', 'trait', 'ronde', 'officiers', 'Williams', 'oeuvres', 'Hall', 'bisous', 'Test', 'Walter', 'gars', 'serais', 'soutient', 'franchement', 'déposer', 'monastère', 'indice', 'mec', 'Équipe', 'genres', 'identification', '--Présentation', 'Tél', 'Ajoutez', 'accueillant', 'mariée', 'Louise', 'conclusion', 'html', 'interventions', 'précédents', 'destinés', 'abonnement', 'French', 'bouteille', 'abrite', 'communautaire', 'Magazine', 'imagine', 'foule', 'accent', 'citoyen', 'Esprit', 'rappel', 'BMW', 'monsieur', 'trace', 'Public', 'connaitre', 'parfum', 'Mini', 'poèmes', 'réalisées', 'Mathieu', 'culturels', 'Ensemble', 'soutenu', 'Renaissance', 'Eugène', 'spécialisés', 'AS', 'soyez', 'marquée', 'possession', 'Galaxy', 'ml', 's.', 'étudie', 'Bourse', 'geste', 'gâteau', 'Brest', 'grossesse', 'agissait', 'trimestre', 'Charlie', 'School', 'familial', 'joindre', '1913', 'ailes', 'séparation', 'générations', 'réactions', 'obligé', 'Wars', 'Profil', 'cool', 'Maire', 'grosses', 'mine', 'Mobile', 'Construction', 'intéresser', 'occupé', 'intellectuelle', '700', 'room', 'Liens', 'Journée', 'passages', 'Publicité', 'Auvergne', 'évaluer', 'pompe', 'sûrement', 'Finalement', 'cherchez', 'parlent', 'tables', 'tourné', 'classée', 'not', 'centrales', 'vis-à-vis', 'acquérir', 'I.', 'lèvres', 'César', 'London', 'signal', 'actuels', 'Île', 'explications', 'supports', 'prime', 'interprète', 'choisit', 'représenter', 'Magasiner', 'intervient', 'Trouvez', 'Entretien', 'représenté', 'préfecture', 'Manchester', '24h', 'restant', 'Azur', '้', 'sondage', 'Time', 'métrage', 'carbone', '2005Sujet', 'réfugiés', 'Nouvelles', 'nettement', 'Lausanne', 'proposant', 'Chapelle', 'arbitre', 'exercer', 'pouvaient', 'puissent', 'Support', 'testé', 'PARIS', '2010Sujet', 'drôle', 'doté', 'pauvreté', 'usages', 'conformément', 'Scénario', 'posté', 'graves', 'représentations', 'froide', 'oppose', 'Camille', 'permanente', 'littéraires', 'présentées', 'signaler', 'vingtaine', 'intégral', 'dramatique', 'constituée', 'Lundi', 'souligne', 'refaire', 'sonore', 'Reims', 'Modifier', 'spectateurs', 'parvenir', 'arabes', 'Roman', 'Imprimer', 'Ivoire', 'casque', 'Littérature', 'faibles', 'trou', 'suppose', 'Déjà', 'évènement', 'carton', 'Domaine', 'quelqu', 'Avenue', '1922', 'verbe', 'volet', 'diocèse', 'sexuelle', 'avancer', 'futurs', 'animé', 'Texte', 'éthique', 'Linux', 'Sarah', 'matériels', 'inférieure', 'Paru', 'boutons', 'Mardi', 'Juste', 'Dragon', 'universelle', 'Dijon', 'guère', 'Aquitaine', 'observer', 'Lit', 'convaincre', 'meurtre', 'Nintendo', '1916', 'interdiction', 'Kim', 'destin', 'balle', 'Écosse', 'établit', 'voler', 'astéroïde', 'conservé', 'brut', 'opéra', 'cardinal', 'Août', 'passagers', 'Pyrénées', 'Samuel', 'Inc', 'triste', 'verts', 'succession', 'Victoria', 'Bleu', 'incluant', 'Jackson', 'remplace', 'Island', '1923', 'Météo', 'Jour', 'russes', 'abri', 'révolutionnaire', 'aviation', 'puisqu', 'postal', 'communs', 'tube', 'essentielles', 'Deuxième', 'Jeune', 'attends', 'randonnées', 'paradis', 'requête', 'enseignant', 'plaît', 'saisir', 'consultez', 'alertes', 'initiale', 'panneau', 'caractéristique', 'attaquer', 'Recettes', 'commis', 'Caen', 'aile', 'blogs', 'désert', 'Bible', 'Mohamed', 'Section', 'vendeur', 'Hier', 'Little', 'Exposition', 'fonctionner', 'er', 'documentation', 'élaboration', 'chrétienne', 'allé', 'Stock', 'mené', 'précieux', 'supérieures', 'extraits', 'schéma', 'duquel', 'Cameroun', 'internes', 'golf', 'Professeur', 'terrible', 'parisienne', 'Anthony', 'frappe', 'Celle-ci', 'vrais', 'poursuite', 'nationalité', 'officiels', 'cordes', 'fédérale', 'gardé', 'Hervé', 'comprennent', 'intéressé', 'constructions', 'adopter', 'fabricant', 'apparemment', 'civilisation', 'aurai', 'Contactez-nous', 'islamique', 'indiquant', 'feux', 'inutile', 'Machine', 'trés', 'Young', 'Bulletin', 'Contacter', 'parlementaire', 'composants', 'boire', 'couronne', 'bourg', 'agences', 'up', 'poème', 'Roumanie', 'graphiques', 'remarquer', 'fantastique', 'fontsize', 'passés', '̀', 'Alexander', 'indépendants', 'profond', 'publicités', 'Avignon', 'constante', 'Consultez', 'star', 'figures', 'foot', 'épaisseur', 'paraître', 'arguments', 'Allemands', 'tombé', 'introduit', 'Quels', 'sainte', 'magnifiques', 'Toronto', 'volant', 'Angers', 'Légion', 'églises', 'Bar', 'vive', 'rois', 'suivie', 'habite', 'habitant', 'maillot', 'prévenir', 'taxes', 'Malheureusement', 'case', 'cliquer', 'toilette', 'charmant', 'jeter', 'appellation', 'désigné', '1906', 'définit', 'Jim', 'travaillent', 'fiable', 'VII', 'présentés', 'réfléchir', 'chère', 'Kevin', 'argument', 'sportives', 'my', 'Excellent', 'géant', 'produite', 'contribue', 'retrouvent', 'Roy', 'oubliez', 'façons', 'ouverts', 'réserves', 'grecque', 'classés', 'Double', 'pourrais', 'am', 'ouvertes', 'graines', 'Annuaire', 'Laval', 'our', 'host', 'etait', 'sortant', 'Alliance', 'étages', 'voila', 'typique', 'dedans', 'trône', 'profondément', 'battu', 'Maritimes', 'Soit', 'Day', 'souhaitent', 'hein', 'musicien', 'électeurs', 'Mans', 'prévues', 'Administration', 'rapides', 'gueule', 'marins', 'achète', 'devons', 'électorale', 'pratiquement', 'clinique', 'équipage', 'servent', 'spacieux', 'Marques', 'immédiate', 'géographiques', 'insee', 'associée', 'Quelles', 'PME', 'cf.', 'collecte', 'Charlotte', 'valable', 'Editions', 'employeur', 'promo', 'file', 'Boston', 'bateaux', 'dispute', 'revues', 'new', 'couches', 'Fred', 'deviendra', 'coll', 'obligatoires', 'bgcolor', 'traitements', 'verra', 'folie', 'UNE', 'librement', 'rechercher', 'collaborateurs', 'concernés', 'déplacements', 'partagé', 'Technique', 'ya', 'marge', 'powiat', 'escalier', 'Ontario', 'Minimum', 'priori', 'Café', 'manche', 'SNCF', 'Dossier', 'remboursement', 'survie', 'fixation', 'Paiement', 'Six', 'civils', 'W.', 'fournis', 'pensent', 'Lecture', 'zéro', 'séances', 'Mali', 'Vu', 'recommandations', 'Vallée', 'utilité', 'resultat', 'Personne', '1870', 'constate', 'âgé', 'Golf', 'fixé', 'policier', 'provoque', 'ligue', 'onze', 'pot', 'orbite', 'Mercredi', 'more', 'tissus', 'Central', 'néerlandais', 'récupération', 'détruire', 'OFFRE', 'Amsterdam', 'décoré', 'humeur', '1905', 'CV', 'ordinateurs', 'biologique', 'odeur', 'Texas', 'énormément', 'provinces', 'puits', 'Gros', 'septième', 'Obama', '350', 'citron', 'rapporte', 'attendu', 'Sylvie', 'émotions', 'séminaire', 'Né', 'Gilbert', 'vaisseau', 'stages', 'Patrimoine', 'bravo', 'assister', '¨', 'Enseignement', 'Johnson', '97', 'Venez', 'Jean-Luc', 'militants', 'parlant', 'dommages', 'savoir-faire', 'juive', 'artisans', 'Orient', 'estimé', 'satisfaction', 'gentilé', 'ha', 'ventre', 'Fontaine', 'Moulin', 'dormir', 'simplicité', 'tchèque', 'Karl', 'intense', 'chimiques', 'décennies', 'Dame', 'Mission', 'crédits', 'décider', 'fonctionnaires', 'serbe', 'accueillis', 'stable', 'complémentaire', 'universel', 'conquête', 'centaine', 'Allez', 'dépasse', 'philosophe', 'exprime', 'compliqué', 'Beauté', 'excellence', 'Las', 'utilisez', 'fiches', 'preuves', 'Marguerite', 'Stephen', 'plafond', 'drame', 'Trouver', 'recommandé', 'cités', 'haine', 'stay', 'Derniers', 'opérateurs', 'actualités', 'clic', 'abandonné', 'apprécier', 'prochaines', 'exposé', 'cuire', 'cap', 'côtes', 'préserver', 'ballon', 'évaluations', 'procéder', 'correspondent', 'complément', 'Bonsoir', 'marchandises', 'Transport', 'serai', 'disposer', '2017Voir', 'Pages', 'roue', 'Bac', 'SA', 'citation', 'combattre', 'refusé', 'Offre', 'Citation', 'témoin', 'dessert', 'qualifié', 'PSG', 'blanches', 'possèdent', 'probable', 'dirigeant', 'invitons', 'pause', 'pôle', 'adhésion', 'attribué', 'sacs', 'chef-lieu', 'dirigée', 'traditions', 'syndicats', 'manga', 'facture', 'Blanche', 'stratégies', 'heureuse', 'vendus', 'Techniques', 'moral', 'animations', 'issues', 'pensées', 'tailles', 'entraîner', 'Éric', 'Franck', 'étendue', 'forfait', 'hygiène', 'vice-président', '2010Age', 'latine', 'Neuf', 'oeufs', 'cellule', 'conseillé', 'protocole', 'Munich', 'dispositifs', 'anagramme', 'barrage', 'Édouard', 'Up', 'affronter', 'démarrage', 'paris', 'Jean-Louis', 'ferait', 'capables', 'satisfaire', 'communications', 'ingénierie', 'fréquemment', 'bourse', 'traités', 'Aperçu', 'Réalisation', 'actuelles', 'essentielle', 'défini', 'charte', 'serveurs', 'pomme', 'réunions', 'provenance', 'Question', 'catholiques', \"tarifsJusqu'\", 'Historique', 'énergies', 'branches', 'quelconque', 'mousse', 'défis', 'échanger', '0,00', 'horaire', 'apt', 'productions', 'Exemple', 'Johnny', 'TF1', 'Portrait', 'touristes', 'week', 'Statistiques', 'climatique', 'accusé', 'Logement', 'frac', 'époux', 'intéressante', 'canapé', 'Crédit', 'participent', 'rural', 'miroir', 'oublie', 'téléfilm', 'bière', 'correspondance', 'ultime', 'domestiques', 'dégâts', 'gouvernements', 'situées', 'Langue', 'stabilité', 'externes', 'reconnue', 'suspension', 'partiellement', 'gloire', 'majeurs', 'ISBN', 'dévoile', 'instructions', 'photographies', 'immigration', 'Company', 'select', 'institut', 'America', 'papiers', 'exécutif', 'disposent', 'étudié', 'fédération', 'Oise', 'Seul', 'Kong', 'nulle', 'opposé', 'Pôle', 'os', 'Troisième', 'fiscal', 'trajet', 'contribué', 'Brian', 'têtes', 'Gallimard', 'faculté', 'Dark', 'Unitaire', 'médicament', 'qualification', 'chimique', 'certificat', 'Racing', 'héritage', 'Jane', 'talents', 'Award', 'explosion', 'malades', 'confidentialité', 'positive', 'joint', 'sèche', 'ondes', 'nef', 'Carl', 'assistant', 'rond', 'canons', 'exerce', 'notoriété', 'éditeurs', 'URSS', 'plainte', 'idéalement', 'imprimé', 'basket-ball', 'XXX', 'al', 'boutiques', 'bête', 'connues', 'instance', '1907', 'Bay', 'Côté', 'astéroïdes', 'balcon', 'Robin', 'dynastie', 'Finlande', '2011Sujet', 'tués', 'esprits', 'sons', 'Taylor', 'mobilier', 'remonter', 'Jean-Marie', 'réglementation', 'cousin', 'Peugeot', 'Île-de-France', 'routière', 'marié', 'Marque', 'alliés', 'North', 'signer', 'Pinterest', 'vend', 'Celui', 'devaient', 'BTS', 'voyant', 'attirer', 'Suivant', 'nomination', 'digne', 'livré', 'Charte', 'partisans', 'Pacifique', 'habituellement', 'laser', 'circuits', 'délégation', 'Mouvement', 'températures', 'thématique', 'comportements', 'faim', 'réservéDu', 'précédentes', 'élite', 'colonnes', 'reconnaît', 'œufs', 'salarié', 'république', 'ingénieurs', 'ménages', 'merveilleux', 'oreille', 'investir', 'noix', 'vs', 'RSS', 'antenne', 'satisfait', 'Actuellement', 'Reine', 'racines', 'oncle', 'traits', 'motivation', 'Centrale', 'attentat', 'conducteur', 'Grands', 'connaissez', 'imaginaire', 'Contrairement', 'horreur', 'Serbie', 'marcher', 'feront', 'siren', 'filiale', 'Lady', 'relatifs', 'Marketing', 'fermer', 'confirmer', '1908', 'AC', 'T.', 'placée', 'récompense', 'législation', '±', 'ultra', 'fortune', 'Comté', 'traditionnels', 'Power', 'projection', 'moulin', 'Lune', 'bords', 'surpris', 'nécessairement', 'miel', 'hésite', 'Compte', 'remarqué', 'dépasser', 'Images', 'bénéfice', 'Jacob', 'territoriale', 'transmettre', 'iPad', 'have', 'intégralité', 'scrutin', 'compétitions', 'pensais', 'Félix', 'distinction', 'Chili', 'subi', 'préparé', 'réunit', 'naît', 'combinaison', 'réalisations', 'handicap', 'horizon', 'OS', 'FN', 'Oscar', 'orientale', 'Colombie', 'baseball', 'accueillante', 'supprimé', 'filtres', 'trous', 'Virginie', 'Marne', 'Station', 'transforme', 'War', 'pousser', 'Lieu', 'métalliques', 'phare', 'chante', 'fidélité', 'degrés', 'Nicole', 'coûte', 'ordonnance', 'XIXe', 'révolte', 'vies', 'révision', 'Appel', 'dictionnaire', 'Romain', 'sauvegarde', 'Giovanni', 'administrateurs', 'isolation', 'Étienne', 'camps', 'départemental', 'Fleurs', 'Mort', 'Cup', 'primaires', 'grade', 'expansion', 'Classe', 'gmina', 'corruption', 'Hill', 'Chevalier', 'Jean-Jacques', 'domination', 'Prise', 'illustrations', 'entouré', 'litres', 'Garde', 'mg', 'souligné', 'sensibles', 'Jonathan', 'franchise', 'pharmacie', 'High', 'k', 'semblait', 'Besoin', 'manifeste', 'venez', 'terrorisme', 'gentillesse', 'corriger', 'South', 'impériale', 'Suppression', 'artistiques', 'notables', 'Émile', 'Pack', 'masque', 'tue', 'Gratuit', 'Luis', 'nombres', 'sélections', 'Ile', 'Études', 'préférence', 'fausse', 'recul', 'devront', 'associées', 'Opéra', 'immobilière', 'violences', '2011Age', 'AFP', 'thématiques', 'stars', 'conversion', 'carburant', 'tournant', 'apres', 'spacieuse', 'fermés', 'suprême', 'pis', 'annoncer', 'topic', 'FORUM', 'saut', 'justifier', 'celles-ci', 'Doctinaute', 'promotions', 'régionaux', 'abandon', 'ports', 'volontaires', 'Cinq', 'cite', 'ajout', 'récits', 'responsabilités', 'Gaulle', 'susceptibles', 'précédemment', 'reposer', 'lieutenant', 'provisoire', 'DJ', 'sensation', 'sections', 'sanitaire', 'Chaussures', 'prévoir', 'vaisselle', 'Night', 'Occident', 'mentionné', 'consommateur', 'neutre', 'solaires', 'émotion', 'initial', 'mâle', '2017Déjà', 'personnellement', 'intensité', 'constituer', 'Formule', 'intégrée', 'sculpture', 'extrémité', 'parisien', 'Show', 'soldat', 'paragraphe', 'chair', 'boucle', 'Clément', 'banlieue', 'Ier', 'laine', 'voïvodie', 'archevêque', 'Hans', 'mystère', 'recommander', 'plume', 'anime', 'extérieurs', 'continuent', 'Bistro', 'strictement', 'Flash', 'Garantie', 'imprimer', 'data', 'Athènes', 'traditionnelles', 'Découvrir', 'promis', 'statistique', 'mécanisme', 'Tournoi', 'plaine', 'oblige', 'appuie', 'cheminée', 'VTT', 'Barbara', 'Allah', 'poule', 'écrivains', 'installée', 'autorisé', 'évolutions', 'Islam', 'fiche.php', 'fermée', 'pp.', 'Laura', 'renommée', 'requis', 'monté', 'technologique', 'enfer', 'témoins', 'Story', '/', 'jolies', 'cuivre', 'montrant', 'cassini', 'pollution', 'défenseur', 'Petits', 'surfaces', 'élevés', 'tirage', 'Valérie', 'déclarations', 'psychologie', 'XIII', 'volontaire', 'bloqué', 'ampleur', 'Lens', 'facilité', 'cassini.ehess.fr', 'Éducation', 'Sauf', '2,99', 'aperçu', '−', 'Libération', 'chaine', 'mineurs', 'urbanisme', 'doigt', 'imagination', 'quantités', 'symbolique', 'faisons', 'opportunité', 'commissaire', 'finance', 'Xbox', 'concepts', '--Problèmes', 'touché', 'pratiquer', 'baron', 'visibles', 'loup', 'établie', 'aériennes', 'puissante', 'participant', 'phénomènes', 'Concernant', '1896', 'Hong', 'aidé', 'cou', 'voté', 'chimie', 'Champagne', 'catastrophe', 'Début', 'sûre', 'Piscine', 'génial', 'HP', 'Amiens', '101', 'Canton', 'noires', 'styles', 'perspectives', 'attendent', 'Volume', 'remet', 'Certes', 'Video', 'hebdomadaire', 'accidents', 'Auto', 'Boutique', 'Villes', 'attaquant', 'supplément', 'jugé', 'recherchez', 'virtuelle', 'Toulon', 'apport', 'Systèmes', 'Ceux', 'Quoi', 'Commons', 'plait', 'égale', 'assurances', 'découvrez', 'linguistique', 'Analyse', 'invasion', 'robot', 'Long', 'Floride', 'créés', 'Anvers', 'Standard', 'réformes', 'jouant', 'so', 'prochains', 'Siège', 'varie', 'abonner', 'syndicat', 'Petites', 'souple', 'Connectez-vous', 'intitulée', 'anciennement', 'téléphonique', 'box', 'poulet', 'Saint-Pierre', 'bébés', 'spatiale', 'Parking', 'considérés', 'engagements', 'annexe', 'réunis', 'fondateurs', 'salaires', 'toilettes', 'réflexions', 'mauvaises', 'insectes', 'parait', 'couture', 'prouver', 'Photographie', 'précisé', 'Ray', 'choisis', 'gentil', 'effectuée', 'seuil', 'informé', 'apprécie', 'Alan', 'itinéraire', 'clos', 'terrestre', '✉', 'orgue', 'retenir', 'continu', 'dons', 'Alexis', 'devenus', 'Poids', 'descente', 'tabac', 'Wikipedia', 'opus', 'Society', 'debout', 'invitation', 'présidente', 'Rencontre', 'ème', 'problématique', 'illustre', 'insertion', 'poitrine', 'absolue', 'payé', 'apporté', 'Cher', 'College', 'distinguer', 'adhérents', 'ISO', 'Continuer', 'Tweet', 'Secrétaire', 'PSP', 'ruisseau', 'chaise', '1909', 'faune', 'illustration', 'conjoint', 'dose', 'sélectionner', 'one', 'fiscale', 'massage', '1,5', 'bandeau', 'protégé', 'promu', 'Femmes', 'Nouveautés', 'fallu', 'préfet', 'libertés', 'beaux-arts', 'romantique', 'enjeu', 'fibre', 'soeur', 'moi-même', 'accorde', 'Américains', 'curieux', 'subit', 'occasions', 'facebook', 'analyser', 'religions', 'augmenté', 'aussitôt', 'quai', 'Créé', 'Médecin', 'littéralement', 'directrice', 'demandent', 'examens', 'charbon', 'est-elle', 'noblesse', '2012Sujet', 'pleins', 'immédiat', 'historiens', 'neuve', 'Naples', 'Louvre', 'savait', 'écrans', 'juif', 'Abbaye', 'efficaces', 'incontournable', 'spécialité', 'Newsletter', 'investisseurs', 'éliminer', 'tranche', 'Catalogue', 'tiennent', 'cacher', 'gère', 'organes', 'proviennent', 'authentique', 'générique', 'alias', 'adresser', 'acoustique', 'Andrew', 'généraliste', 'studios', 'Biographie', 'financer', 'voyager', 'humidité', 'dépit', 'existait', 'clan', 'correct', 'op', '1793', 'Tel', 'drogue', 'porte-parole', 'sauvages', 'tenues', 'attentats', 'rendant', 'mobilisation', 'soirées', 'show', 'temporaire', 'barbecue', 'inscriptions', 'ref-data4', 'fixer', 'savais', 'adoré', \"Lorsqu'\", 'traversée', 'Arc', 'Libre', '1890', 'suédois', 'oldid', 'compagnon', 'admin', 'Etudes', 'Social', 'exemplaire', 'libérer', 'encontre', 'huiles', 'Wilson', 'Choisissez', 'coupé', 'inclut', 'rédigé', 'filière', 'constaté', 'View', 'paquet', 'essaye', 'dîner', 'ah', 'nice', 'Data', 'juger', 'dieux', 'olive', 'boissons', 'retours', 'phases', 'Manager', '1880', 'égal', 'observations', 'parlement', 'Profitez', 'définitive', 'ref-data8', 'Lewis', 'jouets', 'mécanismes', 'ref-data3', 'délicieux', 'ref-data1', 'ref-data2', 'ref-data5', '105', 'ref-data7', 'ref-data6', 'évolué', 'réside', 'mecs', 'chargement', 'exil', 'dir.', 'destinées', 'circulaire', 'consacre', 'biographie', 'retenu', 'initiatives', 'livrer', 'Lui', 'existant', 'Prague', 'Horaires', 'tort', 'violon', 'Economie', 'dégagée', 'manquer', 'municipales', 'développée', 'coller', 'refuge', 'mange', 'décrire', 'raconter', 'relever', '--Discussions', 'surement', 'papa', 'methodes', 'anonyme', 'Longueur', 'Dan', 'conversation', 'Nouvel', 'parution', 'pédagogiques', 'allée', 'coalition', 'contributeur', 'Lumière', 'shopping', 'Marché', 'Golden', 'révélé', 'amène', 'variables', 'Madagascar', 'sœurs', 'voisine', 'Ressources', 'orthographe', 'nu', 'femelle', 'Surtout', 'offerts', 'consensus', 'composés', 'nomenclatures', 'âgées', 'Manche', 'XIV', 'River', 'CC', 'p.revenumedian', 'Installation', 'Cuba', 'donnera', '2021173', 'Nouvelle-Zélande', 'Secret', 'Papier', 'EST', '2129090', '2123878', 'explorer', 'adversaires', '2129062', '2129059', '2123937', 'rendus', '2129068', '2129076', 'axes', 'noble', 'restes', 'colonie', 'colline', 'Départ', 'garanties', 'entourage', 'Ed', 'Cahiers', 'die', 'francophones', 'géographie', 'frein', 'conte', 'devenant', 'pattes', 'activer', 'p.page2code', 'Lucas', 'faisaient', 'troupe', 'décors', \"avancéeS'\", 'affirmé', 'plate-forme', 'renseignement', 'Notice', 'Info', 'africaine', 'relief', 'appuyer', 'portefeuille', 'tentatives', 'Davis', \"VOIRL'\", 'plongée', 'bis', 'Arabie', 'livret', 'rajouter', 'ref-data9', 'History', 'coordination', 'Annonces', 'phrases', 'partition', 'servant', 'émis', 'SMS', 'courir', 'colle', 'disparaître', 'q', 'Droits', 'fur', 'Mettre', 'Internationale', 'gain', 'transformé', 'wifi', 'TripAdvisor', 'gratuites', 'Martine', 'visibilité', 'sculpteur', 'jean', 'ambition', 'chars', 'bol', 'Rivière', 'susceptible', 'suites', 'réservée', 'rendement', 'expose', 'panne', 'perception', 'trouble', 'For', 'Spa', 'Camping', 'indiqués', 'laissent', '900', '115', 'pc', 'Mémoire', 'systématiquement', 'dirait', 'doctorat', 'terminée', 'text-align', 'Police', 'reportage', 'lacs', 'unies', 'metal', 'pop35', 'Assurance', 'an35', 'if', 'Faculté', 'offerte', 'aquarium', '2534314', 'balade', 'rempli', 'recens35', 'avocats', 'lis', 'Elizabeth', 'sérieusement', 'ambassadeur', '1904', 'confirmation', 'business', 'pop36', 'an36', 'publiées', 'Eh', 'Boîte', 'amoureuse', 'trentaine', '1881', 'Allemand', 'recens36', 'contemporains', 'Lake', 'ICI', 'Tim', 'installés', 'sanitaires', 'amont', 'finition', 'compagnons', 'pop37', 'an37', 'Somme', 'recens37', 'étudiante', 'particulières', 'Points', 'poignée', 'épicerie', 'great', 'vents', 'More', 'Laurence', 'suicide', 'chrétien', 'Langues', 'brillant', 'lourds', 'inspire', 'conçue', 'clocher', 'chapeau', 'mineur', 'planche', 'rotation', 'suffisant', 'abandonner', 'chambresLocation', 'couvrir', 'chirurgie', 'cahier', 'menaces', 'zonages', 'Architecture', 'pop38', 'Règlement', 'an38', 'prenez', 'promenade', 'recens38', 'auxquelles', 'créant', 'tirs', 'souviens', 'noyau', 'travaillant', 'publicitaire', 'Ernest', 'démonstration', 'personnaliser', 'inventaire', 'libéral', 'Médecins', 'Sun', 'Dead', 'continuité', 'Déclaration', 'passées', 'confusion', 'harmonie', 'assaut', 'May', 'eBay', 'armé', 'portugais', 'Marco', 'Nick', 'Girl', 'Européenne', 'remporter', 'Cartes', 'blé', 'sensibilité', 'formats', 'concernent', 'illustré', 'disciplines', 'témoigne', 'courtes', 'Diego', 'abonnés', 'décident', 'assassinat', 'DC', 'définie', 'tablettes', 'mi', 'partielle', 'supporter', 'assurée', 'marquis', 'discret', 'marquer', 'Résistance', 'épargne', 'communal', 'opinions', 'boule', 'symptômes', 'blessures', 'parts', 'convaincu', 'an39', 'pop39', 'K.', 'recens39', 'avère', 'Pâques', 'Librairie', 'ruines', 'days', 'Broché', 'intime', 'commune.asp', 'depcom', 'bombe', 'Valence', 'tubes', 'hâte', 'portraits', 'notions', 'Décoration', 'laver', 'obtention', 'bizarre', 'transparence', 'Times', 'fitness', 'British', 'lourde', 'actu', 'chalet', 'honte', 'interprété', 'reconstruction', 'Chacun', 'variations', 'avez-vous', 'brevet', 'mémoires', 'séquence', 'diriger', 'personnalisé', 'saga', 'provoquer', 'ponts', 'an40', 'Peut', 'distances', 'souffrance', 'organisées', 'Client', 'Madeleine', 'Limoges', 'attire', 'échecs', 'assis', 'Entrée', 'aîné', 'génétique', 'reviendrons', 'prononcé', 'conservateur', 'Saint-Louis', 'savons', 'voyez', 'formidable', 'Brigitte', 'noté', 'Album', 'rangs', 'territoriales', 'can', 'Leurs', 'OU', 'interdite', 'lave', 'couteau', 'vernis', 'chasseurs', '1500', 'monétaire', 'heureusement', 'matches', 'terroristes', 'subir', 'peaux', 'particules', 'Jazz', 'basque', 'perdue', 'Poitiers', 'Notes', 'enquêtes', '---', 'seins', '├', 'offensive', 'time', 'brigade', 'Contacts', 'lots', 'routier', 'Saint-Martin', 'agréables', 'Meilleure', 'Documents', 'cherchent', 'succède', 'diffuser', 'Dentiste', '--Archives', 'pop40', 'accompagnée', 'recens40', 'constat', 'observe', 'espoirs', 'respecte', 'bibliothèques', 'considérable', 'Besançon', 'Hello', 'Taux', 'Marion', 'planification', 'Réserver', 'construits', 'académie', 'tendre', 'partant', 'épée', 'poétique', 'vitesses', '2014-2015', 'propagande', 'caché', 'Coup', 'tomates', 'modules', 'juges', 'profité', 'préférés', 'fabrique', 'Monument', 'XVI', 'Ayant', 'fier', '2012Age', 'raisonnable', 'rassemble', 'charmante', 'Liban', 'Road', '104', 'médicaux', 'originales', 'Traité', 'vestiges', 'moyennes', 'planches', 'cinquante', 'destinations', 'doctrine', 'Réf', 'publicitaires', 'DANS', 'vif', '\\u200e', 'tensions', 'flore', '1800', 'Afghanistan', '1903', 'canaux', 'traduire', 'commentaireCharger', 'black', 'châteaux', 'Media', 'entité', 'Hollywood', 'Rochelle', 'chargés', 'Sac', 'étonnant', 'délégué', 'classification', 'menus', 'contes', 'champignons', 'variés', 'logistique', 'cavalerie', 'mères', 'dirais', 'Lucien', 'trésor', 'prêtres', 'choisissez', 'moines', 'cantons', 'Patrice', 'jette', 'posée', 'Immobilier', 'conformité', 'sénateur', 'rénové', 'tas', 'comptent', 'Fillon', 'Lord', 'tracé', 'créateurs', 'Jimmy', '1860', 'marbre', 'rangement', 'contrôles', 'paysans', 'vélos', 'Quartier', 'amener', 'al.', 'soie', 'concerné', 'gouvernance', 'baignoire', '1902', 'républicain', 'balades', '1886', 'maréchal', 'complexité', 'Havre', 'inconnue', 'identifié', 'Céline', 'Is', 'blessure', 'habitudes', 'coureur', 'Christopher', 'africain', 'originaux', 'rédacteur', 'Pékin', 'System', 'nuages', 'pluriel', 'souverain', 'postés', 'CNRS', 'contiennent', 'renouvellement', 'Extrait', 'flash', 'couverte', 'légitime', 'surnom', 'will', 'légale', 'rigueur', 'amené', 'Gaston', 'Promotion', 'jupe', '1891', 'N.', 'collèges', 'BA', 'allemandes', 'remercier', 'Chemin', 'instar', 'effectuées', 'débutant', 'signification', 'Your', 'solde', '1876', 'VIII', 'rebelles', 'And', 'reposant', 'br', 'vivants', 'usagers', 'were', 'aérien', 'c.', 'cachées', 'an41', 'pop41', 'recens41', 'Hitler', 'Magasin', 'Vosges', 'négatif', 'Rhône-Alpes', 'demandant', 'Olympique', 'invention', 'marchands', 'Libye', 'gagnant', 'inclinaison', 'roses', 'SC', 'métallique', 'Trop', 'fût', 'Series', 'Great', 'payant', 'refuser', 'm.', 'Idéal', 'roulant', 'étendre', 'emballage', 'Molière', 'protégée', '1789', 'combattants', 'bénéfices', 'Amis', 'fixes', 'Finances', 'prochainement', '1830', 'Miller', 'concurrents', 'rurale', 'polémique', 'adolescents', 'contrainte', 'engagée', 'nucléaires', 'perles', 'Vladimir', 'matériau', 'intégrale', 'mélanger', 'remarques', 'départementale', '1848', 'esclaves', 'avancées', 'diminuer', 'affirmer', 'fondamentaux', 'peintres', 'théorique', 'Interview', 'Hauteur', 'figurent', 'menées', 'faciles', 'na', 'légal', 'Observatoire', 'Bad', 'Digital', 'arbitrage', 'protéines', 'annuler', 'purement', 'descriptif', 'arrivés', 'fréquentes', 'maquillage', '--Jeux', 'foie', 'fabriquer', 'coupable', 'mythe', 'chouette', 'donnés', 'blocs', 'pourcentage', 'gestes', 'literie', 'aménagé', '450', 'proprement', 'Sylvain', 'répartis', 'Vieux', '4e', 'gel', 'Jeunes', 'vérification', 'plomb', 'Tunis', 'UNESCO', 'rêver', 'Resort', 'doublé', 'Moto', 'Porto', 'modeste', 'diminution', 'Sir', 'particularité', 'ouvrant', 'Midi', '2013-2014', 'douleurs', 'relevé', 'goûts', 'véritables', 'Gustave', 'relie', 'Western', 'Rendez-vous', 'lampe', 'Supprimer', 'po', 'théories', 'Heureusement', 'larges', 'Télévision', 'renseigner', 'Promotions', '1871', 'Tribunal', 'procureur', 'souhaité', 'FM', 'guematria', 'lycées', 'démission', 'bars', 'Univers', 'couvent', 'lumineuse', 'courante', 'métaux', 'conventions', 'souffre', 'nourrir', 'Montagne', 'pattern', 'dépendance', 'majeures', 'considérablement', 'protège', 'créativité', 'chanter', 'Sortie', 'YouTube', 'engagés', 'régimes', 'organisés', 'OpenEdition', 'apparait', 'gorge', 'devoirs', 'assisté', 'tend', 'Cloud', 'Co', '2009Age', 'conclu', 'effectifs', 'exister', 'carrés', 'Annie', 'représentée', 'racine', 'Ferdinand', 'caractérisée', 'mensuel', 'excès', 'EP', 'pr', 'Près', 'attractions', 'constamment', 'pantalon', 'bénévoles', 'résister', 'had', 'régulier', 'Traitement', 'passionnés', 'colonies', 'Land', 'occidental', 'prétexte', 'médiévale', 'rappelé', 'exploration', 'Huile', 'donna', 'débuté', 'carnet', 'ours', '1895', 'individuelles', 'Dimensions', 'Édition', 'made', 'espérons', '1898', 'consacrer', 'immeubles', 'proportion', 'compromis', 'placés', 'téléphones', 'contraint', 'Géorgie', 'marin', 'chercheur', 'cohérence', 'Pérou', 'fontaine', 'Traduction', 'Matt', 'incident', 'XXe', 'Régime', 'Moins', 'fabricants', 'Cécile', 'retire', 'exact', 'signée', 'scandale', 'résistant', 'pavillon', 'mécaniques', 'Miami', 'désolé', '2015-2016', 'Découverte', 'indien', 'coque', 'accorder', 'Moyen-Orient', 'archéologique', 'Jeff', 'Four', 'Comparer', 'Logo', 'absolu', 'avancé', 'marchand', 'suggestions', '---------------', 'certification', '128', 'Hubert', 'réparer', 'herbe', 'ange', 'laissez', 'supérieurs', 'variétés', 'Ryan', 'rémunération', 'visuel', 'couverts', 'popularité', 'cote', 'Jura', 'musulman', 'guitariste', 'ressource', 'Îles', 'Jean-Marc', 'pouvais', '§', 'falloir', 'Box', 'fauteuil', 'Peinture', 'superbes', \"I'\", 'camion', 'Email', 'Classic', 'Gold', 'Style', 'bordure', 'optimiser', 'Éditeur', 'Games', 'Industrie', 'comptable', 'polonaise', 'Responsable', 'sage', 'Professionnels', 'boîtes', 'della', 'Soin', 'verres', 'bancaires', 'Lionel', 'dames', 'envisager', 'Séjour', 'athlète', 'évêques', 'Justin', 'choisie', 'Heures', 'pack', 'encourager', 'marie', 'By', 'contenir', 'enregistre', 'officielles', 'prendra', 'hjem', 'Ferrari', 'Combien', 'sépare', 'cardiaque', 'moule', 'Space', 'Post', 'Emma', 'Essai', 'inscrite', 'inquiète', 'franc', 'retenue', 'administratifs', 'dés', 'Noire', 'puissants', 'familiales', 'exige', 'surprises', 'Boy', '00ZUn', 'product', 'attraction', 'confie', 'Bruce', 'usines', 'renvoie', 'agriculteurs', 'onde', '3000', 'trio', 'étang', 'Donner', 'faut-il', 'fonctionnelle', 'pension', 'conclure', 'weekend', 'fameuse', 'Palestine', 'galeries', 'Anciens', 'poêle', 'PAS', 'appliquée', 'Midi-Pyrénées', 'Stanley', 'transferts', 'intrigue', 'Campagne', 'renforcement', 'implantation', 'humide', 'Ivan', 'comportant', 'compilation', 'Howard', 'E-mail', 'symboles', 'confié', 'Choisir', 'before', 'défend', 'répression', 'Douglas', 'chic', 'épices', 'manipulation', 'métropole', 'remarquables', 'changeant', 'Rencontres', 'Server', 'leçons', 'Jason', 'imprimante', 'Mondial', 'Discuter', 'Portes', 'alinéa', '220', 'Billy', 'delà', 'Picardie', 'technologiques', 'Agriculture', 'CampagneIdéal', 'originalité', 'nettoyer', 'costume', 'africains', 'Entertainment', 'rendue', 'Prenez', 'Cercle', 'ST', 'suggère', 'appartiennent', 'Vatican', 'per', 'régulation', 'poivre', 'hyper', 'Contrôle', 'brun', 'CP', 'breton', 'UTC', 'psychologique', 'Jean-Michel', 'Mots-clés', 'watch', 'spirituel', 'espérer', 'validation', 'doucement', 'ajoutée', 'Alphonse', 'Préparation', 'courbe', 'rencontrent', '.Le', 'réservoir', 'vagues', 'véritablement', 'réductions', 'alternance', 'Moselle', 'ADN', 'hébergements', 'ascension', 'forteresse', 'ONG', 'Pedro', 'ébauche', 'Marche', 'State', 'métropolitaine', 'aidera', 'CO2', '106', 'casse', '2,5', 'jet', 'traverser', \"p'\", 'Premium', 'emporter', 'coins', 'grandeur', '1899', 'Caisse', 'Oxford', 'communistes', '170', 'résulte', 'stocks', 'Armand', 'index', 'Cadre', 'Aéroport', 'Visite', 'Zurich', 'médiatique', '.jpg', 'Prénom', 'meuble', 'rivières', 'coloris', 'Ton', 'fatigue', 'Recherches', 'multimédia', 'média', 'théologie', 'set', 'urbains', 'athlétisme', 'tempête', 'retiré', 'Johann', 'réels', 'Hôpital', 'estimer', 'sortent', 'chaleureuse', 'Simple', 'Presses', 'saints', 'Andrea', 'sommaire', 'cents', 'Mère', '\', 'Générales', 'Anderson', '1866', 'GT', 'Sydney', 'Sauvegarder', 'Économie', 'socialistes', 'sélectionnés', 'fuir', 'poil', 'larmes', 'soi-même', 'vocabulaire', 'pilotage', 'love', 'Conseiller', 'Fil', 'animateur', 'Genre', 'colloque', 'réagir', 'jaunes', 'Last', 'démontrer', 'archipel', '240', 'emploie', '1200', 'promet', '1861', 'paramètre', 'Bas', 'Online', '1889', 'alarme', 'sanctions', 'déploiement', 'Alimentation', 'merde', 'favori', '00ZLogement', 'Saint-Jean', 'ensembles', 'divorce', 'lentement', 'manuscrit', 'bloquer', 'mondes', 'passionné', 'pop42', 'majoritairement', 'an42', 'recens42', 'successivement', 'Java', 'Au-delà', 'inspecteur', 'évite', 'Pokémon', 'balles', '1.1', 'libéré', 'écho', 'Fox', 'patience', 'ail', 'solidaire', 'Fiat', 'fumée', 'procurer', 'semblable', 'tombée', 'Vert', 'substance', 'copier', 'bénéficient', 'équipés', 'tribu', 'bla', '▪', 'versant', 'refait', 'défauts', 'aborder', 'cartouche', 'Fax', 'Chinois', 'enregistrés', 'uniques', 'Sept', 'efficacement', 'détention', 'reliant', 'Croatie', 'pré', 'Carlo', 'Baby', 'apprécierez', '2012-2013', 'guides', 'tentent', 'Indonésie', 'Jersey', 'Egypte', 'industries', 'Résultat', 'Figaro', 'sagesse', 'croissant', 'Abonnez-vous', 'formant', 'attaché', 'tunnel', 'Andy', 'espérant', 'statuts', 'attribution', 'Automobile', 'sauter', 'PIB', 'émergence', 'vedette', 'Transports', 'âmes', 'fois-ci', 'Ignace', 'Der', 'cherché', 'Allen', 'climatisation', 'interroger', 'intéressantes', 'attache', 'Casa', '1850', 'Line', '750', 'stationnement', 'Seulement', 'express', 'matelas', 'Lisbonne', 'idéologie', 'fréquente', 'saisie', 'campus', 'age', 'Gironde', 'renommé', 'grand-père', 'affluent', 'Forces', 'Restauration', 'Books', 'coffre', 'curiosité', 'critère', 'monstre', 'anges', 'adoptée', 'conclusions', 'OM', 'évoqué', 'Varsovie', 'fous', 'Thaïlande', 'Gîtes', 'Objets', 'séjourner', 'Orchestre', 'Activité', 'musicales', 'Bulgarie', 'précipitations', 'céréales', 'standards', 'collègue', 'Moteur', 'développeurs', 'envoyés', '1897', 'Hamilton', 'HC', 'basses', 'ouvrier', 'prévus', 'futures', 'multitude', 'Suivez', 'regroupant', 'caractérise', 'héritier', 'Good', 'AUX', 'z', 'détendre', 'BBC', 'Animation', 'Junior', 'Élections', 'assise', 'vignes', 'Instagram', 'Définition', 'diable', 'Tibet', 'inédit', 'General', 'dominante', 'Sélection', 'serre', 'permettrait', 'Institute', 'enseignements', 'www.youtube.com', 'chantiers', 'Infirmiers', 'incapable', 'opposer', 'Résidence', '102', 'stratégiques', 'plastiques', 'Ok', 'master', '←', 'collectifs', 'déficit', 'Productions', 'banc', 'référendum', 'recevrez', 'palette', 'reçoivent', 'huitième', 'dépression', 'descendre', 'Décès', 'chapitres', 'Stage', 'Jordan', 'Gordon', 'Argent', \"T'\", 'Ottawa', 'négociation', 'hop', 'Lion', 'accordé', 'salade', 'chroniques', 'ramener', 'Maxime', 'migrants', 'Second', 'réfrigérateur', 'seigneurs', 'CHF', 'imposé', 'Parfois', 'frigo', 'ignore', 'East', 'robots', 'Fille', 'gravité', 'Isère', 'Mieux', 'Vendée', 'résidences', 'invisible', 'triple', 'batteries', 'Simone', 'complètes', 'dénonce', 'péninsule', 'deja', 'indicateurs', 'gré', 'dessinateur', 'manager', 'saurait', 'exploiter', 'Rugby', 'appuyant', 'balance', 'forumAccueilCréer', 'médicales', 'mentale', '1872', 'chaises', 'formés', 'sculptures', 'Joueur', 'automobiles', 'accessoire', 'étiquette', 'domestique', 'has', 'Saint-Denis', 'intelligent', 'belges', 'milliard', 'syndrome', '17h', 'implication', 'MP', 'textile', 'races', 'Patricia', 'Cabinet', 'Proche', 'Technologies', 'paisible', 'constructeurs', 'anti', 'occupent', 'arrival', 'rumah', 'Nîmes', 'nette', 'transmis', 'concevoir', 'Rhin', 'manqué', 'regrette', 'out', 'bleus', 'Vietnam', 'Arrondissement', 'transféré', 'soumettre', 'promesse', 'lumières', 'Anniversaire', 'fines', 'buteur', 'merveille', 'Problème', 'coule', 'Réservez', 'Boris', 'globalement', 'fosse', 'Kate', 'christianisme', 'uniforme', 'biologie', 'First', 'biodiversité', 'arrêtés', 'bouteilles', 'pertinence', 'souveraineté', 'Solutions', 'treize', 'bulletin', 'qualifiés', 'quinzaine', 'fabriqué', 'Frères', 'Challenge', '1885', 'écologie', 'individuels', 'Francesco', 'indispensables', 'icône', 'essentiels', 'commença', 'Distribution', 'Fantasy', 'apparu', 'massacre', 'préférée', 'portait', 'optimale', 'marais', 'tranquillité', '14h', 'minimale', 'Calais', 'entrepreneurs', 'vieilles', 'gothique', 'amende', 'Morgan', 'Lisa', 'tit', '5e', 'salons', 'Dordogne', 'sondages', 'classées', 'pdf', 'obstacles', 'divisé', 'produisent', 'détermination', 'commerçants', '2016-2017', 'extérieures', 'australien', 'Devant', 'PAR', 'formée', 'Voix', 'Audi', 'Bande', 'Printemps', 'âgés', 'affluence', 'Steven', 'occupée', 'élimination', 'valorisation', 'aimerai', 'probleme', 'messe', 'Casino', 'coach', 'apartment', 'précisions', 'grotte', 'touches', 'spirituelle', 'onglet', 'Conservatoire', 'tournois', 'verser', 'aménagements', 'entretenir', 'voisines', 'exclusion', 'Être', 'médailles', 'sociologie', 'arrêts', 'Tant', 'çà', 'cessé', 'veuve', 'assurent', 'Match', 'Saint-Étienne', 'poussé', 'restait', 'Déco', 'Kelly', 'NBA', 'propreté', 'spécialités', 'sentiers', 'capture', 'révéler', 'agrandir', 'cosy', 'agissant', 'filet', 'fouilles', 'lanceur', 'vêtement', 'élégant', 'méditation', 'abandonne', 'qualifie', 'sois', 'associe', 'affecté', 'autorise', 'italiens', 'List', 'infrastructure', 'Hommes', 'soumise', 'ID', 'pm', 'désirez', 'autorisés', 'Blues', 'Véronique', 'constitutionnel', 'Bébé', 'indices', 'Store', 'nov', 'quarante', 'Couronne', '00ZRoom', '4ème', 'Pharmacie', 'Youtube', 'camarades', 'annulation', 'science-fiction', 'comprise', 'manières', 'boulangerie', 'natale', 'effectués', 'résidents', 'correction', 'partagée', 'solides', 'cave', 'placement', 'difficilement', 'diplômé', 'virtuel', 'masculine', '--Photos', 'commodités', 'PlayStation', 'pseudonyme', 'Spécial', 'Venezuela', 'voulant', 'surveiller', 'reconnus', 'rayonnement', 'pop34', 'coffret', 'Unité', 'mignon', 'puissances', '1893', 'Old', 'fraîche', 'expressions', 'texture', 'People', 'Situation', 'Index', 'inspirée', 'Franz', 'dérivés', 'corde', 'profitez', 'réédition', 'bah', 'Précédent', 'Objet', 'Antiquité', 'économiser', 'soigner', 'assurant', 'an34', 'dignité', 'back', '113', '103', 'Nobel', 'postale', 'TypeEntire', 'festivals', 'Licence', 'beautiful', 'curé', 'Friedrich', 'divisions', 'indications', 'Do', 'cuillère', 'recens34', 'gares', 'climatiques', 'lame', 'tramway', 'visiblement', 'Honda', 'fluide', 'doubles', 'parent', 'varier', 'substances', 'mètre', 'Lambert', 'départementales', 'applicables', 'convertir', 'enveloppe', 'finis', 'Aires', 'Academy', 'Comédie', 'réunir', 'chants', 'extraction', 'hiérarchie', 'devrais', 'progresser', 'distribué', 'Mathématiques', 'Champs', 'Espagnol', 'fibres', 'coco', 'accessibilité', 'étroite', 'qualifier', 'Romains', 'leçon', '2020', 'ES', 'reçus', 'THE', 'folle', 'Japonais', 'immobilières', 'prononciation', 'oct', 'supporters', 'stand', 'Chauffage', 'emporte', 'rassemblement', 'fautes', 'crainte', 'paiements', 'neufs', 'foyers', 'enthousiasme', 'Contre', 'concernées', 'accomplir', 'Plaza', 'magazines', 'blonde', '2011-2012', 'Inter', 'garantit', 'pianiste', 'identiques', 'équipées', 'appellent', 'pile', 'porteur', 'Liberté', 'Autant', 'unes', 'Disneyland', 'validité', 'nocturne', 'légers', 'DS', 'rapprocher', 'acceptation', 'Maman', 'violation', 'Abraham', 'RechercherRésultats', 'Capitaine', 'Galles', 'Coucou', 'contraste', 'spa', 'appliqué', 'rachat', 'Montage', 'tri', 'battant', 'réaliste', 'Voyages', 'Motif', 'parquet', 'isolé', 'D2', 'iOS', 'Will', 'compositions', '--les', 'Bavière', 'clics', '112', 'satellites', 'déplace', 'Hot', '18h', 'Turin', 'potentiels', 'prisonnier', 'détient', 'neveu', 'Grandes', 'citations', 'baignade', 'Canon', 'systématique', 'violent', 'grand-mère', 'Caraïbes', 'Pau', 'Aix', 'Mis', 'Hugues', 'podium', 'secs', 'endémique', 'assiste', 'déguster', 'Sites', 'échappe', 'abusif', 'fiabilité', 'parlementaires', 'Fabrice', 'brésilien', 'Angel', 'pâtes', 'Carter', 'foncé', 'faillite', 'st', 'Batterie', 'permettront', 'mourut', 'motos', 'crises', 'woning', 'intéressants', 'Inscrivez-vous', 'attendait', 'industrielles', 'architectes', 'Yann', 'déçu', 'entré', 'wc', 'Partenaires', 'regardant', 'suffrages', 'Dakar', 'exprimé', 'devise', 'céramique', 'gravure', 'comparateur', 'frappé', 'Chat', 'Elisabeth', 'assiette', 'provoqué', 'IX', 'signifiant', 'dépôts', 'District', 'Wii', 'mystérieux', 'Objectif', 'Seuls', 'conclut', 'voulons', 'pochette', 'souligner', 'parcourir', 'goûter', 'taper', 'sale', '118', 'opposant', 'Zoom', 'sublime', 'Cédric', 'fondamentale', 'peut-on', 'Suites', 'repasser', 'bits', 'een', 'grise', 'séparés', '1894', 'veste', 'lignée', 'good', 'Walt', 'québécoise', 'Mémoires', 'Mariage', 'briques', 'loisir', 'Bilan', 'littoral', 'organe', 'détection', 'Renaud', 'impeccable', 'Magic', 'gendarmerie', 'prouve', 'Coin', 'irlandais', 'Miguel', 'coureurs', 'exécuter', 'Contexte', 'turc', 'baroque', '●', '--Concours', 'ex.', 'lavage', 'fausses', 'secrète', '1892', 'Final', 'Ain', 'av', 'applicable', 'Moore', 'descendants', 'Ball', 'Agenda', 'width', 'magnétique', 'souhait', 'Philippines', 'éponyme', 'clichés', 'trouvés', 'dotée', 'Equipements', 'écris', '2010-2011', 'Application', 'divine', 'registres', 'Ø', 'grain', 'Be', 'nés', 'Ahmed', 'Out', 'people', 'intervenants', 'intérieurs1', 'Colin', 'reservation', 'Etude', 'trophée', 'suggestion', 'lapin', 'restée', 'essor', 'poches', 'séparer', 'records', 'baisser', 'maîtresse', 'Actes', 'rénovée', 'cinématographique', 'sommets', 'dits', 'bulle', 'suffisante', 'Cycle', 'décennie', 'terroriste', 'appelait', 'Pape', 'entrepreneur', 'accueillants', 'Mo', 'open', 'animale', 'reçut', 'Ouverture', 'racisme', 'Augustin', 'restaurer', 'transactions', 'Fernando', 'capitalisme', 'avouer', 'cloud', 'montres', 'regardé', 'extra', 'composant', 'creux', 'envergure', 'Jr', 'minéraux', 'permettait', 'fournie', 'script', 'entités', 'clean', 'Paix', 'ascenseur', 'Bank', 'Philip', 'éteint', 'indienne', 'attribuée', 'maïs', 'Passion', 'job', 'recherché', 'péché', 'formules', 'favorise', 'kr', 'transporter', 'activement', 'dénomination', 'inauguré', 'archéologie', 'copié', 'militant', 'âgée', 'bataillon', 'séparé', 'Eva', 'désire', 'rumeurs', 'Bonaparte', '108', 'détaillé', 'développés', 'Square', 'Confédération', 'Etienne', 'Canadiens', 'richesses', 'Languedoc-Roussillon', 'partent', 'boules', 'domine', 'indicatif', 'pic', 'Hygiène', 'Unies', 'portables', 'miracle', 'costumes', 'marron', 'baiser', 'vivante', 'cirque', 'estimation', 'Nelson', 'animée', 'Di', 'porteurs', '00ZAccueil', 'nid', 'Cadeaux', 'remercions', 'Bell', 'cotation', 'rejet', 'Plage', 'Julia', 'Métiers', 'Party', 'Paulo', 'associer', 'Budapest', 'sanctuaire', 'Réseaux', '1867', 'teint', 'Gary', 'passait', 'déclin', 'collier', 'Corporation', 'Baie', 'référencement', 'détenus', 'épaule', 'Agnès', 'absent', 'enregistrements', 'Ferme', 'entourée', 'team', 'soutenue', 'anneau', 'Bach', 'disposant', 'étoilesà', 'Insee', 'quatorze', 'répondent', 'exacte', 'partagent', 'poils', 'CFA', 'collines', 'maîtriser', 'diffuse', 'organique', 'rédiger', 'viendra', 'Désolé', 'supporte', 'déclarer', 'doré', 'passions', 'légendes', 'Navy', '109', 'processeur', 'observé', 'Autour', 'survivre', 'M6', 'bouger', 'cyclisme', 'Burkina', 'gâteaux', 'favorables', 'flexible', 'confortables', 'sérénité', 'Bluetooth', 'Ski', 'Phil', 'collectivité', 'Russell', '5000', 'pénal', 'philosophique', 'réglage', 'levée', 'Course', 'Syndicat', 'faiblesse', 'prestige', 'obtenus', 'Pratique', 'Bisous', 'Objectifs', 'aval', 'créatures', 'XP', 'gite', 'simultanément', 'Robe', 'marqués', 'parlait', 'estimations', 'hésiter', 'tr', 'Play', 'considèrent', 'aptProperty', 'dragon', 'assassiné', 'Cambridge', 'échéant', 'assemblage', 'ment', 'adopte', 'voudrait', 'civiles', 'Forme', 'sentier', 'anagrammes', 'Adrien', 'opportunités', 'munitions', 'composer', 'Changer', 'Possibilité', '190', 'Coran', 'suisses', 'esclavage', 'infini', 'Main', '1888', 'prof', 'décorée', 'kmHôtels', 'Atlas', 'ira', 'étanchéité', 'sympathiques', 'connectés', 'SI', 'grandement', 'municipaux', 'asiatique', 'Bel', 'acides', 'Britanniques', 'roller', 'ruban', 'Auvergne-Rhône-Alpes', 'privilégié', 'like', 'sortis', 'sensibilisation', 'seize', 'ciblées', \"M'\", 'volontairement', 'Clark', 'Relations', 'encadrement', 'ose', 'Project', 'aisément', 'Franche-Comté', 'Guinée', 'Maine', 'Coffret', '107', 'juridiction', 'doutes', 'déroulement', 'consommer', 'décorations', 'décrite', 'intentions', 'Ortograf', 'hôpitaux', 'variante', 'protégés', 'Roche', 'préface', 'judiciaires', 'Halloween', 'Intel', 'interdire', 'fragile', 'Ardèche', 'Bijoux', 'São', 'smartphones', 'durs', 'défaites', 'fixée', 'pente', 'élégance', '135', 'Hérault', 'herbes', 'amuser', 'aveugle', 'Cyril', 'influences', 'manuscrits', 'condamnation', 'chêne', 'challenge', 'Jennifer', 'exclusif', 'Stars', 'idem', 'Global', 'Mohammed', 'DR', 'pardon', 'détruite', 'réveil', 'google', 'Pétrole', 'académique', 'trouvée', 'carrières', 'Sicile', 'reproduire', 'Vendu', 'leaders', 'compteur', 'écrites', 'sexuel', 'Casablanca', 'affiches', 'Juridique', 'fesses', 'Sources', 'innovations', 'Valls', 'boue', 'Studios', 'pois', 'Citroën', '2009-2010', 'Potter', 'vérifié', 'Laboratoire', 'orthodoxe', 'déc', 'occidentaux', 'capteur', 'précises', 'Champions', 'musulmane', 'VOUS', '1851', 'ヽ', 'Shanghai', 'monstres', 'terroir', '✔', 'remplacée', 'desservie', 'réservations', 'illustrer', 'calculer', 'traductions', 'poussière', 'compétitivité', 'Cathédrale', 'intéressés', 'trouvera', 'Séries', 'spectaculaire', 'valider', 'réuni', 'volontiers', 'Haïti', 'laboratoires', 'Michèle', 'guise', 'copies', 'poussée', 'élue', 'poursuivi', 'résume', 'Devenir', 'chèque', 'AVEC', 'sèches', 'Network', 'récolte', 'Wifi', '111', 'là-bas', 'dépassé', 'Mercedes', 'fournies', 'impulsion', 'sphère', 'lève', 'variantes', 'Wild', 'catch', 'variation', 'Recevez', 'tenait', 'spécifiquement', 'baies', 'approvisionnement', 'relevant', 'ancêtres', 'Islande', 'nobles', 'exclusive', 'lisse', 'carnets', 'Voiture', 'surnommé', 'allié', 'Largeur', 'prestigieux', 'occurrence', 'agression', 'firme', 'perdus', 'Matthieu', 'agenda', 'autel', 'revers', 'lion', 'recense', 'Parallèlement', 'spectateur', 'hongrois', 'problématiques', 'interaction', 'Championship', 'asile', 'améliorations', '1792', 'déterminé', 'nommés', 'Guadeloupe', 'Juliette', 'Située', 'démontre', 'Light', 'automatiques', 'figurant', 'rouler', 'Firefox', 'actionnaires', 'Dave', 'évacuation', 'retraites', 'optimisation', 'maux', '2013Sujet', 'Grégoire', '1875', 'végétaux', 'rapproche', 'mythique', 'visuelle', 'tarte', 'Écrit', 'gestionnaire', 'batailles', 'entretiens', 'adapte', 'modernité', '1878', 'armés', 'réputé', 'Golfe', 'passes', 'Clé', 'Jardins', 'ongles', 'synonyme', 'dispo', 'misère', 'tribus', 'Pop', 'considération', 'défilé', 'célébrer', 'indication', 'éventuelle', 'gardes', 'inauguration', 'indiquée', 'Document', '205', 'mosquée', 'possédant', 'posséder', 'musicaux', 'ambassade', 'pédagogie', 'demi-finale', 'nouveauté', 'humanitaire', 'coupes', 'aube', 'CDI', 'copains', 'Marianne', 'Aisne', 'échantillon', 'admirer', 'Comte', 'mythologie', 'Valley', 'cabine', 'évoquer', 'initiation', 'Clermont', 'traitant', 'boisson', 'Cookies', 'actives', 'Dommage', 'terminal', '--Forum', 'allure', 'Africa', 'Astuces', 'bâti', 'palmarès', 'cherchant', 'internaute', 'vaisseaux', 'répondant', 'tenus', 'serviettes', 'Martinique', 'UA', 'apportent', 'Sacs', 'Souvent', 'vertus', 'Affaire', 'arnaque', '1815', 'vertes', 'continuation', 'fermes', 'clef', 'maintient', '2008-2009', 'Jacqueline', 'den', 'amiral', 'Faut', 'draps', '£', 'créées', '--LA', 'a-t-elle', 'infection', 'concentrer', 'révélation', 'lourdes', 'Critique', 'calculs', 'rap', 'linéaire', 'agisse', 'directive', 'Germain', '1882', 'croise', 'significative', 'amusant', 'ferai', 'mentions', 'Death', 'robes', 'conçus', 'artisan', 'Stone', 'WordPress', 'rez-de-chaussée', 'Foot', 'trompe', 'admission', 'Valentin', 'Religion', 'remplaçant', 'Danse', 'reviens', 'disparaît', 'suspendu', 'variées', '.-', 'automated', 'Edmond', 'massive', 'OTAN', 'States', 'Envie', 'auberge', 'inox', 'comptant', 'Part', 'colonisation', 'vintage', 'tranches', 'reviennent', '¯', 'PLUS', 'Pakistan', 'démo', 'plu', 'faille', 'connaissais', 'saura', 'carrelage', 'élémentaire', 'Raoul', 'Publications', 'bébéEquipements', 'comptabilité', 'Projets', 'biologiques', 'seigneurie', 'canceled', 'dangers', 'approches', 'téléphoniques', 'close', 'Occitanie', 'gay', 'exposés', 'démarrer', 'go', 'intérieurs', 'put', 'statues', 'Thèmes', 'Physique', 'Assistance', 'Tapis', '230', '1887', 'Empereur', 'visuels', 'mentionne', 'illusion', 'dissolution', 'hauteurs', 'positionnement', 'voyageur', 'sérieuse', 'vus', 'Ian', 'piège', 'énormes', 'nue', '--LE', 'Giuseppe', 'animés', '1884', 'prononcer', 'concession', 'Diane', 'tasse', 'Landes', '1.2', 'sonores', 'saveurs', 'éleveurs', 'bacs', 'Avez-vous', 'inventé', 'urbaines', '1856', \"Quelqu'\", 'Raphaël', 'Read', 'envisage', 'évalué', 'procédés', 'JavaScript', 'PVC', 'demi-grand', 'croyances', 'caméras', 'Bio', 'Gaza', 'visité', 'compatibles', 'botanique', 'singulier', 'Arrivée', 'donnait', 'productivité', 'Buenos', '1-0', 'émet', 'ABC', 'Forêt', 't-il', 'baise', 'fonte', 'Vivre', 'coquine', 'écossais', 'Parcours', 'Toyota', 'IT', 'localités', 'tolérance', 'arrivant', '1846', '10h', 'Nation', 'artificielle', 'crochet', 'navette', 'sacrée', '123', 'sensations', 'réelles', 'disais', 'exprimés', 'ridicule', 'baptisé', 'Seuil', 'renouvelables', 'marches', 'causer', 'débutants', 'view', 'chaussée', 'censure', 'documentaires', 'pénale', 'jolis', 'rapporté', 'atouts', 'dégustation', 'existants', 'interactions', 'manteau', 'Marina', 'MP3', 'déposée', 'sont-ils', 'profils', 'baptême', 'élevées', 'diesel', 'apparente', 'arrestation', 'maxi', 'href', 'évoluant', 'Oliver', '1790', 'Ni', 'informe', 'Charente', 'Navigation', 'Fait', 'posts', 'prévision', 'Secrets', 'Ancienne', 'Bâle', 'indiquent', 'alternatives', 'Figure', 'intermédiaires', 'Vichy', 'grains', 'Hégésippe', 'Cœur', 'dérive', 'quarts', 'manquent', 'créent', 'Agent', 'obstacle', 'Reste', 'médiation', 'petit-déjeuner', 'Autorité', 'Tennis', 'nommer', 'im', 'fiscalité', 'Arles', 'attaqué', '--Annonces', 'Aventure', 'croient', 'plomberie', 'los', 'Thème', 'socle', 'Song', 'Queen', '1840', 'alphabet', 'Accéder', 'tantôt', 'diplômes', 'Bâtiment', 'légères', 'épaules', 'coiffure', 'Original', 'madame', 'commercialisation', 'Ross', 'CS', 'quarantaine', 'commissions', 'caisses', 'Autrement', 'bémol', 'plancher', 'appartenance', 'papillon', 'für', 'XI', 'fêter', 'Feu', 'promesses', 'rapprochement', 'indépendantes', 'Circuit', 'Trophée', 'Raison', 'vanille', 'dépannage', 'équation', 'accordée', 'Train', 'spécialisées', 'approuvé', 'Ciel', 'Victoire', 'oxygène', 'mutation', 'blues', 'signalé', 'Guyane', 'plutot', 'composent', 'Singapour', 'poules', 'câbles', 'préférable', 'répétition', 'Almouggar.com', 'Coeur', 'Disque', 'Géographie', 'chatte', 'Beau', 'Clermont-Ferrand', 'géré', 'Parker', 'Town', 'numero', 'accuse', 'enceintes', 'fondamental', 'Sud-Ouest', 'archéologiques', 'PêcheVoir', 'Anjou', 'Poser', 'Constantinople', 'Cliquer', 'autrui', 'entends', 'manoir', 'Princesse', 'vendue', 'XVIIIe', 'passa', 'duplicate', 'ci-après', '--Espace', 'partiel', 'kms', 'Populaire', 'Utiliser', 'recevez', 'casser', 'médiéval', 'valoir', 'tuto', 'manifestants', 'ad', 'coloniale', 'traversé', 'Paradis', 'sexuelles', 'sévère', 'oubli', 'Créez', 'roche', 'mien', 'admissibilité', 'Sud-Est', 'entame', 'tactile', 'alphabétique', 'solitaire', 'Élisabeth', 'adorable', 'commenter', 'XII', 'introduire', 'éducatif', 'Limousin', 'intellectuels', 'diplomatique', 'Saint-Pétersbourg', 'about', 'consultant', 'Wallonie', 'championne', 'correcte', 'Italien', 'dynamiques', 'Monique', 'accompagnés', 'intéressent', 'Beaux-Arts', 'porc', 'Jerry', 'fonctionnalité', 'amies', 'dénoncer', 'instants', 'encyclopédique', 'Opération', 'mat', 'quotidiens', 'Royale', 'positifs', 'Forums', 'maturité', 'mondiaux', 'enseigner', 'organisateurs', 'détriment', '127', 'PSPsexy', 'Vegas', 'Istanbul', 'pots', 'instances', 'puce', 'Revenir', 'signale', 'minuit', 'quoique', 'généralistes', 'matinée', 'pouce', 'renoncer', 'mental', 'budgétaire', 'essayez', 'imposant', 'Back', '--A', 'remercié', 'Meuse', 'achève', 'angles', 'Special', 'Flandre', 'Mail', 'Rica', 'Azure', 'Alberto', 'Malte', 'Who', 'audit', 'intervalle', '117', 'Stéphanie', 'municipalités', 'annoncée', 'épais', 'espérance', 'fonctionnaire', 'dettes', 'technicien', 'ouvrent', 'Gard', 'orale', 'approfondie', 'tomes', 'dessiner', 'basilique', 'Crédits', 'méchant', 'recommandation', 'Batman', 'tribunaux', 'dira', 'vain', 'Jusqu', '--Histoire', 'PDG', 'Hunter', 'escalade', 'tre', 'minorité', 'Rights', 'améliore', 'Universal', 'Rousseau', 'Faso', '--Questions', 'wikipédia', 'localement', 'Parfait', 'athlètes', 'Puy', 'volets', 'Rachel', 'liaisons', 'Métropole', 'SAS', 'Ingénieur', 'Lorient', 'Chantal', 'acheteur', 'lingerie', 'planètes', 'enregistrée', 'Quelque', 'congé', 'confondre', '1883', 'Caire', 'prétend', 'ranger', 'châssis', 'chaos', 'électoral', 'générer', '19h', 'impasse', 'mutuelle', 'radical', 'Prendre', 'icone', 'renouveler', 'périmètre', 'Bush', 'Conception', 'Téléchargez', 'Pauline', 'interroge', 'carrefour', 'Villeneuve', 'regards', 'scénarios', 'Annecy', 'vous-même', 'digital', 'poètes', 'Fonction', 'Bruges', 'possédait', 'entrent', 'cotisations', 'demandeurs', 'convivial', 'moine', 'barrière', '1863', 'résolu', '26Localisation', 'Stratégie', 'SE', 'nuage', 'Ann', 'détour', 'triangle', 'licences', 'Alexandra', '1Sauter', 'intégrés', 'décida', 'Stockholm', 'négocier', 'empreinte', '2013Age', 'mâles', 'Book', 'Gallery', 'désignation', 'segment', 'Posted', 'pourriez', 'sexualité', 'semblables', 'Actu', 'Professionnel', 'flèche', 'semestre', 'Constantin', 'chaudes', 'toiture', 'confluence', 'Retourner', '6e', 'Explorer', 'compact', 'quotidiennement', 'incarne', 'InvitéInvité', 'oeuf', 'démontré', 'Audio', 'allais', 'Marvel', 'buffet', 'coucou', 'Otto', 'DO', 'finissent', 'réguliers', 'alimenter', 'humides', 'discrimination', 'Micro', 'tribune', '♦', 'Comics', 'démocrate', 'Personnellement', 'Damien', 'franchir', 'vigne', 'para', 'gr', 'blason', 'cycles', 'caoutchouc', 'investi', 'saumon', 'consacrés', '9h', 'étendu', 'Aude', 'Roberto', 'cinquantaine', 'Bons', 'préoccupations', 'pirates', 'Equitation', 'rassembler', 'piliers', 'simulation', 'cinéaste', 'pompiers', '-10,5', 'réplique', 'aéronautique', 'guider', 'végétation', 'Full', 'désordre', 'LG', 'Catalogne', 'envoyée', 'Charme', 'Cologne', 'unis', 'pensait', 'complot', 'israélien', 'dictature', 'details', 'free', 'Chronique', 'Sociétés', 'croisière', 'fun', 'représentés', 'Berne', 'porno', 'appliquent', 'accroître', 'Metal', 'profondes', 'cliniques', 'there', 'prototype', 'vraies', '1865', 'Abs', 'Edouard', 'over', 'enchères', 'priorités', 'VOTRE', 'inégalités', 'séduire', 'injection', 'Effectivement', 'passez', 'éclat', 'Salvador', '1862', 'potable', 'synthétique', '11h', 'confier', 'restés', 'chauffer', 'aborde', 'EDF', 'Troyes', 'ressenti', 'jambe'])\n", + "[-0.0842, -0.0388, 0.0456, -0.0559, -0.0366, 0.0241, 0.0919, -0.0214, 0.0179, -0.1384, -0.0202, -0.1276, -0.0163, 0.0644, -0.1042, 0.0152, -0.0191, 0.0761, -0.0149, 0.0261, 0.0354, -0.077, -0.0034, 0.0941, -0.0169, 0.1621, 0.2469, -0.009, 0.0335, 0.0022, -0.0168, -0.0063, 0.0149, -0.0182, 0.0205, 0.0628, -0.3591, -0.0155, 0.0188, 0.0503, -0.0251, 0.0328, 0.04, 0.0639, -0.1502, 0.1655, 0.0538, 0.0762, -0.1086, -0.0351, 0.0534, 0.0267, 0.0255, 0.038, 0.0026, 0.3703, 0.0797, -0.0189, 0.4854, 0.0882, 0.0483, 0.224, 0.0077, -0.2437, -0.0396, -0.0343, -0.1632, -0.0818, -0.0074, 0.0008, -0.0255, -0.0482, -0.4431, -0.0576, -0.0413, -0.0182, -0.0852, -0.0737, 0.2608, -0.0044, -0.0147, -0.0486, -0.2496, -1.3323, -0.0243, -0.0382, 0.0852, 0.0166, 0.0292, -0.0092, 0.0345, -0.0205, 0.0806, -0.0287, 0.0068, -0.3224, -0.0187, -0.0661, -0.043, 0.4115, 0.021, 0.0019, 0.0826, 0.0753, 0.0254, 0.0634, 0.0524, -0.0342, -0.0224, 0.3635, 0.0102, -0.0121, -0.3234, 0.1405, 0.0347, 0.029, -0.0187, 0.0473, -0.067, 0.0084, -0.0503, -0.0469, -0.1019, 0.1343, -0.0289, 0.0632, 0.0699, 0.0675, 0.0196, -0.0432, 0.0576, 0.0173, 0.0264, 0.0001, 0.026, -0.0262, -0.3346, -0.025, 0.1202, 0.0655, 0.0264, -0.0396, 0.0032, -0.0192, -0.0364, -0.0285, 0.0278, 0.0017, -0.0048, -0.0001, -0.0395, 0.002, -0.1174, 0.0715, 0.0118, -0.0433, 0.0497, -0.0519, 0.0654, -0.0596, 0.006, 0.1493, 0.01, 0.0117, -0.1024, -0.0334, 0.0252, -0.2275, -0.0043, -0.0623, 0.3386, 0.0622, 0.0344, -0.3352, -0.0398, -0.161, -0.0401, -0.2124, 0.0329, 0.0056, -0.0218, -0.007, 0.1279, 0.0429, -0.0155, 0.0529, 0.1669, 0.0851, -0.4496, -0.0199, 0.1243, 0.0296, 0.0625, 0.5931, -0.0495, -0.0263, 0.0038, 0.0456, -0.0591, 0.0706, 0.046, 0.0196, 0.0271, 0.0136, 0.0427, 0.1151, 0.0651, 0.0513, 0.3261, -0.0095, -0.1681, 0.0631, 0.4491, 0.0119, -0.0168, -0.0606, -0.2383, -0.0494, 0.1051, 0.0095, -0.0175, -0.0459, 0.094, 0.0788, 0.0581, -0.0833, 0.0291, 0.0228, 0.004, -0.2135, -0.045, -0.2637, -0.0708, -0.0272, 0.0321, -0.0116, 0.0079, -0.0634, 0.1234, -0.0904, 0.0501, -0.0339, -0.0494, 0.0714, 0.1486, 0.1024, 0.0903, 0.0458, -0.0289, -0.0185, -0.034, 0.0427, -0.033, -0.0147, -0.2744, -0.0971, 0.0208, 0.0127, -0.0412, 0.0009, -0.0658, 0.0333, -0.0383, 0.0523, -0.019, 0.0391, 0.0702, 0.0231, 0.0573, 0.083, -0.1997, -0.0273, -0.0001, 0.002, -0.0557, 0.0669, -0.0026, 0.1349, 0.0173, -0.0312, -0.0388, 0.032, 0.0129, -0.0233, 0.0034, -0.0373, 0.0239, -0.07, 0.0412, 0.0402, 0.0019, -0.0405, -0.0111, -0.0038, 0.008, 0.1887, 0.0118, 0.3069, -0.0106, 0.0579]\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "### 1.2 Build the weight matrix\n", + "\n", + "We have a list of words associated to vector.\n", + "Now we need to specifically retrieve the vectors for the words present in our data, there is no need to keep vectors for all the words.\n", + "We thus build a matrix over the dataset associating each word present in the dataset to its vector.\n", + "For each word in dataset’s vocabulary, we check if it is in FastText’s vocabulary:\n", + "* if yes: load its pre-trained word vector.\n", + "* else: we initialize a random vector.\n", + "\n", + "\n", + "**Question:** Examine the coverage, i.e.:\n", + "* print the number of tokens from FastText found in the training set\n", + "* and the number of unknown words." + ], + "metadata": { + "id": "GTA0vXeevSuO" + } + }, + { + "cell_type": "code", + "source": [ + "# Load the weight matrix: modify the code below to check the coverage of the\n", + "# pre-trained embeddings\n", + "emb_dim = 300\n", + "matrix_len = len(train.vocab)\n", + "weights_matrix = np.zeros((matrix_len, emb_dim))\n", + "\n", + "for i in range(0, len(train.vocab)):\n", + " word = train.vocab.lookup_token(i)\n", + " try:\n", + " weights_matrix[i] = vectors[word]\n", + " except KeyError:\n", + " weights_matrix[i] = np.random.normal(scale=0.6, size=(emb_dim, ))\n", + "weights_matrix = torch.from_numpy(weights_matrix).to( torch.float32)\n", + "\n", + "print(weights_matrix)" + ], + "metadata": { + "id": "4XXFTaRxvRNk", + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "a0e1947e-1007-4fe5-dcb0-cadcdf151c01" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "tensor([[-0.5379, 0.5573, 0.3196, ..., -0.0434, -1.2529, -1.0325],\n", + " [-0.0842, -0.0388, 0.0456, ..., 0.3069, -0.0106, 0.0579],\n", + " [-0.0386, 0.0706, 0.0421, ..., -0.3886, 0.0417, 0.0771],\n", + " ...,\n", + " [-0.9572, 0.9099, 0.6799, ..., -0.2349, 1.2547, -0.3426],\n", + " [ 0.3485, -0.2591, 0.2503, ..., -0.2570, -0.5629, 0.9814],\n", + " [ 0.0772, -0.4581, 0.1934, ..., 0.2854, -0.7186, -0.3906]])\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "---------------------------------------------------\n", + "SOLUTION" + ], + "metadata": { + "id": "yqUw4CPDKzpk" + } + }, + { + "cell_type": "code", + "source": [ + "# Load the weight matrix: modify the code below to check the coverage of the\n", + "# pre-trained embeddings\n", + "emb_dim = 300\n", + "matrix_len = len(train.vocab)\n", + "weights_matrix = np.zeros((matrix_len, emb_dim))\n", + "words_found, words_unk = 0,0\n", + "\n", + "for i in range(0, len(train.vocab)):\n", + " word = train.vocab.lookup_token(i)\n", + " try:\n", + " weights_matrix[i] = vectors[word]\n", + " words_found += 1\n", + " except KeyError:\n", + " weights_matrix[i] = np.random.normal(scale=0.6, size=(emb_dim, ))\n", + " words_unk += 1\n", + "weights_matrix = torch.from_numpy(weights_matrix).to( torch.float32)\n", + "print( \"Words found:\", weights_matrix.size() )\n", + "print( \"Unk words:\", words_unk )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "o-_cCnGtK0Ax", + "outputId": "6c0c2ea3-254a-45de-c1c6-99ed6dbbedbe" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Words found: torch.Size([43072, 300])\n", + "Unk words: 37486\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "### 1.3 Exercise: Model definition\n", + "\n", + "#### a/ Define the embedding layer:\n", + "Now modify your model to add this embedding layer in the __init__() function below:\n", + "\n", + "* Define *self.embedding_bag*: a layer combining the word embeddings for the words. Here we just give the definition of the layer, i.e.:\n", + " * we use pre initialized weights\n", + " * we want to combine the embeddings by doing the average\n", + "See ```nn.EmbeddingBeg.from_pretrained( ..)```, https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html\n", + "* Retrieve the *embedding dimensions* to be used as parameter for the first linear function (look at the *EnbeddingBag* class definition).\n", + "\n", + "#### b/ Use the embedding layer\n", + "Now you need to tell the model when to use this embedding layer, thus you need to modify the *forward()* function to say that it needs to first *embed* the input before going through the linear and non linear layers.\n", + "\n", + "Look at the example in the doc: https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html\n", + "Note that this embedding layer needs the information about the offset, to retrieve the sequences / individual documents in the batch.\n", + "\n" + ], + "metadata": { + "id": "VcLWQgu877rQ" + } + }, + { + "cell_type": "code", + "source": [ + "class FeedforwardNeuralNetModel(nn.Module):\n", + " def __init__(self, hidden_dim, output_dim, weights_matrix):\n", + " # calls the init function of nn.Module. Dont get confused by syntax,\n", + " # just always do it in an nn.Module\n", + " super(FeedforwardNeuralNetModel, self).__init__()\n", + "\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " # mode (string, optional) – \"sum\", \"mean\" or \"max\". Default=mean.\n", + " self.embedding_bag = nn.EmbeddingBag.from_pretrained(\n", + " weights_matrix,\n", + " mode='mean')\n", + " embed_dim = self.embedding_bag.embedding_dim\n", + "\n", + " # Linear function\n", + " self.fc1 = nn.Linear(embed_dim, hidden_dim)\n", + "\n", + " # Non-linearity\n", + " self.sigmoid = nn.Sigmoid()\n", + "\n", + " # Linear function (readout)\n", + " self.fc2 = nn.Linear(hidden_dim, output_dim)\n", + "\n", + " def forward(self, text, offsets):\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " embedded = self.embedding_bag(text, offsets)\n", + "\n", + " # Linear function\n", + " out = self.fc1(embedded)\n", + "\n", + " # Non-linearity\n", + " out = self.sigmoid(out)\n", + "\n", + " # Linear function (readout)\n", + " out = self.fc2(out)\n", + " return out" + ], + "metadata": { + "id": "fXOPuCv_vZrr" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 1.4 Exercise: Train and evaluation functions (code given)\n", + "\n", + "Look at the code below that performs the training and evaluation of your model.\n", + "Note that:\n", + "* one epoch is one ieration over the entire training set\n", + "* each *input* is here a batch of several documents (here 2)\n", + "* the model computes a loss after making a prediction for each input / batch. We accumulate this loss, and compute a score after seing each batch\n", + "* at the end of each round / epoch, we print the accumulated loss and accuracy:\n", + " * A good indicator that your model is doing what is supposed to, is the loss: it should decrease during training. At the same time, the accuracy on the training set should increase.\n", + "* in the evaluation procedure, we have to compute score for batched of data, that's why we have slight modifications in the code (use of *extend* tp have a set of predictions)\n", + "\n", + "Note: here we need to take into account the offsets in the training and evaluation procedures." + ], + "metadata": { + "id": "UsXmIGqApbxj" + } + }, + { + "cell_type": "code", + "source": [ + "import matplotlib.pyplot as plt\n", + "import os\n", + "\n", + "def my_plot(epochs, loss):\n", + " plt.plot(epochs, loss)\n", + " #fig.savefig(os.path.join('./lossGraphs', 'train.jpg'))\n", + "\n", + "def training(model, train_loader, optimizer, num_epochs=5, plot=False ):\n", + " loss_vals = []\n", + " for epoch in range(num_epochs):\n", + " train_loss, total_acc, total_count = 0, 0, 0\n", + " for input, label, offsets in train_loader:\n", + " # Step1. Clearing the accumulated gradients\n", + " optimizer.zero_grad()\n", + " # Step 2. Forward pass to get output/logits\n", + " outputs = model( input, offsets ) # <---- argument offsets en plus\n", + " # Step 3. Compute the loss, gradients, and update the parameters by\n", + " # calling optimizer.step()\n", + " # - Calculate Loss: softmax --> cross entropy loss\n", + " loss = criterion(outputs, label)\n", + " # - Getting gradients w.r.t. parameters\n", + " loss.backward()\n", + " # - Updating parameters\n", + " optimizer.step()\n", + " # Accumulating the loss over time\n", + " train_loss += loss.item()\n", + " total_acc += (outputs.argmax(1) == label).sum().item()\n", + " total_count += label.size(0)\n", + " # Compute accuracy on train set at each epoch\n", + " print('Epoch: {}. Loss: {}. ACC {} '.format(epoch, train_loss/len(train), total_acc/len(train)))\n", + " loss_vals.append(train_loss/len(train))\n", + " total_acc, total_count = 0, 0\n", + " train_loss = 0\n", + " if plot:\n", + " # plotting\n", + " my_plot(np.linspace(1, num_epochs, num_epochs).astype(int), loss_vals)\n", + "\n", + "\n", + "def evaluate( model, dev_loader ):\n", + " predictions = []\n", + " gold = []\n", + " with torch.no_grad():\n", + " for input, label, offsets in dev_loader:\n", + " probs = model(input, offsets) # <---- fct forward with offsets\n", + " # -- to deal with batches\n", + " predictions.extend( torch.argmax(probs, dim=1).cpu().numpy() )\n", + " gold.extend([int(l) for l in label])\n", + " print(classification_report(gold, predictions))\n", + " return gold, predictions" + ], + "metadata": { + "id": "US_0JmN5phqs" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 1.3 Exercise: run experiments\n", + "\n", + "Look at the code below, it allows to run experiments with the following values for the hyper-parameters:\n", + " * batch size = 2\n", + " * hidden dimension = 4\n", + " * learning rate = 0.1\n", + " * number of epochs = 5\n", + " * using the Cross Entropy loss function\n", + " * using SGD as the optimizer algorithm\n", + "\n", + "Questions:\n", + " * What is the input dimension?\n", + " * What is the output dimension?\n", + " * What are the hyper-parameters that could be tuned? Propose a set of values to be tested for each one of them.\n", + " * Run the code: what is the behaviour of the loss and accuracy?\n", + " * What do you think about the performance of this model?" + ], + "metadata": { + "id": "NC2VtTmv-Q_c" + } + }, + { + "cell_type": "code", + "source": [ + "# Set the values of the hyperparameters\n", + "hidden_dim = 4\n", + "learning_rate = 0.1\n", + "num_epochs = 5\n", + "criterion = nn.CrossEntropyLoss()\n", + "output_dim = 2" + ], + "metadata": { + "id": "Jod8FnWPs_Vi" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=5 )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "1Xug7ygbpAhS", + "outputId": "db7acdf4-0c22-4050-b960-abc7e2a1cfea" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.7095046862233626. ACC 0.5110403819375373 \n", + "Epoch: 1. Loss: 0.6778815342007775. ACC 0.566341754525562 \n", + "Epoch: 2. Loss: 0.6441416264388815. ACC 0.6252237915257609 \n", + "Epoch: 3. Loss: 0.6270657096088513. ACC 0.6467077779988064 \n", + "Epoch: 4. Loss: 0.6168690368104526. ACC 0.6572508454346528 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.55 0.69 0.62 230\n", + " 1 0.73 0.60 0.66 319\n", + "\n", + " accuracy 0.64 549\n", + " macro avg 0.64 0.65 0.64 549\n", + "weighted avg 0.66 0.64 0.64 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "**Note:** that we don't use here a SoftMax over the output of the final layer to obtain class probability: this is because this SoftMax application is done in the loss function chosen (*nn.CrossEntropyLoss()*). Be careful, it's not the case of all the loss functions available in PyTorch." + ], + "metadata": { + "id": "OBqQaAf6mxEI" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Part 2 - Exercise: Tuning your model\n", + "\n", + "The model comes with a variety of hyper-parameters. To find the best model, we need to test different values for these free parameters.\n", + "\n", + "Be careful:\n", + "* you always optimize / fine-tune your model on the **development set**.\n", + "* Then you compare the results obtained with the different settings on the dev set to choose the best setting\n", + "* finally you report the results of the best model on the test set\n", + "* you always keep a track of your experimentation, for reproducibility purpose: report the values tested for each hyper-parameters and the values used by your best model.\n", + "\n", + "In this part, you have to test different values for the following hyper-parameters:\n", + "\n", + "1. Batch size\n", + "2. Max number of epochs (with best batch size)\n", + "3. Size of the hidden layer\n", + "4. Activation function\n", + "5. Optimizer\n", + "6. Learning rate\n", + "\n", + "Inspect your model to give some hypothesis on the influence of these parameters on the model by inspecting how they affect the loss during training and the performance of the model.\n", + "\n", + "Once done, modify your model to test a variation on the architecture. Here you don't have to tune all your model again, just try for example when keeping the best values found previously for the hyper-parameters:\n", + "\n", + "7. Try with 1 additional hidden layer\n", + "\n", + "**Note:** (not done below) Here you are trying to make a report on the performance of your model. try to organise your code to keep track of what you're doing:\n", + "* give a different name to each model, to be able to run them again\n", + "* save the results in a dictionnary of a file, to be able to use them later: \n", + " * think that you should be able to provide e.g. plots of your results (for example, plotting the accuracy for different value of a specific hyper-parameter), or analysis of your results (e.g. by inspecting the predictions of your model) so you need to be able to access the results." + ], + "metadata": { + "id": "1HmIthzRumir" + } + }, + { + "cell_type": "code", + "source": [ + "from sklearn.metrics import accuracy_score, f1_score" + ], + "metadata": { + "id": "bS_br1eLi-X_" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, activation, optimizer, acc, macro-F1\n", + "experiments = []\n", + "\n", + "class Expe:\n", + " def __init__(self, epochs, hidden, lr, batch, act, opt):\n", + " self.epochs = epochs\n", + " self.hidden = hidden\n", + " self.lr = lr\n", + " self.batch = batch\n", + " self.activation = act\n", + " self.optimizer = opt\n", + " self.acc = None\n", + " self.macroF1 = None\n", + " self.model = None\n", + "\n", + " def set_acc(self, acc ):\n", + " self.acc = acc\n", + "\n", + " def set_f1( self, f1 ):\n", + " self.macroF1 = f1\n", + "\n", + " def set_model( self, model ):\n", + " self.model = model\n", + "\n", + " def set_scores( self, gold, pred ):\n", + " self.acc = accuracy_score( gold, pred )\n", + " self.macroF1 = f1_score( gold, pred, average='macro')\n", + "\n", + " def is_better( self, other_exp, score='f1' ):\n", + " if score == 'f1':\n", + " if self.macroF1 >= other_exp.macroF1:\n", + " return self\n", + " return other_exp\n", + " elif score == 'acc':\n", + " if self.acc >= other_exp.acc:\n", + " return self\n", + " return other_exp\n" + ], + "metadata": { + "id": "jLy2TCP3f4fE" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# For now, we keep a medium number of epochs eg 50\n", + "num_epochs = 20" + ], + "metadata": { + "id": "mPl550bHgYE7" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 1. BATCH SIZE\n", + "\n", + "We need to reload the data to change the size of the batch." + ], + "metadata": { + "id": "YXarvcQk4uEo" + } + }, + { + "cell_type": "code", + "source": [ + "# Hyperparameters\n", + "hidden_dim = 4\n", + "learning_rate = 0.1" + ], + "metadata": { + "id": "V1RYBCEm4wNu" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> BATCH SIZE 2" + ], + "metadata": { + "id": "XbB7T1Un5ZET" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "batch_size = 2\n", + "\n", + "train_loader = DataLoader(train, batch_size=batch_size, shuffle=True, collate_fn=collate_fn) # Bien modifie ici la batch size + shuffle = True!\n", + "dev_loader = DataLoader(dev, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)" + ], + "metadata": { + "id": "y1uRBZ5t5MsC" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs, plot=True )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "717e5ae1-81f6-446c-cf9b-0da303ff8d70", + "id": "RujCyodz4wNv" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.3443748711508122. ACC 0.5384921424308733 \n", + "Epoch: 1. Loss: 0.3238032021863052. ACC 0.6248259399244082 \n", + "Epoch: 2. Loss: 0.3119724961444187. ACC 0.6510841456136861 \n", + "Epoch: 3. Loss: 0.30653937585742286. ACC 0.6719713546847026 \n", + "Epoch: 4. Loss: 0.3030025432349846. ACC 0.66938531927591 \n", + "Epoch: 5. Loss: 0.3010095448503858. ACC 0.6741595384921424 \n", + "Epoch: 6. Loss: 0.2988497476623882. ACC 0.6851004575293416 \n", + "Epoch: 7. Loss: 0.29819395815238864. ACC 0.6793316093097275 \n", + "Epoch: 8. Loss: 0.2980856440660492. ACC 0.6805251641137856 \n", + "Epoch: 9. Loss: 0.2954139497789908. ACC 0.6845036801273126 \n", + "Epoch: 10. Loss: 0.2945998609125294. ACC 0.6839069027252834 \n", + "Epoch: 11. Loss: 0.2934940134095273. ACC 0.6884821961408395 \n", + "Epoch: 12. Loss: 0.2927938977151558. ACC 0.6974338571712752 \n", + "Epoch: 13. Loss: 0.29124776529333857. ACC 0.6924607121543664 \n", + "Epoch: 14. Loss: 0.2906765501863954. ACC 0.6932564153570718 \n", + "Epoch: 15. Loss: 0.29035230220010366. ACC 0.6930574895563955 \n", + "Epoch: 16. Loss: 0.2914984108932407. ACC 0.6928585637557191 \n", + "Epoch: 17. Loss: 0.2891622580364065. ACC 0.6920628605530137 \n", + "Epoch: 18. Loss: 0.289319252017652. ACC 0.696240302367217 \n", + "Epoch: 19. Loss: 0.2874025212341828. ACC 0.6922617863536901 \n", + "Epoch: 20. Loss: 0.2873479837082254. ACC 0.6948478217624826 \n", + "Epoch: 21. Loss: 0.28698403910261894. ACC 0.7020091505868311 \n", + "Epoch: 22. Loss: 0.2863148364384985. ACC 0.6998209667793913 \n", + "Epoch: 23. Loss: 0.2871159102981871. ACC 0.6998209667793913 \n", + "Epoch: 24. Loss: 0.2862042659926609. ACC 0.6998209667793913 \n", + "Epoch: 25. Loss: 0.2856487465593671. ACC 0.7024070021881839 \n", + "Epoch: 26. Loss: 0.28580112889948556. ACC 0.7020091505868311 \n", + "Epoch: 27. Loss: 0.2858768165732547. ACC 0.7032027053908892 \n", + "Epoch: 28. Loss: 0.2857147929743481. ACC 0.6982295603739805 \n", + "Epoch: 29. Loss: 0.28580333072390546. ACC 0.705390889198329 \n", + "Epoch: 30. Loss: 0.28389336485419403. ACC 0.7014123731848021 \n", + "Epoch: 31. Loss: 0.28368568674296063. ACC 0.7000198925800676 \n", + "Epoch: 32. Loss: 0.28407661078869406. ACC 0.7041973343942709 \n", + "Epoch: 33. Loss: 0.28380351586914665. ACC 0.7020091505868311 \n", + "Epoch: 34. Loss: 0.2823989510896347. ACC 0.7085737020091506 \n", + "Epoch: 35. Loss: 0.28255766987436. ACC 0.7006166699820967 \n", + "Epoch: 36. Loss: 0.2825257646711738. ACC 0.7111597374179431 \n", + "Epoch: 37. Loss: 0.2827194988978047. ACC 0.7045951859956237 \n", + "Epoch: 38. Loss: 0.2819667985492004. ACC 0.7075790730057688 \n", + "Epoch: 39. Loss: 0.2813402628642808. ACC 0.7018102247861547 \n", + "Epoch: 40. Loss: 0.2808800577325922. ACC 0.7045951859956237 \n", + "Epoch: 41. Loss: 0.27942367505207316. ACC 0.7063855182017108 \n", + "Epoch: 42. Loss: 0.27922166472362764. ACC 0.7089715536105032 \n", + "Epoch: 43. Loss: 0.2795443612313695. ACC 0.7043962601949473 \n", + "Epoch: 44. Loss: 0.27878284501995626. ACC 0.7085737020091506 \n", + "Epoch: 45. Loss: 0.27964969383012067. ACC 0.709369405211856 \n", + "Epoch: 46. Loss: 0.2784189229935959. ACC 0.7071812214044162 \n", + "Epoch: 47. Loss: 0.27862977107046777. ACC 0.7067833698030634 \n", + "Epoch: 48. Loss: 0.27821397670806464. ACC 0.7109608116172668 \n", + "Epoch: 49. Loss: 0.2767134781579315. ACC 0.709966182613885 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.30 0.41 230\n", + " 1 0.63 0.87 0.73 319\n", + "\n", + " accuracy 0.63 549\n", + " macro avg 0.63 0.58 0.57 549\n", + "weighted avg 0.63 0.63 0.60 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "pgkEIDRgiv1W" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "UQCZKqFZFHev" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> BATCH SIZE 100" + ], + "metadata": { + "id": "Wvf6nYZ55ff3" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "batch_size = 100\n", + "\n", + "train_loader = DataLoader(train, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)\n", + "dev_loader = DataLoader(dev, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)" + ], + "metadata": { + "id": "q7nMbTzl5ff4" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "75280b62-c873-477e-a14c-b9c5ce3028b5", + "id": "ZuMZ7YyV5ff4" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.007073940957116635. ACC 0.5084543465287448 \n", + "Epoch: 1. Loss: 0.007029072923453494. ACC 0.5126317883429481 \n", + "Epoch: 2. Loss: 0.007021559311567207. ACC 0.5201909687686493 \n", + "Epoch: 3. Loss: 0.007013624080113869. ACC 0.530336184603143 \n", + "Epoch: 4. Loss: 0.007002492050112468. ACC 0.5398846230356077 \n", + "Epoch: 5. Loss: 0.006991920661850386. ACC 0.5621643127113587 \n", + "Epoch: 6. Loss: 0.0069761488602226385. ACC 0.5528148000795703 \n", + "Epoch: 7. Loss: 0.006963936500481021. ACC 0.5892182216033419 \n", + "Epoch: 8. Loss: 0.0069445139232534574. ACC 0.6095086532723294 \n", + "Epoch: 9. Loss: 0.006923313166480048. ACC 0.5987666600358067 \n", + "Epoch: 10. Loss: 0.0069005172414383165. ACC 0.6363636363636364 \n", + "Epoch: 11. Loss: 0.006863713620172383. ACC 0.6208474239108812 \n", + "Epoch: 12. Loss: 0.0068298692918568405. ACC 0.6272130495325243 \n", + "Epoch: 13. Loss: 0.006794634694391012. ACC 0.6421324845832505 \n", + "Epoch: 14. Loss: 0.0067576949084661156. ACC 0.6530734036204495 \n", + "Epoch: 15. Loss: 0.006705810114524937. ACC 0.6538691068231549 \n", + "Epoch: 16. Loss: 0.006661873440489229. ACC 0.6574497712353292 \n", + "Epoch: 17. Loss: 0.006612105969052821. ACC 0.659637955042769 \n", + "Epoch: 18. Loss: 0.006558903671767002. ACC 0.6602347324447981 \n", + "Epoch: 19. Loss: 0.006500782574387462. ACC 0.668191764471852 \n", + "Epoch: 20. Loss: 0.006468181039079037. ACC 0.66938531927591 \n", + "Epoch: 21. Loss: 0.006398217683287105. ACC 0.6685896160732047 \n", + "Epoch: 22. Loss: 0.00635060680441385. ACC 0.6757509448975532 \n", + "Epoch: 23. Loss: 0.006308472675856952. ACC 0.672170280485379 \n", + "Epoch: 24. Loss: 0.006279352860444105. ACC 0.6856972349313706 \n", + "Epoch: 25. Loss: 0.0062244532595388405. ACC 0.6854983091306942 \n", + "Epoch: 26. Loss: 0.006182033960639494. ACC 0.6821165705191964 \n", + "Epoch: 27. Loss: 0.0061473599930560825. ACC 0.6829122737219018 \n", + "Epoch: 28. Loss: 0.006125714389045219. ACC 0.6892778993435449 \n", + "Epoch: 29. Loss: 0.00609686327473838. ACC 0.6912671573503083 \n", + "Epoch: 30. Loss: 0.006095695369922311. ACC 0.6886811219415158 \n", + "Epoch: 31. Loss: 0.006049742109618558. ACC 0.691665008951661 \n", + "Epoch: 32. Loss: 0.00603084385644045. ACC 0.6972349313705988 \n", + "Epoch: 33. Loss: 0.0059978616671602985. ACC 0.6984284861746568 \n", + "Epoch: 34. Loss: 0.005998012048244381. ACC 0.6986274119753332 \n", + "Epoch: 35. Loss: 0.005985496981802721. ACC 0.6932564153570718 \n", + "Epoch: 36. Loss: 0.005963753297501111. ACC 0.6952456733638352 \n", + "Epoch: 37. Loss: 0.005971982082126391. ACC 0.6998209667793913 \n", + "Epoch: 38. Loss: 0.005939266916200658. ACC 0.6964392281678934 \n", + "Epoch: 39. Loss: 0.005949360749961985. ACC 0.6982295603739805 \n", + "Epoch: 40. Loss: 0.005934212485008551. ACC 0.7012134473841257 \n", + "Epoch: 41. Loss: 0.005918450631321319. ACC 0.6948478217624826 \n", + "Epoch: 42. Loss: 0.005912830631847241. ACC 0.6972349313705988 \n", + "Epoch: 43. Loss: 0.005908906068530594. ACC 0.6954445991645116 \n", + "Epoch: 44. Loss: 0.005885356933809261. ACC 0.7026059279888601 \n", + "Epoch: 45. Loss: 0.005892672879287541. ACC 0.7036005569922419 \n", + "Epoch: 46. Loss: 0.0058904265359070755. ACC 0.7010145215834493 \n", + "Epoch: 47. Loss: 0.005880098729904779. ACC 0.7020091505868311 \n", + "Epoch: 48. Loss: 0.005885767656883663. ACC 0.695643524965188 \n", + "Epoch: 49. Loss: 0.005888840853160102. ACC 0.7039984085935946 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.64 0.36 0.46 230\n", + " 1 0.65 0.86 0.74 319\n", + "\n", + " accuracy 0.65 549\n", + " macro avg 0.64 0.61 0.60 549\n", + "weighted avg 0.65 0.65 0.62 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "-kF0PCLZjyRj" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> BATCH SIZE 10" + ], + "metadata": { + "id": "jemPmZxi62n0" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "batch_size = 10\n", + "\n", + "train_loader = DataLoader(train, batch_size=batch_size, shuffle=True, collate_fn=collate_fn) #<-- use shuffle = True instead\n", + "dev_loader = DataLoader(dev, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)" + ], + "metadata": { + "id": "Mx8IMlLO62n2" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "7c1d6248-09c6-49bd-c8b5-cd57df4ac8d2", + "id": "LBjLVvoM62n2" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06943608308516655. ACC 0.5130296399443007 \n", + "Epoch: 1. Loss: 0.06835846744903115. ACC 0.5575890192958026 \n", + "Epoch: 2. Loss: 0.06575914487607298. ACC 0.6242291625223791 \n", + "Epoch: 3. Loss: 0.06329712385516625. ACC 0.6427292619852795 \n", + "Epoch: 4. Loss: 0.06130664807938666. ACC 0.668191764471852 \n", + "Epoch: 5. Loss: 0.06052716653508743. ACC 0.673363835289437 \n", + "Epoch: 6. Loss: 0.060176474734568705. ACC 0.682514422120549 \n", + "Epoch: 7. Loss: 0.05978312572048985. ACC 0.6880843445394867 \n", + "Epoch: 8. Loss: 0.05943953453225788. ACC 0.6823154963198726 \n", + "Epoch: 9. Loss: 0.05929612556721786. ACC 0.6892778993435449 \n", + "Epoch: 10. Loss: 0.05913474476429917. ACC 0.6880843445394867 \n", + "Epoch: 11. Loss: 0.05913642076106628. ACC 0.685896160732047 \n", + "Epoch: 12. Loss: 0.05894601099624391. ACC 0.6896757509448975 \n", + "Epoch: 13. Loss: 0.05861684172509654. ACC 0.6942510443604536 \n", + "Epoch: 14. Loss: 0.05863150297374167. ACC 0.6860950865327233 \n", + "Epoch: 15. Loss: 0.05871834742731239. ACC 0.6898746767455739 \n", + "Epoch: 16. Loss: 0.05839123122908378. ACC 0.6896757509448975 \n", + "Epoch: 17. Loss: 0.05847804964193416. ACC 0.6884821961408395 \n", + "Epoch: 18. Loss: 0.058277733841852045. ACC 0.6908693057489557 \n", + "Epoch: 19. Loss: 0.058386488127523464. ACC 0.691665008951661 \n", + "Epoch: 20. Loss: 0.05812547799486609. ACC 0.6974338571712752 \n", + "Epoch: 21. Loss: 0.05797829210272076. ACC 0.6974338571712752 \n", + "Epoch: 22. Loss: 0.0580150423086446. ACC 0.695643524965188 \n", + "Epoch: 23. Loss: 0.05798782616326082. ACC 0.6920628605530137 \n", + "Epoch: 24. Loss: 0.05793982845000583. ACC 0.6988263377760096 \n", + "Epoch: 25. Loss: 0.05766732708251047. ACC 0.6998209667793913 \n", + "Epoch: 26. Loss: 0.05790334768031591. ACC 0.6976327829719514 \n", + "Epoch: 27. Loss: 0.05761318465741842. ACC 0.6960413765665406 \n", + "Epoch: 28. Loss: 0.05759310899611755. ACC 0.6986274119753332 \n", + "Epoch: 29. Loss: 0.057751722742032885. ACC 0.6944499701611299 \n", + "Epoch: 30. Loss: 0.05742728516314218. ACC 0.6986274119753332 \n", + "Epoch: 31. Loss: 0.05744654730463303. ACC 0.6976327829719514 \n", + "Epoch: 32. Loss: 0.057447121611451546. ACC 0.6988263377760096 \n", + "Epoch: 33. Loss: 0.05743587963240081. ACC 0.7012134473841257 \n", + "Epoch: 34. Loss: 0.057487206849611755. ACC 0.700815595782773 \n", + "Epoch: 35. Loss: 0.05745081912102935. ACC 0.7030037795902129 \n", + "Epoch: 36. Loss: 0.05718231361832893. ACC 0.6978317087726278 \n", + "Epoch: 37. Loss: 0.05748365801372418. ACC 0.6958424507658644 \n", + "Epoch: 38. Loss: 0.057017978526072356. ACC 0.7067833698030634 \n", + "Epoch: 39. Loss: 0.057033468252008994. ACC 0.7022080763875075 \n", + "Epoch: 40. Loss: 0.05710869398829902. ACC 0.6996220409787149 \n", + "Epoch: 41. Loss: 0.056759040229880646. ACC 0.7006166699820967 \n", + "Epoch: 42. Loss: 0.057205665720986495. ACC 0.700815595782773 \n", + "Epoch: 43. Loss: 0.05693956865144103. ACC 0.7063855182017108 \n", + "Epoch: 44. Loss: 0.05696175061425656. ACC 0.7067833698030634 \n", + "Epoch: 45. Loss: 0.057026335483804855. ACC 0.7055898149990054 \n", + "Epoch: 46. Loss: 0.056927547490167554. ACC 0.6998209667793913 \n", + "Epoch: 47. Loss: 0.05687895283230451. ACC 0.7037994827929183 \n", + "Epoch: 48. Loss: 0.05670617793814335. ACC 0.7065844440023871 \n", + "Epoch: 49. Loss: 0.05673999334154723. ACC 0.7041973343942709 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.63 0.23 0.33 230\n", + " 1 0.62 0.90 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.56 0.53 549\n", + "weighted avg 0.62 0.62 0.57 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "jnf8dVIRkJUI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "------> Perte par rapport à 2, mais beaucoup plus rapide. On garde celui là (we could have tried a few more values here)" + ], + "metadata": { + "id": "VmXdgAMC7Ap0" + } + }, + { + "cell_type": "code", + "source": [ + "# So now we keep the data loaded with 10 batches\n", + "batch_size = 10\n", + "\n", + "train_loader = DataLoader(train, batch_size=batch_size, shuffle=False, collate_fn=collate_fn) #<-- use shuffle = True instead\n", + "dev_loader = DataLoader(dev, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)" + ], + "metadata": { + "id": "PaaocPgK7Gfy" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 3. HIDDEN SIZE" + ], + "metadata": { + "id": "NOCfrCXHXuHF" + } + }, + { + "cell_type": "code", + "source": [ + "# Already optimized\n", + "batch_size = 10" + ], + "metadata": { + "id": "eoDeM5ldX2gk" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Hyper-parameters\n", + "learning_rate = 0.1\n", + "criterion = nn.CrossEntropyLoss()\n", + "output_dim = 2" + ], + "metadata": { + "id": "Decj_K3OXuHG" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> HIDDEN DIM 4" + ], + "metadata": { + "id": "qz3TaBqMXuHH" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 4" + ], + "metadata": { + "id": "_CMEWtIuX-cQ" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "1eab3011-a409-43e5-b5a1-399724a17bbe", + "id": "m72hePuXXuHI" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06924030699928167. ACC 0.519395265565944 \n", + "Epoch: 1. Loss: 0.06755568319792678. ACC 0.5820568927789934 \n", + "Epoch: 2. Loss: 0.0643479654749606. ACC 0.6327829719514622 \n", + "Epoch: 3. Loss: 0.06209694384722295. ACC 0.6562562164312712 \n", + "Epoch: 4. Loss: 0.061107959345602626. ACC 0.6660035806644121 \n", + "Epoch: 5. Loss: 0.060512718390536305. ACC 0.66938531927591 \n", + "Epoch: 6. Loss: 0.06015065681728047. ACC 0.6735627610901134 \n", + "Epoch: 7. Loss: 0.0598090323511165. ACC 0.6787348319076985 \n", + "Epoch: 8. Loss: 0.059600808611204495. ACC 0.6829122737219018 \n", + "Epoch: 9. Loss: 0.05937369354254497. ACC 0.6860950865327233 \n", + "Epoch: 10. Loss: 0.05924973485838331. ACC 0.6866918639347523 \n", + "Epoch: 11. Loss: 0.05908976297552125. ACC 0.6878854187388104 \n", + "Epoch: 12. Loss: 0.059003198890247806. ACC 0.6878854187388104 \n", + "Epoch: 13. Loss: 0.05887418596460446. ACC 0.6884821961408395 \n", + "Epoch: 14. Loss: 0.05877320526228146. ACC 0.6894768251442212 \n", + "Epoch: 15. Loss: 0.058660052744892166. ACC 0.6908693057489557 \n", + "Epoch: 16. Loss: 0.05857120004638374. ACC 0.6920628605530137 \n", + "Epoch: 17. Loss: 0.05846636946736782. ACC 0.6930574895563955 \n", + "Epoch: 18. Loss: 0.0583727004301816. ACC 0.6924607121543664 \n", + "Epoch: 19. Loss: 0.05829021169244354. ACC 0.6930574895563955 \n", + "Epoch: 20. Loss: 0.058212152936241844. ACC 0.6938531927591008 \n", + "Epoch: 21. Loss: 0.05814118327334692. ACC 0.6940521185597772 \n", + "Epoch: 22. Loss: 0.05808073776081776. ACC 0.6944499701611299 \n", + "Epoch: 23. Loss: 0.058024488542149644. ACC 0.695046747563159 \n", + "Epoch: 24. Loss: 0.05797648725107458. ACC 0.696837079769246 \n", + "Epoch: 25. Loss: 0.05792635114856475. ACC 0.6964392281678934 \n", + "Epoch: 26. Loss: 0.05788604337414525. ACC 0.6974338571712752 \n", + "Epoch: 27. Loss: 0.05784654567510583. ACC 0.6970360055699224 \n", + "Epoch: 28. Loss: 0.057807977029526475. ACC 0.6982295603739805 \n", + "Epoch: 29. Loss: 0.05776816355748136. ACC 0.6986274119753332 \n", + "Epoch: 30. Loss: 0.05773178896634603. ACC 0.6988263377760096 \n", + "Epoch: 31. Loss: 0.057694174670926475. ACC 0.6992241893773622 \n", + "Epoch: 32. Loss: 0.05765648317522004. ACC 0.6998209667793913 \n", + "Epoch: 33. Loss: 0.05762090813646455. ACC 0.6998209667793913 \n", + "Epoch: 34. Loss: 0.057591354637087186. ACC 0.6998209667793913 \n", + "Epoch: 35. Loss: 0.057559573142884714. ACC 0.700218818380744 \n", + "Epoch: 36. Loss: 0.057518420771715865. ACC 0.700815595782773 \n", + "Epoch: 37. Loss: 0.057511935004994484. ACC 0.7000198925800676 \n", + "Epoch: 38. Loss: 0.057468027884286225. ACC 0.7012134473841257 \n", + "Epoch: 39. Loss: 0.057451791339539995. ACC 0.700815595782773 \n", + "Epoch: 40. Loss: 0.05742049626880074. ACC 0.7012134473841257 \n", + "Epoch: 41. Loss: 0.05740073836060815. ACC 0.7006166699820967 \n", + "Epoch: 42. Loss: 0.05737620045514142. ACC 0.7010145215834493 \n", + "Epoch: 43. Loss: 0.05734994709313876. ACC 0.7012134473841257 \n", + "Epoch: 44. Loss: 0.057335071667350024. ACC 0.7018102247861547 \n", + "Epoch: 45. Loss: 0.05730531897795279. ACC 0.7016112989854785 \n", + "Epoch: 46. Loss: 0.057294652366913265. ACC 0.7020091505868311 \n", + "Epoch: 47. Loss: 0.05726205780939148. ACC 0.7020091505868311 \n", + "Epoch: 48. Loss: 0.05724721420752687. ACC 0.7028048537895365 \n", + "Epoch: 49. Loss: 0.05722641405989072. ACC 0.7026059279888601 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.64 0.34 0.44 230\n", + " 1 0.64 0.86 0.74 319\n", + "\n", + " accuracy 0.64 549\n", + " macro avg 0.64 0.60 0.59 549\n", + "weighted avg 0.64 0.64 0.61 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "-_r5Ct13kTEu" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> BHIDDEN DIM 10" + ], + "metadata": { + "id": "CHMNOXBcaYuE" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 10" + ], + "metadata": { + "id": "6LLePhm2aYuF" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "c498c00a-62f8-4d5e-baac-5ce2f5bf0056", + "id": "kLkludZnaYuF" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06896211096940232. ACC 0.5315297394072012 \n", + "Epoch: 1. Loss: 0.06637562880441117. ACC 0.6091108016709768 \n", + "Epoch: 2. Loss: 0.06341122994015984. ACC 0.6439228167893376 \n", + "Epoch: 3. Loss: 0.061803018321141066. ACC 0.65904117764074 \n", + "Epoch: 4. Loss: 0.06101972030696941. ACC 0.6660035806644121 \n", + "Epoch: 5. Loss: 0.06050558253953773. ACC 0.6689874676745574 \n", + "Epoch: 6. Loss: 0.060153545039962154. ACC 0.6723692062860553 \n", + "Epoch: 7. Loss: 0.059852006238806855. ACC 0.6763477222995823 \n", + "Epoch: 8. Loss: 0.05964551357689875. ACC 0.6809230157151382 \n", + "Epoch: 9. Loss: 0.05943929881112391. ACC 0.6835090511239308 \n", + "Epoch: 10. Loss: 0.059304041406708495. ACC 0.6849015317286652 \n", + "Epoch: 11. Loss: 0.059154149585816454. ACC 0.6852993833300179 \n", + "Epoch: 12. Loss: 0.05905471958838046. ACC 0.686492938134076 \n", + "Epoch: 13. Loss: 0.058940565435507436. ACC 0.6866918639347523 \n", + "Epoch: 14. Loss: 0.05886337150392936. ACC 0.6878854187388104 \n", + "Epoch: 15. Loss: 0.058780789244879256. ACC 0.6884821961408395 \n", + "Epoch: 16. Loss: 0.05871899085986773. ACC 0.6892778993435449 \n", + "Epoch: 17. Loss: 0.05865352336848638. ACC 0.6892778993435449 \n", + "Epoch: 18. Loss: 0.0586048074619467. ACC 0.6908693057489557 \n", + "Epoch: 19. Loss: 0.058552351295770175. ACC 0.6942510443604536 \n", + "Epoch: 20. Loss: 0.05850790909087419. ACC 0.6932564153570718 \n", + "Epoch: 21. Loss: 0.058467955742959876. ACC 0.6936542669584245 \n", + "Epoch: 22. Loss: 0.05843013882494741. ACC 0.6944499701611299 \n", + "Epoch: 23. Loss: 0.05839480876756611. ACC 0.696240302367217 \n", + "Epoch: 24. Loss: 0.058362456553257884. ACC 0.6964392281678934 \n", + "Epoch: 25. Loss: 0.058333588234586696. ACC 0.6970360055699224 \n", + "Epoch: 26. Loss: 0.0583048580547486. ACC 0.6970360055699224 \n", + "Epoch: 27. Loss: 0.058276971534964625. ACC 0.6970360055699224 \n", + "Epoch: 28. Loss: 0.05825454208896865. ACC 0.696837079769246 \n", + "Epoch: 29. Loss: 0.05823096667935319. ACC 0.6980306345733042 \n", + "Epoch: 30. Loss: 0.05820865520360054. ACC 0.6978317087726278 \n", + "Epoch: 31. Loss: 0.05818741211175966. ACC 0.6982295603739805 \n", + "Epoch: 32. Loss: 0.05816647641602344. ACC 0.6982295603739805 \n", + "Epoch: 33. Loss: 0.058146640851053255. ACC 0.6980306345733042 \n", + "Epoch: 34. Loss: 0.05812764217584632. ACC 0.6986274119753332 \n", + "Epoch: 35. Loss: 0.058108912041677216. ACC 0.6988263377760096 \n", + "Epoch: 36. Loss: 0.058093623242796544. ACC 0.6988263377760096 \n", + "Epoch: 37. Loss: 0.058077310618240646. ACC 0.6990252635766859 \n", + "Epoch: 38. Loss: 0.058061877924001117. ACC 0.6998209667793913 \n", + "Epoch: 39. Loss: 0.05804592890319232. ACC 0.6990252635766859 \n", + "Epoch: 40. Loss: 0.0580320079342755. ACC 0.6994231151780386 \n", + "Epoch: 41. Loss: 0.058014122017434895. ACC 0.6992241893773622 \n", + "Epoch: 42. Loss: 0.05799840642344959. ACC 0.6998209667793913 \n", + "Epoch: 43. Loss: 0.05798272412364422. ACC 0.6996220409787149 \n", + "Epoch: 44. Loss: 0.0579593316008088. ACC 0.6998209667793913 \n", + "Epoch: 45. Loss: 0.057939109239991814. ACC 0.7000198925800676 \n", + "Epoch: 46. Loss: 0.057923655896958. ACC 0.700218818380744 \n", + "Epoch: 47. Loss: 0.05789986932778041. ACC 0.6998209667793913 \n", + "Epoch: 48. Loss: 0.057867199544145875. ACC 0.6980306345733042 \n", + "Epoch: 49. Loss: 0.057830534624915865. ACC 0.6984284861746568 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.63 0.35 0.45 230\n", + " 1 0.64 0.85 0.73 319\n", + "\n", + " accuracy 0.64 549\n", + " macro avg 0.64 0.60 0.59 549\n", + "weighted avg 0.64 0.64 0.61 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "m64cYF23kVjf" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> HIDDEN DIM 30" + ], + "metadata": { + "id": "ciuX1XwFatK1" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 30" + ], + "metadata": { + "id": "AC0RfEpZatK1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "e3752e14-9fc9-45a2-b293-8622628cf104", + "id": "5P5QvBFAatK2" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06916412696322366. ACC 0.5267555201909687 \n", + "Epoch: 1. Loss: 0.06722096759423551. ACC 0.5904117764074001 \n", + "Epoch: 2. Loss: 0.06412798812476166. ACC 0.6339765267555202 \n", + "Epoch: 3. Loss: 0.06214847299932651. ACC 0.6526755520190969 \n", + "Epoch: 4. Loss: 0.06124751349555016. ACC 0.65904117764074 \n", + "Epoch: 5. Loss: 0.06071575187844739. ACC 0.6650089516610305 \n", + "Epoch: 6. Loss: 0.06035083478977316. ACC 0.6713745772826736 \n", + "Epoch: 7. Loss: 0.06005531560124947. ACC 0.6751541674955241 \n", + "Epoch: 8. Loss: 0.05984238558434956. ACC 0.6789337577083748 \n", + "Epoch: 9. Loss: 0.05964589461621978. ACC 0.6799283867117565 \n", + "Epoch: 10. Loss: 0.05950711206007791. ACC 0.6829122737219018 \n", + "Epoch: 11. Loss: 0.05936534083551172. ACC 0.6847026059279888 \n", + "Epoch: 12. Loss: 0.059265624250833523. ACC 0.6860950865327233 \n", + "Epoch: 13. Loss: 0.05915929378561123. ACC 0.6856972349313706 \n", + "Epoch: 14. Loss: 0.05908176730351157. ACC 0.6856972349313706 \n", + "Epoch: 15. Loss: 0.059000206131192404. ACC 0.6854983091306942 \n", + "Epoch: 16. Loss: 0.058936901528389006. ACC 0.686492938134076 \n", + "Epoch: 17. Loss: 0.05887285243072874. ACC 0.6874875671374577 \n", + "Epoch: 18. Loss: 0.05881847355530327. ACC 0.6874875671374577 \n", + "Epoch: 19. Loss: 0.058766317896371524. ACC 0.6878854187388104 \n", + "Epoch: 20. Loss: 0.05872137367025815. ACC 0.6888800477421921 \n", + "Epoch: 21. Loss: 0.058679233980814365. ACC 0.6898746767455739 \n", + "Epoch: 22. Loss: 0.058641203697856054. ACC 0.6912671573503083 \n", + "Epoch: 23. Loss: 0.058605874156249974. ACC 0.6904714541476029 \n", + "Epoch: 24. Loss: 0.05857295907703337. ACC 0.691068231549632 \n", + "Epoch: 25. Loss: 0.05854129607477789. ACC 0.6914660831509847 \n", + "Epoch: 26. Loss: 0.05851208630105196. ACC 0.6918639347523374 \n", + "Epoch: 27. Loss: 0.058484796124138275. ACC 0.6924607121543664 \n", + "Epoch: 28. Loss: 0.0584593109767243. ACC 0.6930574895563955 \n", + "Epoch: 29. Loss: 0.05843541669352334. ACC 0.6930574895563955 \n", + "Epoch: 30. Loss: 0.05841283296552263. ACC 0.6928585637557191 \n", + "Epoch: 31. Loss: 0.058391570612144963. ACC 0.6934553411577482 \n", + "Epoch: 32. Loss: 0.058371144431187995. ACC 0.6940521185597772 \n", + "Epoch: 33. Loss: 0.05835203282492095. ACC 0.6948478217624826 \n", + "Epoch: 34. Loss: 0.058334296909743566. ACC 0.695046747563159 \n", + "Epoch: 35. Loss: 0.0583169173122472. ACC 0.695046747563159 \n", + "Epoch: 36. Loss: 0.058300359036402176. ACC 0.6952456733638352 \n", + "Epoch: 37. Loss: 0.05828488306447149. ACC 0.695643524965188 \n", + "Epoch: 38. Loss: 0.05826912266918317. ACC 0.6958424507658644 \n", + "Epoch: 39. Loss: 0.05825412129345627. ACC 0.696240302367217 \n", + "Epoch: 40. Loss: 0.058240195824845256. ACC 0.6966381539685698 \n", + "Epoch: 41. Loss: 0.05822649668077363. ACC 0.6964392281678934 \n", + "Epoch: 42. Loss: 0.05821330964221427. ACC 0.6954445991645116 \n", + "Epoch: 43. Loss: 0.05820075450537336. ACC 0.6960413765665406 \n", + "Epoch: 44. Loss: 0.058187138644179175. ACC 0.6964392281678934 \n", + "Epoch: 45. Loss: 0.05817102754545278. ACC 0.696837079769246 \n", + "Epoch: 46. Loss: 0.05815539926569149. ACC 0.6970360055699224 \n", + "Epoch: 47. Loss: 0.058140018178616554. ACC 0.6964392281678934 \n", + "Epoch: 48. Loss: 0.058125598893764126. ACC 0.6970360055699224 \n", + "Epoch: 49. Loss: 0.0581111740124541. ACC 0.6972349313705988 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.27 0.37 230\n", + " 1 0.62 0.88 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.57 0.55 549\n", + "weighted avg 0.62 0.62 0.58 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "1fK2cOC-kZxM" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> HIDDEN DIM 64" + ], + "metadata": { + "id": "vdAowqjnaM6m" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 64" + ], + "metadata": { + "id": "s3d0b5JOaM6n" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "39645515-7814-4c72-928c-f07881f21ea9", + "id": "dPVElrroaM6n" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06918686282120146. ACC 0.5289437039984086 \n", + "Epoch: 1. Loss: 0.06731588176023849. ACC 0.5965784762283668 \n", + "Epoch: 2. Loss: 0.06427250460741746. ACC 0.6327829719514622 \n", + "Epoch: 3. Loss: 0.062290343412803754. ACC 0.6532723294211259 \n", + "Epoch: 4. Loss: 0.06136146768177625. ACC 0.6610304356475035 \n", + "Epoch: 5. Loss: 0.06083343859953887. ACC 0.6642132484583251 \n", + "Epoch: 6. Loss: 0.06044989986141752. ACC 0.6689874676745574 \n", + "Epoch: 7. Loss: 0.06017542718916165. ACC 0.6731649094887607 \n", + "Epoch: 8. Loss: 0.05993544316111584. ACC 0.6765466481002586 \n", + "Epoch: 9. Loss: 0.05976648447004682. ACC 0.6787348319076985 \n", + "Epoch: 10. Loss: 0.05960057801818051. ACC 0.6803262383131092 \n", + "Epoch: 11. Loss: 0.05948651382362058. ACC 0.6827133479212254 \n", + "Epoch: 12. Loss: 0.05936432487689076. ACC 0.6847026059279888 \n", + "Epoch: 13. Loss: 0.05927935623259437. ACC 0.6849015317286652 \n", + "Epoch: 14. Loss: 0.05918656135734185. ACC 0.6847026059279888 \n", + "Epoch: 15. Loss: 0.05911824261902672. ACC 0.6851004575293416 \n", + "Epoch: 16. Loss: 0.05904605627795064. ACC 0.6868907897354287 \n", + "Epoch: 17. Loss: 0.058988605759502856. ACC 0.6872886413367814 \n", + "Epoch: 18. Loss: 0.05893096965682988. ACC 0.6882832703401631 \n", + "Epoch: 19. Loss: 0.05888151869945553. ACC 0.6876864929381341 \n", + "Epoch: 20. Loss: 0.05883267014382823. ACC 0.6880843445394867 \n", + "Epoch: 21. Loss: 0.05878903034289694. ACC 0.6880843445394867 \n", + "Epoch: 22. Loss: 0.058747436989840585. ACC 0.6888800477421921 \n", + "Epoch: 23. Loss: 0.0587054798551836. ACC 0.6894768251442212 \n", + "Epoch: 24. Loss: 0.058667122392880175. ACC 0.6894768251442212 \n", + "Epoch: 25. Loss: 0.05863180214115656. ACC 0.6898746767455739 \n", + "Epoch: 26. Loss: 0.05859861508832909. ACC 0.6898746767455739 \n", + "Epoch: 27. Loss: 0.05856748351833563. ACC 0.6904714541476029 \n", + "Epoch: 28. Loss: 0.058538229263443015. ACC 0.6912671573503083 \n", + "Epoch: 29. Loss: 0.058510525766793456. ACC 0.6908693057489557 \n", + "Epoch: 30. Loss: 0.05848419636055184. ACC 0.6906703799482793 \n", + "Epoch: 31. Loss: 0.058458312939810235. ACC 0.691068231549632 \n", + "Epoch: 32. Loss: 0.05843393267185142. ACC 0.6922617863536901 \n", + "Epoch: 33. Loss: 0.05841100349302009. ACC 0.6928585637557191 \n", + "Epoch: 34. Loss: 0.05838928854984817. ACC 0.6930574895563955 \n", + "Epoch: 35. Loss: 0.05836856827623023. ACC 0.6928585637557191 \n", + "Epoch: 36. Loss: 0.058349247443076034. ACC 0.6942510443604536 \n", + "Epoch: 37. Loss: 0.058329760109379916. ACC 0.6944499701611299 \n", + "Epoch: 38. Loss: 0.05830992062570372. ACC 0.6944499701611299 \n", + "Epoch: 39. Loss: 0.058295121457578175. ACC 0.6948478217624826 \n", + "Epoch: 40. Loss: 0.058276735309912335. ACC 0.6944499701611299 \n", + "Epoch: 41. Loss: 0.05826076196466973. ACC 0.6942510443604536 \n", + "Epoch: 42. Loss: 0.058244150036344575. ACC 0.6946488959618062 \n", + "Epoch: 43. Loss: 0.058228511761215164. ACC 0.6948478217624826 \n", + "Epoch: 44. Loss: 0.058213053106289395. ACC 0.6948478217624826 \n", + "Epoch: 45. Loss: 0.05819999741848691. ACC 0.695643524965188 \n", + "Epoch: 46. Loss: 0.0581852136465482. ACC 0.696240302367217 \n", + "Epoch: 47. Loss: 0.058173370629294105. ACC 0.6964392281678934 \n", + "Epoch: 48. Loss: 0.05816079556100146. ACC 0.6966381539685698 \n", + "Epoch: 49. Loss: 0.058147606986973334. ACC 0.6972349313705988 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.27 0.37 230\n", + " 1 0.62 0.88 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.57 0.55 549\n", + "weighted avg 0.62 0.62 0.58 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "YTvL74nskeKl" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> HIDDEN DIM 100" + ], + "metadata": { + "id": "SPLx4mwpZ8Mt" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 100" + ], + "metadata": { + "id": "1N62Mw2nZ8Mu" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "e0bdc61a-14ae-496a-adb0-ac1f94ccbfef", + "id": "TZMuIksNZ8Mu" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06917759010752414. ACC 0.520588820370002 \n", + "Epoch: 1. Loss: 0.06705762690475263. ACC 0.5961806246270142 \n", + "Epoch: 2. Loss: 0.06400140089592214. ACC 0.6401432265764869 \n", + "Epoch: 3. Loss: 0.062146776276695244. ACC 0.6552615874278894 \n", + "Epoch: 4. Loss: 0.061300162472733455. ACC 0.6612293614481798 \n", + "Epoch: 5. Loss: 0.06078929286461263. ACC 0.6650089516610305 \n", + "Epoch: 6. Loss: 0.060443362819524796. ACC 0.6697831708772628 \n", + "Epoch: 7. Loss: 0.06016214593566149. ACC 0.6737616868907897 \n", + "Epoch: 8. Loss: 0.05996537376925133. ACC 0.6769444997016113 \n", + "Epoch: 9. Loss: 0.05977933667563102. ACC 0.6787348319076985 \n", + "Epoch: 10. Loss: 0.05965504488960184. ACC 0.6807240899144619 \n", + "Epoch: 11. Loss: 0.05952140737509476. ACC 0.6817187189178436 \n", + "Epoch: 12. Loss: 0.05943459829066178. ACC 0.6827133479212254 \n", + "Epoch: 13. Loss: 0.059335262828302926. ACC 0.6841058285259598 \n", + "Epoch: 14. Loss: 0.059268764776526185. ACC 0.6837079769246072 \n", + "Epoch: 15. Loss: 0.05919364758720686. ACC 0.6851004575293416 \n", + "Epoch: 16. Loss: 0.05913926471600552. ACC 0.6849015317286652 \n", + "Epoch: 17. Loss: 0.05908131185620401. ACC 0.685896160732047 \n", + "Epoch: 18. Loss: 0.059034374656842296. ACC 0.6852993833300179 \n", + "Epoch: 19. Loss: 0.058986146157888664. ACC 0.685896160732047 \n", + "Epoch: 20. Loss: 0.05894476813802383. ACC 0.6866918639347523 \n", + "Epoch: 21. Loss: 0.05890504864970993. ACC 0.6874875671374577 \n", + "Epoch: 22. Loss: 0.0588692116682824. ACC 0.6874875671374577 \n", + "Epoch: 23. Loss: 0.05883541230133805. ACC 0.6878854187388104 \n", + "Epoch: 24. Loss: 0.05880412687025986. ACC 0.6888800477421921 \n", + "Epoch: 25. Loss: 0.05877470903755159. ACC 0.6896757509448975 \n", + "Epoch: 26. Loss: 0.058747126226375845. ACC 0.6902725283469267 \n", + "Epoch: 27. Loss: 0.05872110533078764. ACC 0.6918639347523374 \n", + "Epoch: 28. Loss: 0.05869652181774858. ACC 0.6918639347523374 \n", + "Epoch: 29. Loss: 0.058673218518098355. ACC 0.6914660831509847 \n", + "Epoch: 30. Loss: 0.058651078759924376. ACC 0.6920628605530137 \n", + "Epoch: 31. Loss: 0.0586299903769367. ACC 0.6920628605530137 \n", + "Epoch: 32. Loss: 0.058609860541452015. ACC 0.6922617863536901 \n", + "Epoch: 33. Loss: 0.05859060251430605. ACC 0.6932564153570718 \n", + "Epoch: 34. Loss: 0.05857215594386922. ACC 0.6934553411577482 \n", + "Epoch: 35. Loss: 0.05855453916091152. ACC 0.6934553411577482 \n", + "Epoch: 36. Loss: 0.05853714394194719. ACC 0.6932564153570718 \n", + "Epoch: 37. Loss: 0.05852007168127098. ACC 0.6930574895563955 \n", + "Epoch: 38. Loss: 0.05850374254597948. ACC 0.6934553411577482 \n", + "Epoch: 39. Loss: 0.058488114088364664. ACC 0.6938531927591008 \n", + "Epoch: 40. Loss: 0.05847297409360016. ACC 0.6946488959618062 \n", + "Epoch: 41. Loss: 0.05845842594794021. ACC 0.6946488959618062 \n", + "Epoch: 42. Loss: 0.058444301698372426. ACC 0.6944499701611299 \n", + "Epoch: 43. Loss: 0.058430569669183974. ACC 0.6944499701611299 \n", + "Epoch: 44. Loss: 0.05841701759219241. ACC 0.695046747563159 \n", + "Epoch: 45. Loss: 0.05840258591687772. ACC 0.6948478217624826 \n", + "Epoch: 46. Loss: 0.05838884583685302. ACC 0.6948478217624826 \n", + "Epoch: 47. Loss: 0.05837544693917433. ACC 0.6952456733638352 \n", + "Epoch: 48. Loss: 0.0583625853784947. ACC 0.695046747563159 \n", + "Epoch: 49. Loss: 0.05834997150849129. ACC 0.6954445991645116 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.26 0.36 230\n", + " 1 0.62 0.89 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.57 0.55 549\n", + "weighted avg 0.62 0.62 0.58 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "d9x0oUT3kice" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "-----> HIDDEN DIM 512" + ], + "metadata": { + "id": "-n9xrbgOaBBW" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "hidden_dim = 512" + ], + "metadata": { + "id": "j8xOZDOWaBBY" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "8d92a2d5-9781-4a02-dcd3-def3f01f531c", + "id": "ZA52chWCaBBY" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.0686950459024032. ACC 0.5365028844241098 \n", + "Epoch: 1. Loss: 0.06554919333066195. ACC 0.6210463497115576 \n", + "Epoch: 2. Loss: 0.06287430665211766. ACC 0.6465088521981301 \n", + "Epoch: 3. Loss: 0.06164849479724428. ACC 0.6566540680326238 \n", + "Epoch: 4. Loss: 0.06106089448890891. ACC 0.6630196936542669 \n", + "Epoch: 5. Loss: 0.06063713147266973. ACC 0.6671971354684703 \n", + "Epoch: 6. Loss: 0.06036763339908692. ACC 0.6697831708772628 \n", + "Epoch: 7. Loss: 0.06010269768934779. ACC 0.6743584642928188 \n", + "Epoch: 8. Loss: 0.059954377874266634. ACC 0.6771434255022877 \n", + "Epoch: 9. Loss: 0.059766372481610396. ACC 0.6797294609110802 \n", + "Epoch: 10. Loss: 0.059672780935205144. ACC 0.6809230157151382 \n", + "Epoch: 11. Loss: 0.05953542923438803. ACC 0.6817187189178436 \n", + "Epoch: 12. Loss: 0.05946852925735794. ACC 0.6835090511239308 \n", + "Epoch: 13. Loss: 0.05936599172605822. ACC 0.6843047543266362 \n", + "Epoch: 14. Loss: 0.05931339195501693. ACC 0.6841058285259598 \n", + "Epoch: 15. Loss: 0.05923553461793263. ACC 0.6852993833300179 \n", + "Epoch: 16. Loss: 0.059191132726502366. ACC 0.6860950865327233 \n", + "Epoch: 17. Loss: 0.05913125236133517. ACC 0.6860950865327233 \n", + "Epoch: 18. Loss: 0.05909214182570651. ACC 0.6854983091306942 \n", + "Epoch: 19. Loss: 0.059045274599850095. ACC 0.6860950865327233 \n", + "Epoch: 20. Loss: 0.05901029463527642. ACC 0.686492938134076 \n", + "Epoch: 21. Loss: 0.05897261184846428. ACC 0.6866918639347523 \n", + "Epoch: 22. Loss: 0.05894137235102322. ACC 0.6874875671374577 \n", + "Epoch: 23. Loss: 0.058910085372619374. ACC 0.6884821961408395 \n", + "Epoch: 24. Loss: 0.058882338112145695. ACC 0.6890789735428685 \n", + "Epoch: 25. Loss: 0.05885559505417495. ACC 0.6896757509448975 \n", + "Epoch: 26. Loss: 0.05883099805391022. ACC 0.6896757509448975 \n", + "Epoch: 27. Loss: 0.058807639518883494. ACC 0.6894768251442212 \n", + "Epoch: 28. Loss: 0.058785773026722665. ACC 0.691068231549632 \n", + "Epoch: 29. Loss: 0.05876506839235805. ACC 0.6908693057489557 \n", + "Epoch: 30. Loss: 0.058745517879161296. ACC 0.691068231549632 \n", + "Epoch: 31. Loss: 0.05872697606816168. ACC 0.6908693057489557 \n", + "Epoch: 32. Loss: 0.05870937866530979. ACC 0.691665008951661 \n", + "Epoch: 33. Loss: 0.05869264146881832. ACC 0.6920628605530137 \n", + "Epoch: 34. Loss: 0.058676699977142095. ACC 0.691665008951661 \n", + "Epoch: 35. Loss: 0.058661491692552324. ACC 0.6926596379550428 \n", + "Epoch: 36. Loss: 0.0586469627432162. ACC 0.6936542669584245 \n", + "Epoch: 37. Loss: 0.0586330647826171. ACC 0.6934553411577482 \n", + "Epoch: 38. Loss: 0.05861975381572132. ACC 0.6938531927591008 \n", + "Epoch: 39. Loss: 0.05860698842637127. ACC 0.6938531927591008 \n", + "Epoch: 40. Loss: 0.05859473278301004. ACC 0.6940521185597772 \n", + "Epoch: 41. Loss: 0.05858295441551239. ACC 0.6934553411577482 \n", + "Epoch: 42. Loss: 0.05857162248407701. ACC 0.6932564153570718 \n", + "Epoch: 43. Loss: 0.05856070942733595. ACC 0.6934553411577482 \n", + "Epoch: 44. Loss: 0.05855019049400692. ACC 0.6936542669584245 \n", + "Epoch: 45. Loss: 0.05854004134779923. ACC 0.6940521185597772 \n", + "Epoch: 46. Loss: 0.05853024176676704. ACC 0.6940521185597772 \n", + "Epoch: 47. Loss: 0.05852077155860679. ACC 0.6942510443604536 \n", + "Epoch: 48. Loss: 0.05851161174634733. ACC 0.6940521185597772 \n", + "Epoch: 49. Loss: 0.058502746672949975. ACC 0.6936542669584245 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.26 0.36 230\n", + " 1 0.62 0.89 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.57 0.55 549\n", + "weighted avg 0.62 0.62 0.58 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'sigmoid', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "PFeWsyCTkl1I" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 4. ACTIVATION FUNCTION" + ], + "metadata": { + "id": "SU5hZaAGa-oN" + } + }, + { + "cell_type": "code", + "source": [ + "# Already optimized\n", + "batch_size = 10\n", + "hidden_dim = 10" + ], + "metadata": { + "id": "oFXMMG-xbOS4" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Hyper-parameters\n", + "learning_rate = 0.1\n", + "criterion = nn.CrossEntropyLoss()\n", + "output_dim = 2" + ], + "metadata": { + "id": "hGgQ8IVAbOS4" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "------> RELU" + ], + "metadata": { + "id": "2JOuEGpAbHvD" + } + }, + { + "cell_type": "code", + "source": [ + "class FeedforwardNeuralNetModel(nn.Module):\n", + " def __init__(self, hidden_dim, output_dim, weights_matrix):\n", + " # calls the init function of nn.Module. Dont get confused by syntax,\n", + " # just always do it in an nn.Module\n", + " super(FeedforwardNeuralNetModel, self).__init__()\n", + "\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " # mode (string, optional) – \"sum\", \"mean\" or \"max\". Default=mean.\n", + " self.embedding_bag = nn.EmbeddingBag.from_pretrained(\n", + " weights_matrix,\n", + " mode='mean')\n", + " embed_dim = self.embedding_bag.embedding_dim\n", + "\n", + " # Linear function\n", + " self.fc1 = nn.Linear(embed_dim, hidden_dim)\n", + "\n", + " # Non-linearity\n", + " self.activation = nn.ReLU()\n", + "\n", + " # Linear function (readout)\n", + " self.fc2 = nn.Linear(hidden_dim, output_dim)\n", + "\n", + " def forward(self, text, offsets):\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " embedded = self.embedding_bag(text, offsets)\n", + "\n", + " # Linear function\n", + " out = self.fc1(embedded)\n", + "\n", + " # Non-linearity\n", + " out = self.activation(out)\n", + "\n", + " # Linear function (readout)\n", + " out = self.fc2(out)\n", + " return out" + ], + "metadata": { + "id": "HNaP18nEZNTu" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "3jC2S26dbdD7", + "outputId": "1aa5c0e5-1b35-49da-8c78-891cd84d9877" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06942072910889731. ACC 0.5054704595185996 \n", + "Epoch: 1. Loss: 0.06856277003448574. ACC 0.5546051322856574 \n", + "Epoch: 2. Loss: 0.06591766325692806. ACC 0.6206484981102048 \n", + "Epoch: 3. Loss: 0.06291877846962561. ACC 0.6500895166103043 \n", + "Epoch: 4. Loss: 0.061278633457636865. ACC 0.6606325840461508 \n", + "Epoch: 5. Loss: 0.06041950248856181. ACC 0.6646111000596777 \n", + "Epoch: 6. Loss: 0.05980401388190721. ACC 0.6711756514819972 \n", + "Epoch: 7. Loss: 0.05930565481089161. ACC 0.6765466481002586 \n", + "Epoch: 8. Loss: 0.05886232488336064. ACC 0.6807240899144619 \n", + "Epoch: 9. Loss: 0.05843683607729572. ACC 0.6876864929381341 \n", + "Epoch: 10. Loss: 0.0579850317790633. ACC 0.6920628605530137 \n", + "Epoch: 11. Loss: 0.057525022178799394. ACC 0.6966381539685698 \n", + "Epoch: 12. Loss: 0.05711436834870311. ACC 0.6970360055699224 \n", + "Epoch: 13. Loss: 0.056641527731138314. ACC 0.7012134473841257 \n", + "Epoch: 14. Loss: 0.05623253186346167. ACC 0.7018102247861547 \n", + "Epoch: 15. Loss: 0.05579468431827537. ACC 0.7034016311915655 \n", + "Epoch: 16. Loss: 0.05532180878509933. ACC 0.7051919633976527 \n", + "Epoch: 17. Loss: 0.05489827995156113. ACC 0.7095683310125324 \n", + "Epoch: 18. Loss: 0.054420895312779556. ACC 0.7107618858165904 \n", + "Epoch: 19. Loss: 0.05386669599431301. ACC 0.7145414760294411 \n", + "Epoch: 20. Loss: 0.053418611796020345. ACC 0.7175253630395863 \n", + "Epoch: 21. Loss: 0.05291474591447839. ACC 0.722498508056495 \n", + "Epoch: 22. Loss: 0.0524292953930438. ACC 0.7244877660632584 \n", + "Epoch: 23. Loss: 0.051777515430703705. ACC 0.7294609110801671 \n", + "Epoch: 24. Loss: 0.05116368355927942. ACC 0.7338372786950468 \n", + "Epoch: 25. Loss: 0.05050421508982472. ACC 0.7358265367018102 \n", + "Epoch: 26. Loss: 0.04980646493256104. ACC 0.7390093495126318 \n", + "Epoch: 27. Loss: 0.04914311262824698. ACC 0.7453749751342749 \n", + "Epoch: 28. Loss: 0.048441451190360765. ACC 0.7487567137457728 \n", + "Epoch: 29. Loss: 0.04776734764933562. ACC 0.7571115973741794 \n", + "Epoch: 30. Loss: 0.046898810354122376. ACC 0.7640740003978516 \n", + "Epoch: 31. Loss: 0.04618029834997922. ACC 0.7702407002188184 \n", + "Epoch: 32. Loss: 0.04542325366816797. ACC 0.7774020290431669 \n", + "Epoch: 33. Loss: 0.04459663130593437. ACC 0.7833698030634574 \n", + "Epoch: 34. Loss: 0.0439375044516026. ACC 0.7851601352695444 \n", + "Epoch: 35. Loss: 0.043228709056857856. ACC 0.7913268350905113 \n", + "Epoch: 36. Loss: 0.042445782698751655. ACC 0.792918241495922 \n", + "Epoch: 37. Loss: 0.04166124297499728. ACC 0.8004774219216232 \n", + "Epoch: 38. Loss: 0.04107313388256423. ACC 0.8046548637358265 \n", + "Epoch: 39. Loss: 0.04042646231365792. ACC 0.8040580863337975 \n", + "Epoch: 40. Loss: 0.03963960715783906. ACC 0.8138054505669385 \n", + "Epoch: 41. Loss: 0.038990876658704564. ACC 0.8167893375770837 \n", + "Epoch: 42. Loss: 0.0382939168158122. ACC 0.8211657051919634 \n", + "Epoch: 43. Loss: 0.03765396263820171. ACC 0.825542072806843 \n", + "Epoch: 44. Loss: 0.036975571820862564. ACC 0.8285259598169883 \n", + "Epoch: 45. Loss: 0.036351039455820085. ACC 0.8297195146210463 \n", + "Epoch: 46. Loss: 0.03568981379946397. ACC 0.8317087726278098 \n", + "Epoch: 47. Loss: 0.03518347225020604. ACC 0.835289437039984 \n", + "Epoch: 48. Loss: 0.03447389968298339. ACC 0.8444400238710961 \n", + "Epoch: 49. Loss: 0.033695441180321764. ACC 0.8442410980704197 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.56 0.52 0.54 230\n", + " 1 0.67 0.71 0.69 319\n", + "\n", + " accuracy 0.63 549\n", + " macro avg 0.62 0.61 0.61 549\n", + "weighted avg 0.62 0.63 0.63 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'relu', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "NZf4IZtyksmP" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "------> Hardtanh" + ], + "metadata": { + "id": "D4tkSoOXckfU" + } + }, + { + "cell_type": "code", + "source": [ + "class FeedforwardNeuralNetModel(nn.Module):\n", + " def __init__(self, hidden_dim, output_dim, weights_matrix):\n", + " # calls the init function of nn.Module. Dont get confused by syntax,\n", + " # just always do it in an nn.Module\n", + " super(FeedforwardNeuralNetModel, self).__init__()\n", + "\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " # mode (string, optional) – \"sum\", \"mean\" or \"max\". Default=mean.\n", + " self.embedding_bag = nn.EmbeddingBag.from_pretrained(\n", + " weights_matrix,\n", + " mode='mean')\n", + " embed_dim = self.embedding_bag.embedding_dim\n", + "\n", + " # Linear function\n", + " self.fc1 = nn.Linear(embed_dim, hidden_dim)\n", + "\n", + " # Non-linearity\n", + " self.activation = nn.Hardtanh()\n", + "\n", + " # Linear function (readout)\n", + " self.fc2 = nn.Linear(hidden_dim, output_dim)\n", + "\n", + " def forward(self, text, offsets):\n", + " # Embedding layer\n", + " # ....\n", + " # ----- SOLUTION\n", + " embedded = self.embedding_bag(text, offsets)\n", + "\n", + " # Linear function\n", + " out = self.fc1(embedded)\n", + "\n", + " # Non-linearity\n", + " out = self.activation(out)\n", + "\n", + " # Linear function (readout)\n", + " out = self.fc2(out)\n", + " return out" + ], + "metadata": { + "id": "85rJ0LeNcnVh" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "c09b51cc-7d0e-422f-96c4-56606a02ce6e", + "id": "AG3KHam2cnVi" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06912730077827382. ACC 0.5265565943902925 \n", + "Epoch: 1. Loss: 0.06693142557656383. ACC 0.6029441018500099 \n", + "Epoch: 2. Loss: 0.06379574281168145. ACC 0.6399443007758107 \n", + "Epoch: 3. Loss: 0.061910910589310336. ACC 0.6574497712353292 \n", + "Epoch: 4. Loss: 0.06103705234050466. ACC 0.6646111000596777 \n", + "Epoch: 5. Loss: 0.06050537584409909. ACC 0.6691863934752338 \n", + "Epoch: 6. Loss: 0.060124982433715585. ACC 0.6713745772826736 \n", + "Epoch: 7. Loss: 0.059825400157692846. ACC 0.6769444997016113 \n", + "Epoch: 8. Loss: 0.059586842123642104. ACC 0.6811219415158146 \n", + "Epoch: 9. Loss: 0.059389169170625505. ACC 0.6829122737219018 \n", + "Epoch: 10. Loss: 0.05923326789253816. ACC 0.6852993833300179 \n", + "Epoch: 11. Loss: 0.05909600550976145. ACC 0.685896160732047 \n", + "Epoch: 12. Loss: 0.05898585811817601. ACC 0.6856972349313706 \n", + "Epoch: 13. Loss: 0.05888758094950559. ACC 0.687089715536105 \n", + "Epoch: 14. Loss: 0.05880669621342761. ACC 0.6884821961408395 \n", + "Epoch: 15. Loss: 0.0587295797370598. ACC 0.6896757509448975 \n", + "Epoch: 16. Loss: 0.058668571952467344. ACC 0.6914660831509847 \n", + "Epoch: 17. Loss: 0.058608014249934434. ACC 0.691665008951661 \n", + "Epoch: 18. Loss: 0.05855635845426397. ACC 0.6924607121543664 \n", + "Epoch: 19. Loss: 0.05850269709331557. ACC 0.6934553411577482 \n", + "Epoch: 20. Loss: 0.058469376952695114. ACC 0.6942510443604536 \n", + "Epoch: 21. Loss: 0.05842616894455252. ACC 0.6942510443604536 \n", + "Epoch: 22. Loss: 0.05838953687486863. ACC 0.6946488959618062 \n", + "Epoch: 23. Loss: 0.05835356955644931. ACC 0.6964392281678934 \n", + "Epoch: 24. Loss: 0.058319245331894934. ACC 0.6982295603739805 \n", + "Epoch: 25. Loss: 0.05828667389533144. ACC 0.6986274119753332 \n", + "Epoch: 26. Loss: 0.05825627778523576. ACC 0.6980306345733042 \n", + "Epoch: 27. Loss: 0.05822576936656731. ACC 0.6978317087726278 \n", + "Epoch: 28. Loss: 0.05819523083136858. ACC 0.6974338571712752 \n", + "Epoch: 29. Loss: 0.0581726550000644. ACC 0.6972349313705988 \n", + "Epoch: 30. Loss: 0.05813115125554254. ACC 0.696837079769246 \n", + "Epoch: 31. Loss: 0.05809444772431218. ACC 0.696837079769246 \n", + "Epoch: 32. Loss: 0.0580429145112198. ACC 0.6974338571712752 \n", + "Epoch: 33. Loss: 0.05801715195380553. ACC 0.696837079769246 \n", + "Epoch: 34. Loss: 0.05796332862758011. ACC 0.6972349313705988 \n", + "Epoch: 35. Loss: 0.05790024081043868. ACC 0.6982295603739805 \n", + "Epoch: 36. Loss: 0.057840524337295036. ACC 0.6978317087726278 \n", + "Epoch: 37. Loss: 0.05779010203289232. ACC 0.6980306345733042 \n", + "Epoch: 38. Loss: 0.05775263634002159. ACC 0.6990252635766859 \n", + "Epoch: 39. Loss: 0.05771525803844474. ACC 0.6976327829719514 \n", + "Epoch: 40. Loss: 0.05767879191795684. ACC 0.6978317087726278 \n", + "Epoch: 41. Loss: 0.05764709149531965. ACC 0.6978317087726278 \n", + "Epoch: 42. Loss: 0.05761691455412319. ACC 0.6984284861746568 \n", + "Epoch: 43. Loss: 0.0575916110840853. ACC 0.6980306345733042 \n", + "Epoch: 44. Loss: 0.05755931099271689. ACC 0.7000198925800676 \n", + "Epoch: 45. Loss: 0.05753736929815712. ACC 0.6996220409787149 \n", + "Epoch: 46. Loss: 0.05750204396575114. ACC 0.6994231151780386 \n", + "Epoch: 47. Loss: 0.05749101606306035. ACC 0.6988263377760096 \n", + "Epoch: 48. Loss: 0.057454643712879436. ACC 0.7004177441814203 \n", + "Epoch: 49. Loss: 0.057430451222672624. ACC 0.6998209667793913 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.64 0.37 0.46 230\n", + " 1 0.65 0.85 0.74 319\n", + "\n", + " accuracy 0.65 549\n", + " macro avg 0.64 0.61 0.60 549\n", + "weighted avg 0.64 0.65 0.62 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'hardthan', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "jMUbVLU0kxEI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 5. LEARNING RATE" + ], + "metadata": { + "id": "T-MTCs64c9R-" + } + }, + { + "cell_type": "code", + "source": [ + "# Already optimized\n", + "batch_size = 10\n", + "hidden_dim = 10" + ], + "metadata": { + "id": "EGKtAcFfc9R_" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "----> learning_rate = 0.0001" + ], + "metadata": { + "id": "l7QZ17-fdJYY" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "learning_rate = 0.0001" + ], + "metadata": { + "id": "jYkx9YVsc9R_" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "lo6ootHAdJx1", + "outputId": "3fad47bb-b328-4729-81b3-6e0507eabf86" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.07287653228362512. ACC 0.49094887606922616 \n", + "Epoch: 1. Loss: 0.07263582983997771. ACC 0.49094887606922616 \n", + "Epoch: 2. Loss: 0.07241138401481108. ACC 0.49094887606922616 \n", + "Epoch: 3. Loss: 0.07220213146932514. ACC 0.49094887606922616 \n", + "Epoch: 4. Loss: 0.07200707030718431. ACC 0.49094887606922616 \n", + "Epoch: 5. Loss: 0.07182526268303596. ACC 0.49094887606922616 \n", + "Epoch: 6. Loss: 0.07165583276312323. ACC 0.49094887606922616 \n", + "Epoch: 7. Loss: 0.07149796235008839. ACC 0.49094887606922616 \n", + "Epoch: 8. Loss: 0.07135088048447001. ACC 0.49094887606922616 \n", + "Epoch: 9. Loss: 0.07121387364163843. ACC 0.49094887606922616 \n", + "Epoch: 10. Loss: 0.07108626437751583. ACC 0.49074995026854984 \n", + "Epoch: 11. Loss: 0.07096742720496757. ACC 0.49074995026854984 \n", + "Epoch: 12. Loss: 0.07085677660647458. ACC 0.49074995026854984 \n", + "Epoch: 13. Loss: 0.0707537580110312. ACC 0.49094887606922616 \n", + "Epoch: 14. Loss: 0.07065785842978602. ACC 0.49094887606922616 \n", + "Epoch: 15. Loss: 0.07056859308145666. ACC 0.49094887606922616 \n", + "Epoch: 16. Loss: 0.07048551033665794. ACC 0.49094887606922616 \n", + "Epoch: 17. Loss: 0.07040819114877045. ACC 0.49094887606922616 \n", + "Epoch: 18. Loss: 0.07033624280535347. ACC 0.49094887606922616 \n", + "Epoch: 19. Loss: 0.07026929603506088. ACC 0.49074995026854984 \n", + "Epoch: 20. Loss: 0.0702070084579996. ACC 0.49074995026854984 \n", + "Epoch: 21. Loss: 0.07014905833714237. ACC 0.49074995026854984 \n", + "Epoch: 22. Loss: 0.07009514721055995. ACC 0.49094887606922616 \n", + "Epoch: 23. Loss: 0.07004499387997198. ACC 0.49094887606922616 \n", + "Epoch: 24. Loss: 0.06999833974635267. ACC 0.49074995026854984 \n", + "Epoch: 25. Loss: 0.0699549406879532. ACC 0.49094887606922616 \n", + "Epoch: 26. Loss: 0.06991456991781504. ACC 0.49094887606922616 \n", + "Epoch: 27. Loss: 0.06987701522111181. ACC 0.49074995026854984 \n", + "Epoch: 28. Loss: 0.06984207867058366. ACC 0.49074995026854984 \n", + "Epoch: 29. Loss: 0.06980957615226122. ACC 0.49074995026854984 \n", + "Epoch: 30. Loss: 0.06977933678447738. ACC 0.49094887606922616 \n", + "Epoch: 31. Loss: 0.06975120021449374. ACC 0.49114780186990253 \n", + "Epoch: 32. Loss: 0.0697250187053153. ACC 0.49134672767057885 \n", + "Epoch: 33. Loss: 0.06970065204907962. ACC 0.49094887606922616 \n", + "Epoch: 34. Loss: 0.0696779736140767. ACC 0.49035209866719714 \n", + "Epoch: 35. Loss: 0.06965686177220903. ACC 0.49035209866719714 \n", + "Epoch: 36. Loss: 0.06963720468917993. ACC 0.49055102446787346 \n", + "Epoch: 37. Loss: 0.06961889941151203. ACC 0.49114780186990253 \n", + "Epoch: 38. Loss: 0.06960184765734728. ACC 0.49154565347125523 \n", + "Epoch: 39. Loss: 0.06958596083191629. ACC 0.49114780186990253 \n", + "Epoch: 40. Loss: 0.06957115430065479. ACC 0.49055102446787346 \n", + "Epoch: 41. Loss: 0.06955735069727931. ACC 0.49015317286652077 \n", + "Epoch: 42. Loss: 0.06954447916494536. ACC 0.49035209866719714 \n", + "Epoch: 43. Loss: 0.06953247025777977. ACC 0.49154565347125523 \n", + "Epoch: 44. Loss: 0.06952126337515803. ACC 0.49254028247463694 \n", + "Epoch: 45. Loss: 0.06951079884129394. ACC 0.49293813407598963 \n", + "Epoch: 46. Loss: 0.06950102417754064. ACC 0.493137059876666 \n", + "Epoch: 47. Loss: 0.06949188797237243. ACC 0.49134672767057885 \n", + "Epoch: 48. Loss: 0.06948334502728008. ACC 0.49015317286652077 \n", + "Epoch: 49. Loss: 0.06947535163772398. ACC 0.49055102446787346 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.44 0.70 0.54 230\n", + " 1 0.63 0.37 0.46 319\n", + "\n", + " accuracy 0.50 549\n", + " macro avg 0.53 0.53 0.50 549\n", + "weighted avg 0.55 0.50 0.50 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# epochs, hidden, lr, batch, act, opt\n", + "exp = Expe( num_epochs, hidden_dim, learning_rate, batch_size, 'hardthan', 'SGD' )\n", + "exp.set_model( model_ffnn )\n", + "exp.set_scores( gold, pred )\n", + "experiments.append( exp )" + ], + "metadata": { + "id": "llwJEQ-Zk5H0" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "----> learning_rate = 0.5" + ], + "metadata": { + "id": "8ofbW-LidnCo" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "learning_rate = 0.5" + ], + "metadata": { + "id": "1bFV5uzXdiMU" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "5b5eee14-5bec-4d03-c28a-756ebfae8b9d", + "id": "SnipbtnodiMV" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06899163271006852. ACC 0.539486771434255 \n", + "Epoch: 1. Loss: 0.06517702153740665. ACC 0.617266759498707 \n", + "Epoch: 2. Loss: 0.06297323326479258. ACC 0.6399443007758107 \n", + "Epoch: 3. Loss: 0.061934177013996325. ACC 0.6520787746170679 \n", + "Epoch: 4. Loss: 0.06119394636732792. ACC 0.6564551422319475 \n", + "Epoch: 5. Loss: 0.06070206175154949. ACC 0.6628207678535906 \n", + "Epoch: 6. Loss: 0.06026897386312247. ACC 0.6677939128704993 \n", + "Epoch: 7. Loss: 0.05987513110987206. ACC 0.672170280485379 \n", + "Epoch: 8. Loss: 0.05942055651770213. ACC 0.6777402029043167 \n", + "Epoch: 9. Loss: 0.05892780445099636. ACC 0.6841058285259598 \n", + "Epoch: 10. Loss: 0.05839201176002909. ACC 0.6900736025462503 \n", + "Epoch: 11. Loss: 0.05783625583490273. ACC 0.6908693057489557 \n", + "Epoch: 12. Loss: 0.057291302460193726. ACC 0.6948478217624826 \n", + "Epoch: 13. Loss: 0.056798706483670205. ACC 0.6964392281678934 \n", + "Epoch: 14. Loss: 0.056330172334296914. ACC 0.6994231151780386 \n", + "Epoch: 15. Loss: 0.05592512204333094. ACC 0.7043962601949473 \n", + "Epoch: 16. Loss: 0.05553743654355534. ACC 0.7091704794111796 \n", + "Epoch: 17. Loss: 0.055203254224766976. ACC 0.7119554406206485 \n", + "Epoch: 18. Loss: 0.0548827683288913. ACC 0.7143425502287647 \n", + "Epoch: 19. Loss: 0.05460903216290479. ACC 0.7181221404416153 \n", + "Epoch: 20. Loss: 0.05431669450420401. ACC 0.7213049532524368 \n", + "Epoch: 21. Loss: 0.05409027660061404. ACC 0.723692062860553 \n", + "Epoch: 22. Loss: 0.05384888120702562. ACC 0.7250845434652875 \n", + "Epoch: 23. Loss: 0.05359562722510696. ACC 0.7274716530734037 \n", + "Epoch: 24. Loss: 0.053376591028336244. ACC 0.7292619852794907 \n", + "Epoch: 25. Loss: 0.053156857639817943. ACC 0.7302566142828725 \n", + "Epoch: 26. Loss: 0.05292541265890806. ACC 0.7348319076984284 \n", + "Epoch: 27. Loss: 0.05272208951578146. ACC 0.7348319076984284 \n", + "Epoch: 28. Loss: 0.052520521241531996. ACC 0.7364233141038393 \n", + "Epoch: 29. Loss: 0.05235762931206413. ACC 0.737417943107221 \n", + "Epoch: 30. Loss: 0.052175618854862854. ACC 0.7380147205092501 \n", + "Epoch: 31. Loss: 0.05198571057769729. ACC 0.7392082753133081 \n", + "Epoch: 32. Loss: 0.05180545676494418. ACC 0.7394072011139845 \n", + "Epoch: 33. Loss: 0.05163319956089598. ACC 0.7406007559180425 \n", + "Epoch: 34. Loss: 0.05147196178778232. ACC 0.7402029043166899 \n", + "Epoch: 35. Loss: 0.05131960003791473. ACC 0.7386114979112791 \n", + "Epoch: 36. Loss: 0.051201086443177696. ACC 0.7394072011139845 \n", + "Epoch: 37. Loss: 0.05102002929272888. ACC 0.7406007559180425 \n", + "Epoch: 38. Loss: 0.050804019208478005. ACC 0.7396061269146609 \n", + "Epoch: 39. Loss: 0.05062438897686259. ACC 0.7417943107221007 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.58 0.23 0.33 230\n", + " 1 0.61 0.88 0.72 319\n", + "\n", + " accuracy 0.61 549\n", + " macro avg 0.59 0.55 0.52 549\n", + "weighted avg 0.60 0.61 0.56 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "----> learning_rate = 1" + ], + "metadata": { + "id": "QJJ__xWZdu7L" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "learning_rate = 1" + ], + "metadata": { + "id": "oMu4WtyOdu7M" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "f96c8962-cd27-45bc-8894-93119cddac4d", + "id": "4aMJb_Pldu7M" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.07048397742898037. ACC 0.5333200716132882 \n", + "Epoch: 1. Loss: 0.06720985362822383. ACC 0.5971752536303958 \n", + "Epoch: 2. Loss: 0.064870651991438. ACC 0.6210463497115576 \n", + "Epoch: 3. Loss: 0.06366148338193137. ACC 0.6363636363636364 \n", + "Epoch: 4. Loss: 0.06282020589810658. ACC 0.6471056296001592 \n", + "Epoch: 5. Loss: 0.06211517086044704. ACC 0.6614282872488562 \n", + "Epoch: 6. Loss: 0.06135640189463656. ACC 0.668788541873881 \n", + "Epoch: 7. Loss: 0.06065581123872047. ACC 0.6731649094887607 \n", + "Epoch: 8. Loss: 0.059938330245534965. ACC 0.6801273125124329 \n", + "Epoch: 9. Loss: 0.059223499554441915. ACC 0.6882832703401631 \n", + "Epoch: 10. Loss: 0.05856797975338499. ACC 0.695643524965188 \n", + "Epoch: 11. Loss: 0.05786031688097957. ACC 0.7006166699820967 \n", + "Epoch: 12. Loss: 0.0571405140085393. ACC 0.705987666600358 \n", + "Epoch: 13. Loss: 0.056542831515848505. ACC 0.7111597374179431 \n", + "Epoch: 14. Loss: 0.055870077107935366. ACC 0.7167296598368809 \n", + "Epoch: 15. Loss: 0.0551753983254316. ACC 0.7242888402625821 \n", + "Epoch: 16. Loss: 0.05453141237736365. ACC 0.7312512432862542 \n", + "Epoch: 17. Loss: 0.054280134731978145. ACC 0.7358265367018102 \n", + "Epoch: 18. Loss: 0.05365139386194801. ACC 0.7402029043166899 \n", + "Epoch: 19. Loss: 0.053165083512091746. ACC 0.7411975333200717 \n", + "Epoch: 20. Loss: 0.052540554120167365. ACC 0.7517406007559181 \n", + "Epoch: 21. Loss: 0.05197549119565035. ACC 0.7598965585836484 \n", + "Epoch: 22. Loss: 0.051534760308379515. ACC 0.7583051521782375 \n", + "Epoch: 23. Loss: 0.05152658173328654. ACC 0.7636761487964989 \n", + "Epoch: 24. Loss: 0.05128098886237008. ACC 0.7676546648100259 \n", + "Epoch: 25. Loss: 0.050514043869959425. ACC 0.770638551820171 \n", + "Epoch: 26. Loss: 0.050716691796623595. ACC 0.7702407002188184 \n", + "Epoch: 27. Loss: 0.05002721436158703. ACC 0.7807837676546648 \n", + "Epoch: 28. Loss: 0.049793344056624673. ACC 0.7789934354485777 \n", + "Epoch: 29. Loss: 0.05015849819699955. ACC 0.7768052516411379 \n", + "Epoch: 30. Loss: 0.04937954791933624. ACC 0.7847622836681918 \n", + "Epoch: 31. Loss: 0.0507163125921581. ACC 0.7783966580465487 \n", + "Epoch: 32. Loss: 0.049295397164488684. ACC 0.7817783966580466 \n", + "Epoch: 33. Loss: 0.04904123517881574. ACC 0.787746170678337 \n", + "Epoch: 34. Loss: 0.050237354649234914. ACC 0.7879450964790133 \n", + "Epoch: 35. Loss: 0.05178575980344942. ACC 0.7690471454147603 \n", + "Epoch: 36. Loss: 0.04931285798694899. ACC 0.7885418738810424 \n", + "Epoch: 37. Loss: 0.048327607255535. ACC 0.7970956833101254 \n", + "Epoch: 38. Loss: 0.05123219209324301. ACC 0.7776009548438433 \n", + "Epoch: 39. Loss: 0.05082752190860811. ACC 0.7853590610702208 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.49 0.70 0.58 230\n", + " 1 0.69 0.48 0.56 319\n", + "\n", + " accuracy 0.57 549\n", + " macro avg 0.59 0.59 0.57 549\n", + "weighted avg 0.60 0.57 0.57 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "----> learning_rate = 0.2" + ], + "metadata": { + "id": "PBHmPwG7eEyF" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "learning_rate = 0.2" + ], + "metadata": { + "id": "KNmltKiMeEyF" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "3cbffae1-c6a3-4adb-d28f-527629a7897a", + "id": "PmDE6j86eEyF" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06899349832240739. ACC 0.5325243684105828 \n", + "Epoch: 1. Loss: 0.06535281002912792. ACC 0.6166699820966779 \n", + "Epoch: 2. Loss: 0.06280231211421171. ACC 0.6445195941913666 \n", + "Epoch: 3. Loss: 0.061588076105026736. ACC 0.6560572906305948 \n", + "Epoch: 4. Loss: 0.06084509976156336. ACC 0.6652078774617067 \n", + "Epoch: 5. Loss: 0.06039042388773923. ACC 0.6677939128704993 \n", + "Epoch: 6. Loss: 0.060018786022623755. ACC 0.6741595384921424 \n", + "Epoch: 7. Loss: 0.05978551142989273. ACC 0.6761487964989059 \n", + "Epoch: 8. Loss: 0.05955219064267274. ACC 0.6805251641137856 \n", + "Epoch: 9. Loss: 0.05941371199077798. ACC 0.6823154963198726 \n", + "Epoch: 10. Loss: 0.05925566872190191. ACC 0.6847026059279888 \n", + "Epoch: 11. Loss: 0.059158015925817375. ACC 0.6860950865327233 \n", + "Epoch: 12. Loss: 0.05904056864299854. ACC 0.6876864929381341 \n", + "Epoch: 13. Loss: 0.05895315326182629. ACC 0.6900736025462503 \n", + "Epoch: 14. Loss: 0.05884284056131755. ACC 0.6906703799482793 \n", + "Epoch: 15. Loss: 0.058768001623838544. ACC 0.6912671573503083 \n", + "Epoch: 16. Loss: 0.058660021839877924. ACC 0.6918639347523374 \n", + "Epoch: 17. Loss: 0.058595194200267604. ACC 0.6920628605530137 \n", + "Epoch: 18. Loss: 0.0585134467263807. ACC 0.691665008951661 \n", + "Epoch: 19. Loss: 0.05843780175672508. ACC 0.6920628605530137 \n", + "Epoch: 20. Loss: 0.058367803998132906. ACC 0.6934553411577482 \n", + "Epoch: 21. Loss: 0.058319510096509576. ACC 0.6938531927591008 \n", + "Epoch: 22. Loss: 0.05823388837406833. ACC 0.6926596379550428 \n", + "Epoch: 23. Loss: 0.05819618467761006. ACC 0.6944499701611299 \n", + "Epoch: 24. Loss: 0.058123201463813924. ACC 0.6948478217624826 \n", + "Epoch: 25. Loss: 0.05807964422248907. ACC 0.6948478217624826 \n", + "Epoch: 26. Loss: 0.05803262060409942. ACC 0.6966381539685698 \n", + "Epoch: 27. Loss: 0.058006321421100196. ACC 0.6964392281678934 \n", + "Epoch: 28. Loss: 0.057945672189214724. ACC 0.6980306345733042 \n", + "Epoch: 29. Loss: 0.0579122100295901. ACC 0.6964392281678934 \n", + "Epoch: 30. Loss: 0.05785082062765265. ACC 0.6984284861746568 \n", + "Epoch: 31. Loss: 0.05783017722232835. ACC 0.6982295603739805 \n", + "Epoch: 32. Loss: 0.05776693871878762. ACC 0.6988263377760096 \n", + "Epoch: 33. Loss: 0.057726877369320084. ACC 0.7004177441814203 \n", + "Epoch: 34. Loss: 0.05768596271124469. ACC 0.7000198925800676 \n", + "Epoch: 35. Loss: 0.0576562900292558. ACC 0.6990252635766859 \n", + "Epoch: 36. Loss: 0.05760319688957113. ACC 0.6994231151780386 \n", + "Epoch: 37. Loss: 0.057580855810099196. ACC 0.6990252635766859 \n", + "Epoch: 38. Loss: 0.057520069213043465. ACC 0.7006166699820967 \n", + "Epoch: 39. Loss: 0.05750346502727511. ACC 0.7004177441814203 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.62 0.32 0.42 230\n", + " 1 0.64 0.86 0.73 319\n", + "\n", + " accuracy 0.63 549\n", + " macro avg 0.63 0.59 0.58 549\n", + "weighted avg 0.63 0.63 0.60 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "#### 6. OPTIMIZER" + ], + "metadata": { + "id": "enMCV7JAeU1k" + } + }, + { + "cell_type": "markdown", + "source": [ + "Best results with SGD:\n", + "\n", + "```\n", + " precision recall f1-score support\n", + "\n", + " 0 0.65 0.34 0.45 230\n", + " 1 0.65 0.87 0.74 319\n", + "\n", + " accuracy 0.65 549\n", + " macro avg 0.65 0.61 0.60 549\n", + "weighted avg 0.65 0.65 0.62 549\n", + "```" + ], + "metadata": { + "id": "DWf55kPme54B" + } + }, + { + "cell_type": "code", + "source": [ + "# Already optimized\n", + "num_epochs = 40\n", + "batch_size = 10\n", + "hidden_dim = 10\n", + "learning_rate = 0.1\n", + "# activation = hard than\n" + ], + "metadata": { + "id": "t38eX3rheWnZ" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "\n", + "# --> Adam\n", + "optimizer = torch.optim.Adam(model_ffnn.parameters(), lr=learning_rate)\n", + "\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "JBnYT4fJeswN", + "outputId": "142ff51c-5bb2-4ab7-b5d3-4cce5dd2b4f6" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.07264429911813988. ACC 0.570917047941118 \n", + "Epoch: 1. Loss: 0.0698777247855363. ACC 0.6085140242689476 \n", + "Epoch: 2. Loss: 0.06719673022565582. ACC 0.6427292619852795 \n", + "Epoch: 3. Loss: 0.06782082412395607. ACC 0.6347722299582256 \n", + "Epoch: 4. Loss: 0.06586369816958608. ACC 0.6514819972150387 \n", + "Epoch: 5. Loss: 0.06640658026065745. ACC 0.6602347324447981 \n", + "Epoch: 6. Loss: 0.06676655723201083. ACC 0.655062661627213 \n", + "Epoch: 7. Loss: 0.065749119306941. ACC 0.6606325840461508 \n", + "Epoch: 8. Loss: 0.0638378215714717. ACC 0.6658046548637359 \n", + "Epoch: 9. Loss: 0.06394622921469388. ACC 0.6757509448975532 \n", + "Epoch: 10. Loss: 0.06449328309585685. ACC 0.6791326835090511 \n", + "Epoch: 11. Loss: 0.06400464364441057. ACC 0.681320867316491 \n", + "Epoch: 12. Loss: 0.06352166555227569. ACC 0.6892778993435449 \n", + "Epoch: 13. Loss: 0.062298728503999735. ACC 0.691665008951661 \n", + "Epoch: 14. Loss: 0.06232551923709004. ACC 0.6924607121543664 \n", + "Epoch: 15. Loss: 0.06114841115469958. ACC 0.6944499701611299 \n", + "Epoch: 16. Loss: 0.05976851396699869. ACC 0.7085737020091506 \n", + "Epoch: 17. Loss: 0.06054833280013384. ACC 0.7006166699820967 \n", + "Epoch: 18. Loss: 0.059087974456757134. ACC 0.7119554406206485 \n", + "Epoch: 19. Loss: 0.059151375325745326. ACC 0.7103640342152377 \n", + "Epoch: 20. Loss: 0.058917334649703695. ACC 0.7175253630395863 \n", + "Epoch: 21. Loss: 0.059819571482677546. ACC 0.7034016311915655 \n", + "Epoch: 22. Loss: 0.05843818292054358. ACC 0.7177242888402626 \n", + "Epoch: 23. Loss: 0.05739628991639375. ACC 0.7219017306544658 \n", + "Epoch: 24. Loss: 0.05801852424196441. ACC 0.7228963596578476 \n", + "Epoch: 25. Loss: 0.05775874622072784. ACC 0.7232942112592003 \n", + "Epoch: 26. Loss: 0.05823313734754075. ACC 0.719116769444997 \n", + "Epoch: 27. Loss: 0.056928637741170254. ACC 0.7256813208673165 \n", + "Epoch: 28. Loss: 0.05637881731877916. ACC 0.7286652078774617 \n", + "Epoch: 29. Loss: 0.056092530804657904. ACC 0.737417943107221 \n", + "Epoch: 30. Loss: 0.056385003240760144. ACC 0.7304555400835488 \n", + "Epoch: 31. Loss: 0.05450110447965229. ACC 0.7413964591207479 \n", + "Epoch: 32. Loss: 0.05588278492906351. ACC 0.7330415754923414 \n", + "Epoch: 33. Loss: 0.05449453544932093. ACC 0.7475631589417148 \n", + "Epoch: 34. Loss: 0.053483706472324095. ACC 0.7537298587626815 \n", + "Epoch: 35. Loss: 0.05486349593859106. ACC 0.7509448975532127 \n", + "Epoch: 36. Loss: 0.05452731025768531. ACC 0.7455739009349512 \n", + "Epoch: 37. Loss: 0.05441718007861584. ACC 0.7493534911478019 \n", + "Epoch: 38. Loss: 0.05406648226664795. ACC 0.7515416749552417 \n", + "Epoch: 39. Loss: 0.053826684245229454. ACC 0.7589019295802666 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.61 0.25 0.35 230\n", + " 1 0.62 0.89 0.73 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.62 0.57 0.54 549\n", + "weighted avg 0.62 0.62 0.57 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "\n", + "# --> RMSprop\n", + "optimizer = torch.optim.RMSprop(model_ffnn.parameters(), lr=learning_rate)\n", + "\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "5bsQDKcMfC2u", + "outputId": "d615f772-4ee7-4efa-e13c-3a7f22b839de" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.09697006490753687. ACC 0.5058683111199522 \n", + "Epoch: 1. Loss: 0.08242940319943243. ACC 0.5237716331808235 \n", + "Epoch: 2. Loss: 0.07632671735598312. ACC 0.5834493733837278 \n", + "Epoch: 3. Loss: 0.07356333009926823. ACC 0.6077183210662422 \n", + "Epoch: 4. Loss: 0.07211291304764843. ACC 0.6341754525561966 \n", + "Epoch: 5. Loss: 0.06881588072574753. ACC 0.6532723294211259 \n", + "Epoch: 6. Loss: 0.06823393546588762. ACC 0.6532723294211259 \n", + "Epoch: 7. Loss: 0.06615830103155157. ACC 0.6705788740799682 \n", + "Epoch: 8. Loss: 0.06583924692324955. ACC 0.672767057887408 \n", + "Epoch: 9. Loss: 0.06481681103103555. ACC 0.6801273125124329 \n", + "Epoch: 10. Loss: 0.06337317771600497. ACC 0.6920628605530137 \n", + "Epoch: 11. Loss: 0.06232508947034942. ACC 0.6958424507658644 \n", + "Epoch: 12. Loss: 0.060841072802043754. ACC 0.7137457728267357 \n", + "Epoch: 13. Loss: 0.061758819855608806. ACC 0.7051919633976527 \n", + "Epoch: 14. Loss: 0.06073459279757739. ACC 0.7097672568132086 \n", + "Epoch: 15. Loss: 0.05904280062618895. ACC 0.7195146210463497 \n", + "Epoch: 16. Loss: 0.057938514900937904. ACC 0.7324447980903123 \n", + "Epoch: 17. Loss: 0.05864429787343792. ACC 0.7246866918639348 \n", + "Epoch: 18. Loss: 0.05763244996321991. ACC 0.7316490948876069 \n", + "Epoch: 19. Loss: 0.05643269296951173. ACC 0.7352297592997812 \n", + "Epoch: 20. Loss: 0.05568546101210818. ACC 0.7429878655261587 \n", + "Epoch: 21. Loss: 0.05648875765625468. ACC 0.7433857171275114 \n", + "Epoch: 22. Loss: 0.05520759497200231. ACC 0.7400039785160135 \n", + "Epoch: 23. Loss: 0.05451818002320816. ACC 0.7517406007559181 \n", + "Epoch: 24. Loss: 0.05449903000354388. ACC 0.7525363039586235 \n", + "Epoch: 25. Loss: 0.05405041901200198. ACC 0.7533320071613289 \n", + "Epoch: 26. Loss: 0.0540137551295098. ACC 0.751143823353889 \n", + "Epoch: 27. Loss: 0.053885131552652406. ACC 0.7549234135667396 \n", + "Epoch: 28. Loss: 0.05264411930840367. ACC 0.7612890391883828 \n", + "Epoch: 29. Loss: 0.05198502114480975. ACC 0.7664611100059677 \n", + "Epoch: 30. Loss: 0.05111291729949781. ACC 0.774020290431669 \n", + "Epoch: 31. Loss: 0.05044663274634399. ACC 0.7785955838472249 \n", + "Epoch: 32. Loss: 0.05148766371075998. ACC 0.7702407002188184 \n", + "Epoch: 33. Loss: 0.053206163160436214. ACC 0.7577083747762084 \n", + "Epoch: 34. Loss: 0.052041790745107515. ACC 0.7634772229958225 \n", + "Epoch: 35. Loss: 0.05049813932883519. ACC 0.7783966580465487 \n", + "Epoch: 36. Loss: 0.050170299291954566. ACC 0.7809826934553411 \n", + "Epoch: 37. Loss: 0.04990425775198498. ACC 0.7817783966580466 \n", + "Epoch: 38. Loss: 0.04984352514384897. ACC 0.7851601352695444 \n", + "Epoch: 39. Loss: 0.049690401037443176. ACC 0.782574099860752 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.57 0.21 0.31 230\n", + " 1 0.61 0.89 0.72 319\n", + "\n", + " accuracy 0.60 549\n", + " macro avg 0.59 0.55 0.51 549\n", + "weighted avg 0.59 0.60 0.55 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "These alternatives do not work better here" + ], + "metadata": { + "id": "fXzs8Ty_fed_" + } + }, + { + "cell_type": "markdown", + "source": [ + "#### 2. NUMBER OF EPOCHS" + ], + "metadata": { + "id": "I3fnAViV7Gfx" + } + }, + { + "cell_type": "code", + "source": [ + "# Hyperparameters\n", + "hidden_dim = 4\n", + "learning_rate = 0.1\n", + "criterion = nn.CrossEntropyLoss()\n", + "output_dim = 2" + ], + "metadata": { + "id": "k9LyIz0C7RJX" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "------> EPOCHS 5" + ], + "metadata": { + "id": "wVCcgJ4W7T9F" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "num_epochs = 5" + ], + "metadata": { + "id": "b73SD6T27S6g" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=5 )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "946088ea-438b-42a9-9120-0d7f3b49ae64", + "id": "LIwM6tKg7Gfz" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06961566724382624. ACC 0.4993037596976328 \n", + "Epoch: 1. Loss: 0.06948280193375526. ACC 0.5084543465287448 \n", + "Epoch: 2. Loss: 0.0693099447289826. ACC 0.5184006365625622 \n", + "Epoch: 3. Loss: 0.06904770855119173. ACC 0.5335189974139646 \n", + "Epoch: 4. Loss: 0.06864595733344353. ACC 0.5528148000795703 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.49 0.64 0.56 230\n", + " 1 0.67 0.53 0.59 319\n", + "\n", + " accuracy 0.58 549\n", + " macro avg 0.58 0.58 0.57 549\n", + "weighted avg 0.60 0.58 0.58 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "-------> EPOCHS 50" + ], + "metadata": { + "id": "rI9VK8mI7dUO" + } + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "num_epochs = 5" + ], + "metadata": { + "id": "P_B5hb6M7bED" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs ) # <---- bien modifie ici !\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "524df7c0-c715-4dbb-f40d-0dcf81eb77a3", + "id": "UCzagdA-7bEE" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06958532500556337. ACC 0.5050726079172468 \n", + "Epoch: 1. Loss: 0.0694332154579231. ACC 0.5128307141436245 \n", + "Epoch: 2. Loss: 0.06921998313152934. ACC 0.5203898945693256 \n", + "Epoch: 3. Loss: 0.06889650951874572. ACC 0.5363039586234335 \n", + "Epoch: 4. Loss: 0.06840828495659268. ACC 0.5627610901133877 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.56 0.49 0.52 230\n", + " 1 0.66 0.72 0.69 319\n", + "\n", + " accuracy 0.62 549\n", + " macro avg 0.61 0.60 0.61 549\n", + "weighted avg 0.62 0.62 0.62 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [ + "# To optimize\n", + "num_epochs = 50" + ], + "metadata": { + "id": "JkLqF0TO7Z5O" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "0e1cf32d-6454-48e3-ab6b-dc495302c852", + "id": "f3j9943U7Z5P" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06974182658672617. ACC 0.4991048338969564 \n", + "Epoch: 1. Loss: 0.06959589539784572. ACC 0.5030833499104834 \n", + "Epoch: 2. Loss: 0.06954592943001818. ACC 0.5048736821165705 \n", + "Epoch: 3. Loss: 0.06946516128285447. ACC 0.5106425303361846 \n", + "Epoch: 4. Loss: 0.06933077723410344. ACC 0.5174060075591804 \n", + "Epoch: 5. Loss: 0.06910902816637007. ACC 0.5253630395862343 \n", + "Epoch: 6. Loss: 0.06875421752127213. ACC 0.5450566938531928 \n", + "Epoch: 7. Loss: 0.06821451198753553. ACC 0.5691267157350308 \n", + "Epoch: 8. Loss: 0.06745087245550738. ACC 0.5957827730256614 \n", + "Epoch: 9. Loss: 0.06646774455190582. ACC 0.6182613885020887 \n", + "Epoch: 10. Loss: 0.06533211761922895. ACC 0.630992639745375 \n", + "Epoch: 11. Loss: 0.06415400347131987. ACC 0.6482991844042172 \n", + "Epoch: 12. Loss: 0.0630416821088899. ACC 0.658444400238711 \n", + "Epoch: 13. Loss: 0.062068336386317306. ACC 0.6660035806644121 \n", + "Epoch: 14. Loss: 0.06126375362700915. ACC 0.6705788740799682 \n", + "Epoch: 15. Loss: 0.060623588169785414. ACC 0.6745573900934951 \n", + "Epoch: 16. Loss: 0.06012473420947879. ACC 0.6779391287049931 \n", + "Epoch: 17. Loss: 0.059738411500483826. ACC 0.6817187189178436 \n", + "Epoch: 18. Loss: 0.059437829628364455. ACC 0.6827133479212254 \n", + "Epoch: 19. Loss: 0.05920113778764077. ACC 0.6854983091306942 \n", + "Epoch: 20. Loss: 0.0590116542602643. ACC 0.685896160732047 \n", + "Epoch: 21. Loss: 0.05885704045198963. ACC 0.6872886413367814 \n", + "Epoch: 22. Loss: 0.058728317605446036. ACC 0.6888800477421921 \n", + "Epoch: 23. Loss: 0.058619045448843894. ACC 0.6898746767455739 \n", + "Epoch: 24. Loss: 0.05852469103531262. ACC 0.6918639347523374 \n", + "Epoch: 25. Loss: 0.05844213044475034. ACC 0.6924607121543664 \n", + "Epoch: 26. Loss: 0.05836920895205593. ACC 0.695643524965188 \n", + "Epoch: 27. Loss: 0.058304359027398896. ACC 0.6946488959618062 \n", + "Epoch: 28. Loss: 0.05824632115361229. ACC 0.695046747563159 \n", + "Epoch: 29. Loss: 0.058194001341805156. ACC 0.6954445991645116 \n", + "Epoch: 30. Loss: 0.058146420063686956. ACC 0.6966381539685698 \n", + "Epoch: 31. Loss: 0.05810272050397117. ACC 0.6970360055699224 \n", + "Epoch: 32. Loss: 0.05806217686020643. ACC 0.6964392281678934 \n", + "Epoch: 33. Loss: 0.05802419174611509. ACC 0.6980306345733042 \n", + "Epoch: 34. Loss: 0.057988279923924115. ACC 0.6978317087726278 \n", + "Epoch: 35. Loss: 0.05795405742565775. ACC 0.6994231151780386 \n", + "Epoch: 36. Loss: 0.05792121719906149. ACC 0.6996220409787149 \n", + "Epoch: 37. Loss: 0.05788951218187489. ACC 0.7000198925800676 \n", + "Epoch: 38. Loss: 0.057858746260944074. ACC 0.7000198925800676 \n", + "Epoch: 39. Loss: 0.05782876131855661. ACC 0.7000198925800676 \n", + "Epoch: 40. Loss: 0.0577994254644666. ACC 0.700218818380744 \n", + "Epoch: 41. Loss: 0.057770634037802845. ACC 0.700218818380744 \n", + "Epoch: 42. Loss: 0.05774229846158131. ACC 0.6998209667793913 \n", + "Epoch: 43. Loss: 0.05771434464795181. ACC 0.7006166699820967 \n", + "Epoch: 44. Loss: 0.05768671226899857. ACC 0.7004177441814203 \n", + "Epoch: 45. Loss: 0.05765934953970347. ACC 0.7014123731848021 \n", + "Epoch: 46. Loss: 0.05763221215082491. ACC 0.7012134473841257 \n", + "Epoch: 47. Loss: 0.057605264087024016. ACC 0.7012134473841257 \n", + "Epoch: 48. Loss: 0.05757847185255354. ACC 0.700815595782773 \n", + "Epoch: 49. Loss: 0.05755180793940155. ACC 0.6998209667793913 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.65 0.33 0.44 230\n", + " 1 0.64 0.87 0.74 319\n", + "\n", + " accuracy 0.64 549\n", + " macro avg 0.65 0.60 0.59 549\n", + "weighted avg 0.65 0.64 0.61 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "--------> EPOCHS 40" + ], + "metadata": { + "id": "lzGveQ6eZdNH" + } + }, + { + "cell_type": "code", + "source": [ + "# Final value\n", + "num_epochs = 40" + ], + "metadata": { + "id": "m-mwa66bZZpG" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=num_epochs )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "outputId": "673d5ded-f25d-4fcf-b390-9372d7e22a3d", + "id": "3x_KKznZZZpH" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Epoch: 0. Loss: 0.06965734783403675. ACC 0.5002983887010145 \n", + "Epoch: 1. Loss: 0.06959862697899923. ACC 0.4985080564949274 \n", + "Epoch: 2. Loss: 0.06953820148092012. ACC 0.5040779789138651 \n", + "Epoch: 3. Loss: 0.0694468540132271. ACC 0.5086532723294211 \n", + "Epoch: 4. Loss: 0.06930082560960972. ACC 0.516610304356475 \n", + "Epoch: 5. Loss: 0.06906634956888776. ACC 0.5279490749950269 \n", + "Epoch: 6. Loss: 0.06870004659811688. ACC 0.5436642132484584 \n", + "Epoch: 7. Loss: 0.06815687391661687. ACC 0.570917047941118 \n", + "Epoch: 8. Loss: 0.06740806327796679. ACC 0.6007559180425701 \n", + "Epoch: 9. Loss: 0.06646292948784495. ACC 0.618460314302765 \n", + "Epoch: 10. Loss: 0.06537891070980757. ACC 0.6317883429480804 \n", + "Epoch: 11. Loss: 0.06424785023753392. ACC 0.6433260393873085 \n", + "Epoch: 12. Loss: 0.06316567815414878. ACC 0.655062661627213 \n", + "Epoch: 13. Loss: 0.0622045275454543. ACC 0.6644121742590013 \n", + "Epoch: 14. Loss: 0.06139975504607691. ACC 0.6677939128704993 \n", + "Epoch: 15. Loss: 0.06075341642341629. ACC 0.6749552416948478 \n", + "Epoch: 16. Loss: 0.06024713083577009. ACC 0.6757509448975532 \n", + "Epoch: 17. Loss: 0.059854913373246735. ACC 0.6801273125124329 \n", + "Epoch: 18. Loss: 0.059551304985472855. ACC 0.682514422120549 \n", + "Epoch: 19. Loss: 0.05931491827618561. ACC 0.6854983091306942 \n", + "Epoch: 20. Loss: 0.05912910468516208. ACC 0.6860950865327233 \n", + "Epoch: 21. Loss: 0.05898135331959001. ACC 0.6874875671374577 \n", + "Epoch: 22. Loss: 0.05886238056337096. ACC 0.6880843445394867 \n", + "Epoch: 23. Loss: 0.0587652856760099. ACC 0.6892778993435449 \n", + "Epoch: 24. Loss: 0.0586849166499423. ACC 0.6902725283469267 \n", + "Epoch: 25. Loss: 0.0586173931125241. ACC 0.6912671573503083 \n", + "Epoch: 26. Loss: 0.058559769678950574. ACC 0.6934553411577482 \n", + "Epoch: 27. Loss: 0.05850979673926485. ACC 0.6946488959618062 \n", + "Epoch: 28. Loss: 0.05846574890084738. ACC 0.6952456733638352 \n", + "Epoch: 29. Loss: 0.058426297378511585. ACC 0.6972349313705988 \n", + "Epoch: 30. Loss: 0.05839041295169195. ACC 0.6976327829719514 \n", + "Epoch: 31. Loss: 0.05835730454569335. ACC 0.6986274119753332 \n", + "Epoch: 32. Loss: 0.05832635464714753. ACC 0.7000198925800676 \n", + "Epoch: 33. Loss: 0.058297083389457896. ACC 0.7000198925800676 \n", + "Epoch: 34. Loss: 0.05826912110407215. ACC 0.700218818380744 \n", + "Epoch: 35. Loss: 0.058242175233798064. ACC 0.7006166699820967 \n", + "Epoch: 36. Loss: 0.058216016217162124. ACC 0.7000198925800676 \n", + "Epoch: 37. Loss: 0.05819046414346659. ACC 0.7000198925800676 \n", + "Epoch: 38. Loss: 0.058165371755920965. ACC 0.700218818380744 \n", + "Epoch: 39. Loss: 0.05814062157634335. ACC 0.6998209667793913 \n", + " precision recall f1-score support\n", + "\n", + " 0 0.66 0.33 0.44 230\n", + " 1 0.64 0.87 0.74 319\n", + "\n", + " accuracy 0.65 549\n", + " macro avg 0.65 0.60 0.59 549\n", + "weighted avg 0.65 0.65 0.62 549\n", + "\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "---> Loss continues to decrease slightly, accuracy starts to even decrease around 40. We keep 40." + ], + "metadata": { + "id": "iw-cOfbUZDoq" + } + } + ] +} \ No newline at end of file diff --git a/notebooks/TP5_m2LiTL_learningWithNN_2324.ipynb b/notebooks/TP5_m2LiTL_learningWithNN_2324.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..a2603e7ba2890cb5db77441de19562d5099a229b --- /dev/null +++ b/notebooks/TP5_m2LiTL_learningWithNN_2324.ipynb @@ -0,0 +1,665 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "source": [ + "# TP 5 : machine learning using neural network for text data\n", + "\n", + "In this practical session, we are going to build simple neural models able to classify reviews as positive or negative. The dataset used comes from AlloCine.\n", + "The goals are to understand how to use pretrained embeddings, and to correctly tune a neural model.\n", + "\n", + "you need to load:\n", + "- Allocine: Train, dev and test sets\n", + "- Embeddings: cc.fr.300.10000.vec (10,000 first lines of the original file)\n", + "\n", + "## Part 1- Pre-trained word embeddings\n", + "Define a neural network that takes as input pre-trained word embeddings (here FastText embeddings). Words are represented by real-valued vectors from FastText. A review is represented by a vector that is the average or the sum of the word vectors.\n", + "\n", + "So instead of having an input vector of size 5000, we now have an input vector of size e.g. 300, that represents the ‘average’, combined meaning of all the words in the document taken together.\n", + "\n", + "## Part 2- Tuning report\n", + "Tune the model built on pre-trained word embeddings by testing several values for the different hyper-parameters, and by testing the addition on an hidden layer.\n", + "\n", + "Describe the performance obtained by reporting the scores for each setting on the development set, printing the loss function against the hyper-parameter values, and reporting the score of the best model on the test set.\n", + "\n", + "-------------------------------------" + ], + "metadata": { + "id": "jShhTl5Mftkw" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Useful imports\n", + "\n", + "Here we also:\n", + "* Look at the availability of a GPU. Reminder: in Collab, you have to go to Edit/Notebook settings to set the use of a GPU\n", + "* Setting a seed, for reproducibility: https://pytorch.org/docs/stable/notes/randomness.html\n" + ], + "metadata": { + "id": "mT2uF3G6HXko" + } + }, + { + "cell_type": "code", + "source": [ + "import time\n", + "import pandas as pd\n", + "import numpy as np\n", + "# torch and torch modules to deal with text data\n", + "import torch\n", + "import torch.nn as nn\n", + "from torchtext.data.utils import get_tokenizer\n", + "from torchtext.vocab import build_vocab_from_iterator\n", + "from torch.utils.data import DataLoader\n", + "# you can use scikit to print scores\n", + "from sklearn.metrics import classification_report\n", + "\n", + "# For reproducibility, set a seed\n", + "torch.manual_seed(0)\n", + "\n", + "# Check for GPU\n", + "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n", + "print(device)" + ], + "metadata": { + "id": "nB_k89m8xAOt" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Paths to data:" + ], + "metadata": { + "id": "taGY9N-PJvWS" + } + }, + { + "cell_type": "code", + "source": [ + "# Data files\n", + "train_file = \"allocine_train.tsv\"\n", + "dev_file = \"allocine_dev.tsv\"\n", + "test_file = \"allocine_test.tsv\"\n", + "# embeddings\n", + "embed_file='cc.fr.300.10000.vec'" + ], + "metadata": { + "id": "kGty4hWCJurB" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## 1- Read and load the data\n" + ], + "metadata": { + "id": "Wv6H41YoFycw" + } + }, + { + "cell_type": "markdown", + "source": [ + "### The class Dataset is defined below." + ], + "metadata": { + "id": "eXiJRrw_zsFD" + } + }, + { + "cell_type": "code", + "source": [ + "# Here we create a custom Dataset class that inherits from the Dataset class in PyTorch\n", + "# A custom Dataset class must implement three functions: __init__, __len__, and __getitem__\n", + "\n", + "\n", + "class Dataset(torch.utils.data.Dataset):\n", + "\n", + " def __init__(self, tsv_file, vocab=None ):\n", + " \"\"\" (REQUIRED) Here we save the location of our input file,\n", + " load the data, i.e. retrieve the list of texts and associated labels,\n", + " build the vocabulary if none is given,\n", + " and define the pipelines used to prepare the data \"\"\"\n", + " self.tsv_file = tsv_file\n", + " self.data, self.label_list = self.load_data( )\n", + " # splits the string sentence by space, can t make the fr tokenzer work\n", + " self.tokenizer = get_tokenizer( None )\n", + " self.vocab = vocab\n", + " if not vocab:\n", + " self.build_vocab()\n", + " # pipelines for text and label\n", + " self.text_pipeline = lambda x: self.vocab(self.tokenizer(x)) #return a list of indices from a text\n", + " self.label_pipeline = lambda x: int(x) #simple mapping to self\n", + "\n", + " def load_data( self ):\n", + " \"\"\" Read a tsv file and return the list of texts and associated labels\"\"\"\n", + " data = pd.read_csv( self.tsv_file, header=0, delimiter=\"\\t\", quoting=3)\n", + " instances = []\n", + " label_list = []\n", + " for i in data.index:\n", + " label_list.append( data[\"sentiment\"][i] )\n", + " instances.append( data[\"review\"][i] )\n", + " return instances, label_list\n", + "\n", + " def build_vocab(self):\n", + " \"\"\" Build the vocabulary, i.e. retrieve the list of unique tokens\n", + " appearing in the corpus (= training set). Se also add a specific index\n", + " corresponding to unknown words. \"\"\"\n", + " self.vocab = build_vocab_from_iterator(self.yield_tokens(), specials=[\"<unk>\"])\n", + " self.vocab.set_default_index(self.vocab[\"<unk>\"])\n", + "\n", + " def yield_tokens(self):\n", + " \"\"\" Iterator on tokens \"\"\"\n", + " for text in self.data:\n", + " yield self.tokenizer(text)\n", + "\n", + " def __len__(self):\n", + " \"\"\" (REQUIRED) Return the len of the data,\n", + " i.e. the total number of instances \"\"\"\n", + " return len(self.data)\n", + "\n", + " def __getitem__(self, index):\n", + " \"\"\" (REQUIRED) Return a specific instance in a format that can be\n", + " processed by Pytorch, i.e. torch tensors \"\"\"\n", + " return (\n", + " tuple( [torch.tensor(self.text_pipeline( self.data[index] ), dtype=torch.int64),\n", + " torch.tensor( self.label_pipeline( self.label_list[index] ), dtype=torch.int64) ] )\n", + " )" + ], + "metadata": { + "id": "GdK1WAmcFYHS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### The function to generate data batches and iterator is defined below.\n" + ], + "metadata": { + "id": "bG3T9LQFTD73" + } + }, + { + "cell_type": "code", + "source": [ + "# This function explains how we process data to make batches of instances\n", + "# - The list of texts / reviews that is returned is similar to a list of list:\n", + "# each element is a batch, ie. a ensemble of BATCH_SIZE texts. But instead of\n", + "# creating sublists, PyTorch concatenates all the tensors corresponding to\n", + "# each text sequence into one tensor.\n", + "# - The list of labels is the list of list of labels for each batch\n", + "# - The offsets are used to save the position of each individual instance\n", + "# within the big tensor\n", + "def collate_fn(batch):\n", + " label_list, text_list, offsets = [], [], [0]\n", + " for ( _text, _label) in batch:\n", + " text_list.append( _text )\n", + " label_list.append( _label )\n", + " offsets.append(_text.size(0))\n", + " label = torch.tensor(label_list, dtype=torch.int64) #tensor of labels for a batch\n", + " offsets = torch.tensor(offsets[:-1]).cumsum(dim=0) #tensor of offset indices for a batch\n", + " text_list = torch.cat(text_list) # <--- here we concatenate the reviews in the batch\n", + " return text_list.to(device), label.to(device), offsets.to(device) #move the data to GPU" + ], + "metadata": { + "id": "oG0ZEYvYccBr" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### We load the data:" + ], + "metadata": { + "id": "U0ueXxdpZcqx" + } + }, + { + "cell_type": "code", + "source": [ + "# Load the training and development data\n", + "train = Dataset( train_file )\n", + "dev = Dataset( dev_file, vocab=train.vocab )\n", + "\n", + "train_loader = DataLoader(train, batch_size=2, shuffle=False, collate_fn=collate_fn) #<-- use shuffle = True instead\n", + "dev_loader = DataLoader(dev, batch_size=2, shuffle=False, collate_fn=collate_fn)\n", + "\n", + "\n", + "print(train[0])\n", + "print(train[1])\n", + "for input, label, offset in train_loader:\n", + " print( input, label, input.size(), offset )\n", + " break" + ], + "metadata": { + "id": "sGAiiL2rY7hD" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### The functions to load the embeddings vectors and build the weight matrix are defined below." + ], + "metadata": { + "id": "RX2DkAqws1gU" + } + }, + { + "cell_type": "code", + "source": [ + "import io\n", + "\n", + "def load_vectors(fname):\n", + " fin = io.open(fname, 'r', encoding='utf-8', newline='\\n', errors='ignore')\n", + " n, d = map(int, fin.readline().split())\n", + " print(\"Originally we have: \", n, 'tokens, and vectors of',d, 'dimensions') #here in fact only 10000 words\n", + " data = {}\n", + " for line in fin:\n", + " tokens = line.rstrip().split(' ')\n", + " data[tokens[0]] = [float(t) for t in tokens[1:]]\n", + " return data\n", + "\n", + "vectors = load_vectors( embed_file )\n", + "print( 'Version with', len( vectors), 'tokens')\n", + "print(vectors.keys() )\n", + "print( vectors['de'] )\n", + "\n", + "# Load the weight matrix: modify the code below to check the coverage of the\n", + "# pre-trained embeddings\n", + "emb_dim = 300\n", + "matrix_len = len(train.vocab)\n", + "weights_matrix = np.zeros((matrix_len, emb_dim))\n", + "words_found, words_unk = 0,0\n", + "\n", + "for i in range(0, len(train.vocab)):\n", + " word = train.vocab.lookup_token(i)\n", + " try:\n", + " weights_matrix[i] = vectors[word]\n", + " words_found += 1\n", + " except KeyError:\n", + " weights_matrix[i] = np.random.normal(scale=0.6, size=(emb_dim, ))\n", + " words_unk += 1\n", + "weights_matrix = torch.from_numpy(weights_matrix).to( torch.float32)\n", + "print( \"Words found:\", weights_matrix.size() )\n", + "print( \"Unk words:\", words_unk )" + ], + "metadata": { + "id": "yd2EEjECv4vk" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Model definition\n" + ], + "metadata": { + "id": "VcLWQgu877rQ" + } + }, + { + "cell_type": "code", + "source": [ + "class FeedforwardNeuralNetModel(nn.Module):\n", + " def __init__(self, hidden_dim, output_dim, weights_matrix):\n", + " # calls the init function of nn.Module. Dont get confused by syntax,\n", + " # just always do it in an nn.Module\n", + " super(FeedforwardNeuralNetModel, self).__init__()\n", + "\n", + " # Embedding layer\n", + " # mode (string, optional) – \"sum\", \"mean\" or \"max\". Default=mean.\n", + " self.embedding_bag = nn.EmbeddingBag.from_pretrained(\n", + " weights_matrix,\n", + " mode='mean')\n", + " embed_dim = self.embedding_bag.embedding_dim\n", + "\n", + " # Linear function\n", + " self.fc1 = nn.Linear(embed_dim, hidden_dim)\n", + "\n", + " # Non-linearity\n", + " self.sigmoid = nn.Sigmoid()\n", + "\n", + " # Linear function (readout)\n", + " self.fc2 = nn.Linear(hidden_dim, output_dim)\n", + "\n", + " def forward(self, text, offsets):\n", + " # Embedding layer\n", + " embedded = self.embedding_bag(text, offsets)\n", + "\n", + " # Linear function\n", + " out = self.fc1(embedded)\n", + "\n", + " # Non-linearity\n", + " out = self.sigmoid(out)\n", + "\n", + " # Linear function (readout)\n", + " out = self.fc2(out)\n", + " return out" + ], + "metadata": { + "id": "fXOPuCv_vZrr" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Train and evaluation functions are defined below:" + ], + "metadata": { + "id": "UsXmIGqApbxj" + } + }, + { + "cell_type": "code", + "source": [ + "import matplotlib.pyplot as plt\n", + "import os\n", + "\n", + "def my_plot(epochs, loss):\n", + " plt.plot(epochs, loss)\n", + " #fig.savefig(os.path.join('./lossGraphs', 'train.jpg'))\n", + "\n", + "def training(model, train_loader, optimizer, num_epochs=5, plot=False ):\n", + " loss_vals = []\n", + " for epoch in range(num_epochs):\n", + " train_loss, total_acc, total_count = 0, 0, 0\n", + " for input, label, offsets in train_loader:\n", + " # Step1. Clearing the accumulated gradients\n", + " optimizer.zero_grad()\n", + " # Step 2. Forward pass to get output/logits\n", + " outputs = model( input, offsets ) # <---- argument offsets en plus\n", + " # Step 3. Compute the loss, gradients, and update the parameters by\n", + " # calling optimizer.step()\n", + " # - Calculate Loss: softmax --> cross entropy loss\n", + " loss = criterion(outputs, label)\n", + " # - Getting gradients w.r.t. parameters\n", + " loss.backward()\n", + " # - Updating parameters\n", + " optimizer.step()\n", + " # Accumulating the loss over time\n", + " train_loss += loss.item()\n", + " total_acc += (outputs.argmax(1) == label).sum().item()\n", + " total_count += label.size(0)\n", + " # Compute accuracy on train set at each epoch\n", + " print('Epoch: {}. Loss: {}. ACC {} '.format(epoch, train_loss/len(train), total_acc/len(train)))\n", + " loss_vals.append(train_loss/len(train))\n", + " total_acc, total_count = 0, 0\n", + " train_loss = 0\n", + " if plot:\n", + " # plotting\n", + " my_plot(np.linspace(1, num_epochs, num_epochs).astype(int), loss_vals)\n", + "\n", + "\n", + "def evaluate( model, dev_loader ):\n", + " predictions = []\n", + " gold = []\n", + " with torch.no_grad():\n", + " for input, label, offsets in dev_loader:\n", + " probs = model(input, offsets) # <---- fct forward with offsets\n", + " # -- to deal with batches\n", + " predictions.extend( torch.argmax(probs, dim=1).cpu().numpy() )\n", + " gold.extend([int(l) for l in label])\n", + " print(classification_report(gold, predictions))\n", + " return gold, predictions" + ], + "metadata": { + "id": "US_0JmN5phqs" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Set the values of the hyperparameters\n", + "hidden_dim = 4\n", + "learning_rate = 0.1\n", + "num_epochs = 5\n", + "criterion = nn.CrossEntropyLoss()\n", + "output_dim = 2" + ], + "metadata": { + "id": "Jod8FnWPs_Vi" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# Initialize the model\n", + "model_ffnn = FeedforwardNeuralNetModel( hidden_dim, output_dim, weights_matrix)\n", + "optimizer = torch.optim.SGD(model_ffnn.parameters(), lr=learning_rate)\n", + "model_ffnn = model_ffnn.to(device)\n", + "# Train the model\n", + "training( model_ffnn, train_loader, optimizer, num_epochs=5 )\n", + "# Evaluate on dev\n", + "gold, pred = evaluate( model_ffnn, dev_loader )" + ], + "metadata": { + "id": "1Xug7ygbpAhS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## 2 - Exercise: Tuning your model\n", + "\n", + "The model comes with a variety of hyper-parameters. To find the best model, we need to test different values for these free parameters.\n", + "\n", + "Be careful:\n", + "* you always optimize / fine-tune your model on the **development set**.\n", + "* Then you compare the results obtained with the different settings on the dev set to choose the best setting\n", + "* finally you report the results of the best model on the test set\n", + "* you always keep a track of your experimentation, for reproducibility purpose: report the values tested for each hyper-parameters and the values used by your best model.\n", + "\n", + "In this part, you have to test different values for the following hyper-parameters:\n", + "\n", + "1. Batch size\n", + "2. Max number of epochs (with best batch size)\n", + "3. Size of the hidden layer\n", + "4. Activation function\n", + "5. Optimizer\n", + "6. Learning rate\n", + "\n", + "Inspect your model to give some hypothesis on the influence of these parameters on the model by inspecting how they affect the loss during training and the performance of the model.\n", + "\n", + "**Note:** (not done below) Here you are trying to make a report on the performance of your model. try to organise your code to keep track of what you're doing:\n", + "* give a different name to each model, to be able to run them again\n", + "* save the results in a dictionnary of a file, to be able to use them later: \n", + " * think that you should be able to provide e.g. plots of your results (for example, plotting the accuracy for different value of a specific hyper-parameter), or analysis of your results (e.g. by inspecting the predictions of your model) so you need to be able to access the results." + ], + "metadata": { + "id": "1HmIthzRumir" + } + }, + { + "cell_type": "code", + "source": [ + "from sklearn.metrics import accuracy_score, f1_score" + ], + "metadata": { + "id": "bS_br1eLi-X_" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# For now, we keep a medium number of epochs eg 50\n", + "num_epochs = 50" + ], + "metadata": { + "id": "mPl550bHgYE7" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 1. BATCH SIZE\n", + "\n", + "We need to reload the data to change the size of the batch." + ], + "metadata": { + "id": "YXarvcQk4uEo" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "JXPJko1y24mp" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 3. HIDDEN SIZE" + ], + "metadata": { + "id": "NOCfrCXHXuHF" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "BOszjLHe3GSR" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 4. ACTIVATION FUNCTION" + ], + "metadata": { + "id": "SU5hZaAGa-oN" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "8qoX14qE3Gy5" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 5. LEARNING RATE" + ], + "metadata": { + "id": "T-MTCs64c9R-" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "M07mV-2B3HXM" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 6. OPTIMIZER" + ], + "metadata": { + "id": "enMCV7JAeU1k" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "piL2YOdw3IIw" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 2. NUMBER OF EPOCHS" + ], + "metadata": { + "id": "I3fnAViV7Gfx" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "wfjyhRp53NGw" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Additional exercise\n", + "\n", + "Modify your model to test a variation on the architecture. Here you don't have to tune all your model again, just try for example when keeping the best values found previously for the hyper-parameters:\n", + "\n", + "7. Try with 1 additional hidden layer" + ], + "metadata": { + "id": "VwGKy1zH0mYT" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "vdGOW2U83OdU" + }, + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/notebooks/TP6_m2LiTL_transformers_data_2324.ipynb b/notebooks/TP6_m2LiTL_transformers_data_2324.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..b1263c8ba2399fad680e37d43245c6178ffbc7a7 --- /dev/null +++ b/notebooks/TP6_m2LiTL_transformers_data_2324.ipynb @@ -0,0 +1,576 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "-bb49S7B50eh" + }, + "source": [ + "# TP 6: Introduction aux transformers\n", + "\n", + "Dans cette séance, nous verrons comment utiliser la librairie Transformers de HuggingFace et des modèles pré-entraînés.\n", + "\n", + "Nous nous intéresserons encore à la tâche d'analyse de sentiments sur les données anglaises IMDB.\n", + "Au sens d'HF, il s'agit d'une tâche de classification de séquences de mots.\n", + "\n", + "Nous nous appuierons sur la librairie HuggingFace et les modèles de langue Transformer (i.e. BERT). \n", + "\n", + "- https://huggingface.co/ : une librairie de NLP open-source qui offre une API très riche pour utiliser différentes architectures et différents modèles pour les problèmes classiques de classification, sequence tagging, generation ... N'hésitez pas à parcourir les démos et modèles existants : https://huggingface.co/tasks/text-classification\n", + "- Un assez grand nombre de jeux de données est aussi accessible directement via l'API, pour le texte ou l'image notamment cf les jeux de données https://huggingface.co/datasets et la doc pour gérer ces données : https://huggingface.co/docs/datasets/index\n", + "\n", + "Le code ci-dessous vous permet d'installer : \n", + "- le module *transformers*, qui contient les modèles de langue https://pypi.org/project/transformers/\n", + "- la librairie de datasets pour accéder à des jeux de données\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9UoSnFV250el" + }, + "outputs": [], + "source": [ + "!pip install -U transformers\n", + "!pip install datasets" + ] + }, + { + "cell_type": "markdown", + "source": [ + "Finally, if the installation is successful, we can import the transformers library:" + ], + "metadata": { + "id": "StClx_Hh9PDm" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZBQcA9Ol50en" + }, + "outputs": [], + "source": [ + "import transformers\n", + "from transformers import pipeline\n", + "from datasets import load_dataset\n", + "import numpy as np" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 1. Sentiment analysis with a pretrained model\n", + "\n", + "Many NLP tasks are made easy to perform within HuggingFace using the Pipeline abstraction.\n", + "\n", + "Useful resource: course made available on HuggingFace website, e.g. part on pipelines: https://huggingface.co/course/chapter1/3?fw=pt#working-with-pipelines\n", + "\n", + "\n", + "For example for text classification, we can very simply have access to pretrained models for varied tasks, included sentiment analysis:\n", + "https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.TextClassificationPipeline\n", + "\n", + "Let's try!" + ], + "metadata": { + "id": "4avqXNnF73M0" + } + }, + { + "cell_type": "markdown", + "source": [ + "#### 1.1 ▶▶ Exercise: Default model\n", + "\n", + "You can test pipelines by simply specifying the task you want to perform, a model is chosen by default.\n", + "\n", + "Run the code below:\n", + "* what is the name of the chosen pretrained model?\n", + "* what language?\n", + "* run the next lines and look at the predictions of the model, does it seem alright? Can you produce an example that is not well predicted?" + ], + "metadata": { + "id": "TxAzsZLjA6P_" + } + }, + { + "cell_type": "code", + "source": [ + "classifier = pipeline(\"sentiment-analysis\")" + ], + "metadata": { + "id": "y-Y4a8Dn_6n7" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "classifier(\"This movie is disgustingly good !\")" + ], + "metadata": { + "id": "nRDF7Sd4ArdG" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "classifier(\"This movie is not as good as expected !\")" + ], + "metadata": { + "id": "iNcy1YsjArko" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "O9ZL4YKMD4ra" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "XadsLGxUD4uM" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "_DerR4loD4w1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "#### 1.3 Specifying a pretrained model for English\n", + "\n", + "You can specify the pretrained model you want to use.\n", + "HuggingFace makes available tons of models for NLP (and other domains).\n", + "You can browse them on this page, here restricted to English model for Text classification tasks: https://huggingface.co/models?language=en&pipeline_tag=text-classification&sort=downloads\n", + "\n", + "▶▶ Exercise: use the same model as before, but using the parameter of the pipeline to specify its name\n", + "\n", + "Hint: look at the doc https://huggingface.co/learn/nlp-course/chapter1/3?fw=pt#using-any-model-from-the-hub-in-a-pipeline" + ], + "metadata": { + "id": "ipX_Nwxi_q9D" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "GTcKK78itzYQ" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 1.4 ▶▶ Exercise: use a pretrained model for French\n", + "\n", + "Now, take a look at the models page and find a suitable model for the task in French: we want to try an adapted version of **FlauBERT**.\n", + "\n", + "* Find the model in the database, look at the documentation: how has been built this model?\n", + "* load it. You will need to install sacremoses library using ```!pip install sacremoses```\n", + "* Then try it on a few examples." + ], + "metadata": { + "id": "dQo8pS93BJKf" + } + }, + { + "cell_type": "code", + "source": [ + "!pip install sacremoses" + ], + "metadata": { + "id": "i5t_Ik688rIX" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "fQcEX6OCuOsg" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "# 1.5 Exploring a dataset\n", + "\n", + "In this part, we will focus on exploring datasets that are part of the HuggingFace hub." + ], + "metadata": { + "id": "RLOpYtKavaio" + } + }, + { + "cell_type": "markdown", + "source": [ + "## 1.5.1 Load a dataset\n", + "\n", + "▶▶ Exercise: Find the dataset corresponding to IMDB and load it.\n", + "\n", + "Doc: https://huggingface.co/datasets and https://huggingface.co/docs/datasets/load_hub\n" + ], + "metadata": { + "id": "QVx1g9QN3CjG" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "njVoUS1vwmtY" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## 1.5.2 Print statistics on the dataset\n", + "\n", + "▶▶ Exercise:\n", + "* Print the number of classes\n", + "* Print the first 2 examples of the dataset (advice: shuffle the dataset..)\n", + "* Print the distribution\n", + "* Count the total number of tokens and unique tokens\n", + "\n", + "Hint: start by simply 'printing' the dataset object, i.e.:\n", + "```\n", + "dataset\n", + "```\n", + "It will show you the structure of this object. " + ], + "metadata": { + "id": "5UAvXlgLvxvh" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "UksRQxWdvBom" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "Ix_za-YVvB1L" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## 1.5.3 Tokenizer\n", + "\n", + "The text in the dataset is not tokenized.\n", + "In fact, transformers models have been trained using a specifc tokenization, and it is crucial to rely on the same tokenization when using a transformer model.\n" + ], + "metadata": { + "id": "bBQ6u5i41ROT" + } + }, + { + "cell_type": "markdown", + "source": [ + "### ▶▶ Exercise: Load the pretrained model for English and test it on the first example" + ], + "metadata": { + "id": "q9LjYU-C3ibO" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "ZexcERzqvJeL" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Notes on tokenizers\n", + "\n", + "Notez que la librairie HuggingFace définit des *Auto Classes*: elles permettent d'inférer directement l'architecture requise selon le type de modèle spécifié en argument.\n", + "* Par exemple ici, le tokenizer est spécifique au modèle DistilBERT, plus précisément il est identique à celui de BERT, et hérite beaucoup de méthodes de la classe *PreTrainedTokenizerFast*.\n", + "* On utilise la classe *class transformers.AutoModelForSequenceClassification* pour un modèle d'étiquetage de séquence.\n", + "\n", + "Le tokenizer est en charge de préparer les données d'entrée, et notamment dans le cas de BERT, de découper les tokens en sous-tokens, mais aussi d'assigner des ids à chaque sous-token, de permettre le mapping dans un sens et dans l'autre...\n", + "\n", + "- Les *Auto Classes*: https://huggingface.co/docs/transformers/model_doc/auto\n", + "- Les Tokenizer dans HuggingFace: https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/tokenizer\n", + "- *Bert tokenizer*: https://huggingface.co/docs/transformers/v4.25.1/en/model_doc/bert#transformers.BertTokenizer\n", + "- Classe *PreTrainedTokenizerFast*: https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/tokenizer#transformers.PreTrainedTokenizerFast" + ], + "metadata": { + "id": "NUus9JUNB3Qq" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9XwH5If4B3Qq" + }, + "outputs": [], + "source": [ + "from transformers import AutoTokenizer\n", + "\n", + "# Defining the tokenizer using Auto Classes\n", + "tokenizer = AutoTokenizer.from_pretrained(pretrained_model)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "### ▶▶ Exercice: Tester le tokenizer\n", + "\n", + "**Utiliser le tokenizer pour :**\n", + "- encoder une phrase (en anglais) :\n", + "- convertir dans l'autre sens : d'une liste d'ids de tokens en texte\n", + " * que se passe-t-il dans le cas de mots longs ?\n", + " * de mots inconnus ?\n", + " * Que répresentent les éléments entre crochets ?\n", + "\n", + "\n", + "Hint: regardez les méthodes 'encode' et 'decode' dans la doc https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/tokenizer (et éventuellement 'convert_ids_to_tokens()')." + ], + "metadata": { + "id": "V8C5djpXB3Qr" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "uO8UkY4yvksn" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "3pLoN8VHvkvb" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Compute the vocabulary using the tokenizer\n", + "\n", + "The function below will tokenize the entire dataset.\n", + "\n", + "▶▶ Exercise: compute the total number of tokens and unique tokens." + ], + "metadata": { + "id": "0GNXQIm9vuNX" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "4b_ICjruwKj5" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "-Kj0bW3_50et" + }, + "outputs": [], + "source": [ + "def tokenize_function(examples):\n", + " return tokenizer(examples[\"text\"])\n", + "\n", + "\n", + "tokenized_datasets = dataset.map(tokenize_function)" + ] + }, + { + "cell_type": "code", + "source": [ + "tokenized_datasets" + ], + "metadata": { + "id": "TKTi2eO8d-JJ" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Notez que le tokenizer retourne deux éléments:\n", + "\n", + "- input_ids: the numbers representing the tokens in the text.\n", + "- attention_mask: indicates whether a token should be masked or not.\n", + "\n", + "Plus d'info sur les datasets: https://huggingface.co/docs/datasets/use_dataset" + ], + "metadata": { + "id": "ATFZVbiYwD34" + } + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "VhabMuyjwSHe" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "# Additional notes about HuggingFace dataset" + ], + "metadata": { + "id": "-bUnXTbbGp5e" + } + }, + { + "cell_type": "markdown", + "source": [ + "#### Available corpora\n", + "\n", + "Note that many corpora are available directly from HuggingFace, for example for text classification tasks:\n", + "https://huggingface.co/models?pipeline_tag=text-classification&sort=downloads\n", + "\n", + "\n", + "In particular you can directly load the full AlloCine corpus:\n", + "https://huggingface.co/datasets/allocine" + ], + "metadata": { + "id": "bsbgcxgTJsW2" + } + }, + { + "cell_type": "markdown", + "source": [ + "#### Some preprocessing\n", + "\n", + "The library allows to perform some preprocessing directly on the Dataset object, very easily.\n", + "Take alook at the doc: https://huggingface.co/course/chapter5/3?fw=pt\n", + "\n", + "For example here we can compute the lenght of each review and filter our dataset to excluse outliers, e.g. reviews with too few words." + ], + "metadata": { + "id": "FLvU5EYUCnVK" + } + }, + { + "cell_type": "code", + "source": [ + "def compute_review_length(example):\n", + " return {\"review_length\": len(example[\"review\"].split())}\n", + "\n", + "dataset = dataset.map(compute_review_length) #Add the column review_lenght\n", + "# Inspect the first training example\n", + "dataset[\"train\"][0]" + ], + "metadata": { + "id": "SgeXPXp6JmZU" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Some review are very short... Dataset.filter() can be used to remove some examples." + ], + "metadata": { + "id": "6fL34GWd53ij" + } + }, + { + "cell_type": "code", + "source": [ + "dataset[\"train\"].sort(\"review_length\")[:3]" + ], + "metadata": { + "id": "56Lv3xpAJmb5" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "filtered_dataset = dataset.filter(lambda x: x[\"review_length\"] > 10)\n", + "print(filtered_dataset.num_rows)" + ], + "metadata": { + "id": "UuDpP1JyF-6a" + }, + "execution_count": null, + "outputs": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "visual", + "language": "python", + "name": "visual" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.5" + }, + "colab": { + "provenance": [], + "toc_visible": true + }, + "accelerator": "GPU", + "gpuClass": "standard" + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/slides/MasterLiTL_Course1_281123.pdf b/slides/MasterLiTL_2324_Course1_281123.pdf similarity index 100% rename from slides/MasterLiTL_Course1_281123.pdf rename to slides/MasterLiTL_2324_Course1_281123.pdf diff --git a/slides/MasterLiTL_2324_Course4_090124.pdf b/slides/MasterLiTL_2324_Course4_090124.pdf new file mode 100644 index 0000000000000000000000000000000000000000..69763d8d451cc102e36b7c6c74f3e12fbfab4ece Binary files /dev/null and b/slides/MasterLiTL_2324_Course4_090124.pdf differ