Skip to content
Snippets Groups Projects
Commit 8a118958 authored by Millian Poquet's avatar Millian Poquet
Browse files

initial commit

parents
Branches
No related tags found
No related merge requests found
LICENSE 0 → 100644
The MIT License (MIT)
=====================
Copyright © `2022` `Millian Poquet`
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the “Software”), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
---
title: Frugal prediction of the load of HPC centers
author:
firstname: Millian
lastname: Poquet
title: Maître de conférences
mail: millian.poquet@irit.fr
date: 28 septembre 2022
bibliography: biblio.bib
---
# Context {-}
High Performance Computing (HPC) centers are large-scale computational platforms composed of many nodes and cores.
Many different users execute their applications on them, notably to conduct scientific simulation studies.
Users do not directly access such platforms but use a middleware called a resource manager to reserve computational resources and to execute their applications on them — [Slurm] is the resource manager used in most HPC centers.
Resource managers can take many decisions about the execution of applications (when to execute them, on which resources) but also on the resources themselves (the frequency the processors should run at, when to shutdown or boot resources).
Resources managers are therefore a key component when one wants to optimize the whole behavior of an HPC center, as tuning resource management policies can lead to significant gains.
The power consumption of HPC centers is substantial (*e.g.*, [Frontier] consumes more than 20 MW) and we are interested in reducing this energy footprint.
In particular, node shutdown policies are rarely implemented on HPC centers as they can be detrimental for performance if they are not well adapted to the present and future load of the center.
# Objective of the internship {-}
The main objective of this internship is to develop a system that predicts the load of an HPC center.
Here, the load can be defined as the amount of computation (in core×hour) in the applications that are being executed, and in the applications that are in queue (*i.e.,* that are waiting to be executed).
The quality of the developed system will be evaluated on different objectives:
**1** precision of the predictions on various time horizons (10/30/60 minutes),
**2** time and energy cost of the system, both in learning phase or when asking for predictions at runtime, and
**3** understandability of the method, of the trained model and of the results.
We are interested in the tradeoffs enabled by various machine learning methods on this problem.
In particular, methods that can associate an uncertainty value with each prediction interest us the most, as they should enable us to develop more robust node shutdown algorithms in the long run. Ideally, the system developed during the internship will be able to estimate the probability of the load to be in a given interval value (*e.g.*, between 10 and 50 core×hour) at a given time (*e.g.*, in 10 minutes).
Traces from the [Parallel Workload Archive] will be used as data sources for this work.
The prediction of the execution time of applications has been studied in the literature [[ZCLT22]] [[GGRT15]] and could be used to design the load prediction system, as the prediction of execution times can be seen as a subproblem.
\pagebreak
# Expected skills and profile {-}
- Machine learning skills, especially on time series
- Programming skills in Python or R (C++ is a plus)
- Taste for experimental methods (chocolate is a plus)
- Fluent French or English
# Practical details {-}
The internship will take place at [IRIT], the largest computer science research institute in Toulouse, France.
Our team [SEPIA] works on resource management on various distributed systems (cloud datacenters, HPC centers, edge architectures, IoT...) and is especially interested in ecological transition, notably by reducing energy consumption and CO2 emissions, by using renewable energy...
The internship will be supervised by Millian Poquet and Georges Da Costa in a convivial atmosphere `:)`.
A computer and an office will be provided, as well as a monthly internship stipend of 591 €.
Internship duration is 5-6 months.
You can send us your application (cover letter + resume / short curriculum vitæ) by email to **[millian.poquet@irit.fr](mailto:millian.poquet@irit.fr)** or **[georges.da-costa@irit.fr](mailto:georges.da-costa@irit.fr)**.
# Bibliography {-}
- [[ZCLT22]] Salah Zrigui, Raphael Y. de Camargo, Arnaud Legrand and Denis Trystram. *Improving the Performance of Batch Schedulers Using Online Job Runtime Classification.* Journal of Parallel and Distributed Computing, Elsevier, 2022, 164, pp.83-95.
- [[GGRT15]] Eric Gaussier, David Glesser, Valentin Reis, and Denis Trystram. *Improving Backfilling by using Machine Learning to predict Running Times.* SuperComputing 2015, Nov 2015, Austin, TX, United States.
[Slurm]: https://slurm.schedmd.com/overview.html
[Frontier]: https://en.wikipedia.org/wiki/Frontier_(supercomputer)
[SEPIA]: https://www.irit.fr/en/departement/dep-architecture-systems-and-networks/sepia-team/
[IRIT]: https://www.irit.fr/en
[Parallel Workload Archive]: https://www.cs.huji.ac.il/labs/parallel/workload/logs.html
[[ZCLT22]]: https://hal.archives-ouvertes.fr/hal-03023222
[[GGRT15]]: https://hal.archives-ouvertes.fr/hal-01221186
File added
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Generator: Adobe Illustrator 16.0.4, SVG Export Plug-In . SVG Version: 6.00 Build 0) -->
<svg
version="1.1"
id="Calque_1"
x="0px"
y="0px"
width="503.39401"
height="503.396"
viewBox="0 0 503.39401 503.396"
enable-background="new 0 0 600 600"
xml:space="preserve"
sodipodi:docname="irit-logo.svg"
inkscape:version="1.1.2 (0a00cf5339, 2022-02-04)"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:i="&amp;ns_ai;"><defs
id="defs33" /><sodipodi:namedview
id="namedview31"
pagecolor="#ffffff"
bordercolor="#999999"
borderopacity="1"
inkscape:pageshadow="0"
inkscape:pageopacity="0"
inkscape:pagecheckerboard="0"
showgrid="false"
inkscape:zoom="1.0264833"
inkscape:cx="177.79149"
inkscape:cy="142.72029"
inkscape:window-width="1916"
inkscape:window-height="1032"
inkscape:window-x="1920"
inkscape:window-y="22"
inkscape:window-maximized="1"
inkscape:current-layer="Calque_1" />
<switch
id="switch28"
transform="translate(-47.992,-48)">
<foreignObject
requiredExtensions="http://ns.adobe.com/AdobeIllustrator/10.0/"
x="0"
y="0"
width="1"
height="1">
</foreignObject>
<g
i:extraneous="self"
id="g26">
<g
id="g24">
<path
fill="#f84b0f"
d="M 47.992,299.696 C 47.992,160.686 160.683,48 299.69,48 c 139.019,0 251.696,112.686 251.696,251.696 0,139.016 -112.678,251.7 -251.696,251.7 -139.007,0 -251.698,-112.684 -251.698,-251.7 z"
id="path2" />
<path
fill="#ffffff"
d="m 207.659,220.17 c 0,15.042 -12.198,27.221 -27.246,27.221 -15.024,0 -27.198,-12.179 -27.198,-27.221 0,-15.03 12.174,-27.221 27.198,-27.221 15.048,0 27.246,12.191 27.246,27.221 z"
id="path4" />
<rect
x="335.76001"
y="261.608"
fill="#ffffff"
width="29.603001"
height="142.067"
id="rect6" />
<rect
x="143.37601"
y="261.608"
fill="#ffffff"
width="29.597"
height="142.067"
id="rect8" />
<rect
x="113.777"
y="261.608"
fill="#ffffff"
width="59.195999"
height="29.594999"
id="rect10" />
<g
id="g16">
<rect
x="190.73199"
y="261.608"
fill="#ffffff"
width="29.597"
height="142.067"
id="rect12" />
<rect
x="190.73199"
y="261.608"
fill="#ffffff"
width="53.277"
height="29.594999"
id="rect14" />
</g>
<rect
x="418.63501"
y="267.52499"
fill="#ffffff"
width="29.596001"
height="136.14999"
id="rect18" />
<path
fill="#ffffff"
d="m 319.783,308.688 c 0,-27.345 -10.786,-47.081 -55.668,-47.081 h -3.736 l -0.109,29.595 h 2.496 c 17.903,0 25.223,4.345 25.223,20.904 0.018,16.178 -7.676,20.534 -24.829,20.534 h -25.071 v 25.268 l 20.879,-0.008 25.888,47.375 33.458,-0.013 -28.452,-51.084 c 23.691,-7.51 29.921,-25.858 29.921,-45.49 z"
id="path20" />
<rect
x="380.16"
y="261.608"
fill="#ffffff"
width="106.553"
height="29.594999"
id="rect22" />
</g>
</g>
</switch>
</svg>
\documentclass[a4paper, 11pt]{article}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[hidelinks]{hyperref}
\usepackage{wasysym}
\usepackage{marvosym}
\usepackage[french,english]{babel}
\usepackage{lmodern}
\usepackage[overlay, absolute]{textpos}
\setlength{\TPHorizModule}{10mm}
\setlength{\TPVertModule}{\TPHorizModule}
\textblockorigin{0mm}{0mm} % start everything near the top-left corner
\usepackage{calc}
\usepackage{graphicx}
\usepackage{xcolor}
\definecolor{blue-irit}{RGB}{0,86,112}
\definecolor{orange-irit}{RGB}{255,76,0}
\newcommand\crule[3][black]{\textcolor{#1}{\rule{#2}{#3}}}
\usepackage{sectsty}
\sectionfont{\color{blue-irit}}
\usepackage{fancyhdr}
\pagestyle{fancy}
\setlength{\headwidth}{.85\paperwidth}
\fancyhf{}
\renewcommand{\headrule}{}
\renewcommand{\footrule}{}
\fancyhead{}
\fancyfoot[L]{\hskip -\hoffset \hskip -38mm \hskip -\oddsidemargin \crule[orange-irit]{.75\paperwidth}{1pt} %
\raisebox{-.5ex+.5pt}{\textcolor{orange-irit}{\href{mailto:$author.mail$}{\texttt{\textbf{$author.mail$}}}}}}
\usepackage{geometry}
\geometry{
a4paper,
left=20mm,
right=20mm,
top=20mm,
bottom=25mm
}
\hypersetup{
colorlinks=true,
linkcolor=orange-irit,
filecolor=orange-irit,
urlcolor=orange-irit,
pdftitle={$title$},
pdfauthor={$author.firstname$ $author.lastname$}
}
%\let\oldhref\href
%\renewcommand{\href}[2]{\oldhref{#1}{\bfseries#2}}
\usepackage{titlesec}
\titlespacing{\section}{0pt}{0pt}{0pt}
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\newcommand{\biglogowithadress}{
\begin{textblock*}{0mm}(20mm,10mm)
\noindent%
\includegraphics[width=30mm]{irit-logo.pdf}
\end{textblock*}
\begin{textblock*}{150mm}(60mm,10mm)
\noindent\textsf{\raggedright\textcolor{blue-irit}{\textbf{Institut de Recherche en Informatique de Toulouse}}}
\end{textblock*}
\begin{textblock*}{150mm}(60mm,17mm)
\noindent%
$author.firstname$ \textsc{$author.lastname$}\\
$if(author.title)$
$author.title$\\
$endif$
Université Paul Sabatier, IRIT\\
118 Route de Narbonne, 31062 Toulouse Cedex 9, France\\
\href{mailto:$author.mail$}{\texttt{\textbf{$author.mail$}}}
\end{textblock*}
}
\newcommand{\firstpagehead}{
\biglogowithadress
\mbox{}
\vskip 1.5cm%
}
\makeatletter
\setlength{\parskip}{10pt}
\date{\today{}}
\author{$author.firstname$ $author.lastname$}
\begin{document}
\firstpagehead
\noindent\makebox[\linewidth]{\rule{\paperwidth}{1pt}}
\begin{center}
\Large\textbf{$title$}
\end{center}
$body$
\nocite{*}
\end{document}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment