Skip to content
Snippets Groups Projects
Commit 2faaed4f authored by Millian Poquet's avatar Millian Poquet
Browse files

example update

parent 4852894b
No related branches found
No related tags found
No related merge requests found
......@@ -12,7 +12,8 @@ bibliography: biblio.bib
# Context {-}
High Performance Computing (HPC) centers are large-scale computational platforms composed of many nodes and cores.
Many different users execute their applications on them, notably to conduct scientific simulation studies.
Users do not directly access such platforms but use a middleware called a resource manager to reserve computational resources and to execute their applications on them — [Slurm] is the resource manager used in most HPC centers.
Users do not directly access such platforms but use a middleware called a resource manager to reserve computational resources and to execute their applications on them.
Examples of such resource managers include [OAR], [Slurm], [PBS], [Flux]...
Resource managers can take many decisions about the execution of applications (when to execute them, on which resources) but also on the resources themselves (the frequency the processors should run at, when to shutdown or boot resources).
Resources managers are therefore a key component when one wants to optimize the whole behavior of an HPC center, as tuning resource management policies can lead to significant gains.
......@@ -39,7 +40,7 @@ The prediction of the execution time of applications has been studied in the lit
- Machine learning skills, especially on time series
- Programming skills in Python or R (C++ is a plus)
- Taste for experimental methods (chocolate is a plus)
- A taste for experimental methods (a taste for chocolate is a plus)
- Fluent French or English
# Practical details {-}
......@@ -57,9 +58,12 @@ You can send us your application (cover letter + resume / short curriculum vitæ
- [[ZCLT22]] Salah Zrigui, Raphael Y. de Camargo, Arnaud Legrand and Denis Trystram. *Improving the Performance of Batch Schedulers Using Online Job Runtime Classification.* Journal of Parallel and Distributed Computing, Elsevier, 2022, 164, pp.83-95.
- [[GGRT15]] Eric Gaussier, David Glesser, Valentin Reis, and Denis Trystram. *Improving Backfilling by using Machine Learning to predict Running Times.* SuperComputing 2015, Nov 2015, Austin, TX, United States.
[OAR]: https://oar.imag.fr
[Slurm]: https://slurm.schedmd.com/overview.html
[PBS]: https://www.openpbs.org
[Flux]: https://flux-framework.org
[Frontier]: https://en.wikipedia.org/wiki/Frontier_(supercomputer)
[SEPIA]: https://www.irit.fr/en/departement/dep-architecture-systems-and-networks/sepia-team/
[SEPIA]: https://www.irit.fr/en/departement/dep-architecture-systems-and-networks/sepia-team
[IRIT]: https://www.irit.fr/en
[Parallel Workload Archive]: https://www.cs.huji.ac.il/labs/parallel/workload/logs.html
[[ZCLT22]]: https://hal.archives-ouvertes.fr/hal-03023222
......
No preview for this file type
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment