docker
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revisionLast revisionBoth sides next revision | ||
docker [2019/11/28 17:23] – created mimbert | docker [2019/12/06 13:34] – [Proposed solution] mimbert | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== Docker ===== | ||
+ | |||
Since November 2019, a new way to conduct experiments is implemented in CorteXlab. | Since November 2019, a new way to conduct experiments is implemented in CorteXlab. | ||
- | ====== | + | ==== Why ==== |
+ | |||
+ | The legacy way of running experiments is to run one (or more) commands on each node of the experiment. These commands are run from the minus task. This has some drawbacks: | ||
+ | * The development workflow, to develop and debug an experiment code, is: | ||
+ | * create the task | ||
+ | * submit the task | ||
+ | * wait the task end | ||
+ | * unzip the task's results | ||
+ | * look in the task's results (stdout, stderr) and in the minus log to understand issues, bugs, errors, exceptions | ||
+ | * fix issues | ||
+ | * repeat the process | ||
+ | * ... This workflow is painful, not interactive, | ||
+ | * The experimenter needs to pack in the task everything needed. This includes potentially big datasets, and if these datasets are different for each node, then the task will contain the union of all the datasets, which can be huge. | ||
+ | * The executable code has to be in the task. For simple scripts, it's ok, but for binaries, or as soon as there are some dependencies (libraries, which may also have dependencies of their own), building the task may become pretty difficult or impossible. See [[embedding_oot_modules_or_custom_libraries_binaries_in_minus_scenario]]). In particular, [[https:// | ||
+ | * All the results are gathered with the task directory, compressed, and sent back to airlock. This means that the results may include huge unneeded things, such as experiment code, input datasets, etc. | ||
+ | |||
+ | ==== Proposed solution | ||
- | The legacy | + | The proposed solution is to use [[https:// |
+ | * [[https:// | ||
+ | * Preparing an image is a much more convenient process than preparing a task, when it comes to complex software bundles such as TensorFlow, OpenBTS, etc. One just needs to install dependencies and build as if it was a real machine, there is no need to tweak or hack build steps, it just works directly. | ||
+ | * Image preparation can be an interactive process, or it can be automated with a [[https:// | ||
+ | * The exact same image can be used to test things on an experimenter' | ||
+ | * When running a task, images are instanciated to [[https:// | ||
+ | * The experiment results are structured differently. For each node, there is one directory per container, containing the stdout/ | ||
+ | * The homes of the users are NFS mounted |
docker.txt · Last modified: 2023/09/28 17:24 by cmorin