tool monkey

adventures of an unfrozen caveman programmer

Rethinking the Semantics of Group Quotas and Slot Weights for Heterogeneous and Multidimensional Compute Resources

| Feedback

The HTCondor semantic for accounting group quotas and slot weights is currently cpu-centric. This is an artifact of the historic primacy of cpu as the most commonly-considered limiting resource in computations. For example the SlotWeight attribute is currently defaulted to Cpus, and when slot weights are disabled, there is logic activated in matchmaking to sum the available cpus on slots to avoid ‘undercounting’ total pool quotas.

However, HTCondor slots – the core model of computational resources in an HTCondor pool – manage four resources by default: cpu, memory, disk and swap. Furthermore, slots may now be configured with arbitrary custom resources. As recently mentioned by Matthew Farrellee, there is a growing pressure to provide robust support not just for traditional cpu-centric resource presentation, usage and allocation, but also seamlessly mediated with memory-centric, gpu-centric or ‘*’-centric resource allocation policies and more generally allocation policies that are simultaneously aware of all resource dimensions.

This goal immediately raises some questions for the semantics of accounting groups and slot weights when matching jobs against slots during matchmaking.

Consider a pool where 50% of the slots are ‘weighted’ in a traditional cpu-centric way, but the other 50% are intended to be allocated in a memory-centric way. This is currently possible, as the SlotWeight attribute can be configured appropriately to be a function of either Cpus or Memory.

But in a scenario where slots are weighted as functions of heterogeneous resource dimensions, it raises a semantic question: when we sum these weights to obtain the pool-wide available quota, what ‘real world’ quantity does this total represent – if any? Is it a purely heuristic numeric value with no well defined unit attached?

This question has import. Understanding what the answer is, or should be, impacts what story we tell to users about what their accounting group configuration actually means. When I assign a quota to an accounting group in such a heterogeneous environment, what is that quota regulating? When a job matches a cpu-centric slot, does the cost of that match have a different meaning than when matching against a memory-centric slot? When the slots are partitionable, a match implies a certain multi-dimensional slice of resources allocated from that slot. What is the cost of that slice? Does the sum of costs add up to the original weight on the partitionable slot? If not, how does that affect our understanding of quota semantics?

It may be possible unify all of these ideas by adopting the perspective that a slot’s weight is a measure of the maximum number of jobs that can be matched against it. The cost of a match is W(S)-W(S’), where W(S) is the weight function evaluated on the slot prior to match, and W(S’) is the corresponding weight after the match has extracted its requested resources. The pool’s total quota is just the sum of W(S), over all slots S in the pool. Note, this implies that the ‘unit’ attached to both slot weights and accounting group quotas is ‘jobs’.

Consider a simple example from the traditional cpu-centric configuration: A partitionable slot is configured with 8 cpus, and SlotWeight is just its default Cpus. Using this model, the allocation policy is: ‘each match must use >= 1 cpu”, and that other resource requests are assumed to be not limiting. The maximum number of matches is 8 jobs, each requesting 1 cpu. However, a job might also request 2 cpus. In this case, note that the cost of the match is 2, since the remaining slot has 6 slots, and so W(S’) now evaluates to 6. So, the cost of the match is how many fewer possible jobs the original slot can support after the match takes its requested resources.

This approach can be applied equally well to a memory-centric strategy, or a disk centric strategy, or a gpu-based strategy, or any combination simultaneously. All weights evaluate to a value with unit ‘jobs’. All match costs are differences between weights (before and after match), and so their values are also in units of ‘jobs’. Therefore, the semantics of the sum of weights over a pool is always well defined: it is a number of jobs, and spefically a measure of the maximum number of jobs that might match against all the slots in the pool. When a match acquires resources that reduce this maximum by more than 1 job, that is not in any way inconsistent. It means the job used resources that might have supported two or more ‘smaller’ jobs. This means that accounting group quotas (and submitter shares) also have a well defined unit and semantic, which is ‘how many (or what fraction of) the maximum possible jobs is this group guaranteed by my accounting policy’

One implication of this proposed semantic for quotas and weights is that the measure for the maximum number of jobs that may match against any given slot must be some finite number. It implies that all resource dimensions are quantized in some way by the allocation policy. This scheme would not support a real-valued resource dimension that had no minimum quantization. I do not think that this is a very heavy-weight requirement, and in fact we have already been moving in that direction with features such as MODIFY_REQUEST_EXPRS_xxx.

When a slot’s resource allocation policy is defined over all its resources, what bounds this measure of maximum possible matches? In a case where each job match must use at least one non-zero quantum of each resource dimension, then the limit is the resource with the mimimum quantized levels. In a case where jobs may request a zero amount of resources, then the limit is the resource with the maximum quantized levels. (note, it is required that each match use at least one quantum of at least one resource, otherwise the maximum is not properly bounded).