Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device not shown if occupied by a task #25583

Open
ruspaul013 opened this issue Apr 2, 2025 · 2 comments
Open

Device not shown if occupied by a task #25583

ruspaul013 opened this issue Apr 2, 2025 · 2 comments

Comments

@ruspaul013
Copy link

Nomad version

[root@base-machine ~]# nomad version
Nomad v1.9.6

Operating system and Environment details

Plugin "nomad-usb-device" v0.4.0
OS: AlmaLinux 9.5

Issue

When you inspect a node, for devices will appear something like this:

[root@base-machine ~]# nomad node status 1f5a5301
Device Resource Utilization
0627/usb/0001[001-002-001-0627-0001]  <none>

Even if the resource is in use, this is not shown. So it's not possible to know which resources are currently used. I know that allocations could be denied due to lack of resources, but cannot be tracked to the node level.

Reproduction steps

  1. install a device plugin (usb/nvidia)
  2. run a job with device block
  3. check node status

Expected Result

I expect to see which task uses the resources.

[root@base-machine ~]# nomad node status 1f5a5301
Device Resource Utilization
0627/usb/0001[001-002-001-0627-0001]  <alloc_id>

Actual Result

It shows the device stats or none ( based on the device used ).

[root@base-machine ~]# nomad node status 1f5a5301
Device Resource Utilization
0627/usb/0001[001-002-001-0627-0001]  <none>

Job file (if appropriate)

job "redis" {
  datacenters = ["dc1"]
  type        = "service"

  group "redis" {
    network {
      port "redis" { to = 6379 }
    }

    task "redis" {
      driver = "podman"

        config {
          image = "docker://redis"
          ports = ["redis"]
        }

      resources {
                memory = 1024
                cores = 2
                device "0627/usb/0001" {}
            }
    }
  }
}
@Juanadelacuesta
Copy link
Member

Hi @ruspaul013, Im looking at the issue you are reporting and I wanted to know if you are installing the Nomad USB Device Plugin for the correct distribution you are using under the plugin_dir in your nomad clients and configuring it as recommended?

@ruspaul013
Copy link
Author

@Juanadelacuesta yes, This is my nomad config:

name="test"
data_dir  = "/var/lib/nomad"
plugin_dir = "/var/lib/nomad/plugins/"
log_level="info"

bind_addr = "0.0.0.0" # the default


client {
  enabled        = true
  cni_path       = "/opt/cni/bin"
  servers         = ["10.150.92.131:4647"]
}



plugin "nomad-driver-podman" {
    config {
        disable_log_collection = false
        socket_path = "unix://run/podman/podman.sock"
        client_http_timeout = "30s"
        extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]

        gc {
            container = true
        }

        volumes {
            enabled      = true
            selinuxlabel = "z"
        }
    }
}

plugin "nomad-driver-usb" {
    config {
        enabled = true

        included_vendor_ids = [0x1d6b,0x0627]
        excluded_vendor_ids = []

        included_product_ids = [0x0001,0x0001]
        excluded_product_ids = []

        fingerprint_period = "1m"
    }
}

server {
  enabled          = true
  bootstrap_expect = 1
}

plugin_dir:

[root@base-machine bp_agent]# ls /var/lib/nomad/plugins/
nomad-driver-nvidia  nomad-driver-podman  nomad-driver-usb

I may have labeled it wrong, I don't think it is a bug - maybe a feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants