Skip to content

Roadmap

This is a random set of TBDs that should be addressed to make the project more flexible and support more configuration options.

Must Haves

Things required and yet to be fully working…

  • Flamenco Manager
    • Need to figure out how to support HTTPS and WSS (HTTPS for Websockets) with the NGINX Ingress Controller.
    • Deregister workers (tear down workers deployment) whenever cluster is powered down

Active Development

  • Flamenco Manager API query number of jobs and use Keda to scale up Workers when jobs are enqueued.
  • Benchmarking using the blender open-data benchmarking scripts locally/offline
  • argo workflow for building and testing flamenco from source

Future Enhancements

The playbooks are currently very focused on Ubuntu 22.04 and Intel GPU. This list increases scope to NVIDIA and potentially non-Ubuntu OS/platforms to build a cluster on. There are also scaling/performance considerations that are not 100% required but would make the project more flexible.

Ansible Playbook Additions

  • UPnP support to detect Flamenco Workers on Home Network
  • UPnP support within cluster to allow for dynamic Worker registration
  • cleanly remove node from cluster (deprovision)
  • lets encrypt for docker ssl cert
  • lets encrypt for gateway device nginx
  • multiple blender version support… flamenco assumes one version
  • grafana status page to show node status
  • k8s gpu nvidia dcgm support
  • document strategies for k8s upgrades
  • break up gateway playbook into smaller plays
  • why does building a docker container take much longer and huge amount of disk space via playbook
  • create some blender benchmark test renders
  • GPU metrics from Intel i915 into Prometheus.
  • support different types of flamenco/blender deployments (1 per node, 1 per CPU, 1 per GPU)
  • if Manager restarts, re-register all workers (or restart them all)
  • co-locate ambelic (baked) files nearer or on the render node (rsync jobs to local PV?)
  • Time of Day rendering… to use off-peak electricity for example
  • Green/Blue and dev vs prod workspaces
  • Alerting (k8s and flamenco) using Alertmanager and some third party (eg: Google Chat, Signal, Pagerduty etc)
  • why tls verify failed on metrics-server and had to override defaultArgs to skip it.
  • minio “Prometheus URL is unreachable” on tenant metrics
  • rook mgr console HTTP->HTTPS and problem with mgr only working with HTTPS/SSL
  • pod security context for usb sensor should be more restrictive.
  • minio HTTPS
  • minio tenant creds and bucket creation via Ansible (currently manually done)