All'inizio di questa settimana Intel ha reso pubblica una nuova classe di vulnerabilità dei processori nota come L1 Terminal Fault (L1TF). Le varianti di L1TF interessano molti ambienti singoli e multi-tenant, tra cui alcune infrastrutture di Linode e gli stessi Linode.
Abbiamo iniziato gli sforzi di mitigazione e prevediamo la completa mitigazione della nostra flotta entro le prossime settimane. Riteniamo di poter raggiungere questo obiettivo senza alcuna interruzione dei vostri sistemi in funzione e senza richiedere alcun coordinamento da parte vostra. Tuttavia, la situazione è ancora in evoluzione e ne sapremo di più man mano che procederemo. I primi risultati delle nostre mitigazioni sono stati incoraggianti.
Sebbene questo protegga la nostra parte di cose, dovreste assicurarvi di eseguire un kernel Linux con le mitigazioni in vigore. Consultate la nostra guida sull'aggiornamento del kernel.
Nelle prossime settimane, man mano che procederemo con i nostri sforzi di mitigazione, continueremo a fornire ulteriori informazioni qui sul nostro blog. Restate sintonizzati!
Commenti (8)
Thanks for the hard work in dealing with this. Though I am not sure it’s enough to just update the Kernel on OS side, need microcode updates – I guess from host node OS level too https://www.linode.com/community/questions/17120/how-is-linode-handling-l1tf-what-actions-can-we-take-to-mitigate#answer-66869 ?
Wouldn’t the microcode updates require host node level reboots ?
You are correct! We’re able to transparently move VMs to patched infrastructure using live migrations.
Ah sweet – live migration feature is awesome. One of many reasons I have stuck with Linode for 4+ yrs now 🙂
What are your plans regarding HyperThreading?
One of the things that has me shocked about L1TF is that there does not yet appear to be any publicly-available, complete mitigation to either of the major open-source hypervisors (KVM and Xen) that does not require HyperThreading to be disabled.
L1TF is not fully mitigated if unrelated guests can run as hyper-siblings (or if an untrusted guest–which is all guests for a cloud VM provider–can run as a hyper-sibling of a hypervisor thread). Technically, this could be enforced by a scheduler, but the most unequivocal statement of a scheduler that will do so comes from, of all places, Microsoft, and therefore Azure (https://blogs.technet.microsoft.com/virtualization/2018/08/14/hyper-v-hyperclear/).
Google also indicates that individual cores are never concurrently shared between VMs (https://cloud.google.com/blog/products/gcp/protecting-against-the-new-l1tf-speculative-vulnerabilities). Certainly, they have the wherewithal to pull this off with custom internal kernel changes, so there’s no particular reason to doubt them. (I didn’t find any clear statement from AWS on shared cores, but they already have their custom Nitro hypervisor, so plausibly they have a custom modification.)
Unfortunately, the current docs applicable to KVM don’t provide any good solution for a cloud VM provider other than disabling HyperThreading: https://www.kernel.org/doc/html/latest/admin-guide/l1tf.html
Am I wrong about this?
I too would like to know more about the hyperthreading story. We have multiple internal deployments of openstack and vmware that would suffer if we have to disable HT. Did Linode disable HT?
I am very happy with Linode being able to live migrate things with no downtime to customers. That is a massive improvement over the past migration queues.
Thanks for implementing these security fixes.
The new “live” migrations is certainly interesting – is this a new feature that you’re now able to use? It’s certainly much less painful than existing migration queues and forced downtime.
Futhermore, will live migration be introduced for other server moves, such as upgrades and downgrades?
Our current plan for L1TF mitigation is to disable HyperThreading.
Yes, live migrations are a feature that we are now able to use. We are evaluating the different use cases for this one, but currently it cannot be used for upgrades/downgrades with plan resizing.
Thanks for keeping us informed and patching the hosts. We appreciate the effort and due diligence. I’m sure these projects at large scale are never fun.