跳到主要内容
博客云计算概述CPU漏洞:Meltdown和Spectre

CPU漏洞。Meltdown和Spectre

CPUVnerabilitiesMeltdownSpectre_1200x631

本周,谷歌的Project Zero团队和其他人披露了一些影响许多CPU架构的严重安全漏洞(CVE-2017-5753CVE-2017-5715CVE-2017-5754)。我们的团队正在与供应商和我们自己的工程师合作,以确定对我们平台的影响,但预计有必要在整个舰队范围内重新启动,以保护这些问题。

当我们通过我们的反应计划工作时,请理解由于这些问题的性质和严重性,可能需要快速反应。像往常一样,我们将提供尽可能多的提前通知。如果有必要对你的Linodes进行重启,我们将直接与你沟通,并提供日程安排信息。

随着更多信息的出现,我们将继续在这里为您提供最新信息。

有关这些漏洞的信息可以在以下网站找到:


更新:2018年1月4日

我们正在继续调查这个问题,并希望提供一个关于我们进展情况的简要更新:

  • 我们正在推迟所有不相关的维修工作,以集中精力和资源来缓解这一问题。
  • 正如Scaleway团队今天早些时候讨论的那样,由于硬件制造商提供的信息不完整,我们与其他可能受影响的云主机供应商联合起来,包括 ǞǞǞ, 数据包OVH. 我们已经创建了一个专门的沟通渠道,以分享信息并共同解决Meltdown & Spectre漏洞。
  • 我们正在继续对缓解措施进行内部评估和测试。  
  • 我们已经为明天与硬件供应商的深入讨论做好了准备。

我们将继续在这里酌情提供最新信息。


更新:2018年1月5日

我们正在继续取得进展,并希望与你分享最新的情况:

  • 最新的稳定和长期的Linux内核在今天发布,并加入了KPTI / Meltdown补丁。 因此,我们已经向您提供了4.14.12内核,并将其设置为最新版本。 如果你使用的是Linode内核,在你下次重新启动时,你的Linode将被升级到这个版本。 这并不能完全减轻你对Meltdown和Spectre漏洞的影响,但为我们提供了一个良好的工作基础,同时计划进行全面修复。
  • 我们已经与我们的硬件供应商进行了规划会议,并一直在为内核、管理程序和固件更新的实施计划而努力。所有这些都是让我们进入补救状态所需要的,但不是所有这些都可以用。

在等待外部依赖的过程中,我们预计周末不会有太大的动作,但如果有的话,一定会在这里提供更新。 如果没有,下周一将在这里分享更多更新。


更新:2018年1月8日

我们的内部测试正在继续取得进展,但仍在等待我们的硬件供应商的微代码更新。 为了确保对Meltdown和Spectre的三个变种有适当的缓解,微代码更新和内核更新都是必需的。


更新:2018年1月9日

我们今天花了很多时间准备在Linode的机群中部署Meltdown缓解措施的计划。在接下来的一天里,我们将对机群的一个子集实施修复,监测其影响,然后继续向其余部分推广。Meltdown缓解措施需要重新启动我们的物理硬件,这将重新启动托管在它们上面的Linode。东京1号、法兰克福和新加坡数据中心的一部分Linodes将作为这个初始组的一部分被重启。对于那些受影响的人,你将会收到一张支持票和电子邮件,其中包含调度信息。  

本周的重启只针对Meltdown。 我们同时进行了测试和规划,以解决Spectre问题。未来几周将需要更多的重启,以适当缓解所有Spectre变体。


更新:2018年1月10日

到目前为止,对我们机群的子集的融化缓解措施的推广工作进展顺利。 我们正在继续执行这一计划,并将在未来几天内对机群的其他部分进行重启。受影响的客户将收到支持票和电子邮件,其中包括他们的Linodes的重启窗口,至少有24小时的通知。

  • 由于这个问题的持续性质,我们创建了以下状态页面。 已经创建了以下状态页面.
  • 我们即将发布一份文件,更好地谈论Meltdown和Spectre,以显示它对你意味着什么,以及你可以做什么来为你的Linodes做准备。我们将在即将发布的博文中分享这个链接。

更新:2018年1月11日

缓解Meltdown的过程正在继续推进,我们每天都在整个舰队中取得进展。 有一个新的指南,提供了关于这些漏洞的更多信息,以及你如何保护你的Linode: 你需要做什么来减轻Meltdown和Spectre的影响?.


更新:2018年1月12日

我们正在继续推进Meltdown缓解进程,并在周末安排了重启。 我们的计划将持续到1月18日。 除非有其他可操作的消息,否则我们将暂停每天的博客更新,直至完成。


更新:2018年2月8日

作为提醒,我们所有的KVM 主机现在都对Meltdown进行了适当的缓解。我们正在继续努力对Spectre漏洞进行适当的缓解,一旦有了更新的计划,我们将在博客上提供。

有关这些漏洞的更多信息,我们的舰队的状态,以及如何保护你的Linode;请参考Meltdown & Spectre指南

评论 (77)

  1. Author Photo

    Wishlist: alternate-CPU-architecture Linode hosts 😉

    -Eugene

  2. Christopher Aker

    Sadly, portions of this affect both AMD and Intel, and likely others, fwiw.

  3. Author Photo
    Never Gonna Give You Up

    My wishlist would be that all VPS providers could get the same early notifications and patches that a certain trendy yet slower provider received two weeks ago.

  4. Author Photo

    Yup, ARM’s response is https://developer.arm.com/support/security-update

    fwiw, Scaleway is patching & mass rebooting hypervisors 1/4/2018

  5. Author Photo

    Sadly, it appears that even ARM CPUs are affected (to some degree) by this design flaw – so even if other CPU architectures were available, a reboot would still likely be necessary.

  6. Author Photo

    I could be wrong, but I think Eugene is referring to an alternative architecture like RISC-V…?

    https://www.codasip.com/2016/09/22/what-is-risc-vwhy-do-we-care-and-why-you-should-too/

  7. Author Photo

    Thanks for the update. One more question, the patch says 5-30% performance hit depending on workload. Do we need to add more VMs to deal with the load?

  8. Author Photo

    @Scott, @Eugene – it seems that “all CPUs” may very well be most modern CPUs; RedHat advisory claims that POWER and even SystemZ (as used in IBM mainframes) may be impacted by Spectre.

    https://access.redhat.com/security/vulnerabilities/speculativeexecution

    Basically if your core does speculative execution for performance gain then it may be vulnerable. The BOOM RISC-V core ( https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-157.html ) can do out of order execution and so _may_ be vulnerable. It would require a deeper look at other implementations of the spec to see if they are vulnerable or not.

  9. Author Photo

    Will this require patching the OS in our Linodes?

    • Nathan Melehan

      We’re still planning our full mitigation strategy, but Linodes will need an updated kernel, and we’re working on providing one from the Linode Manager. If you currently use a distribution-supplied or custom-compiled kernel, you will need to take separate actions to update it.

  10. Author Photo
  11. Author Photo
    Never Gonna Give You Up

    It appears that patching the guest is required to mitigate meltdown on Xen VM’s. If KVM, it should just be the hypervisor that needs the patches. Am I reading this correctly?

    https://www.theregister.co.uk/2018/01/04/intel_amd_arm_cpu_vulnerability/

  12. Author Photo
    Never Gonna Give You Up

    Correction, all VM’s have to be patched because of Meltdown, not just those on Xen.

  13. Author Photo

    Thanks Linode for your quick response and adapting 🙂

    Related tweet by WikiLeaks at https://twitter.com/wikileaks/status/948723793324838914

    Official website: https://meltdownattack.com

    Article: https://www.theregister.co.uk/2018/01/04/intel_amd_arm_cpu_vulnerability/

    A severe design flaws that allow stealing of sensitive data from memory has been discovered in Intel chipsets, affecting Xen, KVM, and more.

  14. Author Photo

    Linode will fix this, I trust in them. So no worries here.

  15. Author Photo

    Linode, can you please provide the details of patching hypervisors?
    I guess mitigation of memory reads between different VMs is often even more important than within a single VM.

  16. Author Photo

    I am relatively new to Linode, and have a low volume site. When critical issues like this are discovered and eventually fixed, it would be awesome to get an email with directions on what to do, if anything. Thanks !!

  17. Author Photo

    Could you clarify how “[upgrading your VM kernel] provides us a good foundation to work with while planning for full remediation”? I mean, less attack surface is great and all, but, how does it factor into your planning? I ass-u-me you’re going to be rebooting my host at some point anyway.

  18. Author Photo

    It’s great to see that you are on top of it. As long as we are informed ahead of time, we are fine with the reboot and security patches. As you know our customers don’t like downtimes. Also, please try to minimize fleet-wide reboot, which makes services completely unavailable.

  19. Author Photo

    Thanks Linode for the updates.

    I guess being on a shared (virtualized) server, all linodes on that (physical) server have to apply the new kernel, for the protection to be really effective.

    That’s a good start anyway.

  20. Author Photo

    Thanks for updated Kernel. Looks like it doesn’t support Redhat/CentOS KPTI tunables to be able to control KPTI and related patch operations https://community.centminmod.com/posts/57936/. Would be nice to have though – details https://access.redhat.com/articles/3311301

    cat /sys/kernel/debug/x86/pti_enabled
    cat: /sys/kernel/debug/x86/pti_enabled: No such file or directory

    cat /sys/kernel/debug/x86/ibpb_enabled
    cat: /sys/kernel/debug/x86/ibpb_enabled: No such file or directory

    cat /sys/kernel/debug/x86/ibrs_enabled
    cat: /sys/kernel/debug/x86/ibrs_enabled: No such file or directory

  21. Author Photo

    were any Spectre fixes added to the kernel ? PoC at https://github.com/crozone/SpectrePoC successfully runs = not fixed on updated linode with 4.14.12-x86_64-linode92 on centos 7.4 64bit

    but on dedicated elsewhere with centos 7.4 64bit and 3.10.0-693.11.6.el7.x86_64 the PoC fails to read = fixed

  22. Author Photo

    @George: One of the Spectre vulns requires you either recompile EVERYTHING with mitigations or a microcode patch. The other Spectre vuln isn’t fixable without newly architecture hardware (that doesn’t exist yet)

    Meltdown is the one you apply the KPTI patch for (Intel only).

    So Linode will probably have to issue a second round of reboots when their motherboard/OEM’s get around to issuing a CPU microcode patch. (Or they recompile everything)

  23. Author Photo

    Re: previous comment, ah upstream linual kernel hasn’t tackled Spectre yet according to http://kroah.com/log/blog/2018/01/06/meltdown-status/ but some distro backported kernels have i.e. Redhat/CentOS

  24. Author Photo

    you say “If you are leveraging a Linode kernel, upon your next reboot your Linode will be upgraded to this version.”.

    Is the easiest way to tell that by doing a `uname -a` and seeing if the string contains `-linode` e.g. “4.9.50-x86_64-linode86”

  25. Author Photo

    @Patrick,

    If `uname -a` is showing “somethingsomething-linode”, your kernel is coming from Linode. The particular version is assigned each time the instance boots, so if it is out of date, just reboot your Linode.

    If you aren’t sure about the kernel source or want to change it, you can log into the Linode Manager and click the “edit” link for your Linode’s Configuration Profile. The Kernel option under Boot is where this particular setting is stored.

  26. Author Photo

    Some of the OS patches are requiring microcode changes. But I think the linux ones work without and differently once the microcode has changed.

    The microcode will come in the form of a firmware update from the vendor.

  27. Author Photo

    Is there a reason why the Intel microcode update cannot be used directly rather than waiting for vendors to repackage it?

    Ie, https://downloadcenter.intel.com/download/27337/Linux-Processor-Microcode-Data-File

  28. Author Photo

    Apparently the earlier mentioned link is not the latest version, not sure of the direct Intel link but it’s apparently included in eg https://launchpad.net/ubuntu/+source/intel-microcode/3.20171215.1

    The question remains the same regardless.

  29. Author Photo
  30. Author Photo

    Hi,

    Any idea how long the reboots will take once they are scheduled?

    Thanks!

    Neil

    • Author Photo

      We’ve allocated a two hour window for maintenance, however in many cases the actual downtime will be less. That being said, we would still recommend preparing for a full two hours of downtime.

  31. Author Photo

    The link given in “Jan 11 update” gives misleading information.

    “Spectre targets the way modern CPUs work, regardless of speculative execution” is incorrect.

    Both Spectre and Meltdown take advantage of “speculative execution”.
    While Meltdown exploits a race condition based on the code after an exception is triggered, Spectre relies on the code speculatively run after a ‘if’ branch (uncached) condition that “usually” goes through happens to be false (since the next code accesses an out of bounds array).

  32. Author Photo

    I used to use linode vps, very good network speed.

  33. Author Photo

    > Our schedule runs through January 18th.

    Are you kidding me.. Why so long to patch Meltdown?

    Rather let your customers stay vulnerable to the exploit than lose a few customers because of insufficient server capacity?

    • Author Photo

      Hey Krian!

      Due to the scope of this vulnerability, we are rolling out the patch in waves to balance downtime for customers as well as ensure the patches work effectively across the entire fleet. With the hasty release of the kernel patches, we are making sure the patches don’t cause more issues for our customers than they fix.

  34. Author Photo

    So, my service is spread accross 5 linodes. Bringing down my VMs, one at a time, at unknown intervals spread over the next several days, is going to cause me to have possibly *five* outages in the worst case (if my VMs are all on different physical hosts, which I have no way of knowing), rather than one.

  35. Author Photo

    @Neil Ticktin

    It took one minute to fix.

    According to Uptime Robot, the monitor (my linode VM, Linode 2048) is back UP (Keyword Exists) (It was down for 0 minutes and 49 seconds).

  36. Author Photo

    I am relatively new to Linode, and have a low two volume site. Today I got the email with this subject “Linode Support Ticket 9678973 – Critical Maintenance for CPU Vulnerabilities (Meltdown) “. it would be great, If you provide an email what to do exactly, or anything.

    Thanks !!

    • Author Photo

      Hey Maneesh, first of all, welcome to Linode! For these Critical Maintenance for CPU Vulnerabilities (Meltdown) tickets, there is no action required on your end. That being said, we do recommend making sure your Linode is set to the latest kernel, which you can read more about how to do here. We would also recommend taking a look at the Reboot Survival Guide to ensure these reboots and migrations have as little impact on your Linode as possible.

  37. Author Photo

    I’m still seeing “Maintenance is not yet scheduled” on my dashboard.

    Warning would be good it sounds like there is a schedule.

    • Author Photo

      Hi Adrian! Wo don’t have a full schedule of exactly what host will experience the mitigation at what time just yet, however once we do set the schedule for a host your Linode is on you will be alerted with a ticket and via the dashboard.

  38. Author Photo

    How long reboots will take, once it is scheduled? Thanks in advance!

    • Author Photo

      The maintenance window for hosts is 2 hours, however we expect the reboots to not take the full two hours. Beyond that, I’m afraid I can’t really give a more accurate assessment of how long the reboot will take. Hope this helps!

  39. Author Photo

    Oh. It is serious issue that we need to mitigate.

  40. Author Photo

    The whole operation, from shutdown to server back up and running took 11 minutes.

  41. Author Photo

    Mine reboot was around 45 minutes.. I moved my servers for now in DigitalOcean. I do not know what DO is doing but as of the moment the do not promise down time..

  42. Author Photo

    @Romel

    That is hilarious. DO hasn’t done their reboots yet. So you leave a hosting provider for a serious, unavoidable reboot to another that has to do the same thing!

    Needed that laugh this morning.

  43. Author Photo

    You’ve schedule 2/3 of my cluster hosts for the same window, and given me no way to reschedule that. Support has not responded to my message about preventing downtime on my cluster by either rescheduling or migrating one of my nodes to another host. That means in 11 hours my cluster will be broken.

    What happened linode? You used to be good about handling these outages, but last summer you started to suck. Please improve, I’d hate to end a 4 year relationship over this.

    • Author Photo

      Hey Zach, I’m terribly sorry we haven’t gotten back to your ticket yet. Can you let us know the ticket number so I can take a look and see what we can do for you? Please also feel free to call our support line for more immediate assistance as well. Thank you for being patient with us during this process!

  44. Author Photo

    reaaaally wish you’d had a pool of servers that we could migrate onto, on our own schedule instead of this middle-of-the-night stuff. You’ve done it before.

    • Author Photo

      I apologize, due to the scope and severity of these vulnerabilities, we are unable to be as flexible as we would like with the mitigation process. If you open a ticket or call the support help line, however, we will be happy to see what we can do for you.

  45. Author Photo

    @Zach: Ask support for server migration for one of the cluster nodes to another hypervisor before reboot…

  46. Author Photo

    Hmm.

    The mugration og my first linode scheduled, could be started when it fit my schedule. Why not the next? (Both nodes are located in london)

    /Henning

  47. Author Photo

    I have a question about notifications. In the past, when downtime was scheduled for one of our systems, we’d receive an e-mail giving us plenty of advance notice. With the reboots for Meltdown/Spectre, however, we were left to polling our Linode Manager page to see what, if anything is scheduled.

    From reading an article at Ars Technica, it appears that Linode learned of these vulnerabilities like the rest of us — with no advance notice. I cannot imagine the scrambling that must have caused!

    As you would have appreciated advance notice, so too would we.

    We’ve been hosted on Linode coming up on 4 years and currently have 9 servers here. One of the reboots took down 3 of our nodes at once. As one of these handled DNS, the lack of sufficient notice prevented us from redirecting to our redundant system (TLS propagation issues). Fortunately, that reboot happened quickly and our loss of service resulted in limited down time for our site (less than 15 minutes).

    As further phases of remediation are planned, we expect that each of our systems will see at least one more reboot.

    How can we get email notification of upcoming server reboots?

    • Author Photo

      I completely understand where you’re coming from. For this round of reboots we aimed for at least 24 hours notice via ticket. For future reboots we’ll be able to provide notice further in advance, which will allow you more time to plan for any maintenance.

      You’ll be notified of any ticket updates via the contact email address on file. If you’re not receiving notices via email let us know in a ticket and we’ll be happy to take a look.

  48. Author Photo

    Thank you for the prompt response!

    Best wishes to you as you try and deal with rolling out fixes to thousands(?) of systems and deal with anxious customers.

    We have continued to receive e-mail messages informing of system reboots… *after* they occurred. The last e-mail message *predicting* a reboot was sent on Jan 11, 2018. We had 7 systems reboot after that, and only learned of those ahead of time by constant scanning of our Linode Manager page.

    This is both error-prone and time-consuming.

    Our Linode Manager Notifications tab ( https://manager.linode.com/profile/notifications ) currently shows:
    Linode Events Email
    Events Email Notification
    Notifications are currently: Enabled

    So we *should* be receiving e-mail notifications in *advance* from now on? Or is there something else that must be done?

    • Author Photo

      Yes, you should receive an email notification before any reboots from here on out. Email notifications should be much more timely moving forward. Even if there is a delay, the increased notice we’ll be providing for future maintenance will allow ample time for email notifications to reach you before any reboots take place.

  49. Author Photo

    That is a tremendous relief — Thank You!

    That is one thing Linode has made a name for itself — being open and forthright in dealings with its clients. Thanks for holding your standards high!

  50. Author Photo

    This is unbearable. Our linodes go down with no notice. Went down twice in the last 12 hours for about an hour each time. The scheduled downtimes have been VEEERRRYYY slowish, also up to 1 hour which is unheard of in professional hosting… Lots of our services affected, our users outraged… What are u going to do about it folks??

    • Author Photo

      Hello George, we are terribly sorry for the inconvenience. Due to the scope and and serious nature of the vulnerabilities, we had to perform the maintenance in an expeditious manner. You should have had tickets open informing you of the downtime in at least 24 hours notice. The maintenance window to address the vulnerability was 2 hours, however often the maintenance itself was shorter within about the hour mark. Do you currently have a ticket open so we can take a look at your account and confirm exactly what happened? I thank you for being patient through this process.

  51. Author Photo

    Hello,my maintenance status show phase 1 complete and future maintenance is pending,i want to know when will the migration plan come to an end。

    now i try to ping my linode IP aways show request timeout, thank you!

    • Author Photo

      This round of maintenance has completed, and your Linode should have returned to its previously booted state. I’m sorry to see the pings are timing out when you attempt to connect to the Linode. Do you currently have a ticket open with us to we can take a deeper look at this?

  52. Author Photo

    i had open a ticket to you,please help me to solve the problem,thanks!

  53. Author Photo

    Hello,my maintenance status show phase 1 complete and future maintenance is pending,i want to know when will the migration plan come to an end。

    now i try to ping my linode IP aways show request timeout,
    my ticket number 9869831
    thank you!

    • Author Photo

      We don’t currently have an ETA on the next maintenance window but tickets will be sent out once that is determined. We are currently waiting for patches to come from our hardware vendors.

  54. Author Photo

    We have a medium size fleet of Linodes and are seeing random loss of connectivity (requiring a reboot from Linode control panel to fix) across this fleet. This is causing major problems; last one to go was a primary database server which took out most of the fleet, and the nature of the networking problems is not playing nice with our cluster failover (eg the failures are inconsistent so the cluster might think the master it elected is fine, but outside the cluster its not visible – so the whole application is broken).

    Linode support is aware of networking issues and advised us to reboot onto latest kernel versions which, unfortunately, do not seem to have resolved these issues.

  55. Author Photo

    Hi
    I’ve set passwordauthentication NO

    But now i can’t log in with my public key any more.
    the public key remind the same and nothing change.

    It show:
    PuTTY Fatal Error
    Disconnected: No supported authentication methods available (server sent: pulickey,gssapi-keyex,gssapi-with-mic)

    Should I wait until the maintenance done so I can log in?

  56. Author Photo
  57. Author Photo

    @paul, what would be a medium sized fleet?

  58. Author Photo

    Linode, can you please provide the details of patching hypervisors?
    I guess mitigation of memory reads between different VMs is often even more important than within a single VM.

  59. Author Photo

    None of your servers need need BIOS/firmware updates for any of the recent CPU vulnerabilities? Would you consider providing at least some hardware with all AMD & Intel “management” features disabled? That seems like it would be 100% unique offering for a cloud host.

  60. Author Photo

    Also, I would like to know if you use Intel ME for any management done in the datacenters – or you use other tools (which I think are more suited to managing datacenters).

    • Author Photo

      Hi Wayne – In addition to switching to our latest patched kernel (5.1.5), we are addressing these vulnerabilities at the host level during scheduled maintenance windows. This guide has additional detailed information on these vulnerabilities as well as their mitigation.

      As far as providing “hardware with all AMD & Intel ‘management’ features disabled,” I have added your suggestion to our internal tracker.

      Regarding your last question about using Intel ME or other tools, we aren’t able to discuss specific information like this. Though if you have any other questions, let us know and we’ll be happy to provide as much information as we’re able to.

留下回复

您的电子邮件地址将不会被公布。 必须填写的字段被标记为*