많은 CPU 아키텍처에 영향을 미치는 심각한 보안 취약점의 숫자는 구글의 프로젝트 제로 팀 등을 통해 이번 주(CVE-2017-5753, CVE-2017-5715, CVE-2017-5754)공개되었다. 우리 팀은 공급업체 및 자체 엔지니어와 협력하여 플랫폼에 미치는 영향을 파악하고 있지만 이러한 문제로부터 보호하기 위해 차량 전체의 재부팅이 필요할 것으로 기대됩니다.
우리는 우리의 응답 계획을 통해 작업하는 동안, 이러한 문제의 성격과 심각성으로 인해 신속한 대응이 필요할 수 있음을 이해하시기 바랍니다. 언제나처럼, 우리는 가능한 한 많은 사전 통지를 제공 할 것입니다. 리노드재부팅이 필요한 경우 일정 정보와 함께 귀하에게 직접 연락할 것입니다.
자세한 정보가 제공됨에 따라 여기에서 계속 업데이트됩니다.
이러한 취약점에 대한 정보는 다음 사이트에서 찾을 수 있습니다.
업데이트: 2018년 1월 4일
우리는이 문제를 계속 조사하고 있으며 우리가 어디에 있는지에 대한 간단한 업데이트를 제공하고 싶었습니다.
- 우리는 이 문제를 완화하는 데 우리의 노력과 자원을 집중하기 위해 관련없는 모든 유지 보수를 연기하고 있습니다.
- 오늘 오전 Scaleway 팀에서 논의한 바와 같이, 하드웨어 제조업체에서 제공한 불완전한 정보로 인해 당사는 다음과 같이 잠재적으로 영향을 받을 수 있는 다른 클라우드 호스팅 제공업체와 협력했습니다. Scaleway, Packet, 및 OVH. 멜트다운 및 스펙터 취약점을 해결하기 위해 정보를 공유하고 협력할 수 있는 전용 커뮤니케이션 채널을 만들었습니다.
- 우리는 완화에 대한 내부 평가 및 테스트를 계속하고 있습니다.
- 우리는 하드웨어 제공 업체와 더 깊은 다이빙을 위해 내일을 위한 토론을 설정합니다.
우리는 적절한 여기에 업데이트를 제공 할 것입니다.
업데이트: 2018년 1월 5일
우리는 계속해서 진전을 이루고 있으며 최신 소식을 여러분과 공유하고 싶었습니다.
- 최신 안정적이고 장기 리눅스 커널은 장소에 KPTI / 멜트 다운 패치와 함께 오늘 발표되었다. 따라서 4.14.12 커널을 사용할 수 있게 되었으며 최신 커널로 설정했습니다. Linode 커널을 활용하는 경우 다음 재부팅 시 Linode가 이 버전으로 업그레이드됩니다. 이는 멜트다운 및 유령 취약점에서 완전히 완화되지는 않지만 전체 수정을 계획하는 동안 함께 작업할 수 있는 좋은 토대를 제공합니다.
- 하드웨어 공급자와 계획 세션을 진행했으며 커널, 하이퍼바이저 및 펌웨어 업데이트에 대한 구현 계획을 수립하고 있습니다. 이 모든 것을 수정 상태로 끌어들이기 위해 필요하지만 이 모든 것을 사용할 수는 없습니다.
우리는 외부 의존성을 기다리는 동안 주말 동안 이것에 많은 움직임을 기대하지 않지만, 거기에있는 경우 확실히 여기에 업데이트를 제공 할 것입니다. 그렇지 않은 경우 다음 주 월요일에 더 많은 업데이트가 공유됩니다.
업데이트: 2018년 1월 8일
우리는 내부 테스트를 통해 계속 진전을 이루고 있지만 하드웨어 공급자의 마이크로 코드 업데이트를 기다리고 있습니다. 멜트다운 및 스펙터의 세 가지 변형에 대한 적절한 완화가 있는지 확인하기 위해 마이크로 코드 업데이트와 커널 업데이트가 모두 필요합니다.
업데이트: 2018년 1월 9일
우리는 오늘 Linode의 함대에 붕괴 완화를 배포하는 계획을 준비하는 데 보냈습니다. 다음 날 동안 우리는 플릿의 하위 집합에 대한 수정 사항을 구현하고, 영향을 모니터링한 다음 나머지 로 롤아웃을 계속할 것입니다. 멜트다운 완화는 해당 하드웨어에서 호스팅되는 Linodes를 재부팅하는 물리적 하드웨어를 재부팅해야 합니다. 도쿄 1, 프랑크푸르트 및 싱가포르 데이터 센터의 리노드 하위 집합은 이 초기 그룹의 일부로 재부팅됩니다. 영향을 받는 사람들을 위해 예약 정보가 있는 지원 티켓과 이메일을 받게 됩니다.
이번 주에 대한 재부팅은 붕괴만 을 해결합니다. 우리는 유령을 해결하기 위해 병렬로 발생하는 테스트 및 계획이 있습니다. 모든 Spectre 변형을 적절하게 완화하기 위해 앞으로 몇 주 동안 추가 재부팅이 필요합니다.
업데이트: 2018년 1월 10일
우리 함대의 하위 집합에 대한 롤아웃은 지금까지 멜트다운 완화를 위해 잘 진행되었습니다. 우리는이 계획을 계속하고 함대의 나머지 부분에 대한 다음 며칠 동안 재부팅을 수행 할 것입니다. 영향을 받는 고객은 최소 24시간 통지와 함께 Linodes의 재부팅 창이 있는 지원 티켓및 이메일을 받게 됩니다.
- 이 문제의 지속적인 특성으로 인해 다음 상태 페이지가 만들어졌습니다.
- 우리는 곧 멜트다운과 유령에 대해 더 잘 이야기하여 그것이 당신에게 무엇을 의미하는지, 그리고 리노드에서 준비하기 위해 무엇을 할 수 있는지를 보여주는 문서를 게시하고 있습니다. 예정된 블로그 게시물에 이에 대한 링크를 공유하겠습니다.
업데이트: 2018년 1월 11일
멜트다운을 위한 완화 프로세스가 계속되고 있으며, 우리는 함대 전역에서 매일 진전을 이루고 있습니다. 이러한 취약점에 대한 자세한 정보와 리노드를 보호하는 방법에 대한 새로운 가이드가 있습니다: 멜트다운 및 유령을 완화하기 위해 해야 할 일.
업데이트: 2018년 1월 12일
우리는 붕괴 완화 프로세스를 계속하고 주말 동안 재부팅이 예정되어 있습니다. 우리의 일정은 1 월 18 일까지 진행됩니다. 다른 실행 가능한 뉴스를 사용할 수 없게 되지 않는 한 이 작업이 완료될 때까지 매일 블로그 업데이트를 일시 중지할 예정입니다.
업데이트: 2018년 2월 8일
상기시켜 주면서, 우리의 모든 KVM 이제 멜트다운에 대해 호스트가 제대로 완화됩니다. 우리는 유령 취약점에 대한 적절한 완화를 위해 계속 노력하고 있으며, 일단 사용할 수 있게되면 블로그에 업데이트 된 계획을 제공 할 것입니다.
이러한 취약점, 당사 의 함대 상태 및 Linode 보호 방법에 대한 자세한 정보; 멜트다운 및 유령 가이드를참조하십시오.
댓글 (77)
Wishlist: alternate-CPU-architecture Linode hosts 😉
-Eugene
Sadly, portions of this affect both AMD and Intel, and likely others, fwiw.
My wishlist would be that all VPS providers could get the same early notifications and patches that a certain trendy yet slower provider received two weeks ago.
Yup, ARM’s response is https://developer.arm.com/support/security-update
fwiw, Scaleway is patching & mass rebooting hypervisors 1/4/2018
Sadly, it appears that even ARM CPUs are affected (to some degree) by this design flaw – so even if other CPU architectures were available, a reboot would still likely be necessary.
I could be wrong, but I think Eugene is referring to an alternative architecture like RISC-V…?
https://www.codasip.com/2016/09/22/what-is-risc-vwhy-do-we-care-and-why-you-should-too/
Thanks for the update. One more question, the patch says 5-30% performance hit depending on workload. Do we need to add more VMs to deal with the load?
@Scott, @Eugene – it seems that “all CPUs” may very well be most modern CPUs; RedHat advisory claims that POWER and even SystemZ (as used in IBM mainframes) may be impacted by Spectre.
https://access.redhat.com/security/vulnerabilities/speculativeexecution
Basically if your core does speculative execution for performance gain then it may be vulnerable. The BOOM RISC-V core ( https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-157.html ) can do out of order execution and so _may_ be vulnerable. It would require a deeper look at other implementations of the spec to see if they are vulnerable or not.
Will this require patching the OS in our Linodes?
We’re still planning our full mitigation strategy, but Linodes will need an updated kernel, and we’re working on providing one from the Linode Manager. If you currently use a distribution-supplied or custom-compiled kernel, you will need to take separate actions to update it.
@Ruben Yes.
It appears that patching the guest is required to mitigate meltdown on Xen VM’s. If KVM, it should just be the hypervisor that needs the patches. Am I reading this correctly?
https://www.theregister.co.uk/2018/01/04/intel_amd_arm_cpu_vulnerability/
Correction, all VM’s have to be patched because of Meltdown, not just those on Xen.
Thanks Linode for your quick response and adapting 🙂
Related tweet by WikiLeaks at https://twitter.com/wikileaks/status/948723793324838914
Official website: https://meltdownattack.com
Article: https://www.theregister.co.uk/2018/01/04/intel_amd_arm_cpu_vulnerability/
A severe design flaws that allow stealing of sensitive data from memory has been discovered in Intel chipsets, affecting Xen, KVM, and more.
Linode will fix this, I trust in them. So no worries here.
Linode, can you please provide the details of patching hypervisors?
I guess mitigation of memory reads between different VMs is often even more important than within a single VM.
I am relatively new to Linode, and have a low volume site. When critical issues like this are discovered and eventually fixed, it would be awesome to get an email with directions on what to do, if anything. Thanks !!
Could you clarify how “[upgrading your VM kernel] provides us a good foundation to work with while planning for full remediation”? I mean, less attack surface is great and all, but, how does it factor into your planning? I ass-u-me you’re going to be rebooting my host at some point anyway.
It’s great to see that you are on top of it. As long as we are informed ahead of time, we are fine with the reboot and security patches. As you know our customers don’t like downtimes. Also, please try to minimize fleet-wide reboot, which makes services completely unavailable.
Thanks Linode for the updates.
I guess being on a shared (virtualized) server, all linodes on that (physical) server have to apply the new kernel, for the protection to be really effective.
That’s a good start anyway.
Thanks for updated Kernel. Looks like it doesn’t support Redhat/CentOS KPTI tunables to be able to control KPTI and related patch operations https://community.centminmod.com/posts/57936/. Would be nice to have though – details https://access.redhat.com/articles/3311301
cat /sys/kernel/debug/x86/pti_enabled
cat: /sys/kernel/debug/x86/pti_enabled: No such file or directory
cat /sys/kernel/debug/x86/ibpb_enabled
cat: /sys/kernel/debug/x86/ibpb_enabled: No such file or directory
cat /sys/kernel/debug/x86/ibrs_enabled
cat: /sys/kernel/debug/x86/ibrs_enabled: No such file or directory
were any Spectre fixes added to the kernel ? PoC at https://github.com/crozone/SpectrePoC successfully runs = not fixed on updated linode with 4.14.12-x86_64-linode92 on centos 7.4 64bit
but on dedicated elsewhere with centos 7.4 64bit and 3.10.0-693.11.6.el7.x86_64 the PoC fails to read = fixed
@George: One of the Spectre vulns requires you either recompile EVERYTHING with mitigations or a microcode patch. The other Spectre vuln isn’t fixable without newly architecture hardware (that doesn’t exist yet)
Meltdown is the one you apply the KPTI patch for (Intel only).
So Linode will probably have to issue a second round of reboots when their motherboard/OEM’s get around to issuing a CPU microcode patch. (Or they recompile everything)
Re: previous comment, ah upstream linual kernel hasn’t tackled Spectre yet according to http://kroah.com/log/blog/2018/01/06/meltdown-status/ but some distro backported kernels have i.e. Redhat/CentOS
you say “If you are leveraging a Linode kernel, upon your next reboot your Linode will be upgraded to this version.”.
Is the easiest way to tell that by doing a `uname -a` and seeing if the string contains `-linode` e.g. “4.9.50-x86_64-linode86”
@Patrick,
If `uname -a` is showing “somethingsomething-linode”, your kernel is coming from Linode. The particular version is assigned each time the instance boots, so if it is out of date, just reboot your Linode.
If you aren’t sure about the kernel source or want to change it, you can log into the Linode Manager and click the “edit” link for your Linode’s Configuration Profile. The Kernel option under Boot is where this particular setting is stored.
Some of the OS patches are requiring microcode changes. But I think the linux ones work without and differently once the microcode has changed.
The microcode will come in the form of a firmware update from the vendor.
Is there a reason why the Intel microcode update cannot be used directly rather than waiting for vendors to repackage it?
Ie, https://downloadcenter.intel.com/download/27337/Linux-Processor-Microcode-Data-File
Apparently the earlier mentioned link is not the latest version, not sure of the direct Intel link but it’s apparently included in eg https://launchpad.net/ubuntu/+source/intel-microcode/3.20171215.1
The question remains the same regardless.
excelent
Hi,
Any idea how long the reboots will take once they are scheduled?
Thanks!
Neil
We’ve allocated a two hour window for maintenance, however in many cases the actual downtime will be less. That being said, we would still recommend preparing for a full two hours of downtime.
The link given in “Jan 11 update” gives misleading information.
“Spectre targets the way modern CPUs work, regardless of speculative execution” is incorrect.
Both Spectre and Meltdown take advantage of “speculative execution”.
While Meltdown exploits a race condition based on the code after an exception is triggered, Spectre relies on the code speculatively run after a ‘if’ branch (uncached) condition that “usually” goes through happens to be false (since the next code accesses an out of bounds array).
I used to use linode vps, very good network speed.
> Our schedule runs through January 18th.
Are you kidding me.. Why so long to patch Meltdown?
Rather let your customers stay vulnerable to the exploit than lose a few customers because of insufficient server capacity?
Hey Krian!
Due to the scope of this vulnerability, we are rolling out the patch in waves to balance downtime for customers as well as ensure the patches work effectively across the entire fleet. With the hasty release of the kernel patches, we are making sure the patches don’t cause more issues for our customers than they fix.
So, my service is spread accross 5 linodes. Bringing down my VMs, one at a time, at unknown intervals spread over the next several days, is going to cause me to have possibly *five* outages in the worst case (if my VMs are all on different physical hosts, which I have no way of knowing), rather than one.
@Neil Ticktin
It took one minute to fix.
According to Uptime Robot, the monitor (my linode VM, Linode 2048) is back UP (Keyword Exists) (It was down for 0 minutes and 49 seconds).
I am relatively new to Linode, and have a low two volume site. Today I got the email with this subject “Linode Support Ticket 9678973 – Critical Maintenance for CPU Vulnerabilities (Meltdown) “. it would be great, If you provide an email what to do exactly, or anything.
Thanks !!
Hey Maneesh, first of all, welcome to Linode! For these Critical Maintenance for CPU Vulnerabilities (Meltdown) tickets, there is no action required on your end. That being said, we do recommend making sure your Linode is set to the latest kernel, which you can read more about how to do here. We would also recommend taking a look at the Reboot Survival Guide to ensure these reboots and migrations have as little impact on your Linode as possible.
I’m still seeing “Maintenance is not yet scheduled” on my dashboard.
Warning would be good it sounds like there is a schedule.
Hi Adrian! Wo don’t have a full schedule of exactly what host will experience the mitigation at what time just yet, however once we do set the schedule for a host your Linode is on you will be alerted with a ticket and via the dashboard.
How long reboots will take, once it is scheduled? Thanks in advance!
The maintenance window for hosts is 2 hours, however we expect the reboots to not take the full two hours. Beyond that, I’m afraid I can’t really give a more accurate assessment of how long the reboot will take. Hope this helps!
Oh. It is serious issue that we need to mitigate.
The whole operation, from shutdown to server back up and running took 11 minutes.
Mine reboot was around 45 minutes.. I moved my servers for now in DigitalOcean. I do not know what DO is doing but as of the moment the do not promise down time..
@Romel
That is hilarious. DO hasn’t done their reboots yet. So you leave a hosting provider for a serious, unavoidable reboot to another that has to do the same thing!
Needed that laugh this morning.
You’ve schedule 2/3 of my cluster hosts for the same window, and given me no way to reschedule that. Support has not responded to my message about preventing downtime on my cluster by either rescheduling or migrating one of my nodes to another host. That means in 11 hours my cluster will be broken.
What happened linode? You used to be good about handling these outages, but last summer you started to suck. Please improve, I’d hate to end a 4 year relationship over this.
Hey Zach, I’m terribly sorry we haven’t gotten back to your ticket yet. Can you let us know the ticket number so I can take a look and see what we can do for you? Please also feel free to call our support line for more immediate assistance as well. Thank you for being patient with us during this process!
reaaaally wish you’d had a pool of servers that we could migrate onto, on our own schedule instead of this middle-of-the-night stuff. You’ve done it before.
I apologize, due to the scope and severity of these vulnerabilities, we are unable to be as flexible as we would like with the mitigation process. If you open a ticket or call the support help line, however, we will be happy to see what we can do for you.
@Zach: Ask support for server migration for one of the cluster nodes to another hypervisor before reboot…
Hmm.
The mugration og my first linode scheduled, could be started when it fit my schedule. Why not the next? (Both nodes are located in london)
/Henning
I have a question about notifications. In the past, when downtime was scheduled for one of our systems, we’d receive an e-mail giving us plenty of advance notice. With the reboots for Meltdown/Spectre, however, we were left to polling our Linode Manager page to see what, if anything is scheduled.
From reading an article at Ars Technica, it appears that Linode learned of these vulnerabilities like the rest of us — with no advance notice. I cannot imagine the scrambling that must have caused!
As you would have appreciated advance notice, so too would we.
We’ve been hosted on Linode coming up on 4 years and currently have 9 servers here. One of the reboots took down 3 of our nodes at once. As one of these handled DNS, the lack of sufficient notice prevented us from redirecting to our redundant system (TLS propagation issues). Fortunately, that reboot happened quickly and our loss of service resulted in limited down time for our site (less than 15 minutes).
As further phases of remediation are planned, we expect that each of our systems will see at least one more reboot.
How can we get email notification of upcoming server reboots?
I completely understand where you’re coming from. For this round of reboots we aimed for at least 24 hours notice via ticket. For future reboots we’ll be able to provide notice further in advance, which will allow you more time to plan for any maintenance.
You’ll be notified of any ticket updates via the contact email address on file. If you’re not receiving notices via email let us know in a ticket and we’ll be happy to take a look.
Thank you for the prompt response!
Best wishes to you as you try and deal with rolling out fixes to thousands(?) of systems and deal with anxious customers.
We have continued to receive e-mail messages informing of system reboots… *after* they occurred. The last e-mail message *predicting* a reboot was sent on Jan 11, 2018. We had 7 systems reboot after that, and only learned of those ahead of time by constant scanning of our Linode Manager page.
This is both error-prone and time-consuming.
Our Linode Manager Notifications tab ( https://manager.linode.com/profile/notifications ) currently shows:
Linode Events Email
Events Email Notification
Notifications are currently: Enabled
So we *should* be receiving e-mail notifications in *advance* from now on? Or is there something else that must be done?
Yes, you should receive an email notification before any reboots from here on out. Email notifications should be much more timely moving forward. Even if there is a delay, the increased notice we’ll be providing for future maintenance will allow ample time for email notifications to reach you before any reboots take place.
That is a tremendous relief — Thank You!
That is one thing Linode has made a name for itself — being open and forthright in dealings with its clients. Thanks for holding your standards high!
This is unbearable. Our linodes go down with no notice. Went down twice in the last 12 hours for about an hour each time. The scheduled downtimes have been VEEERRRYYY slowish, also up to 1 hour which is unheard of in professional hosting… Lots of our services affected, our users outraged… What are u going to do about it folks??
Hello George, we are terribly sorry for the inconvenience. Due to the scope and and serious nature of the vulnerabilities, we had to perform the maintenance in an expeditious manner. You should have had tickets open informing you of the downtime in at least 24 hours notice. The maintenance window to address the vulnerability was 2 hours, however often the maintenance itself was shorter within about the hour mark. Do you currently have a ticket open so we can take a look at your account and confirm exactly what happened? I thank you for being patient through this process.
Hello,my maintenance status show phase 1 complete and future maintenance is pending,i want to know when will the migration plan come to an end。
now i try to ping my linode IP aways show request timeout, thank you!
This round of maintenance has completed, and your Linode should have returned to its previously booted state. I’m sorry to see the pings are timing out when you attempt to connect to the Linode. Do you currently have a ticket open with us to we can take a deeper look at this?
i had open a ticket to you,please help me to solve the problem,thanks!
Can you please provide the ticket number? Thank you.
Hello,my maintenance status show phase 1 complete and future maintenance is pending,i want to know when will the migration plan come to an end。
now i try to ping my linode IP aways show request timeout,
my ticket number 9869831
thank you!
We don’t currently have an ETA on the next maintenance window but tickets will be sent out once that is determined. We are currently waiting for patches to come from our hardware vendors.
We have a medium size fleet of Linodes and are seeing random loss of connectivity (requiring a reboot from Linode control panel to fix) across this fleet. This is causing major problems; last one to go was a primary database server which took out most of the fleet, and the nature of the networking problems is not playing nice with our cluster failover (eg the failures are inconsistent so the cluster might think the master it elected is fine, but outside the cluster its not visible – so the whole application is broken).
Linode support is aware of networking issues and advised us to reboot onto latest kernel versions which, unfortunately, do not seem to have resolved these issues.
Hi
I’ve set passwordauthentication NO
But now i can’t log in with my public key any more.
the public key remind the same and nothing change.
It show:
PuTTY Fatal Error
Disconnected: No supported authentication methods available (server sent: pulickey,gssapi-keyex,gssapi-with-mic)
Should I wait until the maintenance done so I can log in?
We’d like to look into this with you. Could you open a ticket for us and let us know the ticket number so we can locate it on our end?
You may be able to use Lish to log into your Linode if it’s up and running, even if SSH isn’t working:
https://www.linode.com/docs/guides/using-the-lish-console/
ticket 9859656
@paul, what would be a medium sized fleet?
Linode, can you please provide the details of patching hypervisors?
I guess mitigation of memory reads between different VMs is often even more important than within a single VM.
We just completed fleet-wide reboots for Spectre v1/v2 mitigation and are now working through a few rounds of migrations in order to close this out. More information can be found on the status page here: https://status.linode.com/incidents/8dbtk37dwm67/.
None of your servers need need BIOS/firmware updates for any of the recent CPU vulnerabilities? Would you consider providing at least some hardware with all AMD & Intel “management” features disabled? That seems like it would be 100% unique offering for a cloud host.
Also, I would like to know if you use Intel ME for any management done in the datacenters – or you use other tools (which I think are more suited to managing datacenters).
Hi Wayne – In addition to switching to our latest patched kernel (5.1.5), we are addressing these vulnerabilities at the host level during scheduled maintenance windows. This guide has additional detailed information on these vulnerabilities as well as their mitigation.
As far as providing “hardware with all AMD & Intel ‘management’ features disabled,” I have added your suggestion to our internal tracker.
Regarding your last question about using Intel ME or other tools, we aren’t able to discuss specific information like this. Though if you have any other questions, let us know and we’ll be happy to provide as much information as we’re able to.