Testing report – Case ID: 102254267403 – 25/4/2024

PCIe Card Model: SON-FUS-SSD-4X4-E3 Sonnet M.2 4x4 PCIe 3.0 x16 Card Silent Version

NVMe Drives: 4 x Samsing 2TB V-NAND SSD 970 EVO PlUS NVME M.2

Background and issue summary:

Issue present across multiple OS updates from late 2024.Background: An Issue arose late 2023, we now assume after an OSX update, where drives on this particular PCIe card would not initialise, appear in finder or disk utility, usually after a restart.

On a small number of occasions, the drives have disappeared one by one over around 5 seconds when the mac and OS were running, but this is unusual. Potentially indexing or time machine was running when this happened, but I can’t be sure and I can’t reproduce that behaviour reliably.

A reliable step to reproduce is a restart. Usually after one or two restarts, the drives will not be present when MacOS starts.

Sonnet technical support worked with me and early this year, said that they had replicated the issue and it was a problem that was introduced with a MacOS update in November 2023.

Sonnet confirmed that apple were aware and the fix would come in the form of a MacOS update. They also confirmed that this issue did not just affect Sonnet cards, but also competitors equivalents. I have replicated this issue with an OWC card so I can confirm.

Sonnet were expecting a fix early 2024, but as of the time of writing this, no fix has been offered.

I’m engaging with apple support to confirm that this is indeed the issue, and to try to ascertain what can be done to remedy it if a MacOS fix is not forthcoming.

Troubleshooting:

  • Multiple OS updates tested from November 2023 when issue first occurred, up to 14.5 public beta.

Problem persists.

  • New PCIe card purchased and installed. 

I believed this would solved the problem as we thought it to be a bridging chipset incompatibility with MacOS since November 2023.

Problem persists.

Note - now the drives and card have been changed over to the OWC card, for the rest of the troubleshooting after this point, the 4 drives listed above are installed on an OWC Accelsior 4M2 PCIe card since the issue persists and behaviour is exactly the same.

OWC states in a text chat that there was no issue with this card on the 2019 mac pro running Sonoma.

I also confirmed with another user that his similar OWC card, the OWC Accelsior 8M2 PCIe 4.0 NVME M.2 SSD adapter card with M.2 NVME SSD was working with no problems on his 2019 Mac Pro on Sonoma 14.4

I’d also confirmed that the card I was testing had the same bridge chipset as the user who was experiencing no issues

Unfortunately, the problem persisted, exhibiting the exact same behaviour.

After I reported this to OWC, they said that they were, after all, aware of an issue and, as Sonnet had stated, it was an Apple MacOS issue that needed a fix via a software update.•

Problem persists.

  • Moved card to an 8 lane PCIe slot (from the 16 lane one)

Problem persists.

  • Create partition and perform fresh MacOS installation.

1. Catalina 10.15.7

Performed multiple restarts and power cycles.

Drives behave as expected, problem is NOT present.

2. Sonoma Update performed, test partition is now MacOS 14.4.1

Mac shut down for the night.

First start up from power off:

Drives initialise and show as expected.

Restart mac, first restart:

Drives did not initialise or show. 

Full shutdown of mac – wait for 30 seconds – Power up

Drives initialise and appear in finder and Disk Utility

Restart

Drives do not initialise / appear

2nd Restart

Drives DO initialise / appear

3rd Restart

Drives DO NOT initialise / appear

Other Behaviour

When the drives do mount and are working, intermittently, while using the drives to stream music sample library data by a DAW, the machine has frozen, showing a sudden CPU overload on every core.

Sometimes this is accompanied by corrupt colourful graphics on all monitors, looking like a GPU fail, then shutdown and restart with serious error / panic report. I believe this to be linked to the issue, perhaps a wider PCIe handling problem, but this is only anecdotal at this stage, I have no testing or evidence to robustly link this problem with the one discussed above. 

Conclusion

It is a safe assumption from behaviour and testing that the issue affects both the Sonnet and the OWC card in the same way, therefore, troubleshooting carried out on the clean test partition with fresh installs of MacOS 10.15.7 And MacOS 14.4.1 is assumed to produce the same results with both cards, even though only the OWC was tested.

The results of the tests support information from Sonnet engineers, that the issue arose with a MacOS update around November 2023 which affects multiple NVMe PCIe cards across multiple manufacturers.

The early assumption that the problem concerned a specific bridge chipset on some of these cards appears to be wrong, since the two cards tested here have different chipsets.

Tests conclude that the drives and card work as expected with no issues on a new MacOS 10.15.7 installation but fail to work properly and reliably when an update to MacOS 14.4.1 is performed.

The test partition installs both had no additional software or driver installs.

Update from apple since this report was submitted

Apple conducted thorough troubleshooting and data collection with me including data captures, screen shots and this report. They subbmitted it to their engineering team who, in turn said they had to exculate it above their level. I'm now told that it is a case of waiting and there is no time line.

This machine represents a significant business investment and the ability to run fast storage on a 16 lane PCIe IO was a major factor in the purchasing decision.

The fact that this functionality is now broken indefinitely is, in my view unacceptable. Apple sell expensive machines and don't support functions that people but these machines to perform. 

I've never heard of a PC manufactorer talking about the difficulty of using third party PCIe cards when issues like this occure. They've made and sold a machine with PCIe slots - it should work and continue to work.

I've spoken to the store manager of the apple store this machine was purchased from. His stance seemed indifferent and unsympathetic to me. He mentioned that the machine had worked well for a few years as if that was something I should be happy about. I wasn't sure what he meant by this, maybe that I shouldn't expect these machines to last more than 3 years?

This is not an acceptable or appropriate attitude in my opinion, but it is worrying that this seems to confirm Apple's lack of interest in solving this problem. I hope I'm proved wrong, but it's understandable if I put myself in their position that they don't like supporting the Intel Mac Pro.

The conspiracist in me hypothesises that Apple may even be delaying fixes like this on purpose to force us to upgrade machines. If anything it will push me to the dark side (PC). That's an alarming thought.

Apple: if by some small chance you're reading this, please reach out and help! This issue has caused delays and hardship, and loss of opportunity. 

Update from apple engineering after going through the above and collecting data captures.

Engineering have found evidence of a drive issue in the logging

The drive is going into controller fatal state which in turn is causing the drive to be terminated

Apple engineering referring customer back to the third party manufacturer 

The issue is with the drive itself, not Mac OS or the hardware

I have replied asking the following:

  • My testing and evidence shows that the drives and card work perfectly in Catalina and Ventura, but the problem reoccurs as soon as MacOS is updated to Sonoma. How, with this evidence, has engineering made a pronouncement that this is not an OS issue.
  • The drive consists of 4 M.2 drives and a host PCIE card. The card has been replaced with a different model, both models approved by apple. The drives have been tested on the Mac Pro in question running and older OS and the problem is not present. The drives have been tested and gone through Samsungs testing process on a windows machine and passed. They also work perfectly on the Mac Pro in question as long as it’s a previous OS. With this evidence, how is this not a macOS issue.
  • Engineering state "Engineering have found evidence of a drive issue in the logging”. There are 4 drives and a host card. It’s unclear which drive they found an issue with, can you clarify as all 4 are affected intermittently, only under specific versions of MacOS.
  • I have now been through troubleshooting with Samsung. They are surprised by the statement from engineering and have deemed all four of my drives functional and fault free after we ran their diagnostics again. They have also mentioned that one or two drives being faulty is possible, but all 4 would be a statistical impossibility. This is why we seek further clarification.

All drives pass all diagnostics and work perfectly under any Mac OS apart from Sonoma. They also work perfectly under Windows11.

Two cards have been tested, different brands, different chipsets.

Apple appear to have closed this case and will not be investigating any further. I hope they will reconsider this when they read the above.

FURTHER UPDATE

Engineering have reopened my case and I'm told I'll be updated very soon. I dare not hope for any great news at this point, after all, if there was a fix, I'd have seen an OS update pop up.