I posted this in the connect feedback yesterday, but it might be useful here as well. There is a recent similar post, but my thoughts are in a different direction.
The specific situation details are as follows. SQL Server 2016 EE RTM+cu2 build 2164. On Windows Server 2012 R2 Standard Edition. Hardware is HP ProLiant DL380 Gen9 2-socket 18-cores per socket Xeon E5-2699 v3, 2.3GHz, HT disabled (36 physical core and 36 logical processors total), no VM. System memory is 512GB, SQL Server Max memory is 494GB (506,000MB), 20GB page file. LPIM is set. Storage is All-Flash, local PCI-E SSDs. Environment does use availability groups with 2 local nodes and 1 DR node.
Typical operating conditions are SQL Server at or near target memory (489 of 494GB), OS has 9GB available physical memory, 25GB available page file (implying 4GB of page file in use?). There is very low disk IO to the data files even during full transaction load periods. The plan cache is 14GB, well under the 75% of first 4GB, 10% of 4-64GB and 5% of memory over 64GB limit, will works out to just over 30GB. The plan cache split is about 3.5GB Adhoc, 7.6GB prepared (probably from Entity-Framework parameterized SQL) and 3.2GB procedures. Most of the Adhoc SQL plans (90%) are single use, more than half of prepared plan (57%) are single use. Only a tiny fraction of procedure plans are single use.
Virtual address space reserved is 2.4TB after SQL Server has been running 60-70 days, but growth appears to be in spurts, not steady. VAS committed is never observed at more than 496GB. There are no home growth CLRs, but there is infrequent light use of spatial geography STDistance function. Most EF calls specific a network packet size of 4096 or 8000 bytes, but there might be infrequent older .Net Framework clients that do not specify network packet size, hence defaulting to 8192 bytes. Does anything else make direct OS VAS allocations instead of using the buffer pool?
On Microsoft web site, it is stated that when operations need to make memory allocation via the OS VAS while SQL Server is at the target memory limit, it will allow this, temporary exceeding the specified memory limit, then gradually release memory elsewhere.
The problem experienced is intermittent occasions when the plan cache is forced down for unknown reasons, getting as lows as 100MB. This results in a majority of queries requiring a compile, greatly increasing system overall CPU (from 20-30% to near 100%?) as many procedures involve complex SQL. An even worse occurrence is compiles being block. Perhaps whatever the memory pressure that caused this results in SQL Server taking a lock on the entire plan cache to force out plans?
It is not certain, but suspected that SQL Server did not flush out data pages from the buffer cache. The 14GB of plan cache flushed by the memory pressure event had severe negative consequences. It is thought the flushing out 14GB of data pages probably would only resulted in a minor increase in disk IO, and even that would be far below what the Flash storage could have supported. Also, just flushing the single use Adhoc and prepared plans would have reduced plan cache by 50% (7GB) with probably minor impact.
The question is: what is the SQL Server internal strategy for responding to memory pressure. Does it automatically flush plan cache? Which would not be a good choice. Does it also consider whether to flush data pages? In this particular case with disk IO normally being low, and storage on all-flash, it would be my preferred course of action.
jchang