°ä²¼¹¦·ò£º2023-07-20
CXL£¨Compute Express Link£©£¬ÊÇÒ»ÖÖȫеÄÉ豸»¥Áª¼¼Êõ³ß¶È£¬ÒѳÉΪҵ½çÍ»ÆÆÄÚ´æÆ¿¾±µÄ´æ´¢Ð¼¼Êõ¡£²»½öÓÃÓÚÄÚ´æÈÝÁ¿/´ø¿íÀ©´ó£¬»¹ÓÃÓÚÒì¹¹»¥Áª£¬Êý¾ÝÖÐÐÄ×ÊÔ´³Ø½âñî¡£ÔÚÊý¾ÝÖÐÐÄ£¬CXL¼¼ÊõÄܹ»½«·ÖÆçµÄÍÆËãºÍ´æ´¢×ÊÔ´½øÐл¥Áª£¬Óøü¸ßµÄϵͳ»úÄܺÍЧÄÜÀ´½â¾öÊý¾ÝÖÐÐÄÄÚ´æÎÊÌâ¡£
ÔÆÍÆËã¡¢´óÊý¾Ý·ÖÎö¡¢ÈËΪÖÇÄÜ¡¢»úе½ø½¨µÅצÓõļ±¾ç·¢Õ¹´øÀ´Êý¾ÝÖÐÐÄ´æ´¢ºÍ´¦ÖÃÊý¾ÝÐèÒªµÄ±¬Õ¨ÐÔÔö³¤¡£´«Í³DDRÄÚ´æ½Ó¿Ú´æÔÚ×Ü´ø¿í¡¢Ã¿Ö÷Ìâ¾ùÔÈ´ø¿íºÍÈÝÁ¿¿ÉÀ©´óÐÔÊÜÏÞµÈÎÊÌâ¡£ÓÈÆäÔÚÊý¾ÝÖÐÐÄ£¬ÊÜÏÞÓÚ´óÁ¿ÄÚ´æÎÊÌ⣬еÄÄÚ´æ½Ó¿Ú¼¼ÊõCXL³öÏÖ¡£
ÔÚÊý¾ÝÖÐÐÄ£¬CPUÄÚ´æÊǽôñîºÏ¹ØÏµ£¬Ã¿´úCPU¶¼Ñ¡È¡ÐµÄÄÚ´æ¼¼ÊõʵÏÖ¸ü¸ßµÄÈÝÁ¿ºÍ´ø¿í¡£×Ô2012ÄêÒÔÀ´£¬Ö÷ÌâÊýÁ¿Ñ¸¿ìÔö³¤£¬µ«Ã¿¸öÄں˵ÄÄÚ´æ´ø¿íºÍÈÝÁ¿²¢Î´ÏàÓ¦Ôö³¤£¬·´¶ø½µÂä¡£ÕâÖÖÇ÷Ïò½«ÔÚ½«À´³ÖÐø£¬ÄÚ´æÈÝÁ¿Ôö³¤¿ìÂÊÒ²¿ìÓÚÄÚ´æ´ø¿í£¬Õâ¶Ôϵͳ»úÄÜÓ°Ïì¾Þ´ó¡£
´Ë±í£¬Ö±Á¬DRAMºÍSSDÔÚÑÓ³¤ºÍ³É±¾·½ÃæµÄ¾Þ´ó²î¾à£¬Ê¹µÃ°º¹óµÄÄÚ´æ×ÊÔ´ÍùÍùÀûÓÃÂʺܵͣ¬ÃýÎóµÄÍÆËãºÍÄÚ´æ±ÈÀýºÜÈÝÒ×Ôì³ÉÄÚ´æÏÐÖã¨Stranded£©×â²»³öÈ¥µÄ¾°Ïó¡£Êý¾ÝÖÐÐÄÒµÎñ×÷ΪȫÇò±¾Ç®×îÃܼ¯µÄÐÐÒµÖ®Ò»£¬µÍÀûÓÃÂÊÊÇÒ»¸öºÜ´óµÄÖ°ÊØ¡£Î¢Èí°µÊ¾£¬50%µÄ·þÎñÆ÷×ܳÉÕý±¾×ÔDRAM¡£Ö»¹ÜDRAM³É±¾ºÜ¸ß£¬µ«25%µÄDRAMÄڴ滹ÊÇÀË·ÑÁË¡£ÏÂͼÀ´×ÔMetaÄÚ²¿µÄͳ¼ÆÊý¾ÝÒ²ÏÔʾÀàËÆ¾°Ïó¡£ÄÚ´æ³É±¾Õ¼ÏµÍ³×ܳɱ¾µÄ±ÈÀýÏÖʵÉÏÔÚ²»ÐÝÉÏÉý£¬ÏµÍ³µÄÖØÒª³É±¾ÒѾÔì³ÉÊÇÄÚ´æ¶ø²»ÊÇCPU×ÔÉí£¬Ê¹ÓÃCXLÄÚ´æ×ÊÔ´³ØÄܹ»ÓÐЧ¸ÄÉÆÕâÒ»ÎÊÌ⣬ͨ¹ý¸øÏµÍ³¶¯Ì¬·ÖÅäÄÚ´æ×ÊÔ´Äܹ»ÓÅ»¯ÍÆËãºÍÄÚ´æ±ÈÀý¹ØÏµÓÅ»¯TCO¡£

·ÖÆçÄÚ´æ¼¼ÊõµÄÑÓ³¤¸Å¿ö

ÄÚ´æ´ø¿í/ÈÝÁ¿Ë湦·òÍÆÒÆÔö³¤
·ÖÆç´úÄÚ´æÔÚ»ú¼ÜTCO/¹¦ºÄÕ¼±È

Microsoft AzureÄÚ´æÏÐÖÃ
»ùÓÚ´«Í³ÄÚ´æÎÊÌ⣬ҵ½çÒ»ÏòÔÚ×·ÇóѡȡеÄÄÚ´æ½Ó¿Ú¼¼ÊõºÍϵͳ¼Ü¹¹¡£
ÔÚÄÚ´æ½Ó¿Ú¼¼Êõ·½Ã棬PCI-Express(peripheral component interconnect express)³ÉΪÊ×Ñ¡¡£PCIeÊÇ´®ÐÐ×ÜÏߣ¬Ê¹ÓÃPCIe´Ó»úÄܺÍÈí¼þµÄ½Ç¶ÈÀ´¿´·ÖÆçÉ豸֮¼äͨѶµÄ¿ªÏúÏà¶Ô½Ï¸ß£¬µ«ºÃÐÂÎÅÊÇ£¬PCIe½«ÒÀÕÕ´òËãÔÚ2023Äêµ×ʵÏÖ7.0°æµÄºË×¼£¬Ìṩ¸ß´ï256GB/s µÄ¿ìÂÊ£»Õâ¾àÀë16 GT/s¿ìÂʵÄ4.0°æPCIeÎÊÊÀ»¹²»µ½Á½Äê¡£¼Ó¿ìPCIe·¢Õ¹À¶Í¼µÄÖØÒªÍÆÊÖÊÇÔÆ¶ËÔËËãÐèÒª£»¶øPCIeÒÔÍùÊÇÿ3~4Ä꣬ÉõÖÁÊÇ7Äê»á½«Êý¾Ý´«Êä¿ìÂÊÌáÉýÒ»±¶¡£

PCIe vs DDR ´ø¿í¶Ô±È
ϵͳ¼Ü¹¹ÔòÊÇÀú¾¼¸´ú½ø»¯¸üµü¡£×î³õΪʵÏÖ¶à¸ö·þÎñÆ÷¹²Ïí×ÊÔ´³ØµÄ³¢ÊÔ£¬Í¨³£Ê¹ÓÃRDMA¼¼ÊõÔÚͨÓÃÒÔÌ«Íø»òInfiniBandÉÏÃæÊµÏÖ£¬ÕâЩͨѶ²½Öèͨ³£Ê±ÑÓ¸ü¸ß£¨±¾µØÄڴ漸ʮÄÉÃëvsRDMA¼¸¸ö΢Ã룩ºÍ¸üµÍµÄ´ø¿í£¬²¢ÇÒÒ²ÎÞ·¨ÌṩÄÚ´æÒ»ÖÂÐԵȹؼüÖ°ÄÜ¡£

ÔÚ40GbpsÁ´Â·´ø¿íÍøÂç¿ÉʵÏÖµÄÍù·µÑÓ³¤£¨×ܼƣ©
ÒÔ¼°µ¼ÖÂÍù·µÑÓ³¤Ôö³¤µÄ×é¼þ£¨Ê¹ÓÃ100Gbps¿É½«Êý¾Ý´«ËÍÏ÷¼õ0.5us£©
2010Ä꣬CCIX³ÉΪDZÔÚµÄÐÐÒµ³ß¶È¡£ËüµÄÇý¶¯³É·ÖÊDZØÒª±Èµ±Ç°¿ÉÓü¼Êõ¸ü¿ìµÄ»¥Á¬£¬²¢ÇÒ±ØÒª»º´æÒ»ÖÂÐÔ£¬ÒÔ±ãÔÚÒì¹¹¶à´¦ÖÃÆ÷ϵͳÖиü¿ìµØ½Ó¼ûÄÚ´æ¡£CCIX¹æ·¶µÄ×î´óÓÅÊÆÊÇËü³ÉÁ¢ÔÚPCI Express¹æ·¶µÄ»ù´¡Ö®ÉÏ£¬µ«ËüÒò²»×ã¹Ø¼üÐÐÒµÖ§³Ö£¬´ÓÎ´ÕæÕýÌÚ·É¡£
¶øCXLÒÀ¸½ÏÖÓеÄPCIe5.0µÄÎïÀí²ãºÍµçÆø²ã³ß¶È¼°Éú̬ϵͳ£¬ÎªÄÚ´æ¼ÓÔØ/´æ´¢£¨load/Store)ÊÂÎñÔö³¤»º´æÒ»ÖÂÐԺ͵ÍʱÑÓ¸öÐÔ¡£ÓÉÓÚ³ÉÁ¢ÁËÐÐÒµÖдóÎÞÊýÖØÒª²Î¼ÓÕß¶¼Ö§³ÖµÄÐÐÒµ³ß¶ÈºÍ̸£¬CXLʹÏòÒì¹¹ÍÆËãµÄ¹ý¶É³ÉΪ¿ÉÄܲ¢»ñµÃ¿í·ºµÄÒµ½çÖ§³Ö¡£AMDµÄGenoaºÍIntelµÄSapphireRapids½«ÔÚ2022Ëêĺ/2023ËêÊ×Ö§³ÖCXL1.1¡£ÖÁ´Ë£¬CXL³ÉΪҵ½çºÍѧÊõ½ç×îÓÐǰ;½â¾öÕâÒ»ÎÊÌâµÄ¼¼ÊõÖ®Ò»¡£
CXL¹¹½¨ÔÚPCIeÎïÀí²ãÉÏ£¬¾ß±¸ÏÖÓÐPCIeÎïÀí¼°µçÆø½Ó¿Ú¸öÐÔ£¬Ìṩ¸ß´ø¿í£¬¸ß¿ÉÀ©´óÐÔÌØµã¡£Áí±íCXLÓ봫ͳµÄPCIExpress£¨PCIe£©»¥Á¬Ïà±ÈÓµÓиüµÍµÄʱÑÓ£¬²¢ÇÒ»¹Ìṩһ×é¹ÖÒìµÄÐÂÖ°ÄÜʹCPU¿ÉÄÜÒÔÓµÓмÓÔØ/´æ´¢(load/store)ÓïÒåµÄ¸ß¿ì»º´æÒ»Ö£¨Cache-Coherent£©·½Ê½Óë±íΧÉ豸£¨ÄÚ´æÀ©´óºÍ¼Ó¿ìÆ÷¼°ÆäÏνӵĴ洢Æ÷£©Í¨Ñ¶¡£¸Ã¼¼Êõά³ÖCPUÄÚ´æ¿Õ¼äºÍ¸½¼ÓÉ豸ÉÏÄÚ´æµÄÒ»ÖÂÐÔ£¬ÔÊÐí×ÊÔ´¹²Ïí£¬´Ó¶ø»ñµÃ¸ü¸ß»úÄÜ£¬½µµÍÈí¼þÕ»¸´ÔÓÐÔ¡£ÓëÄÚ´æÓйصÄÉ豸À©´óÊÇCXLÖØÒªÖ¸±ê³¡¾°Ö®Ò»¡£

CXLÒÀ¸½ÏÖÓÐPCIeÎïÀí¼°µçÆø½Ó¿Ú¸öÐÔ

CXL/PCIeʵÏÖÄÚ´æ×ÊÔ´À©´ó/³Ø
CXLÏÖʵÉÏÔ̺¬ÈýÖÖºÍ̸£¬µ«²¢·ÇËùÓкÍ̸¶¼ÊÇÑÓ³¤µÄÁ鵤ÃîÒ©¡£CXL.io£¨ÔËÐÐÔÚPCIe×ÜÏßµÄÎïÀí²ãÉÏ£©ÒÀȻӵÓÐÓëÒÔÍùÒ»ÑùÀàÐ͵ÄÑÓ³¤£¬µ«ÆäËûÁ½¸öºÍ̸£¬CXL.cacheºÍCXL.memѡȡÁ˸ü¿ìµÄõè¾¶£¬Ï÷¼õÁËÑÓ³¤¡£´óÎÞÊýCXLÄÚ´æ½ÚÔìÆ÷»áÔöԼĪ100-200ÄÉÃëµÄÑÓ³¤£¬¶î±íµÄ³Á°´Ê±Æ÷»áÔö³¤»òÆÆ·Ñ¼¸Ê®ÄÉÃ룬¾ßÌåÈ¡¾öÓÚÉ豸ÓëCPUµÄ¾àÀë¡£

CXLÒýÈëʱÑÓÓëNUMA¿¿½ü

CXL/PCIeÀ©´óÄÚ´æÏµÍ³¼Ü¹¹
CXLÔÚPCIePHY²ã¸´ÓÃ·ÖÆçµÄºÍ̸£¬CXL1.0/1.1¹æ·¶¸½´ø3¸öºÍ̸֧³Ö-CXL.io¡¢CXL.cacheºÍCXL.mem£¬´óÎÞÊýCXLÉ豸½«Ê¹ÓÃCXL.io¡¢CXL.cacheºÍCXL.memµÄ×éºÏ¡£CXL.io ʹÓÃÓë PCIe Ò»ÑùµÄÊÂÎñ²ãÊý¾Ý°ü (Transaction Layer Packet, TLP)ºÍÊý¾ÝÁ´Â·²ãÊý¾Ý°ü (DLLP)¡£TLP/DLLP¸²¸ÇÔÚCXL flitµÄÓÐЧ¸ºÔز¿ÃÅÉÏ¡£CXL½ç˵ÁË¿ç·ÖÆçºÍ̸ջÌṩËùÐè·þÎñÖÊÁ¿(QoS)µÄÕ½Êõ¡£PHY¼¶´ËÍâºÍ̸¸´ÓÿÉÈ·±£CXL.cacheºÍCXL.memoryµÈÑÓ³¤Ãô¸ÐºÍ̸ӵÓÐÓë±¾»úCPUµ½CPU¶Ô³ÆÒ»ÖÂÐÔÁ´Â·Ò»ÑùµÄµÍÑÓ³¤¡£CXLΪÕâЩÑÓ³¤Ãô¸ÐºÍ̸½ç˵ÁËÒý½Åµ½Òý½ÅÏìÓ¦¹¦·òµÄÉÏÏÞ£¬ÒÔÈ·±£Æ½Ì¨»úÄܲ»»áÒòʵÏÖÒ»ÖÂÐÔºÍÄÚ´æÓïÒåµÄ·ÖÆçÉ豸֮¼äµÄÑÓ³¤²î¾à½Ï´ó¶øÊܵ½²»ÀûÓ°Ïì¡£
ÓÉÓÚ°²È«Ê¹ÓÃÆä±¾µØ¸±±¾£¬CXL.cacheÔÊÐíCXLÉ豸Á¬¹áµØ½Ó¼û»ººÍ´æÖ÷»úCPUµÄÄڴ棬Äܹ»°ÑÕâÉèÏë³ÉÒ»¸öGPUÖ±½Ó´ÓCPUµÄÄÚ´æÖлº´æÊý¾Ý¡£
ÔÊÐíÖ÷»úCPUÁ¬¹áµØ½Ó¼ûÉ豸µÄÄڴ棬½«´ËÊÓΪCPUʹÓÃרÓô洢¼¶ÄÚ´æÉ豸»òʹÓÃGPU/¼Ó¿ìÆ÷É豸ÉϵÄÄÚ´æ¡£

´Ó×óÏòÓÒ˳´ÎÊÇCXLType1¡¢CXLType3¡¢CXLType2
CXL 2.0Ôö³¤Á˶ÔÄÚ´æ³ØºÍCXL»¥»»µÄÖ§³Ö£¬ÔÊÐí¶à¶àÖ÷»úºÍÉ豸ȫÊýÁ´½Ó²¢Ï໥ͨѶ£¬´Ó¶øÊ¹ÏνÓÔÚCXLÍøÂçÉϵÄÉ豸ÊýÁ¿ÏÔ×ÅÔö³¤¡£¶ą̀Ö÷»úÄܹ»Ïνӵ½»¥»»»ú£¬¶øºó½«»¥»»»úÏνӵ½¸÷ÀàÉ豸£¬ÈôÊǸÃCXLÉ豸ÊǶàÍ·µÄ²¢Ïνӵ½¶à¸öÖ÷»úµÄ¸ù¶Ë¿Ú£¬ÔòÒ²Äܹ»ÔÚûÓл¥»»»úµÄÇé¿öÏÂʵÏÖ¡£SLD£¨µ¥¸öÂß¼É豸£©Êǵ¥¸öÖ÷»ú±ðÀëʹÓÃ·ÖÆçÄÚ´æ³Ø£¬MLD£¨¶à¸öÂß¼É豸£©Ö¼ÔÚñîºÏ¶à¸öÖ÷»úÒÔ·ÖÏíͳһÎïÀíÄÚ´æ³Ø¡£
É¢²¼Ê½ÄÚ´æ×ÊÔ´ÍøÂ罫ÓɽṹÖÎÀíÆ÷£¨FabricManager£©ÕƹܷÖÅäÄÚ´æ¼°É豸±àÅÅ£¬ËüÏ൱ÓÚ½ÚÔìÆ½Ãæ»òе÷Æ÷£¬Î»ÓÚµ¥¶ÀµÄоƬÉÏ»ò»¥»»»úÖУ¬Í¨³£²»±ØÒª¸ß»úÄÜ£¬ÓÉÓÚ²»½Ó´¥Êý¾ÝÃæ¡£½á¹¹ÖÎÀíÆ÷£¨FabricManager£©ÌṩÓÃÓÚ½ÚÔìºÍÖÎÀí¸ÃϵͳµÄ³ß¶ÈAPI£¬Äܹ»ÊµÏÖϸÁ£¶ÈµÄ×ÊÔ´·ÖÅä¡¢ÈȲå°ÎºÍ¶¯Ì¬À©ÈÝÔÊÐíÓ²¼þÔÚ¸÷¸öÖ÷»úÖ®¼ä¶¯Ì¬·ÖÅäºÍ×ªÒÆ£¬ÎÞÐèÈκγÁÆô¡£½«ËùÓÐÕâЩ½áºÏÔÚһ·£¬Î¢Èí»ã±¨ÏÔʾѡȡCXL·½Ê½ÊµÏÖÄÚ´æ×ÊÔ´³ØÕûÌå¿ÉÏ÷¼õ10%ÄÚ´æÐèÒª£¬½ø¶ø½µµÍ5%µÄ×Ü·þÎñÆ÷³É±¾µÄDZÁ¦¡£


CXL 2.0ÄÚ´æ×ÊÔ´³Ø£¨Switch vs Directconnectģʽ£©
CXL·¢Õ¹ÊÆÍ·Ç¿¾¢£¬ÈýÐÇ¡¢SKº£Á¦Ê¿¡¢Marvell¡¢Rambus¡¢ÈýÐÇ¡¢AMDµÈ´ó³§ÃǵIJ¼¾ÖÒ²ÔÚ²»Ðݼӿ졣¹«ÓÐÔÆ¹©¸øÉÌÔÚÄÚµÄËùÓг¬´ó¹æÄ£ÆóÒµ¶¼ÆðÍ·³¢ÊÔÒÀÀµCXLÏνÓÄÚ´æ³ØÀ´¸ÄÉÆÄÚ´æÏÐÖ㬶¯Ì¬½Ã½ÝÔö³¤´ø¿íºÍÈÝÁ¿µÄÎÊÌâ¡£µ«µ±Ç°Ã»ÓÐÌ«¶àÓÃÓÚAPPʹÓñ¾µØ/±í²¿»ìºÏ»ïÔ´³ØµÄ¶à¼¶ÄÚ´æµ÷¶ÈÖÎÀí¼à¿Ø¼¼Êõ£¬Òò¶øÔÆ·þÎñÉÌÈôÊÇ¿ÌÒâ´ó¹æÄ£Ê¹ÓûùÓÚCXL¼¼ÊõµÄ×ÊÔ´³ØÏµÍ³£¬ÒªÃ´×Ô¼º½¨£¬ÒªÃ´µÃѰÕÒÏàÒ˵ÄϵͳÈíÓ²¼þ¹©¸øÉÌ¡£Õâ·½ÃæÎ¢Èí¡¢MetaµÈÖØÒªÔÆ·þÎñÉÌÒѾ×ßµ½Ç°Ãæ¡£
΢ÈíµÄPond¹æ»®Ê¹Óûúе½ø½¨ÅжϷÖÎöÐé»úÊÇ·ñÊÇʱÑÓÃô¸ÐÒÔ¼°²»±»Ê¹ÓõĶuntouched£©ÄÚ´æ´óÓ×£¬²¢ÓÉ´ËÀ´Åжϵ÷¶ÈVMÔÚÏàÒ˵ı¾µØ»òCXLÔ¶¶ËÄÚ´æµØÎ»£¬¹²Í¬»úÄÜ¼à¿ØÏµÍ³²»Ðݵ÷ÕûǨáã¡£

΢ÈíPond¹æ»®½ÚÔìÆ½Ãæ¹¤×÷Á÷³Ì
£¨A) µÄVM µ÷¶È·¨Ê½Ê¹ÓûùÓÚ ML µÄÔ¤²âÀ´¼ø±ðÑÓ³¤Ãô¸ÐµÄÐé¹¹»ú¼°Æä¿ÉÄÜδ´¥¼°µÄÄÚ´æÁ¿¾ö¶¨Ðé¹¹»úµÄ¸éÖÃ
£¨B) ¼à¿ØÈôÊÇ·þÎñÖÊÁ¿ (QoS) ²»Âú×㣬µ÷¶ÈǨáã½ÚÔìÖÎÀíÆ÷£¨Mitigation Manager£©»á³ÁÐÂÅäÖÃÐé¹¹»ú
×÷ΪÖÇËãÖÐÐÄÍøÂ罨ÉèÕߣ¬GA»Æ½ð¼×ÍøÂçÖÂÁ¦ÓÚΪ¿Í»§Ìṩ´´ÐµIJúÆ·¹æ»®ºÍ·þÎñ£¬Íƶ¯ÐÐÒµ·¢Õ¹ºÍ´´Ð£¬Èÿͻ§Ó뽫À´¸üçÇÃܵØÏνӡ£GA»Æ½ð¼×ÍøÂ罫³ÖÐø´´Ð£¬ÒýÁìÖÇËãʱÆÚµÄÍøÂç·¢Õ¹³±Ë®¡£
TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory
Pond: CXL-Based Memory Pooling Systems for Cloud Platforms
Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices
Compute Express Link™ Specification 3.0 whitepaper
Design and Analysis of CXL Performance Models for Tightly-Coupled Heterogeneous Computing
Memory Disaggregation: Advances and Open Challenges
Network Requirements for Resource Disaggregation
A Case for CXL-Centric Server Processors
