µçÔºÑîС¿µ½ÌÊÚÍŶÓÌá³ö¿É³ÖÐøÊ±¿ÕÔ¤²âѧϰ¿ò¼Ü
½üÄêÀ´£¬ÒÔѧϰͨÓû·¾³±íÕ÷ΪĿµÄµÄÔ¤²âѧϰ£¨predictive Learning£©Ô½À´Ô½¶àµØ±»Ó¦Óõ½¹¤ÒµÖÆÔì¡¢×Ô¶¯¼ÝÊ»µÈ³¡¾°µÄ¸÷ÖÖʱ¿Õ¾ö²ßÈÎÎñÖС£Õë¶Ô³ÖÐøÈÎÎñѧϰÉ趨ϵÄʱ¿ÕÔ¤²âѧϰÎÊÌ⣬µç×ÓÐÅÏ¢ÓëµçÆø¹¤³ÌѧԺÈ˹¤ÖÇÄÜÑо¿ÔºÑîС¿µ½ÌÊÚ´øÁìµÄÍŶÓͨ¹ýÒýÈë²¢¸Ä½øÒÑÓеijÖÐøÑ§Ï°·½·¨£¬¿ª´´ÐÔµØÌá³öÁ˿ɳÖÐøÊ±¿ÕÔ¤²âѧϰ¿ò¼ÜCpL (Continual predictive Learning£©¡£ÓÉÑîС¿µ½ÌÊÚºÍÍõè¹²©ÖúÀí½ÌÊÚÖ¸µ¼µÄÏà¹ØÑо¿¹¤×÷¡°Continual predictive Learning from Videos¡±Òѱ»CVpR 2022ÊÕ¼²¢±»Ñ¡Îª¿ÚÍ·±¨¸æ£¨oral presentation£©£¨Ã¿ÄêOralԼռͶ¸åÊýµÄ5%£©¡£
CVpR£¨¼ÆËã»úÊÓ¾õÓëģʽʶ±ð»áÒ飬IEEE Conference on Computer Vision and pattern Recognition£©ÊǼÆËã»úÊÓ¾õºÍģʽʶ±ðÁìÓòµÄ¶¥¼¶»áÒ飬±»Öйú¼ÆËã»úѧ»áÍÆ¼öΪAÀà»áÒé¡£¸ù¾Ý¹È¸èѧÊõ¹«²¼µÄ2021Äê×îÐÂѧÊõÆÚ¿¯ºÍ»áÒéÓ°ÏìÁ¦ÅÅÃû£¬CVpRÔÚËùÓÐѧÊõ¿¯ÎïºÍ»áÒéÖÐλ¾ÓµÚ4¡£
Ô¤²âѧϰ£¨predictive Learning£©×îÔçÓÉͼÁé½±»ñµÃÕßYann LeCunÔÚNIpS 2016´ó»áÖ÷Ìⱨ¸æÖÐÊ×Ïȱ»Ìá³ö¡£ÆäºËÐÄ˼Ïë¿ÉÒÔ¼òµ¥×ܽáΪÈçºÎͨ¹ýÍê³É»ùÓÚ¸ø¶¨ÊÓÆµÆ¬¶ÎµÄÊý¾ÝÔ¤²âδÀ´Á¬ÐøÖ¡ÕâÒ»Î޼ලԤ²âѧϰÈÎÎñ£¬Ê¹µÃÖÇÄÜÌå¿ÉÒÔѧϰµ½Êý¾ÝËùÔÚ»·¾³Öаüº¬µÄ¶¯Ì¬ÏÈÑéÐÅÏ¢£¬ÈçÎïÌåÔÚÁ¦µÄ×÷ÓÃϵÄÔ˶¯×´Ì¬£¬´Ó¶ø½øÒ»²½¸¨ÖúÖÇÄÜÌå¶ÔÓÚδÀ´ÐÐΪµÄ¾ö²ßÍÆÀí¡£ÔÚÒÑÓеÄÑо¿ÖУ¬ÍùÍù¼ÙÉè¿ÉÒÔÌáǰ»ñµÃ²»Í¬»·¾³¡¢²»Í¬Ô¤²âÈÎÎñµÄÈ«²¿ÑµÁ·Êý¾Ý£¬È»ºó½øÐÐÄ£ÐÍѵÁ·¡£
È»¶ø£¬ÔÚʵ¼Ê³¡¾°ÖУ¬Èçͼ1Ëùʾ£¬Ä£ÐÍËùÃæÁٵĻ·¾³»òÈÎÎñ¿ÉÄÜÊǶ¯Ì¬±ä»¯µÄ£¬¼´´ýѧϰµÄÔ¤²âÈÎÎñ¿ÉÄÜÒÔÐòÁл¯µÄ·ÇƽÎȵÄÐÎʽ³öÏÖ£¬±ÈÈç»úе±ÛÐèÒªÊ×ÏÈÍê³ÉÍÆ¶¯µÄ¶¯×÷£¬ÔÙ·Ö±ðѧϰץȡºÍ¶ÑµþµÄ¶¯×÷¡£Ä£ÐÍÐèÒªÐòÁл¯µØÑ§Ï°Ò»Á¬´®²»Í¬µÄÈÎÎñ£¬¶øÔÚѧϰµ±Ç°ÈÎÎñʱ£¬ÎÒÃÇÎÞ·¨»ñµÃ»òÖ»ÄÜÉÙÁ¿»ñµÃ֮ǰÈÎÎñµÄѵÁ·Êý¾Ý¡£ÔÚÕâÖÖ³ÖÐøÑ§Ï°£¨Continual Learning£©µÄÉ趨Ï£¬¶àÊýÏÖÓеÄÔ¤²âѧϰ·½·¨»áÔâÓöÑÏÖØµÄÔÖÄÑÐÔÒÅÍü£¨Catastrophic Forgetting£©ÎÊÌ⣬¼´Ä£ÐÍÔÚѧϰÈÎÎñÐòÁеĹý³ÌÖУ¬»áÖð½¥ÒÅÍüµô֮ǰÒÑѧϰÈÎÎñµÄ֪ʶ£¬Ôì³ÉÔÚ֮ǰÈÎÎñÉϲâÊÔÐÔÄܵĽµµÍ£¬²¢ÇÒÑо¿ÈËÔ±·¢ÏÖÖ±½Ó½«ÒÑÓÐ×÷ÓÃÔÚͼÏñÁìÓòµÄ³ÖÐøÑ§Ï°·½·¨Ó¦Óõ½Ê±¿ÕÔ¤²âÉϲ¢²»ÄÜÈ¡µÃºÜºÃµÄЧ¹û¡£
ͼ1 ¿É³ÖÐøÊ±¿ÕÔ¤²âÎÊÌⶨÒå¼°ËùÌá³ö¼Ü¹¹ÔÚ²âÊÔʱµÄÔËÐÐÁ÷³Ì
Õë¶ÔÒÔÉÏÎÊÌ⣬Ñо¿ÍŶӿª´´ÐÔµØÌá³öÁËÒ»ÖֿɳÖÐøÊ±¿ÕÔ¤²âѧϰ¿ò¼ÜCpL£¨Continual predictive Learning£©£¬ÕûÌå½á¹¹Èçͼ2Ëùʾ¡£ÔÚÍøÂç½á¹¹Éè¼ÆÉÏ£¬Õë¶ÔÐÔµØÉè¼ÆÁË»ìºÏÊÀ½çÄ£ÐÍ£¨Mixture World Model£©£¬Í¨¹ýÒýÈëÀà±ð±êÇ©·ÖÀ벻ͬÈÎÎñ¶ÔÓ¦µÄʱ¿Õ¶¯Ì¬ÐÅÏ¢¡£ÔÚÒÅÍüÊý¾ÝÔö¹ãÉÏ£¬Ìá³öÁË»ùÓÚÔ¤²âµÄ¾Ñ黨·Å£¨predictive Experience Replay£©²ßÂÔ£¬Í¨¹ý½áºÏµ¥Ö¡Í¼ÏñÉú³ÉºÍÊÀ½çÄ£Ð͵ĸ´Óã¬ÔÚÄÚ´æÊÜÏÞµÄÌõ¼þÏÂʵÏÖÁËÒÑÓÐÈÎÎñÊý¾ÝµÄÉú³É£¬´òÆÆÁËÊý¾ÝÏÞÖÆ¡£×îºóÔÚÄ£ÐͲâÊÔÁ÷³ÌÖУ¬ÒýÈëÁË×ÔÊÊÓ¦µÄÎÞ²ÎÊýÈÎÎñÍÆ¶Ï»úÖÆ£¨Non-parametric Task Inference£©£¬½øÒ»²½»º½âÔ¤²â½×¶ÎµÄ±êÇ©ÒÅÍüÎÊÌâ¡£
ͼ2 CpLÕûÌå¿ò¼Ü
´´Ðµã
1¡¢»ìºÏÊÀ½çÄ£ÐÍ
ΪÁ˸üºÃµØ·ÖÀ벻ͬÈÎÎñ¶ÔÓ¦µÄʱ¿Õ¶¯Ì¬ÐÅÏ¢£¬½ø¶ø»º½âÄ£ÐÍѧϰÊý¾Ý·Ö²¼Ê±´øÀ´µÄ±íÕ÷»ìÏý£¬Ñо¿ÈËÔ±Ê×ÏȶԲ»Í¬ÈÎÎñ·ÖÅ䲻ͬµÄÈÎÎñ±êÇ©£¬²¢Ê¹ÓûìºÏ¸ß˹·Ö²¼µÄÐÎÊ½Ñ§Ï°ÌØ¶¨ÈÎÎñµÄÏÈÑéÐÅÏ¢ÓÃÓÚÔ¤²â£¬Ê¹µÃÊÀ½çÄ£Ð;ßÓиüºÃµÄ±í´ïÄÜÁ¦¡£
2¡¢»ùÓÚÔ¤²âµÄ¾Ñ黨·Å
ΪÁË»º½âʱ¿ÕÔ¤²âѧϰÖеÄÔÖÄÑÐÔÒÅÍü£¬Ñо¿ÈËÔ±²ÉÓûùÓڻطţ¨Replay£©µÄ·½·¨¶Ô»ìºÏÊÀ½çÄ£ÐͽøÐÐѵÁ·£¬¼´ÔÚѵÁ·µ±Ç°ÈÎÎñʱ£¬Í¨¹ýÆäËû·½Ê½¶Ô֮ǰÒÑѧϰÈÎÎñµÄÊý¾Ý½øÐÐÔÙÉú³É£¬²¢½«Éú³ÉµÄÊý¾ÝºÍµ±Ç°ÕæÊµÊý¾Ý»ìºÏÌṩ¸øÄ£ÐÍѧϰ£¬ÊµÏÖ»º½âÔÖÄÑÐÔÒÅÍüµÄÄ¿±ê¡£
3¡¢ÎÞ²ÎÊýÈÎÎñÍÆ¶Ï
ÔÚ²âÊԽ׶Σ¬ÎªÁ˱ÜÃâÖ±½ÓʹÓÃÒ»¸öÊÓÆµ·ÖÀàÄ£ÐͽøÐÐÈÎÎñÍÆ¶ÏÔì³É·ÖÀàÄ£ÐÍÔâÓöÔÖÄÑÐÔÒÅÍüµÄÎÊÌ⣬Ñо¿ÍŶÓÌá³öÁËÒ»ÖÖÎÞ²ÎÊýµÄÈÎÎñÍÆ¶Ï·½·¨£¬ÀûÓûìºÏÊÀ½çÄ£ÐÍͨ¹ýÊÔ´í·¨½øÐÐÈÎÎñÍÆ¶Ï¡£
ΪÑéÖ¤Ëã·¨ÔÚ¸´ÔÓ³¡¾°ÏµÄʱ¿ÕÔ¤²âÄÜÁ¦£¬Ñо¿ÈËÔ±ÔÚÕæÊµ³¡¾°ÖеĻúе±ÛÊý¾Ý¼¯RoboNetºÍÈËÌ嶯×÷Êý¾Ý¼¯KTHÉϽøÐÐÁ˶¨Á¿¼°¶¨ÐÔʵÑé¡£ÔÚKTHÊý¾Ý¼¯ÉÏÄ£ÐÍѧϰµÄÈÎÎñÐòÁÐΪ£¨boxing -> handclapping -> handwaving -> walking -> jogging-> running£©£¬Ñо¿ÈËÔ±ÔÚÄ£ÐÍѧϰÍê×îºóÒ»¸öÈÎÎñ¡°ÅÜ£¨running£©¡±Ö®ºó£¬²âÊÔÁËÄ£ÐÍÔÚµÚÒ»¸öѧϰÈÎÎñ¡°È»÷£¨boxing£©¡±ÉÏÊÓÆµÔ¤²âµÄЧ¹û¡£Èçͼ3Ëùʾ£¬×óÉϽÇGround Truth(GT )ΪԤ²â½á¹ûµÄÕæÊµÖµ£¬ÓëÆäËû·½·¨Ïà±È£¬±¾Ñо¿Ìá³öµÄCpL-fullÄ£ÐÍÄܹ»Ô¤²âºÍÕæÊµÖµ¸ß¶ÈÒ»ÖµľßÓÐÕýÈ·¶¯×÷ÓïÒ壨boxing£©µÄÊÓÆµÆ¬¶Î£¬¶øÆäËûÄ£Ð͵ÄÍùÍù»áÉú³ÉÄ£ºýµÄÔ¤²â½á¹û£¨ÈçpredRNN+LwF£©£¬»òÕßÉú³É½á¹ûÖаüº¬´íÎóµÄ¶¯×÷ÐÅÏ¢£¨ÈçpredRNN£©£¬ÕâЩ½á¹û˵Ã÷±¾Ñо¿Ìá³öµÄÄ£ÐÍÓÐЧµØ»º½âÁËÄ£ÐÍѧϰ¹ý³ÌÖжԽÏÔçѧϰÈÎÎñµÄÔÖÄÑÐÔÒÅÍüÎÊÌâ¡£
ͼ3 ²»Í¬Ä£ÐÍÔÚÈ»÷ÈÎÎñÉϲâÊÔʱԤ²â½á¹û¶Ô±È£¨CpL-fullΪ±¾ÎÄ·½·¨£©
Ϊ½øÒ»²½ÌåÏÖËùÌá³öÄ£ÐÍÔÚ¹¤Òµ»·¾³ÖеÄÓ¦ÓÃЧ¹û£¬Ñо¿ÈËÔ±½«Ä£ÐÍÔÚ»úÆ÷ÈË·ÂÕæ»·¾³£¨meta world£©ÖнøÐвâÊÔ£¬Õ¹Ê¾»úе±Û³ÖÐøÑ§Ï°µÄ¿ÉÊÓ»¯½á¹û¡£Ñо¿ÈËÔ±Ê×ÏÈʹÓÃԤѵÁ·µÄÇ¿»¯Ñ§Ï°²ßÂÔ¶Ô²»Í¬ÈÎÎñ²ÉÑù£¬µÃµ½ÊÓÆµÐòÁУ¬Ö®ºóʹģÐͰ´ÕÕhammer -> assembly -> sweepµÄ˳Ðò£¬ÒÀ´Î½øÐÐʱ¿ÕÔ¤²âѧϰ¡£ÔÚѧϰÍê×îºóÒ»¸öÈÎÎñ¡°´òɨ£¨sweep£©¡±ºó£¬ÔٴβâÊÔËùÓÐѧϰÈÎÎñµÄÊÓÆµÔ¤²âЧ¹û£¬Èçͼ4Ëùʾ¡£¾¹ý³ÖÐøÑ§Ï°ÐòÁÐÈÎÎñÖ®ºó£¬»ù׼ģÐÍCpL-baseÔÚµÚÒ»¸öѧϰÈÎÎñ¡°Çû÷£¨hammer£©¡±ÉϳöÏÖÁËÃ÷ÏÔµÄÍâ¹ÛÐÅÏ¢µÄÒÅÍü£¨ÎïÌåÏûʧ£©£¬ÔÚµÚ¶þ¸öÈÎÎñ¡°×°Å䣨assembly£©¡±ÉϱíÏÖ³öÍâ¹Û¼°¶¯×÷ÐÅÏ¢µÄÒÅÍü£¨ÎïÌåÏûʧÇÒ»úе±Û¶¯×÷²»Ò»Ö£©£»¶ø±¾Ñо¿Ìá³öµÄCpL-fullÄ£ÐÍÔÚ¾¹ý³ÖÐøÑ§Ï°Ö®ºóÈÔÈ»ÄܶÔÒÑѧϰµÄËùÓÐÈÎÎñÉú³ÉÇåÎúµÄÔ¤²â½á¹û¡£
ͼ4 Ä£ÐÍÓ¦ÓÃÓÚ»úÆ÷ÈË·ÂÕæ»·¾³ÖеijÖÐøÑ§Ï°Ô¤²â½á¹û(CpL-fullΪ±¾ÎÄ·½·¨)
×ÛÉÏ£¬±¾Ñо¿Ìá³öµÄ¿É³ÖÐøÊ±¿ÕÔ¤²âѧϰ¿ò¼ÜCpL£¬ÓÐЧµØ»º½âÁËʱ¿ÕÔ¤²âÄ£ÐÍÔÚÐòÁл¯Ñ§Ï°¹ý³ÌÖеÄ֪ʶÒÅÍüµÄÏÖÏ󡣸ÃÄ£ÐÍ¿ÉÒԺܺõØÑ§Ï°²¢´æ´¢¶à³¡¾°¡¢¶àÈÎÎñµÄ¶¯Ì¬ÏÈÑéÐÅÏ¢£¬ÕâЩÐÅÏ¢¿ÉÒÔͨ¹ý¶ÔδÀ´Ê±¿Õ½øÐÐÔ¤²â¸¨Öú³¤Ê±¼äµÄʱ¿Õ¹æ»®Óë¾ö²ß£¬ÌáÉýÖÇÄÜ»úÆ÷È˵ÈÖÇÄÜÌåÔÚÕæÊµ³¡¾°ÖеÄѧϰÓëÓ¦ÓÃÄÜÁ¦£¬ÔÚÖÇÄÜÖÆÔì²úÒµ·¢Õ¹ÖÐÓÐ׏㷺µÄÓ¦ÓÃǰ¾°Óë¼ÛÖµ¡£
ÂÛÎĵØÖ·£ºhttps://arxiv.org/abs/2204.05624
ÏîÄ¿µØÖ·£ºhttps://github.com/jc043/CpL
µç×ÓÐÅÏ¢ÓëµçÆø¹¤³ÌѧԺ µç×ÓÐÅÏ¢ÓëµçÆø¹¤³ÌѧԺ