도서요약

�대뼡 �좉린�좎씠 �몄긽�� 洹뱀쟻�쇰줈 蹂��붿떆�ш퉴? �멸퀎 理쒓퀬�� 곌뎄�뚯뿉�� 섏삤�� 쇱슫 �곸떊�� 낆젏 �뚭컻�⑸땲��.

而댄벂�� 珥덇린遺�� щ엺�ㅼ� �쒕��뚯뿉 媛�� 而ㅽ뵾 �� 媛��몄��앹� 媛숈�, �ㅻ뒛�� 꾨쭏議댁씠 �쒓났�섎뒗 �뚮젆��(Alexa) �뚯꽦鍮꾩꽌 �섏�� 怨좊룄�� 紐낅졊�� 곕Ⅴ�� 濡쒕큸 �섏씤�� 곸긽�� 붾떎. 洹몃윭�� MIT 怨듯븰�먮뱾�� ㅻ챸�� 諛붿� 媛숈씠, �대윭�� 믪� �섏�� 묒뾽 �섑뻾�� 濡쒕큸�� 멸컙泥섎읆 臾쇰━�� 섍꼍�� 몄떇�� 덉뼱�� 쒕떎�� 寃껋쓣 �섎��쒕떎.

�ㅼ젣 �몄긽�먯꽌 �대뼡 �쇱쓣 �섎젮硫�, 二쇰� �섍꼍�� 硫섑꽭(mental) 紐⑤뜽�� 덉뼱�� 쒕떎. �닿쾬�� 멸컙�먭쾶�� ъ슫 �쇱씠��. �섏�留� 濡쒕큸�� 寃쎌슦 移대찓�쇰� �듯빐 蹂대뒗 �쎌� 媛믪쓣 �몄긽�� 댄빐濡� 蹂��섑빐�쇳븯�� 留ㅼ슦 �대젮�� 臾몄젣�닿린�� 섎떎.

�ㅽ뻾�덈룄, �욎꽌 �멸툒�� MIT 怨듯븰�먮뱾�� 멸컙�� 몄긽�� 몄떇�섍퀬 �먯깋�섎뒗 諛⑹떇�� 紐⑤뜽濡� �� 濡쒕큸�� 怨듦컙 �몄떇 �쒗쁽�� 媛쒕컻�대깉��. 3D �숈쟻 �λ㈃洹몃옒��(3D Dynamic Scene Graphs)濡� 遺덈━�� 덈줈�� 紐⑤뜽�� ъ슜�섎㈃濡쒕큸�� 섏궗臾쇄��, �섏궗��, 諛�, 踰�, �뚯씠釉�, �섏옄�숈� 媛숈� �쒕㎤�� 덉씠釉�(semantic labels), 洹몃━怨� 濡쒕큸�� 洹몃뱾 �섍꼍�먯꽌 蹂� �� 덉쓣 寃� 媛숈� 湲고� 援ъ“�ㅼ씠 �ы븿�섎뒗 二쇰� �섍꼍 3D 吏��꾨� �좎냽�섍쾶 �앹꽦�� 덈룄濡� �댁��. �먰븳 �� 紐⑤뜽�� 듯빐 濡쒕큸�� 3D 吏��꾩뿉�� 愿�� 뺣낫瑜� 異붿텧�섍퀬 寃쎈줈�먯꽌 臾쇱껜, 諛� �먮뒗 ��吏곸씠�� щ엺�� 꾩튂瑜� 荑쇰━�� 덈떎.

�대젃寃� �섍꼍�� 뺤텞�섎뒗 寃껋� 濡쒕큸�먭쾶 �좎슜�쒕뜲, �좎냽�섍쾶 寃곗젙�� 대━怨� 寃쎈줈瑜� 怨꾪쉷�� 덈룄濡� �댁＜湲� �뚮Ц�대떎. �붾텋�� 닿쾬�� 곕━ �멸컙�� 섎뒗 �쇨낵 洹몃━ �ㅻⅤ吏� �딆� 寃껋씠��. 吏묒뿉�� 吏곸옣源뚯�� 寃쎈줈瑜� �앷컖�� , 怨좊젮�� 꾩슂媛� �덈뒗 紐⑤뱺 �붿냼瑜� �앷컖�섏쭊 �딆쓣 寃껋씠��. �곕━�� 媛� 嫄곕━�� 쒕뱶留덊겕 �뺣룄�� 섏��먯꽌留� �앷컖�섍퀬, 洹멸쾬�� 鍮좊Ⅸ 寃쎈줈瑜� 怨꾪쉷�섎뒗 寃껋쓣 �뺣뒗��.

媛�� 꾩슦誘� �댁긽�쇰줈, �곌뎄�먮뱾�� 대윭�� 덈줈�� 醫낅쪟�� 섍꼍 硫섑꽭 紐⑤뜽�� 梨꾪깮�� 濡쒕큸�� 怨듭옣�먯꽌 �щ엺�ㅺ낵 �섎�� 묒뾽�섍굅�� щ궃 �꾩옣�먯꽌 �앹〈�먮� 李얜뒗 寃껉낵 媛숈� �ㅻⅨ �믪� �섏�� 묒뾽�먮룄 �곹빀 �� 덈떎怨� 留먰븳��.

�� 곌뎄�� 理쒓렐 ��2020 Robotics : Science and Systems 媛�� 而⑦띁�곗뒪�앹뿉�� 諛쒗몴�섏뿀��.

�� 곌뎄媛� �� 以묒슂�좉퉴? 吏�湲덇퉴吏� 濡쒕큸 鍮꾩쟾怨� �대퉬寃뚯씠�섏� 二쇰줈 �� 媛�吏� 寃쎈줈瑜� �곕씪 諛쒖쟾�댁솕��. 泥� 踰덉㎏�� 濡쒕큸�� ㅼ떆媛꾩쑝濡� �먯깋�섎㈃�� 섍꼍�� 3李⑥썝�쇰줈 �ш뎄�깊븷 �� 덈룄濡� �섎뒗 3D 留ㅽ븨�대떎. �� 踰덉㎏�� 쒕㎤�� 遺꾪븷�� 쒖슜�섎뒗 寃껋씠�덈뒗��, �닿쾬�� 濡쒕큸�� 먮룞李� Vs �먯쟾嫄곗� 媛숈� �쒕㎤�� 媛앹껜濡쒖꽌 �섍꼍�� 뱀쭠�ㅼ쓣 遺꾨쪟�섎뒗 �� 꾩�� 以��. �ㅻ쭔 �쒕㎤�� 遺꾪븷�� 吏�湲덇퉴吏�� 遺�遺� 2D �대�吏�瑜� �듯빐 �섑뻾�섏뿀��. 洹몃윭�� MIT媛� 媛쒕컻�� 덈줈�� 怨듦컙 吏�媛� 紐⑤뜽�� ㅼ떆媛꾩쑝濡� �섍꼍 3D 吏��꾨� �앹꽦�섎뒗 �숈떆�� 대떦 3D 吏�� 댁뿉�� 臾쇱껜, �щ엺 諛� 援ъ“�� 덉씠釉붿쓣 吏��뺥븯�� 理쒖큹�� 紐⑤뜽�대떎.

- References

To view or purchase this article, please visit:
https://www.researchgate.net/publication/342881852_3D_Dynamic_Scene_Graphs_Actionable_Spatial_Perception_with_Places_Objects_and_Humans

To view a related video media, from Massachusetts Institute of Technology, please visit:
https://www.youtube.com/watch?v=SWbofjhyPzI

Since the earliest days of computing, people have imagined robotic servants able to follow high-level, Alexa-type commands, such as �쏥o to the kitchen and fetch me a coffee cup.�� But as MIT engineers explain carrying out such high-level tasks means that robots will have to be able to perceive their physical environment as humans do.

In order to function in the world, you need to have a mental model of the environment around you. This is something that�셲 effortless for humans. But for robots, it�셲 a painfully hard problem, which requires transforming pixel values that they see through a camera, into an understanding of the world.

Fortunately, these MIT engineers have developed a representation of spatial perception for robots that is modeled after the way humans perceive and navigate the world. The new model, called 3D Dynamic Scene Graphs, enables a robot to quickly generate a 3D map of its surroundings that also includes objects and their semantic labels such as people, rooms, walls, tables, chairs, and other structures that the robot is likely to see in its environment. The model also allows the robot to extract relevant information from the 3D map and to query the location of objects, rooms, or the moving people in its path.

This compressed representation of the environment is useful because it allows a robot to quickly make decisions and plan its path. This is not too far from what we do as humans. If you need to plan a path from your home to work, you don�셳 plan every single position you need to take. You just think at the level of streets and landmarks, which helps you plan your route faster.

Beyond domestic helpers, the researchers say robots that adopt this new kind of mental model of the environment may also be suited for other high-level jobs, such as working side-by-side with people on a factory floor or exploring a disaster site for survivors.

The research presented recently at the 2020 Robotics: Science and Systems virtual conference.

Why is this important? Until now, robotic vision and navigation have advanced mainly along two routes: the first involves 3D mapping that enables robots to reconstruct their environment in three dimensions as they explore in real-time; and the second uses semantic segmentation, which helps a robot classify features in its environment as semantic objects, such as a car versus a bicycle, which so far is mostly done with 2D images. The new MIT model of spatial perception is the first to generate a 3D map of the environment in real-time, while also labeling objects, people, and structures within that 3D map.

The key component of the team�셲 new model is Kimera, an open-source library that the team previously developed to simultaneously construct a 3D geometric model of an environment, while encoding the likelihood that an object is, say, a chair versus a desk. Like the mythical creature that is a mix of different animals, the team wanted Kimera to be a mix of mapping and semantic understanding in 3D.

Kimera works by taking in streams of images from a robot�셲 camera, as well as inertial measurements from onboard sensors, to estimate the trajectory of the robot or camera and to reconstruct the scene as a 3D mesh, all in real-time.

To generate a semantic 3D mesh, Kimera uses an existing neural network trained on millions of real-world images, to predict the label of each set of pixels, and then projects these labels in 3D using a technique known as ray-casting, commonly used in computer graphics for real-time rendering.

The result is a map of a robot�셲 environment that resembles a dense, three-dimensional mesh, where each face is color-coded as part of the objects, structures, and people within the environment.

If a robot were to rely on this mesh alone to navigate through its environment, it would be a computationally expensive and time-consuming task. So the researchers developed algorithms to construct 3D dynamic �쐓cene graphs�� from Kimera�셲 initial, highly dense, 3D semantic mesh. In the case of the 3D dynamic scene graphs, the associated algorithms abstract, or break down, Kimera�셲 detailed 3D semantic mesh into distinct semantic layers, such that a robot can �쐓ee�� a scene through a particular layer, or lens. This layered representation avoids a robot having to make sense of billions of points and faces in the original 3D mesh. Within the layer of objects and people, the researchers have also been able to develop algorithms that track the movement and the shape of humans in the environment in real-time.

This is essentially enabling robots to have mental models similar to the one humans use. And it is expected to impact many applications, including self-driving cars, search and rescue, collaborative manufacturing, and domestic robots.

References
Robotics Science and Systems, July 12-16, 2020, ��3D Dynamic scene graphs: Actionable spatial perception with places, objects, and humans,�� by Antoni Rosinol, et al. © 2020 RSS. All rights reserved.

To view or purchase this article, please visit:
https://www.researchgate.net/publication/342881852_3D_Dynamic_Scene_Graphs_Actionable_Spatial_Perception_with_Places_Objects_and_Humans

To view a related video media, from Massachusetts Institute of Technology, please visit:
https://www.youtube.com/watch?v=SWbofjhyPzI

Media Briefings