Carnegie Mellon University
July 30, 2024

3D-LFM: Lifting Foundation Model

Researchers: Mosam Dabhi, Laszlo Jeni, and Simon Lucey

By Ashlyn Lacovara

Imagine a piece of software capable of looking at a flat, 2D image and intuitively understanding its 3D form, even if it has never seen anything like it before. That is exactly what researchers at CMU, Mosam Dabhi, Laszlo Jeni, and Simon Lucey, have developed and it is known as the 3D Lifting Foundation Model (3D-LFM). A lifting model is a sophisticated computer vision technology that transforms 2D images into intricate 3D structures without relying on categorical specific knowledge.

What sets 3D-LFM apart from traditional CV models is its capability to reconstruct objects never encountered during its training phase. Traditional computer vision models require extensive training on specific subjects or features to accurately recognize and detect objects during experiments. These models depend heavily on prior knowledge and training data to function correctly. The 3D-LFM's design is capable of reconstructing new objects autonomously, showcasing a higher level of adaptability and intelligence.

During their research, the team evaluated the model on various categories sourced from the internet, including random pictures as well as OpenAI’s SORA videos. To thoroughly assess the model's capabilities, they conducted over 30 experiments, examining the 3D-LFM's performance across diverse landmarks and unique background settings. The results are illustrated in the accompanying images. The left three image carousel are GIFs that show 3D-LFM working on random categories from the internet, while the right image displays static 3D reconstructions. In the rightmost image, red denotes the ground truth, while blue represents the predictions made by the 3D-LFM.

3dlfm.gif3dlfmstill.png

This technology opens up new horizons for applications across various fields, including robotics, augmented reality, and beyond. The ability to interpret and recreate unknown objects seamlessly positions 3D-LFM as a revolutionary tool in the ongoing evolution of computer vision technology.

You can dive deeper into this research by using this link: link