Apple Vision Pro. Is it a game changer for 3D?  / Devices, Augmented Reality, Mixed Reality, Virtual Reality, Volumetric /  Apple Vision Pro. Is it a game changer for 3D?

Apple Vision Pro. Is it a game changer for 3D?


Apple Vision Pro is the entirely new device Apple is bringing next year. This device is both something already seen on other devices, but also something entirely new.

Basically, Apple Vision Pro is an AR (Augmented Reality) device, but made using highly advanced VR (Virtual Reality) technology. This way they can avoid all the big drawbacks that limit AR headsets to unleash all their potential, making it the best AR headset. At any moment, the device can become into the sharpest VR headset, it can be any of both, or mix them, is the ultimate Mixed Reality device.

Like the HoloLens 2 (the most advanced AR headset until now), the headset presents an interface above the real world. But Vision Pro interface is more flat than HoloLens, and mainly using 2d interfaces, icons have depth layers, but they basically continue being 2d. The first HoloLens already had a complete 3D interface, using 3D icons and tools, presented on an angled canvas, instead of completely flat. In Vision Pro, the user can move the canvas of each App to the sides, and it will rotate a bit, but they will still look flat. Instead, they use 3D to project realistic shadows and illumination to the room.

Being an VR headset in the core, they offer the possibility to replace the real world and gain immersion with the simple turn of a digital crown. This allows choosing how much virtual or real world do you want to see.

AR or VR? Why not a transition between both? You can adjust how much real and virtual environments with the digital crown.

Interaction with the Apps is done with voice, eye-tracking, and gestures in the air with your fingers. Looking at icons or interface options slightly enlarges them, or changes how bold they appear. Firsts hand-on shows a smooth experience, eye-tracking and gestures are intuitive, and the interface worked impressively smooth for a prototype (most experts say it seems 120Hz), there’s no any sign of lag on interactions.

The device is smart enough to detect when another person is at sight, interacting with you. At that moment, an external lenticular 3D multiview display (yes, a 3D display outside the device), shows to the other person the eyes of the user in 3D. People will think they are looking at your real eyes through a glass, but they’re looking at a 3D display. Let’s hope they use the glasses-free 3D display as well for other indications while you’re using the device (like 3D icons or emojis, or selectable “no disturb” type warnings).

The EyeSight function allows the user to interact with the other person naturally without having to remove their glasses. When someone is closing to you to give you something, the interface vanishes around that person. Everything is automatic.

It seems a transparent glass, but the face and eyes of the user are shown in 3D on a lenticular multiview external display.

The ability to see the real world in an unprecedented resolution allows using real keyboards with the system or interact with other displays like your smartphone, without having any problem reading small texts. The first thing you notice when you wear the headset is the impressive resolution, much better than any other headset. The video pass-through appeared with zero latency and was sharp, crisp, smooth, and clear. When you look at a Mac computer, the programs automatically jump off the screen and fill the real world to better interact with them. But, unlike with HoloLens, the interface is the same, only 2d windows. Interaction also works with iPhone and iPad, but we don’t know how it will look.

The headset allows capturing Spatial photos and videos, and Apple say Vision Pro is their first 3D camera, but that’s not true: They have TrueDepth 3D cameras integrated on many Apple devices. And you can capture images with depth map with the back cameras on most iPhone models, including recording videos in ProRes mode. We also have been using the iPhone 11 Pro as a traditional two lenses 3D camera almost since its launch. Those who wore the headset said that the spatial video captured with Vision Pro looks incredible.

You can take spatial photos and videos with the touch of a button

Added to Spatial media, you can watch panoramas taken with your iPhone in an immersive mode (in 3D?). But they didn’t mention photos captured in portrait mode, which contains a depth map to reconstruct the depth of the image (numerous Apps already do this), it’s strange they don’t take advantage of this. At least, Leia confirmed it will work to bring Leia’s LIF 3D pictures to the Vision Pro (maybe even LeiaLink would do the work from iPhone to Vision Pro).

Likewise, It is also strange that FaceTime calls only work in 3D with other Vision Pro users but not with users with TrueDepth camera. Firsts impressions of FaceTime in 3D are as impressive as spatial videos. The system uses a previous scan of your face, and then they apply your expressions in real time to that model.
Apple devices are using front TrueDepth cameras for years, so, they technically could send the 3D image from that devices to any FaceTime call.

With your face previously scanned, FaceTime sends your reconstructed face in 3D, mimicking your gestures and expressions in real time

Spatial images and videos appear inside a square window (resizable), but it seems you can’t fill all your space with them like you can do with panoramas.

Watching Spatial videos and Panoramas representation, it’s pretty sure you won’t be able to see so many angles of the image (and don’t need to).

Let’s talk about movies. You can watch a movie and your room will appear darker along with realistic reflections from the movie images, like there was a real display in your room. Or you can totally replace your environment with a virtual one by simply using the digital crown to adjust how much real and virtual environments are visible.

Now, the important things. With Vision Pro, you can watch 3D movies with “incredible depth and crisp motion”[…] They were explaining this while showing images of Avatar: The Way of Water. So, it seems, we will finally be able to watch Avatar 2 in 3D (and HFR?) at home.

Beyond traditional video, they showed Apple Immersive Video (180-degree 3D 8K recordings with Spatial Audio) NBA, soccer, concerts, documentaries, etc. This specific content for Vision Pro allows, for example, a giant screen with a Jurassic environment on your wall, from which a dinosaur appears and enters your room, all in Stereoscopic 3D, of course.

Then they mentioned games, with 100 Apple Arcade games available for the launch, unfortunately it seems they are 2d. Let’s hope developers can make the games to show in 3D on the virtual screen, instead of playing on a virtual 2d screen. Games can be played with controllers, like the Xbox and the DualSense controllers.

Using a 3D device to play the games in 2d… Lost opportunity!

Unconnected with the segment speaking of 3D movies, Disney showcased their Disney+ App for Vision Pro. But instead of offering their contents in 3D, they have chosen to add additional (and very distracting) elements to the program you’re watching. If you are watching a 2d content, and start to add 3D elements and environments outside the content, the viewer will bring more attention to that distractions instead to actually watching the movie/series.

The 3D environment will steal attention from the flat 2d content you’re watching

If they only would use the new Leia Media SDK we talked about in our previous article… They could instantly bring their shows in 3D, instead of adding 3D distractions. Disney doesn’t even mention anything about their catalog of 3D movies… A total deception. Fortunately, Apple is bringing us 3D movies through their store.

The external glasses-free 3D display

The device packs two micro OLED +4K displays with 64x more density than the iPhone’s retina display. That’s 24 million pixels, triple resolution than current VR headsets. The sound is also 3D, the spatial audio takes in account your surroundings in the room, using audio ray tracing, making the sound realistic according to your room. All cameras and sensors are powered by two processors, one well-known powerful M2 processor, and a new R1 processor specialized in processing all spatial data in 12 milliseconds, so interactions won’t have any lag.

Everything is private on the device, the iris recognition will be able to authenticate the user just like FaceID does; And just like FaceID, all information is stored securely only on the device. No information is sent to any server. No Apps can access your biometric data, nor can capture or map your room. All is processed locally, and no Apps have access to the data captured by the sensors and cameras; They can work without knowing the real environment the device is processing. Websites don’t even know where you’re looking until you virtually click a button (Chrome and Windows users are tracked even by mouse movements).

The operating system is designed from the ground, and is called VisionOS, which manages real-time data, spatial audio, multi-app 3D engine and spatial frameworks, along with foveated renderer and traditional iOS framework. Developers can use existing development tools by Apple: SwiftUI, XCode, ARKit, RealityKit, and the new Reality Composer Pro (which allows to simulate environments), as well as Unity (with access to all features). Third parties can also use the Apple Spatial Video format.

Examples of uses are, of course, visualization of animated and interactive 3D models (but at a huge scale), a spatial version of Djay using 3D mixers and 3D buttons for effects, or a virtual planetarium on the ceiling of the room.

Launch is expected early 12024 (Holocene calendar), so there’s still time to improve things, but we have mixed feelings about what it offers. The purpose of this expensive first generation is to leave time for developers to create a good ecosystem of Apps and solutions, while Apple works to improve size, weight, battery, and above all, price, for successive generations.

Our conclusion is optimistic, but they should take even more advantage of the 3D information their devices already process. They should use the depth maps of portrait mode photos and TrueDepth cameras, interfaces should be truly spatial, and use 3d objects. Another obvious nice addition would be to use polygonal information on video games to render them in Stereoscopic 3D, or use a second camera (Unity is already capable of both).

Now is the opportunity for Microsoft to take advantage of their fully 3D ecosystem for HoloLens, even since the first HoloLens there are very few 2d elements on the system, most things are rendered in 3D. Microsoft could present a revolutionary HoloLens 3, transitioning their AR hardware to a high-resolution VR headset, so they can equally Apple hardware possibilities, but with their perfected 3D ecosystem: they are pioneers, the first HoloLens appeared 13 years ago, and their interfaces were already using more 3D elements than the new Vision Pro.

That way, there will be a friendly war between the two platforms, and users will benefit from the efforts of the two brands trying to be better than the other. Android devices will also try to enter that war (with better approaches than today’s headsets, already obsolete after this presentation)

The data in this article was carefully curated from both official data and several first-hand impressions from well-known XR experts, who had already tested the device.


Discover more from

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.