Search — rendering
Issues
21 matches- github:google-deepmind/mujoco5/14/2026rendering
Request to add native point cloud import and visualization (e.g., PLY/PCD) to the MuJoCo viewer as an efficient vertex rendering overlay. Current workaround renders each vertex as a sphere, which is inefficient for ground-truth verification.
mujocopoint-cloudpcdplyviewerdebuggingrendering - github:newton-physics/newton5/13/2026rendering
The cloth_franka example simulates in centimeters but only partially converts data back to meters for visualization. Debug overlays like COM markers and joint/contact arrows appear meters away due to a cm→m mismatch.
renderingnewton - nvidia-forum:simulation5/13/2026crashes-stability
Isaac Sim is reported to crash when using RTX Sensors. The post provides no additional details beyond the crash condition.
crashrenderingisaac-sim - github:isaac-sim/IsaacLab5/13/2026crashes-stability
TacSL force-field readings in an Isaac Lab demo do not increase smoothly with stepped applied normal force and instead appear irregular. The user reports Isaac Lab main branch with Isaac Sim 5.1.
crashusdrenderingsensorsisaac-simisaac-lab - github:newton-physics/newton5/13/2026crashes-stability
A dexterous hand imported via URDF fails to grasp and lift a bottle; the object slides and remains unliftable. The same bottle can be lifted using a Franka example, suggesting contact/friction or grasp modeling differences for the hand.
crashusdrenderingmanipulationisaac-labnewton - github:newton-physics/newton5/13/2026crashes-stability
A dexterous hand imported via URDF cannot grasp a bottle reliably; the bottle slides and cannot be lifted. The reporter notes the Franka example can lift the same object, implying a hand-specific contact/friction issue.
crashusdrenderinghardwaremanipulationisaac-labnewtonwarp - github:isaac-sim/IsaacLab5/12/2026asset-pipeline
Relative texture paths do not work in the IsaacLab Beta according to the report, even when the image is in the same folder as the USD. Loading via IsaacLab code triggers errors.
usdrenderinghardwareisaac-simisaac-lab - github:isaac-sim/IsaacLab5/11/2026rendering
In a CloudXR + OpenXR setup, frames stream correctly but inbound messages and hand-tracking poses are silently dropped between client and Isaac Sim’s OpenXR plugin. This blocks teleop commands and hand tracking for interactive workflows.
renderinghardwaredeploymentintegrationisaac-simisaac-lab - github:isaac-sim/IsaacLab5/11/2026crashes-stability
In Isaac Lab v3.0.0-beta, lift_cube_sm.py ignores the --viz kit option and no Kit/Isaac Sim window opens despite the process running. A one-line change to AppLauncher initialization appears to fix it locally.
crashrenderinghardwaredocsisaac-simisaac-lab - github:isaac-sim/IsaacLab5/9/2026training-infrarlrenderinghardwaredocsintegrationisaac-simisaac-lab
- github:isaac-sim/IsaacSim5/8/2026training-infrarlusdrenderinghardwaredeploymentlocomotionisaac-simunitree
- github:newton-physics/newton5/8/2026renderingrenderingnewton
- kamino_basic_heterogeneous: rigid box exhibits collision glitches after settling on platformFrictiongithub:newton-physics/newton5/8/2026crashes-stabilitycrashusdrenderinghardwaremanipulationmujoconewtonwarp
- github:newton-physics/newton5/8/2026renderingrenderinghardwaremujoconewtonwarp
- github:isaac-sim/IsaacSim5/8/2026asset-pipelineusdrenderinghardwaresensorsisaac-sim
- github:isaac-sim/IsaacSim5/6/2026renderingrenderinghardwaredocsisaac-simisaac-lab
- github:NVIDIA/warp5/6/2026renderingrenderinghardwarewarp
- github:NVIDIA/warp5/5/2026renderingrenderinghardwareintegrationwarp
- github:isaac-sim/IsaacSim5/5/2026renderingrenderinghardwaredeploymentisaac-sim
- github:isaac-sim/IsaacSim5/4/2026crashes-stabilitycrashrenderinghardwaredeploymentintegrationisaac-sim
- start crashPaingithub:isaac-sim/IsaacSim5/3/2026crashes-stabilitycrashrenderinghardwareintegrationisaac-sim
Papers
11 matches- DiffPhD: A Unified Differentiable Solver for Projective Heterogeneous Materials in Elastodynamics with Contact-Rich GPU-Acceleration2605.145265/14/2026Shih-Yu Lai, Sung-Han Tien, Jui-I Huang, Yen-Chen Tseng …
Differentiable simulation of soft bodies is a foundation for system identification, trajectory optimization, and Real2Sim transfer. Yet, existing methods such as the differentiable Projective Dynamics (DiffPD) struggle when faced with heterogeneous materials with extreme stiffness contrasts, hyperelasticity under large deformations, and contact-rich interactions, which are common scenarios in the real world. We present DiffPhD, a unified GPU-accelerated differentiable Projective Dynamics framework for heterogeneous materials that tackles these intertwined challenges simultaneously. Our key insight is a careful integration of: (i) stiffness-aware projective weights to embed heterogeneity into the global system; (ii) trust-region eigenvalue filtering lifted to the backward pass for stable hyperelastic gradients and a type-II Anderson Acceleration scheme with dual-gate convergence to stabilize forward iteration under large stiffness contrasts; and (iii) a unified GPU pipeline that reuses a single sparse factor across forward, backward, and contact computations, with stiffness-amplified Rayleigh damping folded into the same factor for heterogeneity-aware dissipation at zero recurring cost. DiffPhD achieves strict gradient accuracy while delivering up to an order-of-magnitude speedup over prior differentiable solvers on heterogeneous, hyperelastic, contact-rich benchmarks. Crucially, this speedup does not come at the cost of stability: DiffPhD remains convergent on stiffness contrasts up to 100x where prior PD solvers degrade. This unlocks end-to-end gradient-based optimization on regimes previously bottlenecked by either solver fragility or per-iteration cost -- shell--joint composite creatures, soft characters wielding stiff weapons, and soft-gripper robotic manipulation -- all handled within a single forward--backward pass.
renderingmanipulationintegration - What Limits Vision-and-Language Navigation ?2605.133285/13/2026Yunheng Wang, Yuetong Fang, Taowen Wang, Lusong Li …
Vision-and-Language Navigation (VLN) is a cornerstone of embodied intelligence. However, current agents often suffer from significant performance degradation when transitioning from simulation to real-world deployment, primarily due to perceptual instability (e.g., lighting variations and motion blur) and under-specified instructions. While existing methods attempt to bridge this gap by scaling up model size and training data, we argue that the bottleneck lies in the lack of robust spatial grounding and cross-domain priors. In this paper, we propose StereoNav, a robust Vision-Language-Action framework designed to enhance real-world navigation consistency. To address the inherent gap between synthetic training and physical execution, we introduce Target-Location Priors as a persistent bridge. These priors provide stable visual guidance that remains invariant across domains, effectively grounding the agent even when instructions are vague. Furthermore, to mitigate visual disturbances like motion blur and illumination shifts, StereoNav leverages stereo vision to construct a unified representation of semantics and geometry, enabling precise action prediction through enhanced depth awareness. Extensive experiments on R2R-CE and RxR-CE demonstrate that StereoNav achieves state-of-the-art egocentric RGB performance, with SR and SPL scores of 81.1% and 68.3%, and 67.5% and 52.0%, respectively, while using significantly fewer parameters and less training data than prior scaling-based approaches. More importantly, real-world robotic deployments confirm that StereoNav substantially improves navigation reliability in complex, unstructured environments. Project page: https://yunheng-wang.github.io/stereonav-public.github.io.
renderingdeployment - MoCCA: A Movable Circle Probability of Collision Approximation2605.131255/13/2026Tobias Kern, Christian Birkner
In automated driving, crash mitigation is crucial to ensure passenger safety. Accurate avoidance requires precise knowledge of the object's position and orientation. However, sensor noise and occlusions often result in tracking and prediction uncertainties. To account for these uncertainties, estimating the Probability of Collision (POC) is a critical requirement. While Monte Carlo sampling is a common estimation technique, its high computational demand and stochastic nature often render it unsuitable for real-time applications. Analytical POC calculations are simplified by approximating vehicle geometries using circular bounds. While multi-circle approximations offer higher fidelity than a single circumscribed circle, they significantly increase computational complexity. This paper proposes a shape approximation algorithm, MoCCA, which utilizes a single circle for each vehicle, optimized to minimize the relative distance between them. MoCCA maintains a computational efficiency comparable to standard single-circle techniques while reducing over-conservatism. To address the potential underestimation of POC inherent in partial coverage, we establish an upper bound for the approximation error, demonstrating that it depends primarily on inter-vehicle distance and orientation variance. Furthermore, we introduce a safety distance margin that can be calibrated solely based on orientation variance.
crashrendering - What to Ignore, What to React: Visually Robust RL Fine-Tuning of VLA Models2605.131055/13/2026Yuanfang Peng, Jingjing Fu, Chuheng Zhang, Li Zhao …
Reinforcement learning (RL) fine-tuning has shown promise for Vision-Language-Action (VLA) models in robotic manipulation, but deployment-time visual shifts pose practical challenges. A key difficulty is that standard task rewards supervise task success, but offer limited guidance on whether a visual change is task-irrelevant or changes the behavior required for manipulation. We propose PAIR-VLA (Paired Action Invariance & Sensitivity for Visually Robust VLA), an RL fine-tuning framework to address this difficulty by adding two auxiliary objectives over paired visual variants during PPO optimization: an invariance term that reduces the discrepancy between action distributions for a task-preserving pair (e.g., different distractors), and a sensitivity objective that encourages separable action distributions for a task-altering pair (e.g., target object in a different pose). Together, these objectives turn visual variants from mere observation diversity into behavior-level guidance on policy responses during RL fine-tuning. We evaluate on ManiSkill3 across two representative VLA architectures, OpenVLA and $π_{0.5}$, under diverse out-of-distribution visual shifts including unseen distractors, texture changes, target object pose variation, viewpoint shifts, and lighting changes. Our method consistently improves over standard PPO, achieving average improvements of 16.62% on $π_{0.5}$ and 9.10% on OpenVLA. Notably, ablations further show generalization across visual shifts: invariance guidance learned from distractor and texture variants transfers to target-pose and lighting shifts, while adding sensitivity guidance on target-pose variants further improves robustness to nuisance shifts, highlighting the broader transferability of behavior-level RL guidance.
rlrenderingdeploymentmanipulationvla - NavOL: Navigation Policy with Online Imitation Learning2605.117625/12/2026Xiaofei Wei, Chun Gu, Li Zhang
Learning robust navigation policies remains a core challenge in robotics. Offline imitation learning suffers from distribution shift and compounding errors at rollout, while reinforcement learning requires reward engineering and learns inefficiently. In this paper, we propose NavOL, an online imitation learning paradigm that interacts with a simulator and updates itself using expert demonstrations gathered online. Built upon a pretrained navigation diffusion policy that maps local observations to future waypoints, NavOL trains in a rollout update loop: during rollout, the policy acts in the simulator and queries a global planner which has privileged access to the global environment for the optimal path segment as ground truth trajectory labels; during update, the policy is trained on the online collected observation trajectory pairs. This online imitation loop removes the need for reward design, improves learning efficiency, and mitigates distribution shift by training on the policy own explored rollouts. Built on IsaacLab with fast, high-fidelity parallel rendering and domain randomization of camera pose and start-goal pairs, our system scales across 50 scenes on 8 RTX 4090 GPUs, collecting over 2,000 new trajectories per hour, each averaging more than 400 steps. We also introduce an indoor visual navigation benchmark with predefined start and goal positions for zero-shot generalization. Extensive evaluations on simulation benchmarks, including the NavDP benchmark and our proposed benchmark, as well as carefully designed real-world experiments, demonstrate the effectiveness of NavOL, showing consistent performance gains in online imitation learning.
sim2realrlrenderingsensorsisaac-lab - Introducing Environmental Constraints to Grasping Strategies for Paper-Like Flexible Materials Using a Soft Gripper2605.117145/12/2026Yi Dong, Yang Li, Jinjun Duan, Zhendong Dai
Robotic manipulation of flexible objects is widely required in both industrial and service applications. Among such objects, paper-like materials exhibit distinct mechanical characteristics compared to cloth, being more sensitive to compressive stress, where minor variations in physical properties can significantly affect grasping. This study systematically investigates grasping strategies for paper-like materials using a universal soft gripper by exploiting environmental constraints. Based on manipulation primitives employed in existing grasping strategies, we proposed systematic grasping strategies for flexible materials by exploiting environmental constraints and analyzed their mechanical and kinematic models. To investigate the influence of materials and working conditions on grasping, an evaluation system for measuring grasping force and success rate was defined and experimentally evaluated. Finally, we summarized the specific workspaces and characteristics of different strategies that can satisfy various task requirements and lead to potential applications in household service robots for grasping planar flexible objects.
renderingmanipulation - JACoP: Joint Alignment for Compliant Multi-Agent Prediction2605.113855/11/2026Qingze Liu, Alen Mrdovic, Danrui Li, Mathew Schwartz …
Stochastic Human Trajectory Prediction (HTP) using generative modeling has emerged as a significant area of research. Although state-of-the-art models excel in optimizing the accuracy of individual agents, they often struggle to generate predictions that are collectively compliant, leading to output trajectories marred by social collisions and environmental violations, thus rendering them impractical for real-world applications. To bridge this gap, we present JACoP: Joint Alignment for Compliant Multi-Agent Prediction, an innovative multi-stage framework that ensures scene-level plausibility. JACoP incorporates an Anchor-Based Agent-Centric Profiler for effective initial compliance filtering and employs a Markov Random Field (MRF) based aligner to formalize the joint selection for scene predictions. By representing inter-agent spatial and social costs as MRF energy potentials, we successfully infer and sample from the joint trajectory distribution, achieving prediction with optimal scene compliance. Comprehensive experiments show that JACoP not only achieves competitive accuracy, but also sets a new standard in reducing both environmental violations and social collisions, thereby confirming its ability to produce collectively feasible and practically applicable trajectory predictions.
crashrenderingmulti-agent - MAGS-SLAM: Monocular Multi-Agent Gaussian Splatting SLAM for Geometrically and Photometrically Consistent Reconstruction2605.107605/11/2026Zhihao Cao, Qi Shao, Shuhao Zhai, Jing Zhang …
Collaborative photorealistic 3D reconstruction from multiple agents enables rapid large-scale scene capture for virtual production and cooperative multi-robot exploration. While recent 3D Gaussian Splatting (3DGS) SLAM algorithms can generate high-fidelity real-time mapping, most of the existing multi-agent Gaussian SLAM methods still rely on RGB-D sensors to obtain metric depth and simplify cross-agent alignment, which limits the deployment on lightweight, low-cost, or power-constrained robotic platforms. To address this challenge, we propose MAGS-SLAM, the first RGB-only multi-agent 3DGS SLAM framework for collaborative scene reconstruction. Each agent independently builds local monocular Gaussian submaps and transmits compact submap summaries rather than raw observations or dense maps. To facilitate robust collaboration in the presence of monocular scale ambiguity, our framework integrates compact submap communication, geometry- and appearance-aware loop verification, and occupancy-aware Gaussian fusion, enabling coherent global reconstruction without active depth sensors. We further introduce ReplicaMultiagent Plus benchmark for evaluating collaborative Gaussian SLAM. Intensive experiments on synthetic and real-world datasets show that MAGS-SLAM achieves competitive tracking accuracy and comparable or superior rendering quality to state-of-the-art RGB-D collaborative Gaussian SLAM methods while relying only RGB images.
renderingdeploymentperceptionmulti-agent - PaMoSplat: Part-Aware Motion-Guided Gaussian Splatting for Dynamic Scene Reconstruction2605.103075/11/2026Yinan Deng, Jianyu Dou, Jiahui Wang, Jingyu Zhao …
Dynamic scene reconstruction represents a fundamental yet demanding challenge in computer vision and robotics. While recent progress in 3DGS-based methods has advanced dynamic scene modeling, obtaining high-fidelity rendering and accurate tracking in scenarios with substantial, intricate motions remains significantly challenging. To address these challenges, we propose PaMoSplat, a novel dynamic Gaussian splatting framework incorporating part awareness and motion priors. Our approach is grounded in two key observations: 1) Parts serve as primitives for scene deformation, and 2) Motion cues from optical flow can effectively guide part motion. Specifically, PaMoSplat initializes by lifting multi-view segmentation masks into 3D space via graph clustering, establishing coherent Gaussian parts. For subsequent timestamps, we leverage a differential evolutionary algorithm to estimate the rigid motion of these parts using multi-view optical flow cues, providing a robust warm-start for further optimization. Additionally, PaMoSplat introduces an adaptive iteration count mechanism, internal learnable rigidity, and flow-supervised rendering loss to accelerate and optimize the training process. Comprehensive evaluations across diverse scenes, including real-world environments, demonstrate that PaMoSplat delivers superior rendering quality, improved tracking precision, and faster convergence compared to existing methods. Furthermore, it enables multiple part-level downstream applications, such as 4D scene editing.
renderingperception - SceneFactory: GPU-Accelerated Multi-Agent Driving Simulation with Physics-Based Vehicle Dynamics2605.085285/8/2026Yicheng Zhu, Yang Chen, Tao Li, Zilin Bian
Autonomous-driving simulators typically trade physical fidelity for scalable parallelism. Physics-based platforms such as CARLA and MetaDrive provide articulated vehicle dynamics and contact, but their non-vectorized interfaces make batched training difficult. GPU-batched systems such as Waymax and GPUDrive scale to hundreds of scenarios by replacing rigid-body physics with simplified kinematic models, omitting tire--road interaction, suspension, contact dynamics, and road-condition-dependent friction. We introduce SceneFactory, a GPU-vectorized platform for procedural scene construction, physics-based multi-agent simulation, and RL in autonomous-driving environments. Built on NVIDIA Isaac Sim + Isaac Lab, SceneFactory represents worlds and agents as batched tensors: control, observations, rewards, resets, and policy inference run as GPU tensor operations over the Isaac Lab tensor API. SceneFactory converts Waymo Open Motion Dataset road topologies into simulation-ready USD worlds, runs many worlds concurrently on one GPU, populates each with multiple articulated PhysX vehicles, and maps precipitation and road-surface type to PhysX material friction coefficients. With GPU vectorization, SceneFactory achieves up to 127$\times$ higher throughput than a non-vectorized PhysX baseline on the same GPU and physics solver, reaching 19,250 controlled-agent simulation steps per second at 256 worlds $\times$ 16 agents. Cross-simulator transfer reveals an asymmetric dynamics gap: physics-grounded RL policies transfer to a simplified kinematic bicycle model with 99.5% success, whereas reverse transfer drops to 47.3%. Under wet-road friction, friction-aware policies reduce mean peak DRAC from 58.7 to 27.8,m/s$^2$ without sacrificing goal reach. SceneFactory shows that scalable autonomous-driving training need not discard articulated rigid-body dynamics or physically grounded road-condition variation.
crashrlusdrenderingmulti-agentisaac-simisaac-lab - A Hybrid Approach for Closing the Sim2real Appearance Gap in Game Engine Synthetic Datasets2605.022915/4/2026Stefanos Pasios
Video game engines have been an important source for generating large volumes of visual synthetic datasets for training and evaluating computer vision algorithms that are to be deployed in the real world. While the visual fidelity of modern game engines has been significantly improved with technologies such as ray-tracing, a notable sim2real appearance gap between the synthetic and the real-world images still remains, which limits the utilization of synthetic datasets in real-world applications. In this letter, we investigate the ability of a state-of-the-art image generation and editing diffusion model (FLUX.2-4B Klein) to enhance the photorealism of synthetic datasets and compare its performance against a traditional image-to-image translation model (REGEN). Furthermore, we propose a hybrid approach that combines the strong geometry and material transformations of diffusion-based methods with the distribution-matching capabilities of image-to-image translation techniques. Through experiments, it is demonstrated that REGEN outperforms FLUX.2-4B Klein and that by combining both FLUX.2-4B Klein and REGEN models, better visual realism can be achieved compared to using each model individually, while maintaining semantic consistency. The code is available at: https://github.com/stefanos50/Hybrid-Sim2Real
sim2realrendering