June 30, 2025
Shorten Automotive Design Cycle with High Fidelity Machine Learning Solutions

Upstream team
CFD
2 min read
Machine learning (ML) is transforming various industries, particularly automotive aerodynamics, enabling a shift left in the design cycle. By facilitating near real-time predictions, ML enables designers and engineers to rapidly assess the aerodynamic effect of geometry changes and perform rapid optimisations. With the ability to process vast amounts of data quickly and provide immediate approximations, ML acts as an invaluable tool in enhancing aerodynamic performance. This acceleration in analysis not only shortens design timelines but also allows for the early integration of aerodynamic considerations, reinforcing the synergy between aesthetic design and aerodynamic efficiency.
However, significant challenges remain: the quantity and quality of open-source aerodynamic training data for realistic road cars. To tackle this issue, we teamed up with partners to generate the DrivAerML dataset, an open-source dataset for automotive aerodynamics, created using high-fidelity computational fluid dynamics (CFD) methods. Upstream CFD offers services that leverage high-fidelity simulations to support ML applications.
DrivAerML Dataset Website
How Accurate Data Matters in Machine Learning
While ML offers tremendous benefits, the adage "poor quality inputs lead to poor quality outputs" is particularly pertinent. The accuracy and quality of the training data directly influence the reliability of ML predictions. High-fidelity data is essential; without it, the forecasts generated by models can be misleading and ineffective. The relevance of the training data to the design in question is also vital: As designs diverge from established parameters, the model should be equipped to provide warnings when it lacks confidence in its predictions, prompting the automatic generation of new synthetic training data to expand validity into the new parameter space.
DrivAerML Project: High-Fidelity Open-Source Dataset

The DrivAerML dataset was created through a collaborative initiative involving multiple partners to provide a high-quality, public-domain resource for research in automotive aerodynamics and machine learning. Upstream CFD contributed to the simulation workflow, while AWS provided the required computing resources. The dataset includes 500 parametrically morphed variants of the widely recognised DrivAer notchback generic vehicle. Parametric geometry morphing and mesh generation were carried out automatically using the commercial software ANSA. Scale-resolving CFD simulations were performed using a validated automatic workflow provided by Upstream CFD, with substantial cloud HPC resources committed by AWS. To our knowledge, DrivAerML is the first large public-domain dataset for complex automotive configurations generated using high-fidelity CFD.
*Use the link to learn how to access the DrivAerML project dataset hosted on HuggingFace
The dataset not only supports individual research projects but also encourages collaborative workshops (such as the Automotive CFD Prediction Workshop series). The concept is both simple and powerful: users can deploy this single public dataset to train their models and share results, thereby facilitating a collective evaluation of performance in automotive machine learning applications.
Upstream CFD Offers High-Fidelity Solutions for ML Applications
We are at the forefront of innovation in this space, providing services that leverage high-fidelity simulations to support machine learning applications. We have created workflows to generate proprietary datasets tailored to specific customer needs (e.g., AI startups), and can integrate our high-fidelity workflows into ML-driven design environments. This is particularly valuable for incrementally expanding model validity as innovative designers explore new territory beyond the scope of the original training data. Our offerings are designed not just as one-off projects but as scalable solutions that accommodate ongoing simulation demands, with pricing based on CFD Work Units that scale with the algorithm deployed. By focusing on high-fidelity simulations and automated workflows, we are continuously working towards enhancing the use of ML in automotive aerodynamics and other domains.
For more information on the DrivAerML dataset, refer to this paper DrivAerML: High-Fidelity Computational Fluid Dynamics Dataset for Road-Car External Aerodynamics
Related: How Meancalc runtime control enabled enormous HPC savings for the DrivAerML dataset