pith.machine review for the scientific record.sign in
cs.CE
Computational Engineering, Finance, and Science
Covers applications of computer science to the mathematical modeling of complex systems in the fields of science, engineering, and finance. Papers here are interdisciplinary and applications-oriented, focusing on techniques and tools that enable challenging computational simulations to be performed, for which the use of supercomputers or distributed computing platforms is often required. Includes material in ACM Subject Classes J.2, J.3, and J.4 (economics).
Forward propagation of input uncertainties in physics-based wildfire models is computationally prohibitive, limiting the use of high-fidelity simulators in risk assessment workflows. This work introduces a geometry-aligned bi-fidelity surrogate framework that addresses the convection-dominated nature of wildfire spread by mapping low- and high-fidelity solution snapshots onto a common reference domain prior to basis selection and reconstruction. Unlike conventional bi-fidelity schemes, which combine spatially shifted snapshots and thus suffer from oscillations and excess basis requirements near sharp fronts, the proposed mapping aligns the dominant front geometry through per-variable shift/stretch transforms in 1D and an activity indicator-based affine alignment in 2D, so that reduced bases compare physically corresponding structures rather than displaced ones. Building on the ADfiRe physics-based simulator, we demonstrate the method on 1D and 2D test cases in which low- and high-fidelity models differ in mesh resolution and physical completeness. Across both settings, the geometry-aligned surrogate reproduces full-field temperature and fuel composition with substantially lower error than its unmapped counterpart, eliminates Gibbs-type oscillations near steep gradients, and recovers high-fidelity probability density functions for key quantities of interest (e.g., maximum temperature, evaporated moisture, and burned area). After offline training, online predictions are roughly three orders of magnitude cheaper than direct high-fidelity evaluation, making the framework a practical building block for many-query uncertainty quantification once the offline cost is amortized over enough queries. We discuss the conditions under which the geometric alignment is most effective, its limitations for non-convex or topologically complex fronts, and the path toward validation against real data.
In ride-pooling, a fleet of vehicles is dynamically dispatched to bring travelers from A to B, trying to pool riders with similar itineraries to improve the use of resources compared to taxis or private cars. Ride-pooling is considered a core building block of future transport systems with autonomous vehicles.
In this paper, we introduce Mt-KaRRi, a novel dispatcher for dynamic ride-pooling that leverages state-of-the-art shortest-path algorithms to process millions of travelers per hour. We add a simple mode choice model and use realistic travel demand in three different urban areas for extensive experiments. We find that our dispatcher scales well with a response time per request of around 1ms even for our largest instances. We show how this scalability can be used to conduct ride-pooling studies at unprecedented scale. For instance, we determine how the quality of rides and usage of vehicle resources develop for tens of thousands of vehicles and millions of travelers.
We envision Mt-KaRRi as a tool for future ride-pooling simulation studies at scale.
Full-vehicle crash simulations are computationally expensive, limiting their use in iterative design exploration. This work investigates learned hybrid surrogate models (MeshTransolver, MeshGeoTransolver, and MeshGeoFLARE) for predicting time-resolved structural deformation fields in an industrial lateral pole-impact benchmark. We evaluate whether neural surrogates can reproduce full-field crash kinematics with sufficient accuracy, spatial regularity, and structural plausibility for engineering interpretation. The proposed architectures combine local mesh message passing, geometry-aware global attention, and sparse contact-aware correction for autoregressive crash rollout.
We compare mesh-based graph neural networks, attention-based geometric models, and hybrid architectures under a common training and hyperparameter configuration. The hybrid models capture both short-range structural interactions and long-range deformation patterns, while a sparse contact-aware variant assesses the effect of dynamic proximity interactions during rollout.
On a 25-sample full-vehicle test set, the best hybrid model achieves a temporal mean root-mean-square error of 3.20 mm. While geometry-aware attention baselines are quantitatively competitive, qualitative side-view inspection shows they can introduce local spatial noise and deformation irregularities that complicate structural interpretation. In contrast, hybrid mesh-attention models provide the best balance between scalar accuracy, survival-space consistency, and physically interpretable displacement fields.
These results suggest that crash surrogate assessment should combine global error metrics with downstream safety-relevant quantities and qualitative field inspection. The proposed methodology enables fast full-field predictions while preserving essential structural information for industrial crash-engineering analysis.
Dimensionality reduction is essential in simulation-based shape design, where high-dimensional parameterizations hinder optimization, surrogate modeling, and systematic design-space exploration. Parametric Model Embedding (PME) addresses this issue by constructing reduced variables from geometric information while preserving an explicit backmapping to the original design parameters. However, PME is intrinsically linear and may become inefficient when the sampled design space is governed by nonlinear geometric variability. This paper introduces a nonlinear extension of PME, denoted NLPME. The proposed framework preserves the defining principle of PME -- geometry-driven latent variables and parameter-mediated reconstruction -- while replacing the linear reduced subspace with a nonlinear latent representation. Geometry is not reconstructed directly from the latent variables; instead, the latent representation is decoded into admissible design parameters, and the corresponding geometry is recovered through a forward parametric map. The method is assessed on a bio-inspired autonomous underwater glider with a 32-dimensional parametric shape description and a CAD-based geometry-generation process. NLPME reaches a 5\% reconstruction-error threshold with \(N=5\) latent variables, compared with \(N=8\) for linear PME, and a 1\% threshold with \(N=9\), compared with \(N=15\) for PME. Comparison with a deep autoencoder shows that most of the nonlinear compression gain can be retained while preserving an explicit backmapping to the original design variables. The results establish NLPME as a compact, admissible, and engineering-compatible nonlinear reduced representation for parametric shape design spaces.
LLM inference is still evaluated mainly as a model or software problem: accuracy, latency, throughput, and hardware utilization. This is incomplete. At deployment scale, the relevant output is a quality-conditioned token produced under joint constraints from effective compute, delivered data-center power, cooling capacity, PUE, and utilization.
We argue that the ML community should treat inference as \emph{energy-to-token production}. We formalize this view with a dimensionally consistent Token Production Function in which token rate is bounded by both compute-per-token and energy-per-token ceilings. Listed API prices vary by over an order of magnitude across providers, but we use price dispersion only as directional motivation, not as causal evidence of marginal cost. The core physical question is instead: under fixed quality and service targets, when does the binding constraint move from theoretical peak compute toward delivered power, cooling, and operational efficiency?
Under this framing, system optimizations -- latent KV-cache compression, sparse or heavily compressed attention, quantization, routing, and difficulty-adaptive reasoning -- are not merely local engineering tricks. They are energy-to-token levers because they reduce FLOPs/token, joules/token, memory traffic, or utilization losses under fixed $(q^{*},s^{*})$. We therefore call for inference papers and benchmarks to report Joules/token, active binding constraint, PUE-adjusted delivered power, and utilization-adjusted token output alongside accuracy and latency.
Artificial intelligence radio access networks (AI-RANs) are a promising architecture for bolstering the prosperity of the edge AI ecosystem. A well-designed incentive mechanism can further ensure the sustainable development of this ecosystem. However, incentive mechanism design faces two major challenges: 1) information asymmetry, where AI-RAN operators have only partial knowledge of AI users' utility functions, and 2) competition, as multiple AI-RAN operators coexist in real-world markets. Remarkably, chaotic and adversarial competition might compromise AI-RAN operators' utility. To this end, we develop a matching-with-contracts framework for incentive mechanism design in AI-RAN service markets. The framework extends the static matching-with-contracts model by jointly characterizing the contract design of multiple competitive operators, user-operator matching, and dynamic evolution of the market state. Specifically, the incentive mechanism offered by each AI-RAN operator takes the form of a contract menu, where each contract item consists of an AI service latency agreement and a corresponding price. We model the AI service process as three independent queues and characterize the violation probability of the latency agreement using queueing theory and the Chernoff bound. To derive an effective incentive mechanism, we further propose a mixed stable matching-with-contracts algorithm that jointly updates user-side matching decisions and operator-side contract menus. Simulation results for a teleoperation-oriented AIGC service demonstrate the effectiveness and robustness of the proposed method. Compared with benchmark schemes, our method improves the total utility of AI-RAN operators by at least 56.8\% under representative settings.
Domain, particle, and space-time decompositions produce distinct communication costs that favor different particle-to-mode ratios in Landau,
abstractclick to expand
We present and compare distributed parallelization strategies for the particle-in-Fourier (PIF) schemes used in kinetic plasma simulations. The different strategies are i) domain decomposition, where both the particles and Fourier modes are split between the MPI ranks ii) particle decomposition, where only the particles are split between the ranks and each rank carries all the modes, and, iii) space-time decomposition, in which time parallelization based on the parareal algorithm is added on top of the particle decomposition. We describe the different communication patterns involved in each of the strategies, the parameter regimes where they work best, and explain their advantages and disadvantages. We implement the strategies within the open-source, performance portable library IPPL and conduct scaling studies with 3D-3V Landau damping and Penning trap benchmark problems on Alps and JUWELS booster supercomputers. We analyze the dominant component timings in each of the strategies and identify areas for future optimizations.
The nonuniform fast Fourier transform (NUFFT) enables spectral methods for problems with irregularly spaced samples, with applications in medical imaging, molecular dynamics, and kinetic plasma simulations. Existing implementations are limited to shared-memory execution, restricting problem sizes to what fits on a single node. We present the first distributed, performance-portable NUFFT for heterogeneous supercomputers. Our Kokkos-based implementation runs without modification on NVIDIA and AMD GPUs. We develop multiple spreading and interpolation kernels optimized for different accuracy requirements and architectures. Our spreading kernels match or exceed the single-GPU throughput of the state-of-the-art CUDA-based NUFFT library cuFINUFFT at production particle densities, while our Kokkos-based implementation additionally supports AMD GPUs. Strong scaling experiments on Alps (NVIDIA GH200), JUWELS Booster (NVIDIA A100), and LUMI (AMD MI250X) demonstrate scaling up to 1024 GPUs. At scale, the distributed FFT is a significant part of the total runtime, making higher NUFFT accuracy less expensive. We apply the method to massively parallel Particle-in-Fourier simulations of Landau damping with up to $1024^3$ Fourier modes and 8.6 billion particles on Alps, JUWELS, and LUMI, demonstrating that distributed NUFFTs enable kinetic plasma simulations at resolutions previously inaccessible to spectral particle methods.
This article presents LibrePiLogger, an open-source data logging platform based on the Raspberry Pi for environmental monitoring using Modbus sensors over RS-485. The system combines the AtmosPyre Python library for sensor communication with Ansible-based deployment automation, allowing researchers to deploy sensor networks by editing a single YAML inventory file. Two hardware configurations are described: a minimal setup using a Raspberry Pi Zero with an RS-485 HAT, and a maximal setup using a Raspberry Pi 4 with a USB-to-RS-485 converter. Currently implemented sensors include the Vaisala GMP252 for CO$_2$ and the RadonTech AlphaTRACER for $^{222}$Rn, with new sensors requiring approximately 100 lines of Python following a provided driver template. Data is logged to timestamped CSV files with JSON metadata. The system has been deployed for continuous CO$_2$ and $^{222}$Rn monitoring in a karst environment since spring 2025 and remains in active operation, demonstrating reliable long-term performance. All hardware designs, software, and deployment scripts are released under the GNU General Public License v3.0. Total hardware costs range from 54 to 63EUR (excluding housing), depending on the configuration.
Subseasonal precipitation forecasting is inherently uncertain due to chaotic atmospheric dynamics, making reliable uncertainty estimation essential for real-world applications. Existing approaches typically represent uncertainty through ensemble forecasts rather than directly modeling predictive distributions. However, due to systematic model biases, raw ensemble outputs are often not well calibrated and cannot be directly interpreted as reliable uncertainty estimates. As a result, operational systems rely on post-hoc calibration based on reforecast datasets, which are computationally expensive to generate and maintain. To address these limitations, we propose QuantWeather, an end-to-end probabilistic forecasting framework with a dual-head design. The probabilistic and deterministic heads are supervised with separate objectives and optimized jointly. The framework further supports stochastic sampling, enabling probabilistic outputs even with a single stochastic forward pass and allowing optional multi-sample aggregation. Extensive experiments show that QuantWeather demonstrates superior probabilistic forecasting skill while substantially reducing inference-time computational and storage costs.
Real delay data from a Japanese regional city shows most kids can get to school without cars, though suburban bus routes lose the most time.
abstractclick to expand
Realistic assessments of school commuting accessibility in areas with infrequent public transport services require accounting for operational delays; however, the impact of these delays has not been sufficiently examined. This study evaluates high-school accessibility in Matsumoto City, a regional city in Japan, using GTFS data representing both scheduled timetables and actual operating conditions. Accessibility levels are assessed under scheduled operations, while the effects of delays are examined through a comparative analysis based on actual delay measurements over a five-day workweek. Furthermore, a sensitivity analysis of travel-time thresholds was conducted. Results show that, when walking, cycling to stations, and public transport use are allowed, 78% of children under 15 can reach at least one high school within a 90-minute round trip, and 67% within a 60-minute round trip. Extending the threshold to 120 minutes enables access to nearly all schools in the city center, but the overall proportion increases only marginally to 81%. Delay impacts are particularly pronounced along bus routes connecting the central station with suburban areas, while in some areas, delays generate idiosyncratic events, where irregular transfers and reduced waiting times result in improved accessibility. Results underscore the need for both short-term measures,such as adjusting school start times, prioritizing buses, and introducing dedicated school routes, and long-term strategies, such as incorporating public transport accessibility into school consolidation decisions, to guarantee fair access to education opportunities without relying on private vehicles.
Physics-based simulation underpins engineering analysis but remains difficult to deploy in practice due to complex setup, parameterization, and interpretation. While Large Language Model-based agentic systems have shown promise in automating engineering computing workflows, they have primarily targeted structured, mesh-based problems. We present the first agentic AI workflow for meshless simulation in computational mechanics, demonstrated on debris flow modeling using Smoothed Particle Hydrodynamics (SPH) with the software DualSPHysics. By integrating tool orchestration, multimodal inputs (text and sketches), and human-in-the-loop interaction, the framework enables end-to-end simulation workflows for a class of problems that are inherently less structured and more challenging to automate. Results show that multimodal inputs not only enhance user experience but also reduces failure modes over text-only descriptions. Human-in-the-loop is critical for resolving ambiguities and handling SPH-specific configurations. We further introduce a cognitive-task-based evaluation of post-processing, showing strong performance in visualization and data extraction, with remaining gaps in higher-level SPH-specific physical reasoning that are amenable to improvement through domain-aware modeling. These results establish the viability of agentic AI for particle-based simulation and underscore its potential to transform the accessibility and efficiency of computational mechanics workflows.
LLM-based financial agents increasingly rely on both numerical market data and textual signals for sequential trading and stock prediction. However, financial misinformation often appears as subtle textual perturbations rather than explicit falsehoods, making it difficult to detect while still capable of significantly altering agent reasoning and decisions. To study this risk, we propose AutoRedTrader, an autonomous red-teaming framework that generates finance-specific misinformation through behavioral bias manipulation, minor textual perturbations, and rewriting strategies, with agent feedback used to strengthen attacks over time. We evaluate AutoRedTrader in a POMDP-based financial agent simulation environment, and further examine a time-series-informed grounding setting for robustness analysis. The framework enables systematic evaluation of how subtle misinformation affects financial agents and whether historical market evidence can stabilize decisions under misleading textual signals. We evaluate the framework on Bitcoin transaction data. The results show that AutoRedTrader achieves the strongest attack performance with 69.00% misinformation exposure rate and 26.67% attack success rate, outperforming general-purpose misinformation and red-teaming baselines. Ablation studies further show that all modules contribute to generating retrievable and decision-effective financial misinformation.
In light transport simulation, Markov chain Monte Carlo methods are particularly effective at exploring regions with complex lighting characteristics. However, estimator variance is a central concern across Monte Carlo methods in general. In light transport, high variance directly manifests as increased noise or, equivalently, longer rendering times at fixed image quality. Variance reduction techniques based on Rao-Blackwellization have proven particularly effective. In practice, however, the RB approach traditionally used in light transport, waste-recycling, can yield little to no measurable variance reduction, a fact we empirically confirm in this work. Motivated by this lack of effective variance reduction, we introduce a novel RB technique for the general-purpose Metropolis-Hastings algorithm that is computationally efficient and achieves substantial variance reduction. We show that this method consistently outperforms waste-recycling in terms of both variance reduction and convergence speed. Building on this result, we adapt the proposed RB approach to the recently introduced general-purpose Jump Restore algorithm, where it similarly achieves substantial variance reduction and accelerated convergence. Through extensive experiments in light transport simulation, we demonstrate that our \gls{rb} technique significantly outperforms the traditional approaches for both MH-based light transport algorithms and Jump Restore Light Transport, under both equal-time and equal-sample-count comparisons.
Graph Neural Networks (GNNs) benchmarks often report single point estimates, even when performance differences are small relative to variation across random seeds, train/test splits, and datasets. Confidence intervals, paired comparisons, multiple-comparison correction, and rank-based aggregation are standard statistical tools, but they are rarely the default output of graph-learning benchmark suites. We introduce GraphNetz, a benchmarking framework whose default output is a structured statistical report rather than a raw accuracy table. GraphNetz currently includes 63 dataset loaders, four task types, and five canonical GNN architectures, while also supporting custom datasets and models. The framework standardizes multi-seed evaluation and automatically returns per-cell confidence intervals, Holm-corrected paired tests, and Friedman-Nemenyi critical-difference diagrams across tasks. In a cross-category benchmark over ten heterogeneous tasks, apparent rank differences among four canonical node-level encoders fall within a single Nemenyi clique, indicating that none is significantly better than the others at $\alpha = 0.05$. GraphNetz therefore provides researchers with a reproducible computational and statistical pipeline to benchmark new graph-learning methods against standard architectures, over different tasks and a wide set of applications, while reporting principled statistical evidence for benchmarking which accounts for seed uncertainty. This framework is set to serve the graph-learning community with a reproducible and honest model comparison ready to be added to papers.
We propose an overlapping Schwarz space-time refinement framework for the material point method (OS-MPM) to improve computational efficiency in problems with strongly localized deformation, contact, and large geometric nonlinearity. The method decomposes the domain into overlapping coarse and fine subdomains with heterogeneous spatial and temporal resolutions, while retaining standard MPM discretizations within each subdomain. Coarse-fine coupling is achieved through an MPM-specific Schwarz iteration combining mass-weighted spatial transmission and temporal interpolation for sub-cycling. In contrast to refinement strategies based on modified basis functions, transition kernels, or strongly enforced interface constraints, the proposed approach preserves the modular structure of standard MPM and shifts the coupling complexity to nonmatching-grid interface operators within the Schwarz alternating procedure. Numerical examples, including a gravity-driven cantilever beam, Hertzian contact, and an elastic inclusion problem, show that the method reproduces analytical or fine-resolution reference solutions with good accuracy and convergence behavior. In the inclusion benchmark, the proposed framework achieves comparable or slightly lower error than single-domain fine simulations at the finest tested resolutions, while reducing computational cost by up to 9.15 times. A three-dimensional folding example further demonstrates the generality of the framework. These results indicate that the proposed method provides an accurate, modular, and efficient route for local space-time refinement in MPM.
Score-based generative modeling (SBGM) has achieved state-of-the-art performance in image generation, with the quality of generated images being highly dependent on the design of the forward (diffusion) process. Among these, models based on stochastic differential equations (SDEs) have proven particularly effective. While traditional methods aim to progressively destroy all image information to enable reconstruction from pure noise, we propose a class of anisotropic stochastic partial differential equations (SPDEs) that preserve the geometric structure of the data over longer time scales throughout the transformation. These SPDEs consist of a drift term that enforces deterministic destruction via structured smoothing, and a diffusion coefficient that enables random destruction through noise injection. Both components are governed by anisotropy coefficients, enabling controlled, direction-dependent information degradation. This framework provides the theoretical foundation for a novel anisotropic score-based generative model. By retaining geometric structure for longer time scales, the backward generative process can exploit residual geometric cues, leading to improved reconstruction fidelity. We empirically validate this improvement in a proof-of-concept implementation on unconditional image generation, showing that anisotropic diffusion can achieve superior image quality metrics. We demonstrate consistent improvements in both pixel and latent space experiments over the SDE-driven baseline as well as over the state-of-the-art Flow Matching approach. Finally, we demonstrate the effectiveness of the introduced anisotropy in a conditional stroke-to-image generation task.
Protein-protein interaction (PPI) modeling has been widely studied as a binary or multi-label classification task. While emerging multimodal large language models (LLMs) can now describe single proteins, they remain unable to generate free-form descriptions of interactions between protein pairs. Moving beyond controlled vocabulary annotations, we propose to model PPI using free-text description, enabling richer expressiveness, improved interpretability, and better integration with literature knowledge base. We present PPI2Text, a multimodal LLM for free-form PPI captioning from amino acid sequences, that encodes each protein using ESM3 encoder, constructs a pair map from the two representations to capture interactions across all residue pairs, and autoregressively generates descriptions using a Qwen3 language decoder. We further introduce PaCo-RoPE, a coordinate-aligned positional encoding that aligns each axis of the pair grid with the residue positions of the corresponding protein. In addition, we release PPI2Text-Dataset, a 351k-pair corpus of free-form PPI descriptions aggregated from ten curated biological databases and further synthesized with Gemini under evidence-tiered prompting. PPI2Text consistently outperforms strong baselines across multiple ablation settings and evaluation protocols. It not only achieves higher scores on linguistic metrics against synthesized references, but also excels on factuality metrics, where an LLM-based judge evaluates outputs against raw biological evidence.
We present Diffusion Restore, a real-time framework for diffusion-based MCMC light transport. MCMC methods are highly suitable for sampling from complex high-dimensional distributions and for approximating integrals over them. In practice, they are often the only viable solution when direct sampling is not possible and alternative methods are either inefficient or cannot be applied due to the structure of the target distribution. However, controlling the exploration of the target distribution in MCMC methods remains challenging. Efficient exploration requires a balance between local exploration and global discovery, and local dynamics must rapidly explore individual modes without getting stuck or exhibiting excessive backtracking. The problem of global discovery has recently been addressed by the introduction of the Restore framework. In this work, we build on this framework and focus on improving local exploration. We show how to choose diffusion-based local dynamics within the Restore framework while completely avoiding Metropolis-adjustment, which is known to slow down convergence. Furthermore, we model these dynamics as nonreversible, introducing momentum in the drift and thereby enabling more directed exploration of the target distribution compared to reversible, random-walk-like dynamics. We provide a theoretical justification for the validity of our choice of local dynamics. Empirically, we demonstrate across diverse scenes that Diffusion Restore outperforms all existing MCMC light transport methods and establishes a new state of the art. In addition, we present a GPU implementation in ray tracing and compute shaders and achieve real-time frame rates. This demonstrates that Diffusion Restore is not only superior in offline rendering, but also outperforms traditional Path Tracing methods in real-time rendering settings, such as interactive applications and games.
We present Diffusion Restore, a real-time framework for diffusion-based MCMC light transport. MCMC methods are highly suitable for sampling from complex high-dimensional distributions and for approximating integrals over them. In practice, they are often the only viable solution when direct sampling is not possible and alternative methods are either inefficient or cannot be applied due to the structure of the target distribution. However, controlling the exploration of the target distribution in MCMC methods remains challenging. Efficient exploration requires a balance between local exploration and global discovery, and local dynamics must rapidly explore individual modes without getting stuck or exhibiting excessive backtracking. The problem of global discovery has recently been addressed by the introduction of the Restore framework. In this work, we build on this framework and focus on improving local exploration. We show how to choose diffusion-based local dynamics within the Restore framework while completely avoiding Metropolis-adjustment, which is known to slow down convergence. Furthermore, we model these dynamics as nonreversible, introducing momentum in the drift and thereby enabling more directed exploration of the target distribution compared to reversible, random-walk-like dynamics. We provide a theoretical justification for the validity of our choice of local dynamics. Empirically, we demonstrate across diverse scenes that Diffusion Restore outperforms all existing MCMC light transport methods and establishes a new state of the art. In addition, we present a GPU implementation in ray tracing and compute shaders and achieve real-time frame rates. This demonstrates that Diffusion Restore is not only superior in offline rendering, but also outperforms traditional Path Tracing methods in real-time rendering settings, such as interactive applications and games.
Biomolecular generators are often adapted with reward feedback to improve task-specific utility, but pushing utility alone can concentrate generation on a narrow family of candidates. Maintaining diversity is difficult because sample diversity is a set-level property. We introduce Supergroup Relative Policy Optimization (SGRPO), a flexible GRPO-style framework that directly constructs rewards from set-level diversity. For each condition, SGRPO samples a supergroup of candidate sets, compares their diversity under the same condition, and redistributes the group diversity reward to individual rollouts through leave-one-out diversity contributions before combining it with rollout-level utility. This design decouples SGRPO from a particular generator, utility reward, or diversity metric, and allows instantiation with different GRPO-style approaches. We evaluate SGRPO on de novo small-molecule design, pocket-based small-molecule design, and de novo protein design, instantiating it with both GRPO and Coupled-GRPO across autoregressive and discrete diffusion generators. Across decoding sweeps, SGRPO expands the utility-diversity Pareto frontier and achieves the best frontier-level metrics relative to pretrained generators, GRPO, and memory-assisted GRPO when applicable. Our analyses further show that direct set-level diversity rewards remain effective with small groups and help preserve broader generation-distribution coverage during post-training. The code is available at https://github.com/IDEA-XL/SGRPO.
Texture shapes how we perceive and like food, yet clear links between mechanical measurements and sensory perception of texture remain elusive. Here we combine sensory data from a blind tasting with 101 participants with mechanical texture profile analysis across six burgers to identify the textural features that drive consumer perception and liking. We compare five burgers -- generated with artificial intelligence -- with animal-based, plant-based, mushroom-based, and hybrid animal-mushroom patties, and the classical Big\,Mac. Three main findings emerge: First, animal-based burgers occupy a distinctive and coherent sensory-mechanical region associated with attributes such as firm, fatty, and holds together. Second, mushroom- and plant-based burgers deviate from this region in protein-dependent ways: mushroom-based burgers associate with springy and gummy textures, while plant-based burgers associate with dry, brittle, and crumbly textures. Hybrid animal-mushroom burgers, however, maintain sensory profiles comparable to fully animal-based burgers. Third, resilience emerges as the strongest mechanical correlate of perceived meatiness and sensory texture, while stiffness and hardness show no statistically significant association with consumer perception. Texture independently predicts overall liking alongside flavor: increasing texture liking by one point increases overall liking by 0.28. Among all sensory attributes, meatiness is the dominant predictor of texture liking. These findings identify resilience as a promising target for texture engineering and establish texture as a critical design objective for sustainable alternative proteins.
QIEO delivers better structure recovery and outlier resistance in signal and regression problems.
abstractclick to expand
The escalating complexity of modern machine learning necessitates solving challenging non-convex optimization problems, particularly in high-dimensional regimes and scenarios contaminated by gross outliers. Traditional approaches, relying on convex relaxations or specialized local search heuristics, frequently succumb to suboptimal local minima and fail to recover the true underlying discrete structures. In this paper, we propose treating these non-convex challenges as a global search problem and introduce a unified framework based on Quantum-Inspired Evolutionary Optimization (QIEO). By leveraging a probabilistic representation inspired by quantum superposition, QIEO maintains a global view of the search space, enabling it to tunnel through local optima that trap conventional gradient-based and greedy solvers. We comprehensively evaluate QIEO across diverse non-convex applications, including sparse signal recovery (gene expression analysis and compressed sensing) and robust linear regression. Extensive benchmarking against state-of-the-art continuous solvers (ADAM, Differential Evolution), classical metaheuristics (Genetic Algorithms), and specialized non-convex algorithms (Iterative Hard Thresholding) demonstrates that QIEO consistently achieves superior structural fidelity, lower mean squared error, and enhanced robustness without support inflation. Our findings suggest that embracing a quantum-inspired global search provides a resilient, unified paradigm for overcoming the inherent intractability of discrete nonconvex machine learning landscapes.
The simulation of fluid flows is computationally expensive due to the complexity of its governing partial differential equations. Machine learning models offer a potential surrogate, enabling learning from simulations and significantly faster predictions of flow fields. However, these models require large training datasets, which introduces a trade-off between dataset generation cost and predictive accuracy. In this work, we investigate the relationship between the size of the training-set and accuracy of the prediction when learning steady flow fields in an industrial-scale stirred vessel. A data set of steady flows is generated using Reynolds Averaged Navier Stokes (RANS) simulations in a range of realistic operating conditions, including impeller speeds and liquid heights. We train implicit neural representations of flow fields and compare purely data-driven and constrained variants. Model performance is evaluated using global mean squared error (MSE), qualitative spatial comparisons of predicted and reference flow fields, and tracer transport simulations. We find that the prediction error decreases monotonically with increasing training data, but also that it exhibits clear diminishing returns beyond moderate dataset sizes. Physics-based constraints significantly improve accuracy and reduce variability across training runs in low-data regimes, and they lead to more stable tracer-transport behavior. Furthermore, reasonable interpolation can be achieved over different impeller speeds and liquid heights. However, these benefits come with an increase in the complexity of training, and their relative advantage diminishes as the training set grows.
Shape-morphing metamaterials enable adaptive structures capable of complex functional deformations, with applications ranging from reconfigurable structures and soft robotics to medical devices. However, their design remains challenging due to an inherent trade-off between deformation programmability and computational scalability. Periodic architectures offer computational tractability but are limited in their programmability, whereas aperiodic metamaterials provide richer deformation spaces at the cost of substantially increased design complexity. To bridge this gap, we propose a scalable active metamaterial (SAM) design framework that decouples the design problem into two scales by exploiting the local deformation independence of units isolated by stiff structural members. At the macroscale, global shape deformation is determined by iteratively solving a constrained mesh optimization problem incorporating data-driven constraints. At the microscale, the local infill geometry is obtained through inverse design via either a conditional diffusion model or an adjustable search strategy. This hierarchical decomposition enables fast, accurate, and scalable design of aperiodic shape-morphing metamaterials, offering a new computational paradigm for the design of programmable material systems.
Neural operators have achieved promising performance on partial differential equations (PDEs), but most existing models are built on fixed Eulerian coordinates. This mismatch between evolving physical structures and static coordinates creates spatial misalignment, leading to unnecessarily non-local operator mappings and reinforcing a smoothness preference near sharp transitions. Inspired by adaptive coordinate transformations in classical PDE analysis, we propose the Adaptive Coordinate Transform (ACT) block, a plug-and-play module for data-driven geometric adaptation in neural operators. ACT blocks resolve this structural limitation by learning adaptive coordinate systems within the operator learning pipeline. Specifically, given an input feature, the ACT block learns a coordinate transformation and represents the same feature under the transformed coordinates via differentiable sampling. This operation preserves the underlying signal while changing its spatial representation, equivalent to expressing the same physical quantity in different coordinate systems. By adapting the coordinate system to the data, ACT allows the network to better track evolving structures, reduce operator complexity, and dynamically focus on critical features to improve learning. We evaluate the proposed approach across diverse PDE benchmarks and multiple neural operator architectures. Experimental results demonstrate consistent and significant improvements in predictive accuracy, indicating that learning coordinate systems provides a powerful mechanism for enhancing operator learning.
Automated market makers (AMMs) quote prices from pool state rather than from a limit order book. AMM pools often stay close to a reference price because arbitrageurs correct profitable mispricing. A large part of decentralized finance therefore relies on a simple economic premise: once the AMM price drifts away from the reference price, arbitrage incentives push it back. This paper studies when that premise is strong enough to guarantee block-scale stability. We model the gap between the reference price and the AMM price as a stochastic tracking error, treat arbitrage as the corrective input, and place blockchain execution inside the loop through fees, discrete blocks, transaction ordering, delays, and transaction failure. The detailed execution layer is reduced to the total successful correction confirmed in each block. Under a block-level correction condition, we prove geometric ergodicity of the tracking error and obtain explicit one-step bounds that connect tracking quality to liquidity and execution quality. We also show in a constant-product example how fees, fixed execution costs, and local liquidity map into the no-trade band and the optimal corrective trade. Finally, we build empirical proxies for the theorem quantities from realized block data and use them to organize reduced and mechanism-focused simulations whose comparative statics are consistent with the theory. The contribution is to turn a basic economic intuition behind decentralized finance into a quantitative stability statement together with a tractable calibration interface.
Optimizing Reconfigurable Intelligent Surfaces (RIS) is a high-dimensional combinatorial challenge. Current quantum algorithms often simplify this problem by ignoring physical constraints like mutual coupling, which significantly degrades real-world performance. Rather than targeting a fully realistic RIS description, we embed progressively more physics-informed models of mutual coupling into Quadratic Unconstrained Binary Optimization (QUBO) formulations. We evaluate four Ising interaction models ($J_{ij}$) for the Quantum Approximate Optimization Algorithm (QAOA), ranging from idealized phase-only to fully dense physical models. Analyzing a $5 \times 5$ grid, our results expose a critical trade-off between spatial pointing accuracy and quantum hardware feasibility. While complete global coupling maximizes beamforming precision, dense Hamiltonians introduce prohibitive routing overhead and complicate convergence on near-term processors. Ultimately, we demonstrate that while physics-informed quantum optimization is mathematically viable, sparse, distance-penalized models remain a necessary compromise for execution on current noisy intermediate-scale quantum (NISQ) devices.
The primary objective of this study is to remove duplicated monomial contributions that proliferate in Carleman linearization as state dimension and truncation order increase. To do so, we adopt a shift-and-lift architecture, since it exposes repeated exponent targets and allows duplicate-aware coefficient coalescing during lifted-operator assembly. This architecture also makes high-order truncation practical, but that regime intensifies local convergence and closure sensitivity for higher-order nonlinearities. We therefore pair shift-and-lift with a moving-center expansion so that shift and lift are updated jointly around evolving local centers, improving validity of the truncated model along the trajectory. The resulting workflow combines symmetry-reduced monomial bases, packed exponent-key indexing, and sparse triplet coalescing to preserve truncated affine dynamics while reducing index-resolution overhead and write-path irregularity. We analyze variable growth, preprocessing complexity, and truncation-induced error mechanisms, and we compare against Jacobian linearization through fixed-step error, admissible step size, and cost-at-target-accuracy criteria. Two benchmarks (bilinear driver and logistic interaction) show convergence under refinement for both approaches, with regime-dependent accuracy gains for the proposed method rather than universal superiority.
Elastic ribbons, slender structures whose length ($L$), width ($W$), and thickness ($b$) satisfy $L \gg W \gg b$, exhibit mechanical behaviors intermediate between one-dimensional rods ($L \gg W, b$) and two-dimensional plates ($L, W \gg b$). In quadratic Kirchhoff-type rod-based frameworks, such as Discrete Elastic Rods (DER), the governing equilibrium equations are independent of width, and therefore these models cannot capture width-dependent mechanical effects. Reduced centerline-based ribbon models attempt to capture width dependence via coupled bending-twisting energies. However, their relative accuracy remain unclear due to the absence of a unified simulation framework. In this work, we formulate a framework grounded in discrete differential geometry where the energy is expressed as functions of coupled bending-twisting strain measures along the centerline, rather than a linear sum of quadratic bending and twisting energies in DER. We derive analytical gradients and Hessians of the energy that enable implicit time integration. Within this unified setting, we compare five ribbon models: Kirchhoff, Sadowsky, Wunderlich, Sano, and Audoly. As a benchmark, a straight ribbon is longitudinally constrained into a pre-buckled arch and subjected to transverse displacement, inducing a supercritical pitchfork bifurcation. Predicted bifurcation thresholds are compared against shell-based finite element simulations, with the Sano model providing the closest agreement in capturing width-dependent shifts. Our high-performance JAX-based implementation achieves $\mathcal{O}(N)$ per-iteration cost and also confirms that Sano model introduces negligible per-iteration overhead relative to standard DER.
We compare different Poisson solvers within the context of an electrostatic Vlasov-Poisson system. These schemes are implemented as part of the IPPL (Independent Parallel Particle Layer) library (Frey et al., 2024), which provides performance portable and dimension independent building blocks for scientific simulations requiring particle-mesh methods, with Eulerian (mesh-based) and Lagrangian (particle-based) approaches. The simulation used to compare the performance and portability of the schemes is Landau damping, part of a set of mini-applications implemented to benchmark and showcase the capabilities of the IPPL library (Muralikrishnan et al., 2024). We use grid-sizes of $512^3$ and $1024^3$ with 8 particles per cell, running with different algorithms in the solve phase of the Particle-in-Cell (PIC) loop: a Fast Fourier Transform (FFT) pseudo-spectral solver, a matrix-free finite difference Preconditioned Conjugate Gradient (PCG) solver, and a matrix-free Finite Element (FEM) solver. We also compare these PIC schemes to the novel Particle-in-Fourier (PIF) scheme, which performs interpolations using non-uniform FFTs thereby avoiding a grid in the real space. We obtain results on different computing architectures, such as AMD GPUs (LUMI at CSC), and Nvidia GPUs (Alps at CSCS and JUWELS Booster at J\"ulich Supercomputing Center), showcasing portability. In terms of absolute time the FFT solver is advantageous, but is limited in its applicability. All other field solvers in the PIC scheme are an order-of-magnitude more expensive in terms of time, but scale similarly to the FFT case in the electrostatic PIC context. The PIF scheme serves as a high fidelity alternative to standard PIC, and while it is costlier than the FFT-based PIC scheme, it shows excellent scalability on all the architectures.
Data assimilation provides a systematic framework for combining dynamical models with partial and noisy observations to infer the evolving state of a system. In this work, we undertake a comparative study of Data Assimilation with Transfer Operators (DATO) and Quantum Mechanical Data Assimilation (QMDA), focusing on their mathematical formulation, algorithmic structure, and empirical performance. Both methods are first cast within a common operator-theoretic framework, which makes it possible to compare, on a unified basis, their representations of uncertainty, forecast propagation, and assimilation updates. We then analyse their principal similarities and differences with respect to state-space structure, update mechanisms, structural preservation properties, and computational cost. To complement the theoretical analysis, we assess both approaches on benchmark dynamical systems across a range of observational settings, including noisy, sparse, and partially observed regimes. Our results show that, despite their shared operator-theoretic motivation, DATO and QMDA embody substantially different assimilation paradigms, leading to distinct advantages and limitations in terms of interpretability, robustness, and scalability. The present study helps delineate the regimes in which each framework is most effective and offers broader insight into the design of operator-based methodologies for data assimilation.
Density-based topology optimization methods such as SIMP enable efficient topological exploration but produce diffuse material boundaries that require interpretation before manufacturing. Level-set methods maintain sharp interfaces but are sensitive to the initial design. This paper presents a sequential framework that addresses these complementary limitations through a signed distance function (SDF)-based geometry transfer, formulated for three-dimensional meshes. The SIMP density distribution is converted into an SDF that initializes subsequent level-set boundary refinement. From the level-set perspective, the SIMP-derived initialization mitigates sensitivity to the initial design. From the SIMP perspective, the level-set stage acts as optimization-driven post-processing that produces manufacturing-ready boundaries. Validation on three-dimensional cantilever and MBB benchmarks demonstrates compliance comparable to standalone level-set optimization, with up to 4.6x wall-clock speedup on the cantilever case. The full implementation is released under an open-source license to support reproducibility.
TAFES platform anchors events on Ethereum Layer-2 and stores evidence on IPFS to produce auditable certification trails across stakeholders.
abstractclick to expand
\abstract{\textbf{Purpose:} This study addresses the lack of trust in ethical product labels by designing a blockchain platform grounded in the TAFES principles (Transparency, Accountability, Fairness, Ethics, Safety). It aims to bridge the gap between blockchain's theoretical transparency and a responsible, real-world implementation for certification ecosystems.
\textbf{Design/Methodology/Approach:} Using Action Design Research, we developed a proof-of-concept platform for label authentication. A hybrid architecture records critical events on an Ethereum Layer-2 network for security, while supporting evidence is stored off-chain via IPFS and linked via content identifiers. The solution was validated through a coffee supply chain scenario.
\textbf{Findings:} The proof of concept demonstrates how a TAFES-aligned blockchain platform can support verification of label claims without requiring trust in a single intermediary by creating tamper-evident provenance records and auditable certification evidence across multiple stakeholders. The design supports low-cost, near-real-time anchoring of supply chain events while mitigating adoption barriers related to scalability, privacy, and operational viability.
\textbf{Originality/Value:} This research contributes an integrated ethical and technical blueprint for trustworthy label authentication systems by translating TAFES into implementable design requirements and evaluation checks, and validating them through an ADR driven proof of concept. It advances prior work by moving from the question of whether blockchain can help to the question of how it should be implemented responsibly in multi stakeholder certification ecosystems.}
Named Entity Recognition (NER) is a critical component of Natural Language Processing with diverse applications in information extraction and conversational AI. However, NER in specific domains for low-resource languages faces challenges such as limited annotated data and heterogeneous label sets. This study addresses these issues by proposing a hybrid neurosymbolic framework that integrates rule-based processing with deep learning models for Vietnamese NER. The core idea involves a two-stage pipeline: first, a rule-based component reduces label complexity by grouping relational and special categories; second, pre-trained language models are fine-tuned for high-precision extraction. A post-processing module is then utilized to restore fine-grained labels, preserving expressiveness for application-level usability. To mitigate data scarcity, a scalable data augmentation strategy leveraging Large Language Models (LLMs) is introduced to expand the label set without full re-annotation, which is a significant novelty of this work. The effectiveness of this method was evaluated across five specific-domain datasets, including logistics, wildlife, and healthcare. Experimental results demonstrate substantial improvements over strong RoBERTa-based baselines. Specifically, the proposed system achieved F1 scores of 90 percent in Customer Service, up from 83 percent; 84 percent in GAM, up from 73 percent; 83 percent in AI Fluent, up from 80 percent; 94 percent in PhoNER_Covid19, up from 91 percent; and 60 percent in Rare Wildlife, up from 36 percent. These findings confirm that the hybrid approach effectively captures the linguistic complexity of Vietnamese and contextual nuances in specialized domains, offering a robust contribution to low-resource NER research.
An ice shelf is a floating extension of a land-based ice sheet into the ocean. It plays a crucial role in slowing down the flow of land ice into the sea, thus stabilizing the ice sheet. However, this stabilizing effect can be weakened by ice calving, a process in which large fragments of ice detach from the ice shelf. Although ice calving is widely acknowledged as a major contributor to ice mass loss, and its frequency and magnitude are highly sensitive to the environmental forcing, the underlying physics-based mechanisms remain poorly understood, particularly under ocean wave actions. In this context, we developed a nonlocal peridynamics (PD) framework to model the ice calving process subjected to wave-induced frontal corrosion. The proposed physics-based PD framework enables investigation of the coupled effects of self-weight bending, buoyancy-induced foot loosening, and ice calving process. To authors' best knowledge, this work represents the first attempt to employ a physics-based peridynamics framework for simulating ice calving processes. Compared with conventional finite element methods (FEM), the PD framework naturally captures crack initiation, interaction, and propagation without the need for special numerical treatments, thereby providing a robust tool for simulating fracture phenomena under large deformations and long-term environmental loading. To quantitatively resolve fracture processes, we implemented a static first Piola Kirchhoff virial stress formulation within the PD framework, allowing direct evaluation of stress concentration and energy release at evolving crack tips. Subsequently, the model is rigorously validated through one-to-one comparisons with finite-element stress fields, analytical beam-theory solutions, and recent field observations of wave-driven ice-shelf failure reported by Sartore et al. (2025).
Point discretization of curved surfaces is required in many applications ranging from object rendering to the solution of surface partial differential equations (PDEs). These applications often impose that surfaces are sampled with local regularity and global curvature adaptivity to maintain robustness and efficiency. Computing numerically well-conditioned point discretization is non-trivial, even for simple analytic curved surfaces. We present an algorithm for finding near-optimal surface point distributions governed by a prescribed length field on curved surfaces. The algorithm works by approximately minimizing a global potential over local point-point interactions. The optimization problem is solved using gradient descent, accelerated by line search to find optimal step sizes. We use a level-set method to describe the surface and perform all required projections without requiring additional surface-attractive forces. To further accelerate convergence, the algorithm dynamically fuses and inserts points where a local excess or lack of points is detected using an integral support measure. We test the proposed algorithm on a variety of shapes, ranging from parametric to non-parametric surfaces. We compute point distributions with different curvature adaptivity and show that the algorithm achieves low average deviation from the prescribed target spacing locally. Overall, the presented algorithm rapidly and robustly converges to the final number and distribution of surface points.
AI data centers are increasingly becoming tightly coupled compute--energy systems, where workload placement, cooling demand, electricity procurement, storage operation, and carbon emissions interact over time. This paper studies carbon-aware compute--power scheduling for geographically distributed AI data centers with microgrid prosumer capabilities. We propose a mixed-integer linear programming (MILP) framework that jointly schedules rigid training jobs, routes elastic inference workloads, dispatches local generation and battery storage, and manages bidirectional grid interaction under latency, continuity, power-balance, and carbon-budget constraints. The model captures two key features of emerging AI infrastructure: heterogeneous workload flexibility and site-level energy prosumer operation. Experiments on synthetic yet practically motivated instances show that the proposed joint MILP substantially improves total operational benefit over compute-only and energy-only baselines while reducing emissions. The results further indicate that inference-routing flexibility is a major source of value, battery storage provides useful temporal flexibility, and local-generation-rich settings are particularly favorable. The framework provides a tractable optimization abstraction for sustainable and grid-interactive AI data centers.
Endovascular treatment of cerebral aneurysms aims to achieve functional occlusion and isolation of the aneurysm sac from bloodflow. In clinical practice, treatment success is assessed primarily through digital subtraction angiography (DSA), which visualizes contrast-agent inflow and washout but does not directly resolve thrombus formation driving early occlusion. We present a computational framework that couples acute fibrin thrombus formation with virtual angiography, enabling early thrombus growth to be interpreted through clinically familiar DSA-like imaging. Three common treatment strategies: endovascular coiling, flow diversion, and stent-assisted coiling, are modeled under pulsatile hemodynamics and linked to simulated contrast transport. Across three representative aneurysm morphologies, the simulations demonstrate that while devices reduce inflow, residual contrast access and trapping may persist, with early thrombus formation contributing substantially to perfusion suppression and altered washout patterns. These effects are clearly reflected in the virtual angiographic imaging. The importance of vortical structures in device-induced thrombosis is highligthed in one of the cases. By seeking to align modelling and simulation tools with clinically-relevant metrics, with a particular focus on occlusion outcome, this work presents a good starting point for bridging the gap between these two paradigms.
Feedstock deformation during 3D printing of continuous fiber composites is a critical challenge in path planning and a main driver in the generation of manufacturing defects. The proposed work addressed the feedstock deformation during the deposition through several experimental and numerical pathways. The experimental setups and numerical simulations are used to identify the main driving phenomena in the deformation of feedstock through residual stress relief and drying, crystallization, and thermal stresses. A hybrid physics-based and data-driven modeling effort is performed, using Kelvin-Voigt viscoelastic modeling of the composite prepregs and a stabilized neural ODE for the modeling of drying and crystallization. The identified hybrid models from DMA and DSC experiments are used in robotic 3D printing to validate the deposition of a composite prepreg in real printing settings. The results show the ability of the model to reproduce the prepreg behavior far above the temperature used in the training, showcasing its robustness and generalization capability.
We present two new classes of causal models of decision-making agents. Our approach is motivated by the needs of modeling the economics of computing systems. These systems are composed of subsystems and can exhibit endogenous limits on cognitive resources and value discounting. Structural Causal Decision Models (SCDMs) expand on Structural Causal Influence Models. Like SCIMs, they explicitly represent the causal relationships between model variables and the payoffs of agent decisions. Additionally, agent decisions can be constrained by their causal antecedents, and SCDMs can have open root variables for which no probability distribution or structural equation is given. We show that SCDMs have a well-defined and computationally useful property of composability. Building on SCDMs, we then define a Structural Causal Decision Process (SCDP) as a recurring SCDM with a discount variable. SCDPs benefit from the useful composition properties of SCDMs. Moreover, SCDPs are strictly more expressive than POMDPs because they do not assume rational belief formation. Indeed, an SCDP can endogenously model the memory-formation process, and is thus useful for modeling resource rational agents in dynamic settings. SCDPs are also capable of modeling variable discounting, a tool used widely in social scientific modeling. We pose that SCDPs are a useful framework for policy simulation for the digital economy, mechanism design for information systems, and digital twin modeling of cyberinfrastructure.
Physics-informed neural networks provide a mesh-free framework for solving partial differential equation-governed problems in solid mechanics. However, most existing formulations in linear elasticity still learn the displacement field directly, which does not explicitly exploit the analytic structure of two-dimensional elasticity and becomes restrictive for fracture problems with crack face discontinuities and crack tip singularities. Moreover, existing Kolosov--Muskhelishvili informed neural network formulations still rely on residual-based loss functions with multiple boundary and interface terms, whereas a variational concept has not yet been established. To address these issues, a variational Kolosov--Muskhelishvili informed neural network framework for two-dimensional linear elastic problems with and without cracks is proposed in this work. The solution is represented by two holomorphic Kolosov--Muskhelishvili potentials and trained through an energy-based loss function derived from the principle of minimum total potential energy. For crack problems, a discontinuous stress potential representation is further introduced to embed the crack face condition and crack tip singularity directly into the solution ansatz. The proposed framework is validated on a series of benchmark problems with or without crack problems. The results show that variational Kolosov--Muskhelishvili informed neural network can accurately predict stress and displacement field as well as stress intensity factors. Compared with traditional neural network models, it achieves higher accuracy, simpler loss construction, and faster convergence in the considered cases. Overall, the proposed variational Kolosov--Muskhelishvili informed neural network provides an effective and physically consistent variational framework for two-dimensional linear elastic fracture analysis.
Hybrid conditioning on stress fields and loads produces accurate topologies without repeated simulations.
abstractclick to expand
This work presents a diffusion transformer framework for data-driven structural topology optimization that combines the accuracy of physics-based methods with the efficiency of generative deep learning. Conventional approaches such as the Solid Isotropic Material with Penalization (SIMP) method require repeated finite element analyses at every iteration, making large-scale or real-time optimization computationally expensive. We propose a hybrid conditioning diffusion transformer (DiT) model that learns to generate near-optimal topologies directly from problem definitions, eliminating iterative analysis during inference. The model integrates spatially distributed conditioning through concatenated stress and strain fields and global conditioning via adaptive layer normalization (AdaLN) using scalar descriptors such as load position, magnitude, and prescribed volume fraction. A dataset of 30,000 two-dimensional SIMP-optimized structures was generated for training and evaluation. Results demonstrate that the proposed DiT achieves less than 1% compliance errors relative to ground-truth SIMP solutions while maintaining accurate volume fractions and structural connectivity. Deterministic DDIM sampling enables high-fidelity topology generation in seconds using as few as five denoising steps, enabling near-real-time performance. The hybrid conditioning diffusion transformer thus provides an efficient and scalable alternative to traditional topology optimization methods, with strong potential for integration into interactive computer-aided design workflows.
Cathode particle fracture is widely recognised as a major degradation mechanism in lithium-ion batteries, yet cracking also permits electrolyte wetting of newly exposed internal surfaces, modifying interfacial reaction pathways. The mechanistic role of electrolyte wetting in redistributing reactions within cracked particles remains unclear. Here, we isolate this effect through a controlled comparison between (i) a fully coupled electro-chemo-mechanical model resolving lithium concentration, electrostatic potential, and stress fields in both the active material and the electrolyte inside and outside cracks, and (ii) a single-particle chemo-mechanical model employing the conventional uniform flux assumption. The coupled model predicts strong spatial heterogeneity in interfacial reaction rates, with flux amplification approximately 8x relative to the imposed uniform flux at the crack tip. Reaction redistribution, and thus lithium flux, is governed predominantly by local solid-state lithium concentration and stress variations, while electrolyte potential gradients inside cracks remain secondary under the conditions considered. Uniform flux models can underpredict delivered capacity by 25% at 1C-rate; this discrepancy increases at higher rates. They also underestimate tensile stresses throughout the delithiation process by 10%, directly affecting crack driving conditions. These results demonstrate that neglecting crack-electrolyte coupling leads to systematic underestimation of both utilisation limits and fatigue-relevant stress histories.
VGRSI derived from visibility graphs on asset prices generates trading signals yielding ~$340,000 total profit across DJI30, EUR/USD andβ¦
abstractclick to expand
Traditional technical analysis indicators, although widely used by market participants, are often not sufficiently effective. We propose the Visibility Graphs Relative Strength Index (VGRSI), based on backward visibility relations in the price of a financial instrument. Rescaled to the 0--100 range, it can generate profitable trading signals. The performance of the indicator was evaluated using an automated trading strategy based on a 30-day optimisation window and a 7-day test window for three instruments representing different asset classes: DJI30, EUR/USD and XAU/USD over the 2024--2025 period (503 trading days). The strategy based on VGRSI signals generated a profit of USD~146,000 for DJI30, USD~69,000 for EUR/USD, and USD~125,000 for XAU/USD. This gives a total result of USD$\sim$340,000, which corresponds to an average profit of USD$\sim$676 per trading day, with a fixed investment of USD~1,000 to open a single trade. For all three assets, the strategy generated substantial profits while maintaining a moderate drawdown (10--18\% relative to a portfolio value of USD~10,000), a relatively low trading intensity (3.3--4.8 trades per day) and high Sharpe ratio values (2.55--3.6). These results indicate that VGRSI constitutes a promising technical analysis tool that goes beyond the classical trend-following approach by exploiting the geometric properties of asset price fluctuations.
The co-optimization of geometry and physical parameters remains challenging in transient multiphysics systems involving moving boundaries, nonlinear material response, phase transitions, and competing objectives. Existing methods often optimize geometry and physical variables separately, rely on simplified steady-state physics, or require offline data generation and reduced design spaces. Here, we present an end-to-end differentiable co-optimization framework that couples an implicit neural representation of geometry with a JAX-compiled Eulerian multiphysics solver. Geometry is represented as a signed distance field using Fourier-feature-encoded spatial coordinates, while boundary conditions, initial conditions, process controls, and material parameters are optimized within the same differentiable loop. Continuous relaxations represent non-smooth physical transitions while preserving compatibility with reverse-mode automatic differentiation and backpropagation through time. We demonstrate the framework using a transient hamburger-cooking benchmark, selected as an interpretable multiphysics problem rather than a culinary optimization exercise. The benchmark combines conductive and convective heat transfer, latent energy effects, moisture and fat transport, shrinkage-induced geometry evolution, evolving contact boundary conditions, flipping-induced boundary-condition changes, and competing quality objectives. Results show that geometry-only optimization modifies shape to relieve thermal bottlenecks, while joint co-optimization distributes the design response across geometry, material state, process variables, and boundary conditions through gradients propagated over the full transient rollout.
We introduce HyCOP, a modular framework that learns parametric PDE solution operators by composing simple modules (advection, diffusion, learned closures, boundary handling) in a query-conditioned way. Rather than learning a monolithic map, HyCOP learns a policy over short programs - which module to apply and for how long - conditioned on regime features and state statistics. Modules may be numerical sub-solvers or learned components, enabling hybrid surrogates evaluated at arbitrary query times without autoregressive rollout. Across diverse PDE benchmarks, HyCOP produces interpretable programs, delivers order-of-magnitude OOD improvements over monolithic neural operators, and supports modular transfer through dictionary updates (e.g., boundary swaps, residual enrichment). Our theory characterizes expressivity and gives an error decomposition that separates composition error from module error and doubles as a process-level diagnostic.
The reconstruction of physically valid transport fields from subject-specific imaging data is a fundamental challenge in image-based computational modeling due to measurement noise, modeling uncertainties and discretization errors. Without a methodology to construct models that faithfully reflect the underlying physics, mechanistic understanding of complex biological systems is inherently limited. In this work, we address this challenge in the glymphatic system, the brain's waste-clearance network, where cerebrospinal fluid (CSF) is transported through perivascular spaces into the brain parenchyma to facilitate metabolic waste removal. We introduce a computational framework for the high-fidelity reconstruction of subject-specific glymphatic transport fields from spatiotemporal imaging data. The formulation utilizes an advection-diffusion model with a velocity decomposition that imposes mass conservation, enabling the recovery of solenoidal (divergence-free) velocity fields through the solution of a constrained inverse problem. The system is discretized using immersed isogeometric analysis with quadratic B-spline basis functions, providing smooth, high-continuity solutions and inherent regularization of imaging noise. We demonstrate the framework's utility by using contrast-enhanced magnetic resonance imaging of tracer transport in a mouse brain, obtaining spatially varying estimates of CSF velocity, diffusivity, and clearance parameters. Forward simulations using the recovered fields show close agreement with experimental observations, validating the framework's ability to characterize complex transport dynamics while preserving physical integrity. This approach provides a generalizable methodology for the robust inference of physically consistent transport fields from imperfect imaging data, with broad applicability to the image-guided modeling of biological and engineering systems.
The purpose of the current work is the development of an approach to account for quasi-static mechanical equilibrium in empirical (i.e., data-based) models for the stress field employing neural approximations (NAs), which include neural networks (NNs) and neural operators (NOs), in particular Fourier NOs (FNOs). Rather than including such constraints from physics in the loss function as done in the (now standard) physics-informed approach, the current approach incorporates or "encodes" such constraints directly into the architecture of the NA. As a result, both NA training and output are physically constrained in the physics-encoded approach, in contrast to the physics-informed approach, in which only training is physically constrained. For the current constraint of divergence-free stress, a novel encoding approach based on a stress potential is proposed. As a "proof-of-concept" example application of the current approach, a physics-encoded FNO (PeFNO) is developed for a heterogeneous polycrystalline material consisting of isotropic elastic grains and subject to uniaxial extension. Stress field data for this purpose are obtained from the numerical solution of corresponding boundary-value problems for quasi-static mechanical equilibrium. For comparison with the PeFNO, this data is also employed to develop an analogous physics-guided FNO (PgFNO) and physics-informed FNO (PiFNO). As expected theoretically, and confirmed by this computational comparison, for comparable accuracy of the stress field itself as compared to the data, the stress field output by the trained and tested PeFNO is significantly more accurate in satisfying mechanical equilibrium than the output of either the PgFNO or the PiFNO.
Understanding HPC facilities users' behaviors and how computational resources are requested and utilized is not only crucial for the cluster productivity but also essential for designing and constructing future exascale HPC systems. This paper tackles Challenge 4, 'Analyzing Resource Utilization and User Behavior on Titan Supercomputer', of the 2021 Smoky Mountains Conference Data Challenge. Specifically, we dig deeper inside the records of Titan to discover patterns and extract relationships. This paper explores the workload distribution and usage patterns from resource manager system logs, GPU traces, and scientific areas information collected from the Titan supercomputer. Furthermore, we want to know how resource utilization and user behaviors change over time. Using data science methods, such as correlations, clustering, or neural networks, our findings allow us to investigate how projects, jobs, nodes, GPUs and memory are related. We provide insights about seasonality usage of resources and a predictive model for forecasting utilization of Titan Supercomputer. In addition, the described methodology can be easily adopted in other HPC clusters.
Modern package designs make use of technologies such as backside power delivery (BSPD) and 3D stacked chiplets that require accounting for the heterogeneity in back end of the line (BEOL) structures in hot-spot prediction. Multiscale homogenization strategies have been demonstrated to be effective for steady-state simulations, however accurate 3D transient simulations that include BEOL structures remain an open challenge.
In this work, we demonstrate a transient thermal workflow that accounts for the 3D heterogeneous structures in the BEOL for problems with strong- and weak- temporal scale separation under the assumption of temperature independent constitutive properties. Our workflow, based on Bloomfield et. al. 2025, automatically extracts, meshes, and homogenizes thermal properties from GDSII and OASIS files to construct thermal property maps.
Property maps (heat capacity and conductivity) have been generated for a 1 mm by 1 mm SoC-style model die that was constructed with LibreLane for 100 by 100 grids with 5 micron by 5 micron representative volume elements (RVEs), and 50 by 50 grids with 10 micron by 10 micron RVEs. The expressions for a transient effective conductivity are provided and a demonstration of the impact of the transient effects are provided for a single RVE. Finally, transient conductivity maps have been provided for a time integration timestep of dt=0.001.
Optimization of a quarter-car model shows common rebound-to-compression ratios work best for specific severe conditions rather than all the
abstractclick to expand
Asymmetric damping is widely used in passive vehicle suspensions, with rebound damping often recommended to exceed compression damping by a factor of two to three. Despite its prevalence, this guideline remains largely empirical and lacks a systematic derivation based on vehicle dynamics and excitation conditions. This paper presents a scenario-driven optimization framework that provides a principled explanation for the effectiveness of asymmetric damping. A minimal quarter-car model is employed to isolate the key mechanisms governing the trade-off between ride comfort, road holding, and transient response, using standardized ISO~8608 road excitations. Rebound and compression damping ratios are treated as independent design variables, and optimal configurations are identified via a stochastic Cross-Entropy algorithm applied to a non-convex, simulation-based objective function. Performance is assessed through ISO~2631 weighted RMS acceleration, tire--ground contact force variability, and settling time. The results show that symmetric damping is often sufficient under moderate excitation, whereas asymmetric damping becomes necessary under severe conditions, with commonly cited rebound-to-compression ratios emerging as scenario-dependent near-optimal solutions rather than universal constants.
Design Structure Matrix (DSM) modularization, the task of partitioning system elements into cohesive modules, is a fundamental combinatorial challenge in engineering design. Traditional methods treat modularization as a pure graph optimization, without access to the engineering context embedded in the system. Building on prior work on LLM-based combinatorial optimization for DSM sequencing, this paper extends the method to modularization across five cases and three backbone LLMs. Our method achieves near-reference quality within 30 iterations without requiring specialized optimization code. Counterintuitively, domain knowledge, beneficial in sequencing, consistently impairs performance on more complex DSMs. We attribute this to semantic misalignment between the LLM's functional priors and the purely structural optimization objective, and propose the semantic-alignment hypothesis as a testable condition governing knowledge effectiveness with LLMs. Ablation studies identify the most effective input representation, objective formulation, and solution pool design for practical deployment. These findings offer practical guidance for deploying LLMs in engineering design optimization.
Adaptive mesh refinement (AMR) is indispensable for efficient finite element analyses. However, its performance depends not only on the refinement itself but also on strategy to mark elements for refinement and the way it is tuned. This work compares classical marking methods (maximum, D\"orfler bulk-chasing, quantile) with non-classical, statistically based approaches (z-score, Isolation Forest), all driven by the residual-based Kelly error estimator and tested on steady solid and fluid mechanics problems. The study finds quantile and z-score markings to be the most robust, D\"orfler effective for large bulk parameters, and maximum marking sensitive to irregular fields. Isolation Forest can rival top classical methods with a generous contamination level but may fail under aggressive settings. These results offer practical guidance for selecting marking strategies that balance refinement aggressiveness and computational cost in adaptive FEM workflows.
The verification of bullion coin authenticity is essential for maintaining integrity within the precious metals market; however, the increasing sophistication of counterfeits has rendered traditional inspection methods insufficient. This paper proposes a non-destructive verification framework based on acoustic frequency analysis and deep neural networks. The methodology leverages the unique acoustic fingerprint of a coin, a physical signature determined by its material composition, mass, and geometry, captured through mechanical excitation. We implement a synergistic dual-model architecture consisting of an autoencoder that reconstructs the spectrum for anomaly detection and a deep learning classifier for coin type identification. To address the challenges of environmental noise and limited dataset diversity, a dynamically calculated anomaly threshold and data augmentation techniques were employed. Experimental results demonstrate that the integrated system achieves high precision in distinguishing authentic specimens from high-quality counterfeits, maintaining stability across varying recording conditions and devices. Beyond bullion authentication, the study highlights the scalability of the proposed non-destructive testing method for assessing the safety of critical components in the automotive and aerospace industries.
The pressure for Water Resource Recovery Facilities (WRRF) operators to efficiently treat wastewater is greater than ever because of the water crisis, produced by the climate change effects and more restrictive regulations. Technicians and researchers need to evaluate WRRF performance to ensure maximum efficiency. For this purpose, numerical techniques, such as CFD, have been widely applied to the wastewater sector to model biological reactors and secondary settling tanks with high spatial and temporal accuracy. However, limitations such as complexity and learning curve, prevent extending CFD usage among wastewater modeling experts. This paper presents HydroSludge, a framework that provides a series of tools that simplify the implementation of the processes and workflows in a WRRF. This work leverages HydroSludge to preprocess existing data, aid the meshing process, and perform CFD simulations. Its intuitive interface proves itself as an effective tool to increase the efficiency of wastewater treatment
In this article, we propose a simple and efficient hyperreduced strain-space model order reduction (MOR) approach for hyperelastic representative volume elements (RVEs), called Empirical Material Sampling and Linearisation (EMSL). The approach is conceptually motivated by the Empirically Corrected Cluster Cubature (E3C) of Wulfinghoff and Hauck [36], but also draws on ideas from previous work on incremental variational structure-preserving strain-space model order reduction techniques to achieve rapid evaluations in the online phase.
As in E3C, we group the material domain into regions of similar behaviour, and query the material routine at one reference strain value per region. However, we sample these strains only once per load increment, at empirically estimated expected strain values. We use the reference material tangent and strain modes obtained via the Proper Orthogonal Decomposition (POD) to compute a linearised estimate of the stress response in the remainder of the material cluster. In contrast to E3C, which approximately integrates the exact material law, EMSL could therefore be said to exactly integrate an approximation of the material behaviour. The resulting reduced problem is affine in each load step, allowing for integration over the entire computational domain via operations which can readily be preprocessed in the offline phase. Since a linear equation system is obtained in each load increment, no Newton iterations are required in the online phase.
For benchmark comparisons, we pose a variant of two popular reduced cubature schemes in strain space and recall the E3C algorithm proposed by Wulfinghoff et al. On an example hyperelastic RVE problem with a porous geometry, we show that EMSL Pareto-dominates competing strain-space approaches in terms of the tradeoff between accuracy and runtime.
FeatureFox combines binary edge classification on B-Rep graphs with connected-component instance recovery to deliver sample-efficientβ¦
abstractclick to expand
Automatic feature recognition (AFR) on B-Rep 3D-CAD models is central to CAD/CAM automation, yet most learning-based methods are complex, data-hungry, and evaluate instance grouping and semantic labeling separately. We present FeatureFox, a panoptic AFR pipeline that outputs machining instances with semantic labels: a calibrated binary edge classifier on enriched edge attributes localizes feature boundaries, instances are recovered as connected components in a pruned face-adjacency graph, and a per-instance classifier predicts the machining class from aggregated subgraph attributes. We evaluate on MFInstSeg using Panoptic Quality (PQ), which jointly scores instance separation and semantic correctness. FeatureFox is substantially more sample- and compute-efficient than the deep baseline AAGNet, reaching $\mathrm{PQ}>0.9$ with $\sim250$ training parts versus $\sim5{,}000$ for AAGNet, and training on the full MFInstSeg set takes seconds on a GPU. On the full training set, AAGNet surpasses FeatureFox marginally in PQ, while FeatureFox remains slightly ahead in feature-level recognition and localization accuracy. Finally, leveraging its low data requirement, we train FeatureFox on $270$ manually labeled industrial CAD parts and show qualitative generalization to an unseen real industrial part, indicating practical real-world applicability.
Technology mapping is a critical yet challenging stage in logic synthesis. While Large Language Models (LLMs) have been applied to generate optimization scripts, their potential for core algorithm enhancement remains untapped. We introduce MappingEvolve, an open-source framework that pioneers the use of LLMs to directly evolve technology mapping code. Our method abstracts the mapping process into distinct optimization operators and employs a hierarchical agent-based architecture, comprising a Planner, Evolver, and Evaluator, to guide the evolutionary search. This structured approach enables strategic and effective code modifications. Experiments show our method significantly outperforms direct evolution and strong baselines, achieving 10.04\% area reduction versus ABC and 7.93\% versus mockturtle, with 46.6\%--96.0\% $S_{overall}$ improvement on EPFL benchmarks, while explicitly navigating the area--delay trade-off. Our code and data are available at https://github.com/Flians/MappingEvolve.
Matrix-free Galerkin GMG solver for single-GPU 3D SIMP elasticity achieves 1.62x-3.12x speedups over Jacobi-PCG with pass rates of 7/9 toβ¦
abstractclick to expand
Large 3D SIMP studies require repeated elasticity solves for density-dependent operators whose finest matrices are expensive to assemble and whose conditioning degrades under high contrast. We study this linear-solver layer rather than claiming end-to-end optimization acceleration. The solver builds a matrix-free Galerkin geometric multigrid (GMG) hierarchy around a fused fine operator: the finest level remains matrix-free, the first coarse level is assembled by local Galerkin aggregation, and deeper levels use sparse Galerkin products. The practical default is FP32-GMG; BF16 is evaluated as a guarded mixed-precision variant and diagnostic stress test, not as the main speed mechanism. In a 27-case heterogeneous cantilever sweep, pass rates under a 200-iteration budget are 7/9, 4/9, and 1/9 at 64k, 216k, and 512k elements; converged-only mean iteration counts are about 112, 134, and 146. On uniform rho=0.5, p=3 solves, FP32-GMG gives 1.62x, 1.75x, and 3.12x wall-time ratios relative to the capped flat Jacobi-PCG baseline at the same sizes; that non-converged baseline reaches the 200-iteration cap in all timed trials. BF16-GMG is not faster than FP32-GMG. In 18 fixed-seed heterogeneous BF16 validation cases, 7/18 converge, matching the FP64 count, and 11 cases that pass the spectral screen still fail the 500-iteration cap; the screen is therefore diagnostic rather than a convergence certificate. The largest reported solve is a 1M-element uniform-modulus system solved in 1.50+/-0.58 s with an 8.66 GiB hierarchy-allocation delta during setup, not a peak-memory trace; this point is reported as uniform scaling, not heterogeneous robustness evidence. The contribution is therefore a bounded single-GPU solver result built on an inherited Level 0 matrix-free operator: a Galerkin GMG hierarchy, direct BF16 guard evidence, and an explicit failure-mode screen for structured 3D SIMP linear systems.
Mapping systems to laser phases lets the LPU emulator cut time-to-solution for multi-banded sparse matrices compared to GPU solvers.
abstractclick to expand
Solving large, sparse linear systems is a fundamental workload in scientific computing and engineering simulations, often dominating runtime and energy consumption in high-performance computing (HPC) applications. In this work, we explore an alternative computing paradigm based on analog optical processing, implemented through the Laser Processing Unit (LPU). The LPU encodes linear systems into the dynamics of coupled lasers within an optical cavity, where the steady-state phases of the optical fields correspond to the solution of $Ax=b$. We present a mapping of general linear systems, both dense and sparse, onto the LPU architecture and evaluate its performance using representative matrices from the SuiteSparse collection. Using an LPU emulator, we benchmark convergence behavior and time-to-solution for sparse, multi-banded matrices against established Krylov subspace methods (CG, GMRES, BiCGSTAB, and others) executed on a modern GPU platform. Our results demonstrate that the LPU will achieve significantly lower time-to-solution for selected problem classes, highlighting the potential of optical analog computing for accelerating iterative linear solvers. These findings suggest that optical processors such as the LPU will be able to serve as accelerators for linear systems, in particular structured and/or repeatedly solved, offering advantages in latency, parallelism, and energy efficiency. We discuss current limitations, including scaling constraints and precision considerations, and outline directions toward hybrid optical-digital computing systems.
Localized features such as singularities, sharp gradients, discontinuities, and moving sources require adaptive finite element discretizations. Conventional refinement strategies introduce significant computational overhead through mesh-topology modifications, constraint handling for non-matching interfaces, and repeated remeshing with state transfer. This work presents an unfitted multi-level hp-refinement strategy that enriches a fixed base discretization by independently positioned overlay meshes. The global approximation space is constructed by superposition of the active spaces across all refinement levels, while homogeneous constraints on artificial overlay boundaries ensure global $C^0$ continuity. Coupling between non-matching meshes is assembled over admissible integration regions defined by intersections of element partitions, enabling reuse of standard element-level finite element routines within a lightweight superposition framework. In contrast to fitted multi-level approaches, overlay boundaries are not required to align with underlying mesh interfaces. This reduces inter-level coupling and allows refinement zones to be inserted, translated, and removed without modifying the base discretization. Numerical studies for discontinuous and singular benchmark problems, as well as a moving source, demonstrate the performance of the method. The unfitted approach retains exponential convergence for non-smooth problems and achieves improved error-to-cost ratios compared to fitted multi-level hp-refinement. For representative cases, comparable accuracy is obtained with substantially fewer degrees of freedom, while localized high-order refinement accurately tracks moving features.
The time-dependent deformation of concrete, particularly creep, remains a key challenge for reliable and material-efficient design. Experimental results show that tailored preloading, short-term loads exceeding the subsequent sustained load, can reduce both the magnitude and variability of creep strains which may be associated with beneficial microstructural changes. Building on these insights, this article employs Gaussian Process Regression (GPR) to calibrate analytical creep models, incorporating the effects of preloading intensity, timing, and concrete age into conventional predictions. The study pursues three main objectives: (i) calibrating a creep model using GPR based on experimental data, (ii) evaluating the impact of training data selection and preparation, and (iii) analysing model performance depending on the available experimental duration. The results demonstrate that GPR can improve model accuracy, quantify uncertainties, and support optimal test planning, while also enhancing understanding of preloading effects and contributing to more reliable and sustainable concrete creep predictions.
With the rapid advancement of computer technologies enabling fast calculations of complex structures, numerical methods have become a central tool in engineering sciences, while physical models have increasingly receded into the background. Nevertheless, owing to their clarity and comprehensibility, these former engineering tools remain of great value and their use can still be highly relevant today. At the example of the scale model of the Lilleb{\ae}lt Bridge -- developed by the Copenhagen engineers Christen Ostenfeld and Wriborg J{\o}nson and given for research purposes to the Bauhaus-Universit\"at Weimar -- this paper illustrates how physical models can still serve as useful instruments in research and teaching. By applying operational modal analysis, the natural frequencies and damping ratios of the bridge model are experimentally determined, which in turn can serve as reference data for the calibration and validation of numerical models.
In practical early-stage battery-electric vehicle studies, analysis workflows may become fragmented across spreadsheets, notebooks, and project-specific scripts, making reuse, audit, and extension harder. VEHRON is an open-source Python framework for a deterministic, traceable workflow built around prescribed-speed longitudinal simulation of battery-electric vehicles using validated YAML configuration, packaged drive-cycle resources, interchangeable subsystem models, and auditable case outputs. VEHRON currently runs as a command-line workflow in which a vehicle definition and a testcase definition are combined to execute a simulation, emit a flat time series, and write a case package containing copied inputs, resolved configuration, summary metadata, and standard plots. Architecturally, VEHRON is organized around a small simulation engine, a shared state bus, a registry of model selections, schema-based configuration loading, and extension points for custom battery and HVAC models loaded from external Python files. VEHRON currently focuses on battery-electric longitudinal simulation with low-order battery, thermal, auxiliary-load, and HVAC models. This paper explains how VEHRON is structured, how it is used, which models it implements, and where its present limits lie. Source code is available at https://github.com/vehron-dev/vehron, with archived release metadata recorded under DOI https://doi.org/10.5281/zenodo.19820111.
Mean-variance optimization still leads in cumulative return and Sharpe ratio when fed accurate model inputs.
abstractclick to expand
This study proposes a portfolio optimization framework that integrates advanced deep learning architectures with traditional financial models to enhance risk-adjusted performance. Using historical data from 2015-2023 across equities, ETFs, and bonds, the research evaluates the predictive power of Graph Neural Networks (GNNs), Deep Reinforcement Learning (DRL), Transformers, and Autoencoders. The models jointly address covariance estimation, return forecasting, dynamic asset allocation, and dimensionality reduction. Hybrid approaches such as Transformer+GNN and Autoencoder+DRL are also explored to capture both relational and temporal market structures. Performance is assessed through backtesting using metrics including volatility, cumulative return, maximum drawdown, annualized return, and Sharpe ratio across seven strategies, including Equal-Weighted, 60/40 allocation, and Mean-Variance Optimization (MVO). Results show that hybrid models provide superior stability and risk control, with Transformer+GNN achieving the lowest volatility and drawdown. MVO, when paired with well-calibrated inputs, delivers the highest cumulative return and Sharpe ratio, highlighting the continued relevance of traditional methods. Standalone DRL underperforms due to limited structural awareness, while Autoencoders exhibit behavior similar to Equal-Weight strategies, emphasizing the need for dynamic policy learning. These findings align with existing literature on relational modeling and feature compression in finance. Overall, the study demonstrates that combining deep learning with financial theory yields robust and adaptive portfolio strategies and suggests exploring latent representations within traditional optimization frameworks to improve scalability and performance.
Wind-traffic interactions strongly influence the dynamic response of long-span bridges, yet loads are often analysed independently. This work models concurrent wind and traffic and demonstrates that it differs from linear superposition. Traffic is synthesised from volumes, composition, and vehicle dynamics, with vehicles represented as 3D systems. Vehicle-pavement interaction adopts ISO roughness with transverse coherence, and wind turbulence follows the K\'arm\'an spectrum with Davenport coherence. A quasi-steady aerodynamic model supports time-history analysis under combined actions. Results indicate non-linear interactions that change response, revealing limitations of conventional design assumptions. The framework enables accurate performance assessment and informs serviceability criteria and design optimisation for long-span bridges.
Gaussian basis functions provide an efficient and flexible alternative to spline activations in KANs. In this work, we introduce the partition-of-unity Gaussian KAN (PU-GKAN), a Shepard-type normalized Gaussian KAN in which the Gaussian basis values on each edge are divided by their local sum over fixed centers. This produces a partition-of-unity feature map with trainable coefficients, while preserving the standard edge-based KAN structure. The normalized construction gives exact constant reproduction at the edge level and admits an explicit finite-feature kernel interpretation.
We formulate both the standard Gaussian KAN (GKAN) and PU-GKAN from a finite-feature and additive-kernel viewpoint, making the induced layer kernels and empirical feature matrices explicit. Using the first-layer feature matrix as the reference object, we adopt a practical scale-selection interval for \(\epsilon\), with the lower endpoint determined by adjacent-center overlap and the upper endpoint determined by a conservative conditioning threshold. Numerical experiments show that PU-GKAN reduces sensitivity to \(\epsilon\), improves validation accuracy for most smooth and moderately non-smooth targets, and gives more stable training behavior. The benefit persists across sample-size and center-number sweeps, higher-dimensional architectures, Mat\'ern RBF bases, and physics-informed examples involving Helmholtz and wave equations. These results indicate that Shepard-type partition-of-unity normalization is a simple and effective stabilization mechanism for RBF-based KANs.
Learning to solve the Alternating Current Optimal Power Flow (AC-OPF) problem by neural networks (NNs) is a promising approach in real-time applications. Existing methods to ensure the physical feasibility of NN outputs embed a power flow (PF) solver within networks. However, the gradient through the PF solver, namely, implicit differentiation, needs manual Jacobian derivation and the solution of linear systems, which is computationally prohibitive and hinders integration with modern automatic differentiation (AD) frameworks. To address these challenges, we propose FPL-OPF, a novel unsupervised learning framework that incorporates a Fast Physics-aware Layer for AC-OPF problems. FPL-OPF embeds a fast PF iterative solver within the NN and takes solely the last few or even the final iterations into the AD graph. This design ensures high computational efficiency for both the forward and backward passes, circumventing complex custom backward implementations. Theoretically, we rigorously prove that the gradient from this design serves as a high-fidelity surrogate of the true implicit gradient under mild conditions. Extensive experiments demonstrate that FPL-OPF achieves significant speedups over state-of-the-art unsupervised learning approaches, while maintaining near-zero constraint violations and competitive optimality. Our code is available at https://github.com/wowotou1998/fpl-opf
Learning to solve the Alternating Current Optimal Power Flow (AC-OPF) problem by neural networks (NNs) is a promising approach in real-time applications. Existing methods to ensure the physical feasibility of NN outputs embed a power flow (PF) solver within networks. However, the gradient through the PF solver, namely, implicit differentiation, needs manual Jacobian derivation and the solution of linear systems, which is computationally prohibitive and hinders integration with modern automatic differentiation (AD) frameworks. To address these challenges, we propose FPL-OPF, a novel unsupervised learning framework that incorporates a Fast Physics-aware Layer for AC-OPF problems. FPL-OPF embeds a fast PF iterative solver within the NN and takes solely the last few or even the final iterations into the AD graph. This design ensures high computational efficiency for both the forward and backward passes, circumventing complex custom backward implementations. Theoretically, we rigorously prove that the gradient from this design serves as a high-fidelity surrogate of the true implicit gradient under mild conditions. Extensive experiments demonstrate that FPL-OPF achieves significant speedups over state-of-the-art unsupervised learning approaches, while maintaining near-zero constraint violations and competitive optimality. Our code is available at https://github.com/wowotou1998/fpl-opf
Recent years have seen growing application potential for Lattice-skin Plate Structures in advanced manufacturing fields such as aerospace and automotive engineering. For multiscale performance evaluation of such structures, conventional homogenization methods for lattice-filled volume structures are often used for equivalent analysis. However, in finite-thickness Lattice-skin Plate Structures, periodic boundary conditions imposed along the three orthogonal directions of the representative cell cannot adequately capture the boundary effect of the free surfaces in the thickness direction, which introduces bias into the prediction of effective properties. To reduce this bias, this study develops and open-sources a homogenization method for Lattice-skin Plate Structures, forming an open-source computational framework for this class of structures. Representative numerical examples show that the framework can stably extract effective plate/shell stiffness matrices and can be extended to predict multiphase material properties and analyze steady-state heat conduction. The tool provides an open and reusable analysis foundation for the high-fidelity design of multifunctional lightweight structures.
Under the 6G wireless network evolution, the low-altitude Internet of Things (IoT), supported by unmanned aerial vehicles (UAVs) with Integrated Sensing and Communication (ISAC) capabilities, provides ground sensing networks with advanced real-time monitoring and data collection. To maximize data collection volume from distributed IoT nodes, AI-powered data collection technology plays a critical role in enabling intelligent decision-making. Among them, deep reinforcement learning (DRL) has gained particular attention. However, the existing DRL-based work on UAV-assisted IoT nodes data collection rarely address problems such as unknown interference and dynamic data volume. Moreover, these DRL models have high arithmetic requirements and slow convergence speed, making it difficult to carry on UAVs with limited load and arithmetic power. To address these challenges, a hierarchical deep reinforcement learning (HDRL), which can converge quickly and with smaller models, is designed to optimize UAV trajectories and bandwidth allocation to maximize data collection volume. Firstly, the proposed scenario incorporates interference from jammers, dynamic data volume of IoT nodes, and multiple types of obstacles. The entire task is hierarchically structured: the upper-level makes flight trajectory decisions at a coarse temporal granularity, while the lower-level makes bandwidth allocation decisions at a finer temporal granularity. Secondly, a trajectory and bandwidth allocation optimization algorithm based on hierarchical deep deterministic policy gradients (TBH-DDPG) is proposed to solve the problem. Finally, simulation results demonstrate that the proposed algorithm improves convergence speed by 44.44%, and reduces computational cost by 58.05%, compared to non-hierarchical algorithm.
In-context modeling trained on physics equations assimilates new measurements as context for accurate single-pass inference and scales with
abstractclick to expand
Building models that generalize across physical systems without retraining remains a central challenge in computational science. Here we introduce In-Context Modeling (ICM), a retrain-free paradigm that infers physical relationships directly from observational fields. Rather than encoding system-specific behavior in fixed parameters, ICM assimilates measurements as physical context and performs inference through a single forward pass. Trained in a physics-informed, label-free manner using governing equations, a single model generalizes across unseen materials, geometries, and loading conditions. Demonstrated on hyperelasticity, ICM integrates with finite-element simulations and is validated using experimental full-field measurements. Moreover, performance improves with increasing data diversity and computational budget, exhibiting favorable scaling behavior analogous to foundation models. By recasting physical modeling as in-context inference, this work establishes a transferable paradigm for retrain-free scientific learning and a foundation for scalable modeling across computational science.
By adapting kernel bandwidths to measure trust between data-driven and theory models, particle filters converge even with limited physicsζ ·ζ¬.
abstractclick to expand
AI and data-driven models have large potential for data assimilation applications by creating fast and accurate forecasts. Their tendency to produce spurious inaccurate, nonphysical results -- hallucination -- however, raises a serious question about their long-term use, and can be categorized as untrustworthy methods. Theory-driven methods on the other hand are slow, but are capable of staying physically realistic due to their mathematical underpinning, and can be categorized as trustworthy methods. We argue that by making use of these methods in tandem, it is possible to build a relative measure of trust between the theory-driven and data-driven methods that results in a combined trustworthy methodology. We argue, and then show, that the bandwidth scaling factors in the kernel density estimates can be used to represent our trust in the theory-driven and data-driven models. We provide for ways in which these measures of trust can be adaptively computed through an expectation-maximization approach. We combine all of these ideas to create the multifidelity ensemble Gaussian mixture filter and its adaptive trust version, which are particle filters capable of high-dimensional data assimilation. We validate our ideas on both a static banana problem and on a sequential filtering example with the Lorenz '96 equations, showing that it is possible to create a particle filter that is capable of high dimensional convergent inference in the undersampled regime -- when the number of theory-driven samples is less than the dimension of the system.
Design coupling analysis identifies dominant variable links to support decomposition strategies that match full optimization results withfar
abstractclick to expand
This work presents a design coupling analysis (DCA) framework to investigate the interactions among control and plant design variables in floating offshore wind turbine (FOWT) and to support the formulation of tractable control co-design (CCD) optimization strategies. DCA provides quantitative information that reveals the relationships and dependencies among design variables and to objective function, enabling improved design variable selection, identification of dominant variables that drive system interactions, and informed selection of optimization solution strategies. However, applying DCA to complex systems is challenging because the models used to describe their dynamics are computationally expensive, and constructing DCA information requires exhaustive model evaluations and optimizations. Here, a surrogate model of the FOWT system is employed to make the repeated model evaluations required for DCA computationally feasible. Using this framework, the bidirectional couplings between control and plant design variables, as well as the couplings among plant design variables, are estimated. The results reveal strong interactions among various design variables, and identify the most influential plant design variables affecting system performance. These insights guide the development of two DCA-based optimization strategies for large CCD problems: a sequential decomposition approach that preserves dominant design variable couplings while reducing problem size at each stage, and a reduced dimensional optimization approach that focuses on collectively the most influential variables. The results demonstrate that these strategies significantly reduce computational complexity while achieving solutions comparable to those obtained through full simultaneous optimization, demonstrating the value of DCA for understanding and solving complex design problems.
AI is increasingly used to accelerate engineering design by improving decision-making and shortening iteration cycles. Application to marine propeller design, however, remains challenging due to scarce training data and the lack of widely available pretrained models. We address this gap with a physics-based data generation pipeline and a generative-AI framework for direct performance-to-design generation tailored to marine propellers. First, we build a database of over 20,000 four- and five-bladed propeller geometries, each accompanied by simulated open-water performance curves. On top of this dataset, we develop a three-module design framework: (1) A Conditional Generation Model that proposes candidate geometries conditioned on design specifications such as target thrust, power, and diameter. (2) A Performance Prediction Model, implemented as a neural-network surrogate, that predicts thrust, torque, and efficiency in milliseconds, enabling rapid evaluation of generated designs. (3) A design refinement stage that applies evolutionary optimization to enforce practical constraints such as required thrust under power limits and bounds on blade-area ratio and thickness. Experimental results over a range of operating conditions show that the framework can generate hydrodynamically plausible propeller designs that match prescribed performance targets while substantially reducing design-iteration time relative to the traditional expert-guided refinement. Latent diffusion-based generator produces more diverse designs under the same conditions than the conditional variational autoencoder, suggesting a stronger capacity for design-space exploration with diffusion models. By coupling physics-based data synthesis with modular AI models, the proposed approach streamlines the propeller design cycle and reduces reliance on expensive high-fidelity simulations to final validation stages.
Lab tests of optimized beams show ductile failure and point to 33% material reduction potential without added depth.
abstractclick to expand
The production of concrete generates roughly 8% of anthropogenic CO2 globally, largely because of the massive quantities that are manufactured. New design methods must be developed and deployed to improve the material efficiency of reinforced concrete structures, and reduce concrete's carbon impact. This research uses topology optimization, a free-form structural optimization method, for improved structural design. Two topology optimization frameworks are developed specifically for reinforced concrete design and construction. The automated design algorithms are used to generate geometries for materially-efficient reinforced concrete beams, which are fabricated and tested to compare performance to conventional design. The optimized results exhibit ductile failure and reach loads 36%-42% higher than the conventional design with the same material consumption. Through comparison to analytical models, the observed potential for material reduction while maintaining today's performance requirements without adding structural depth is around 33%, indicating a viable path forward in reaching carbon neutrality of reinforced concrete construction.
Engineering structures are increasingly designed using numerical optimisation. However, traditional optimisation methods can be challenging with multiple objectives and many parameters. In machine learning, stable training of artificial neural networks with millions or billions of parameters is achieved using automatic differentiation frameworks such as JAX and Pytorch. Because these frameworks provide accelerated numerical linear algebra with automatic gradient tracking, they also enable differentiable implementations of numerical methods to be built. This facilitates faster gradient-based optimisation of geometry and materials, as well as solution of inverse problems. We demonstrate JAX-BEM, a differentiable Boundary Element Method (BEM) solver, showing that it matches the error of existing BEM codes for a benchmark problem and enables gradient-based geometry optimisation. Although the demonstrated examples are for acoustic simulations, the concept could be readily extended to electromagnetic waves.
Kolmogorov--Arnold Networks (KANs) have recently attracted attention as edge-based neural architectures in which learnable univariate functions replace conventional fixed activation functions. A key source of flexibility in KANs is the choice of basis functions used to parameterize the learnable edge functions. In this context, Gaussian basis functions provide a simple and efficient alternative to splines. However, their performance depends strongly on the scale (shape) parameter \(\epsilon\), whose role has not been studied systematically. In this paper, we investigate how \(\epsilon\) affects Gaussian KANs through first-layer feature geometry, conditioning, and approximation behavior. Our central observation is that scale selection is governed primarily by the first layer, since it is the only layer constructed directly on the input domain and any loss of distinguishability introduced there cannot be recovered by later layers. From this viewpoint, we analyze the first-layer feature matrix and identify a practical operating interval, \[ \epsilon \in \left[\frac{1}{G-1},\frac{2}{G-1}\right], \] where \(G\) denotes the number of Gaussian centers. We interpret this interval not as a universal optimality result, but as a stable and effective design rule, and validate it through brute-force sweeps over \(\epsilon\) across function-approximation problems with different collocation densities, grid resolutions, network architectures, and input dimensions, as well as physics-informed problems. We further show that this range is useful for fixed-scale selection, variable-scale constructions, constrained training of \(\epsilon\), and efficient scale search using early training MSE. In this way, the paper positions scale selection as a practical design principle for Gaussian KANs rather than as an ad hoc hyperparameter choice.
The advent of NMT has expanded the scope of translation beyond isolated sentences, enabling context to be preserved across paragraphs and documents. However, current evaluation metrics largely remain restricted to the sentence level and typically depend on reference translations. Without references, existing metrics cannot provide a clear basis for their quality assessments. To address these limitations, we propose an evaluation framework that independently extracts and compares latent topic structures within source and translated texts. This framework utilises various topic modelling techniques, including LSA, LDA and BERTopic, to achieve this. Our methodology captures statistical frequency information and semantic context, providing a comprehensive evaluation of the entire document. It aligns key topic tokens across languages using a bilingual dictionary and quantifies thematic consistency via cosine similarity. This allows us to evaluate how faithfully the translation maintains the thematic integrity of the source text, even in the absence of reference translations. To this end, we used a large scale dataset of 9.38 million Korean to English sentence pairs from AI Hub, which includes pre evaluated BLEU scores. We also calculated CometKiwi, a state of the art, reference free metric for this dataset, in order to conduct a comparative analysis with our proposed, topic based framework. Through this analysis, we confirmed that, unlike existing metrics, our framework evaluates the differentiated attribute of document level thematic units. Furthermore, visualising the key tokens that underpin the quantitative evaluation score provides clear insight into translation quality. Consequently, this study contributes to effectively complementing the existing translation evaluation system by proposing a new metric that intuitively identifies whether the document's theme has been preserved.