arxiv: 2605.08758 · v1 · submitted 2026-05-09 · 💻 cs.RO · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Omni-scale Learning-based Sequential Decision Framework for Order Fulfillment of Tote-handling Robotic Systems

Jiaxin Liu , Peng Yang , Yuping Li , Xinyue Xie

Authors on Pith no claims yet

Pith reviewed 2026-05-12 00:53 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords tote-handling robotic systemsorder fulfillmentsequential decision makingmulti-agent reinforcement learningcombinatorial optimizationwarehouse automationscalable decision framework

0 comments

The pith

A hybrid framework of combinatorial optimization and multi-agent reinforcement learning coordinates order, tote, and robot decisions to deliver near-optimal performance on small systems and consistent gains on large ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks a single decision framework that handles the sequence of choosing orders, assigning totes, and directing robots in automated warehouses where totes replace pallets as the main unit. It builds the framework by pairing exact optimization routines for structured subproblems with learning agents that manage dynamic interactions across multiple robots. On small setups the method stays within 3.5 percent of the best possible solution in two different layouts. On large setups it reduces total tote movements by 8-12 percent versus standard heuristics and more than 30 percent versus rule-based methods while still deciding fast enough for live operation. The result is a unified approach that avoids the need to redesign the logic every time the warehouse size or layout changes.

Core claim

The OLSF-TRS framework integrates structured combinatorial optimization with multi-agent reinforcement learning to coordinate the sequential decisions on orders, totes, and robots; this produces average optimality gaps below 3.5 percent on small-scale systems across two configurations and reduces tote movements by 8-12 percent against heuristics plus over 30 percent against state-of-the-art rule-based methods on large-scale systems of two types, all while preserving real-time responsiveness.

What carries the argument

OLSF-TRS, the omni-scale sequential decision framework that decomposes order-tote-robot coordination into a hybrid of combinatorial optimization for fixed subproblems and multi-agent reinforcement learning for adaptive coordination across scales.

If this is right

Lower total tote movements translate directly into reduced energy use and operating costs for fulfillment centers.
Real-time responsiveness supports stable high-throughput operation even when order volumes fluctuate.
The same structure applies to both small pilot installations and full-scale production warehouses without redesign.
Improved coordination stability reduces delays that arise from mismatched order, tote, and robot choices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition pattern could be tested on other multi-robot tasks such as bin picking or sortation lines.
Adding short-term demand forecasts as inputs to the learning agents might further tighten the optimality gap.
Hardware experiments on physical tote robots would reveal whether simulation-to-real transfer preserves the reported margins.
The approach could reduce the engineering effort needed when a warehouse expands from one to multiple aisles.

Load-bearing premise

The order-tote-robot decisions can be split into an optimization-plus-learning structure that stays stable and transfers to new system sizes and layouts without per-system retraining or multi-agent instability.

What would settle it

A new tote-handling system configuration where the framework either exceeds a 10 percent optimality gap on small instances or loses real-time responsiveness on large instances.

Figures

Figures reproduced from arXiv: 2605.08758 by Jiaxin Liu, Peng Yang, Xinyue Xie, Yuping Li.

**Figure 1.** Figure 1: Overview of the Omni-scale Learning-based Sequential Decision Framework for Order Ful- fillment of Tote-handling Robotic Systems (OLSFTRS). a Conceptual comparison between conventional system-specific fulfillment solutions and the proposed omni-scale learning-based framework. Traditional approaches rely on operations research methods, rule-based logic and heuristic pipelines that are typically tailored … view at source ↗

**Figure 2.** Figure 2: System-level performance analysis of OLSF-TRS under the 2D MultiTote Handling Robotic Systems (Hairobotic). a Relative tote-move cost matrix comparing OLSF-TRS with baseline methods across large-scale instances (L-1 to L-9), where each entry denotes the normalized cost ratio with respect to the corresponding baseline. b Runtimequality trade-off illustrating the Pareto efficiency of OLSF-TRS in terms of to… view at source ↗

**Figure 3.** Figure 3: System-level performance analysis of OLSF-TRS under the 3D RackClimbing Robotic Systems (Exotec). a Relative tote-move cost matrix comparing OLSF-TRS with baseline methods across large-scale instances (L-1 to L-9), where each entry denotes the normalized cost ratio with respect to the corresponding baseline. b Runtimequality trade-off illustrating the Pareto efficiency of OLSF-TRS in terms of total tote m… view at source ↗

**Figure 4.** Figure 4: Architecture of the OLSF-TRS for tote-handling robotic fulfillment. a Heterogeneous warehouse entities, including orders, totes, and robots, are represented by their relevant attributes such as SKUs, priority, arrival sequence, capacities, and current loads. These entities are encoded into a unified featureaction representation. The original large-scale MDP state space is then abstracted into a reduced BQ… view at source ↗

read the original abstract

Driven by the rapid expansion of e-commerce and small-batch production, the size of the intralogistics load unit of finished goods, semi-finished goods and raw materials is steadily shrinking. Totes are gradually replacing pallets as the primary handling and storage container. This shift has propelled tote-handling robotic systems to the forefront of automation order fulfillment centers. The order-fulfillment decisions of tote-handling robotic systems share a common order-tote-robot sequential decision-making nature. Existing studies primarily focus on decision mechanisms tailored to particular systems, making it difficult to generalize or transfer them to other contexts. We propose an Omni-scale Learning-based Sequential Decision Framework for Order Fulfillment of Tote-handling Robotic Systems (OLSF-TRS), a generalized and scalable sequential decision framework that combines structured combinatorial optimization with multi-agent reinforcement learning to coordinate order,tote, and robot decisions. On small-scale tote-handling robotic systems, OLSF-TRS achieves near-optimal performance with average optimality gaps below 3.5% across two distinct system configurations. In large-scale scenarios, OLSF-TRS consistently outperforms heuristic baselines across two different system types, reducing total tote movements by 8-12% and over 30% compared to SOTA rule-based approaches, while maintaining real-time responsiveness. These improvements translate into tangible operational benefits, including cost reduction, lower energy consumption, and enhanced throughput stability. The proposed framework delivers an efficient and unified order fulfillment decision-making framework for widely deployed tote-handling robotic systems,supporting high-quality order fulfillment in both e-commerce and industrial logistics sectors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The hybrid combinatorial-opt plus multi-agent RL framework is a reasonable practical step for tote-robot order fulfillment, but the omni-scale generalization without retraining lacks direct supporting evidence.

read the letter

The paper introduces OLSF-TRS as a unified way to handle the sequential order-tote-robot decisions in tote-handling robotic systems by splitting the problem between structured combinatorial optimization and multi-agent RL. That decomposition is the main new element, since most prior work stays tied to one specific system layout rather than claiming cross-scale applicability. The reported results give it some weight: under 3.5% optimality gap on the two small configurations and 8-12% fewer tote movements than heuristics on the large ones, plus bigger gains against certain rule-based baselines, all while staying fast enough for real-time use. Those numbers line up with the practical motivation around e-commerce and small-batch logistics. The hybrid split looks sensible for keeping the optimization tractable while letting RL handle the dynamic coordination. The soft spot is the generalization claim. The abstract and strongest statements assert that the same learned policies work across scales without extensive per-system retraining, yet nothing in the provided details shows an ablation that isolates zero-shot transfer or tests stability when agent count and state space grow. Multi-agent RL is known to be sensitive to exactly those changes, so if the large-scale runs used separate training or retuning, the omni-scale property does not fully land. The optimality-gap calculations and baseline implementations also need the full experimental protocol to judge how tight the comparisons really are. This is for people working on warehouse robotics or hybrid optimization-RL methods in logistics. A reader who needs a concrete starting point for sequential decisions in automated fulfillment would get usable ideas and numbers from it. I would send it to peer review because the core problem is well-motivated, the claims are specific enough to test, and the gaps are addressable with targeted experiments on transfer and training details.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes OLSF-TRS, a hybrid sequential decision framework that integrates structured combinatorial optimization with multi-agent reinforcement learning to coordinate order, tote, and robot decisions in tote-handling robotic systems. It claims near-optimal performance with average optimality gaps below 3.5% on small-scale systems across two configurations, and consistent outperformance of heuristic and SOTA rule-based baselines in large-scale scenarios (8-12% and >30% reductions in tote movements) while preserving real-time responsiveness.

Significance. If the central claims hold, the work offers a potentially generalizable hybrid approach for intralogistics automation that could yield measurable gains in throughput, energy use, and cost. The combination of exact optimization subproblems with learned policies is a constructive direction for scalable robotic order fulfillment, and the reported quantitative improvements over external baselines are a positive feature.

major comments (2)

[Large-scale evaluation] Large-scale evaluation section: the omni-scale claim (no extensive per-system retraining) is load-bearing for the title and abstract but unsupported by explicit zero-shot transfer results or ablations isolating the MARL component. The paper must clarify whether the multi-agent policies trained on the two small-scale configurations were applied unchanged to the two large-scale system types, or whether scale-specific retraining or hyperparameter retuning occurred; without this, the generalization property cannot be assessed.
[Method] Framework description (method section): the interface between the combinatorial optimization layer and the multi-agent RL layer is not specified in sufficient detail to determine how state/action spaces remain stable under changes in agent count and system size. MARL non-stationarity is a known risk; the manuscript should provide the exact state representation and reward structure that purportedly enables scale-invariance.

minor comments (2)

[Abstract] Abstract and introduction: the two small-scale configurations and two large-scale system types are referenced but never named or characterized (e.g., layout topology, tote capacity, robot fleet size). Adding one sentence of concrete description would aid reproducibility.
[Notation] Notation and terminology: ensure that all acronyms (OLSF-TRS, MARL, etc.) are defined on first use and used consistently; a small table of symbols would reduce ambiguity in the decision variables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment below, providing clarifications on the experimental setup and framework design while indicating the revisions we will make to improve transparency and reproducibility.

read point-by-point responses

Referee: [Large-scale evaluation] Large-scale evaluation section: the omni-scale claim (no extensive per-system retraining) is load-bearing for the title and abstract but unsupported by explicit zero-shot transfer results or ablations isolating the MARL component. The paper must clarify whether the multi-agent policies trained on the two small-scale configurations were applied unchanged to the two large-scale system types, or whether scale-specific retraining or hyperparameter retuning occurred; without this, the generalization property cannot be assessed.

Authors: We appreciate the referee drawing attention to the need for explicit documentation of the transfer procedure. In the experiments, the multi-agent policies were trained solely on the two small-scale configurations and applied unchanged to the large-scale system types with no retraining or hyperparameter retuning. This zero-shot transfer was central to demonstrating the omni-scale property. To make this fully transparent, we will revise the large-scale evaluation section to explicitly describe the training and transfer protocol, state that no scale-specific retraining occurred, and add discussion of how the MARL component contributes to generalization across scales. If additional ablations are required beyond what is feasible in the current results, we will note this limitation. revision: partial
Referee: [Method] Framework description (method section): the interface between the combinatorial optimization layer and the multi-agent RL layer is not specified in sufficient detail to determine how state/action spaces remain stable under changes in agent count and system size. MARL non-stationarity is a known risk; the manuscript should provide the exact state representation and reward structure that purportedly enables scale-invariance.

Authors: We agree that greater detail on the interface is essential for assessing stability and scale-invariance. In the revised manuscript, we will expand the method section to specify: (i) the exact state representation, including normalized features that encode system size and agent count in a scale-invariant manner; (ii) the action spaces for order, tote, and robot agents; (iii) the reward structure; and (iv) the precise interface by which combinatorial optimization outputs (e.g., assignments or schedules) are fed into the MARL agents as part of the state or as constraints. We will also describe the centralized-training decentralized-execution paradigm and structured state features used to mitigate non-stationarity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation or performance claims

full rationale

The paper describes a hybrid framework of combinatorial optimization plus multi-agent RL for order-tote-robot decisions, with all reported metrics (optimality gaps <3.5% on small scales, 8-12% and >30% improvements on large scales) obtained via direct comparison against external heuristic and SOTA rule-based baselines. No equations, fitted parameters presented as predictions, self-citations used as load-bearing uniqueness theorems, or self-referential definitions appear in the abstract or strongest claims. The derivation chain therefore remains independent of its own outputs and does not reduce to tautology by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard domain assumptions about sequential decision structure in logistics robotics and typical RL training assumptions; no new physical entities are postulated and free parameters are the usual RL hyperparameters left unspecified in the abstract.

free parameters (1)

multi-agent RL hyperparameters (learning rate, discount factor, etc.)
Standard for any MARL implementation; not enumerated in the abstract but required for the learning component.

axioms (1)

domain assumption Order-fulfillment decisions of tote-handling robotic systems share a common order-tote-robot sequential decision-making nature.
Explicitly invoked in the abstract as the foundation for proposing a unified framework.

pith-pipeline@v0.9.0 · 5584 in / 1406 out tokens · 195049 ms · 2026-05-12T00:53:56.964451+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

OLSF-TRS integrates BQ-MDP for principled state abstraction, BQ-NCO for structured combinatorial decisions, and MAPPO for cooperative control... minimizing ZFinal (total tote movements)
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

bisimulation quotienting... abstract states if they produce identical transition distributions and expected rewards

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

170 extracted references · 170 canonical work pages · 1 internal anchor

[1]

2024 , month = aug, note =

work page 2024
[2]

IIE transactions , volume=

Order batching to minimize total travel time in a parallel-aisle warehouse , author=. IIE transactions , volume=. 2005 , publisher=

work page 2005
[3]

2012 , publisher=

Order batching in order picking warehouses: a survey of solution approaches , author=. 2012 , publisher=

work page 2012
[4]

Computers & Industrial Engineering , volume=

Metaheuristics for order batching and sequencing in manual order picking systems , author=. Computers & Industrial Engineering , volume=. 2013 , publisher=

work page 2013
[5]

European Journal of Operational Research , volume=

Design and control of warehouse order picking: A literature review , author=. European Journal of Operational Research , volume=

work page
[6]

Research on the task assignment problem of warehouse robots in the smart warehouse , year=

Li, Zhenping and Li, Wenyu and Jiang, Lulu , booktitle=. Research on the task assignment problem of warehouse robots in the smart warehouse , year=

work page
[7]

Advances in Neural Information Processing Systems , year=

Attention is all you need , author=. Advances in Neural Information Processing Systems , year=

work page
[8]

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence , pages=

ReZero is all you need: Fast convergence at large depth , author=. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence , pages=. 2021 , organization=

work page 2021
[9]

2002 , address=

World-class warehousing and material handling , author=. 2002 , address=

work page 2002
[10]

Autonomous mobile robots for your warehouse , year =

work page
[11]

2025 , url =

AutoStore , title =. 2025 , url =

work page 2025
[12]

HAIROBOTICS Cases

HAIROBOTICS. HAIROBOTICS Cases

work page
[13]

AirRob - Aerial Robotic Manipulator System

AMMICORO Robotics. AirRob - Aerial Robotic Manipulator System

work page
[14]

Frontiers of Engineering Management , volume=

A literature review of smart warehouse operations management , author=. Frontiers of Engineering Management , volume=. 2022 , publisher=

work page 2022
[15]

McGinnis , issn =

Jinxiang Gu and Marc Goetschalckx and Leon F. McGinnis , issn =. Research on warehouse design and performance evaluation: A comprehensive review , volume =. European Journal of Operational Research , keywords =

work page
[16]

European Journal of Operational Research , volume=

The impact of order batching and picking area zoning on order picking system performance , author=. European Journal of Operational Research , volume=

work page
[17]

IIE Transactions , volume=

Performance approximation and design of pick-and-pass order picking systems , author=. IIE Transactions , volume=

work page
[18]

and Goetschalckx, M

Gu, J. and Goetschalckx, M. and McGinnis, L. F. , title =. European Journal of Operational Research , volume =. 2007 , publisher =

work page 2007
[19]

European Journal of Operational Research , volume=

Order batching problems: Taxonomy and literature review , author=. European Journal of Operational Research , volume=

work page
[20]

Transportation and distribution Management , volume=

Cube-per-order index-a key to warehouse stock location , author=. Transportation and distribution Management , volume=

work page
[21]

Management science , volume=

Optimal storage assignment in automatic warehousing systems , author=. Management science , volume=. 1976 , publisher=

work page 1976
[22]

Operations Research , volume=

Optimal inventory location and control in production and distribution networks , author=. Operations Research , volume=

work page
[23]

and Meng, X

Wu, Y. and Meng, X. and Wang, Y. and Hu, J. , title=. Journal of Mechanical Engineering , volume=

work page
[24]

and Wu, Y

Xia, D. and Wu, Y. and Wang, Y. and Zou, X. , title=. Journal of Shenzhen University Science and Engineering , volume=

work page
[25]

Flexible Services and Manufacturing Journal , volume=

Henn, Stefan , title=. Flexible Services and Manufacturing Journal , volume=

work page
[26]

Production Engineering , volume=

An approach for the solution to order batching and sequencing in picking systems , author=. Production Engineering , volume=. 2019 , publisher=

work page 2019
[27]

European Journal of Operational Research , volume=

Scholz, Andreas and Schubert, Dirk and Wäscher, Gunter , title=. European Journal of Operational Research , volume=

work page
[28]

Azadnia, A. H. and Taheri, S. and Ghadimi, P. and Mat Saman, M. Z. and Wong, K. Y. , title =. The Scientific World Journal , volume =. 2013 , publisher =

work page 2013
[29]

and Sharp, Graham P

Gibson, David R. and Sharp, Graham P. , title=. European Journal of Operational Research , volume=

work page
[30]

Computers & Industrial Engineering , volume=

Hsu, Chia-Ming and Chen, Kuan-Yu and Chen, Ming-Chang , title=. Computers & Industrial Engineering , volume=

work page
[31]

Li, Z. P. and Zhang, J. L. and Zhang, H. J. and Hua, G. W. , title=. International Journal of Simulation Modelling , volume=

work page
[32]

and Gong, Y

Zou, B. and Gong, Y. and Xu, X. and Yuan, Z. , title=. International Journal of Production Research , volume=

work page
[33]

and Wang, H

Yuan, R. and Wang, H. and Li, J. , title=. Proceedings of the IEEE International Conference on Service Operations, Logistics, and Informatics (SOLI) , pages=. 2019 , month=

work page 2019
[34]

AI Magazine , author=

Coordinating Hundreds of Cooperative, Autonomous Vehicles in Warehouses , volume=. AI Magazine , author=. 2008 , month=

work page 2008
[35]

and Lamballais, T

Merschformann, M. and Lamballais, T. and de Koster, M. B. M. and Suhl, L. , title=. Operations Research Perspectives , volume=

work page
[36]

and Briskorn, D

Boysen, N. and Briskorn, D. and Emde, S. , title=. European Journal of Operational Research , volume=

work page
[37]

Computers & Operations Research , volume=

Joint optimization of order sequencing and rack scheduling in the robotic mobile fulfilment system , author=. Computers & Operations Research , volume=. 2021 , publisher=

work page 2021
[38]

Efficient order processing in an inverse order picking system , journal=

F. Efficient order processing in an inverse order picking system , journal=

work page
[39]

EURO Journal on Transportation and Logistics , volume=

High-performance order processing in picking workstations , author=. EURO Journal on Transportation and Logistics , volume=. 2019 , publisher=

work page 2019
[40]

International Journal of Production Research , volume=

Order sequencing, tote scheduling, and robot routing optimization in multi-tote storage and retrieval autonomous mobile robot systems , author=. International Journal of Production Research , volume=. 2025 , publisher=

work page 2025
[41]

The International journal of robotics research , volume=

A formal analysis and taxonomy of task allocation in multi-robot systems , author=. The International journal of robotics research , volume=. 2004 , publisher=

work page 2004
[42]

2008 IEEE International Conference on Robotics and Automation , pages=

Distributed multi-robot task assignment and formation control , author=. 2008 IEEE International Conference on Robotics and Automation , pages=. 2008 , organization=

work page 2008
[43]

2011 IEEE International Conference on Robotics and Automation , pages=

Multi-robot assignment algorithm for tasks with set precedence constraints , author=. 2011 IEEE International Conference on Robotics and Automation , pages=. 2011 , organization=

work page 2011
[44]

Processes , volume=

Task scheduling model of double-deep multi-tier shuttle system , author=. Processes , volume=. 2019 , publisher=

work page 2019
[45]

The International Journal of Advanced Manufacturing Technology , volume=

Dynamic selection of sequencing rules for a class-based unit-load automated storage and retrieval system , author=. The International Journal of Advanced Manufacturing Technology , volume=. 2006 , publisher=

work page 2006
[46]

2009 , publisher=

Design and analysis of autonomous vehicle storage and retrieval systems via queuing network and simulation models , author=. 2009 , publisher=

work page 2009
[47]

International Journal of Production Research , volume=

Task scheduling for multi-tier shuttle warehousing systems , author=. International Journal of Production Research , volume=. 2015 , publisher=

work page 2015
[48]

Advances in Neural Information Processing Systems , volume=

Bq-nco: Bisimulation quotienting for efficient neural combinatorial optimization , author=. Advances in Neural Information Processing Systems , volume=

work page
[49]

HKIE transactions , volume=

Robotics in ecommerce logistics , author=. HKIE transactions , volume=. 2015 , publisher=

work page 2015
[50]

What are the Benefits of Automated Storage and Retrieval System , author=

work page
[51]

Guest Editorial: A Revolution in the Warehouse: A Retrospective on Kiva Systems and the Grand Challenges Ahead , year=

D'Andrea, Raffaello , journal=. Guest Editorial: A Revolution in the Warehouse: A Retrospective on Kiva Systems and the Grand Challenges Ahead , year=

work page
[52]

2021 IEEE International Conference on Real-time Computing and Robotics (RCAR) , pages=

Task allocation and path planning of many robots with motion uncertainty in a warehouse environment , author=. 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR) , pages=. 2021 , organization=

work page 2021
[53]

Journal of Sensors , volume=

Multirobot adaptive task allocation of intelligent warehouse based on evolutionary strategy , author=. Journal of Sensors , volume=. 2022 , publisher=

work page 2022
[54]

International Journal of Production Research , volume=

A multi-objective optimisation study for the design of an AVS/RS warehouse , author=. International Journal of Production Research , volume=. 2021 , publisher=

work page 2021
[55]

Complex & Intelligent Systems , volume=

Collaborative optimization of task scheduling and multi-agent path planning in automated warehouses , author=. Complex & Intelligent Systems , volume=. 2023 , publisher=

work page 2023
[56]

Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA) , pages=

Task scheduling for multiple forklift AGVs in distribution warehouses , author=. Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA) , pages=. 2014 , organization=

work page 2014
[57]

AI magazine , volume=

Coordinating hundreds of cooperative, autonomous vehicles in warehouses , author=. AI magazine , volume=

work page
[58]

European Journal of Operational Research , volume=

Warehousing in the e-commerce era: A survey , author=. European Journal of Operational Research , volume=. 2019 , publisher=

work page 2019
[59]

Applied Sciences , volume=

Multi-agent simulation environment for logistics warehouse design based on self-contained agents , author=. Applied Sciences , volume=. 2020 , publisher=

work page 2020
[60]

Ieee Access , volume=

Multi-agent systems: A survey , author=. Ieee Access , volume=. 2018 , publisher=

work page 2018
[61]

Artificial Intelligence Review , volume=

Multi-agent deep reinforcement learning: a survey , author=. Artificial Intelligence Review , volume=. 2022 , publisher=

work page 2022
[62]

Applied Intelligence , volume=

A review of cooperative multi-agent deep reinforcement learning , author=. Applied Intelligence , volume=. 2023 , publisher=

work page 2023
[63]

Innovations in multi-agent systems and applications-1 , pages=

An introduction to multi-agent systems , author=. Innovations in multi-agent systems and applications-1 , pages=. 2010 , publisher=

work page 2010
[64]

2020 , eprint=

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , author=. 2020 , eprint=

work page 2020
[65]

2021 26th IEEE international conference on emerging technologies and factory automation (ETFA) , pages=

Multi-agent based manufacturing: current trends and challenges , author=. 2021 26th IEEE international conference on emerging technologies and factory automation (ETFA) , pages=. 2021 , organization=

work page 2021
[66]

ERIM report series research in management Erasmus Research Institute of Management , number=

Multi agent systems in logistics: a literature and state-of-the-art review , author=. ERIM report series research in management Erasmus Research Institute of Management , number=

work page
[67]

International Journal of Production Research , volume=

Multi-agent system optimisation in factories of the future: cyber collaborative warehouse study , author=. International Journal of Production Research , volume=. 2022 , publisher=

work page 2022
[68]

Transportation research procedia , volume=

An intelligent multi-agent based model for collaborative logistics systems , author=. Transportation research procedia , volume=. 2016 , publisher=

work page 2016
[69]

WSEAS Transactions on Systems and Control , volume=

Robotic Agents through Scalable Multi-agent Reinforcement Learning for Optimization of Warehouse Logistics , author=. WSEAS Transactions on Systems and Control , volume=. 2025 , publisher=

work page 2025
[70]

IEEE Robotics and Automation Letters , volume=

Double-deck multi-agent pickup and delivery: Multi-robot rearrangement in large-scale warehouses , author=. IEEE Robotics and Automation Letters , volume=. 2023 , publisher=

work page 2023
[71]

and Wurman, Peter R

Hazard, Christopher J. and Wurman, Peter R. and D’Andrea, Raffaello , title=. Proceedings of the American Association for Artificial Intelligence (AAAI) , year=

work page
[72]

IEEE Transactions on Automation Science and Engineering , volume=

A two-stage hybrid heuristic algorithm for simultaneous order and rack assignment problems , author=. IEEE Transactions on Automation Science and Engineering , volume=. 2021 , publisher=

work page 2021
[73]

Valle, C. A. and Beasley, J. E. , title=. Computers & Operations Research , volume=

work page
[74]

and Krämer, A

Brucker, P. and Krämer, A. , title=. European Journal of Operational Research , volume=

work page
[75]

Roodbergen, K. J. , title=. European Journal of Operational Research , year=

work page
[76]

and Liu, Z

Wang, Y. and Liu, Z. and Huang, K. and others , title=. Computers & Industrial Engineering , volume=

work page
[77]

and Jiaqi, Z

Dujuan, W. and Jiaqi, Z. and Xiaowen, W. and Cheng, T. C. E. and Yunqiang, Y. and Yanzhang, W. , title=. Computers & Operations Research , volume=

work page
[78]

and Dujuan, W

Xiaochang, L. and Dujuan, W. and Yunqiang, Y. and Cheng, T. C. E. , title=. Computers & Operations Research , volume=

work page
[79]

and Yongjian, Y

Yunqiang, Y. and Yongjian, Y. and Dujuan, W. and Cheng, T. C. E. and Chin-Chia, W. , title=. Naval Research Logistics , volume=

work page
[80]

Engineering Applications of Artificial Intelligence , volume =

Ma, Haixia and Su, Shuang and Simon, David and Fei, Ming , title =. Engineering Applications of Artificial Intelligence , volume =. 2015 , publisher =

work page 2015

Showing first 80 references.