arxiv: 2604.21893 · v1 · submitted 2026-04-23 · 📊 stat.ML · cs.LG· q-fin.RM

Recognition: unknown

Revealing Geography-Driven Signals in Zone-Level Claim Frequency Models: An Empirical Study using Environmental and Visual Predictors

Cristi\'an Bravo, Kristina G. Stankova, Sherly Alfonso-S\'anchez

Authors on Pith no claims yet

Pith reviewed 2026-05-08 14:08 UTC · model grok-4.3

classification 📊 stat.ML cs.LGq-fin.RM

keywords geographic informationclaim frequency modelingmotor insuranceenvironmental featuresimage embeddingszone-level modelspredictive accuracyMTPL

0 comments

The pith

Geographic features from maps and imagery improve accuracy in zone-level motor insurance claim frequency models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether geographic context can strengthen predictions of motor insurance claims when individual location details are scarce in public data. It tests this on Belgian zone-level records by feeding coordinates, land-cover indicators from open maps, and image embeddings into linear models and gradient-boosted trees. Sympathetic readers would care because clearer location signals could support more accurate risk pricing without requiring private address data. The experiments show gains across model types, with the largest lifts coming from combining coordinates and moderate-scale environmental features.

Core claim

Geographic information constructed from OpenStreetMap indicators, CORINE Land Cover, and Belgian orthoimagery augments standard actuarial variables to raise predictive accuracy in zone-level Motor Third Party Liability claim frequency models. Linear and tree-based models both improve, with the strongest results from latitude-longitude paired with environmental features at the 5 km scale; smaller neighborhoods still help baselines. Image embeddings add value mainly when environmental features are unavailable, and overall performance hinges more on how geography is represented than on model complexity.

What carries the argument

Zone-level aggregation of claims paired with constructed geographic predictors—coordinates, scale-specific environmental features, and pretrained vision-transformer embeddings—added to GLM, regularized GLM, and gradient-boosted tree baselines.

If this is right

Coordinates combined with 5 km environmental features deliver the largest accuracy lift for both linear and tree-based models.
Environmental features at smaller neighborhood scales still improve baseline specifications.
Pretrained image embeddings raise accuracy and stability for regularized GLMs only when environmental features are absent.
The predictive contribution of geography depends less on model type than on the chosen representation of location.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same open-data approach could be tested in other insurance lines where location influences risk but detailed addresses are restricted.
Moving from zone aggregates to policy-level data might show whether the geographic signals strengthen or weaken without averaging.
Widespread use of these public sources could lower dependence on proprietary location datasets for actuarial work.

Load-bearing premise

The observed accuracy gains truly reflect location-based risk differences rather than dataset-specific correlations or the particular feature construction choices.

What would settle it

Repeating the same feature additions on an independent insurance dataset from another country or time period and finding no gain or a loss in held-out predictive metrics.

Figures

Figures reproduced from arXiv: 2604.21893 by Cristi\'an Bravo, Kristina G. Stankova, Sherly Alfonso-S\'anchez.

**Figure 1.** Figure 1: ResNet blocks. Because the images used in our experiments are grayscale, we adapt the first convolutional layer from three input channels to one, by averaging the pretrained RGB weights across the channel dimension. All subsequent layers of the ResNet18 backbone remain unchanged. After the final residual stage, a global average pooling layer produces a 512 dimensional embedding that summarizes the visual i… view at source ↗

**Figure 2.** Figure 2: General scheme of the employed ResNet18 model. view at source ↗

**Figure 3.** Figure 3: Extended cross validation scheme. In the following subsections, we introduce the tabular data used in the analysis, the construction of the environmental features, the acquisition of the imagery, and the role of image embeddings in the modeling framework. 3.1 Tabular Data The original dataset used in our research is the 1997 Belgian MTPL dataset (beMTPL97), available in the R package CASdatasets (Dutang, C… view at source ↗

**Figure 4.** Figure 4: Histogram of the aggregated number of claims and exposure. view at source ↗

**Figure 5.** Figure 5: Histogram of frequency and summary statistics. view at source ↗

**Figure 6.** Figure 6: Histogram and bar plots of some environmental features at radius 5 km view at source ↗

**Figure 7.** Figure 7: Orthoimages tiles centered on the given postcodes’ coordinates, with apothem lengths of 0.5 km, 1 km and 3 km. view at source ↗

**Figure 8.** Figure 8: Detailed example for the squares generated for the postcode 1140 with latitude and longitude, 50.87064 N and 4.39674 E, respectively. view at source ↗

read the original abstract

Geographic context is often consider relevant to motor insurance risk, yet public actuarial datasets provide limited location identifiers, constraining how this information can be incorporated and evaluated in claim-frequency models. This study examines how geographic information from alternative data sources can be incorporated into actuarial models for Motor Third Party Liability (MTPL) claim prediction under such constraints. Using the BeMTPL97 dataset, we adopt a zone-level modeling framework and evaluate predictive performance on unseen postcodes. Geographic information is introduced through two channels: environmental indicators from OpenStreetMap and CORINE Land Cover, and orthoimagery released by the Belgian National Geographic Institute for academic use. We evaluate the predictive contribution of coordinates, environmental features, and image embeddings across three baseline models: generalized linear models (GLMs), regularized GLMs, and gradient-boosted trees, while raw imagery is modeled using convolutional neural networks. Our results show that augmenting actuarial variables with constructed geographic information improves accuracy. Across experiments, both linear and tree-based models benefit most from combining coordinates with environmental features extracted at 5 km scale, while smaller neighborhoods also improve baseline specifications. Generally, image embeddings do not improve performance when environmental features are available; however, when such features are absent, pretrained vision-transformer embeddings enhance accuracy and stability for regularized GLMs. Our results show that the predictive value of geographic information in zone-level MTPL frequency models depends less on model complexity than on how geography is represented, and illustrate that geographic context can be incorporated despite limited individual-level spatial information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows that public geographic data can lift zone-level MTPL frequency predictions on BeMTPL97, but the gains look sensitive to post-hoc scale choices and non-spatial hold-outs.

read the letter

The central finding is that adding coordinates plus environmental features from OpenStreetMap and CORINE improves accuracy over actuarial baselines for both GLMs and boosted trees, with the 5 km scale performing best across the experiments they ran. Smaller neighborhoods also help, while image embeddings add little once environmental features are present but can stabilize regularized GLMs when those features are absent. The work is a direct empirical check on how to bring in alternative location signals when individual addresses are unavailable in public datasets like BeMTPL97. It does a clean job comparing three model families and using postcode-level hold-outs for evaluation, which is a reasonable way to test generalization under the stated constraints. The observation that representation of geography matters more than model complexity is a usable takeaway for practitioners. The soft spots are modest but real. The abstract reports no actual metrics, confidence intervals, or feature-selection details, so the size of the lift is hard to judge. Highlighting the 5 km scale as best after mentioning multiple scales and neighborhoods tested suggests the result may reflect post-hoc selection rather than a pre-specified choice. The postcode hold-out is better than random splits, yet without geographic blocking the shared map-derived features could still leak signal between nearby zones. Zone-level aggregation itself may also smooth over individual-level spatial effects that would change the picture. This paper is for actuaries and spatial ML researchers who need concrete examples of open geo data in insurance modeling. It deserves a serious referee because the setup is reproducible with public sources and the question is practically relevant, even if the gains require tighter validation to be fully convincing. I would send it for review with requests for the missing numbers and a clearer description of how scales were chosen.

Referee Report

2 major / 2 minor

Summary. This paper investigates incorporating geographic information from public sources (OpenStreetMap, CORINE Land Cover, Belgian orthoimagery) into zone-level MTPL claim-frequency models on the BeMTPL97 dataset. It evaluates GLMs, regularized GLMs, and gradient-boosted trees on unseen postcodes, claiming that augmenting actuarial baselines with coordinates plus environmental features (especially at 5 km scale) improves accuracy, while image embeddings help mainly when environmental features are absent; the predictive value depends more on geography representation than model complexity.

Significance. If the reported gains hold after addressing selection and leakage concerns, the work would provide actionable evidence that alternative geographic data can enhance actuarial models when individual-level location identifiers are limited. It would illustrate practical trade-offs between feature construction and model class, with potential to inform risk pricing in motor insurance using publicly available spatial layers.

major comments (2)

[Abstract] Abstract: the central claim that 'both linear and tree-based models benefit most from combining coordinates with environmental features extracted at 5 km scale' is presented without any quantitative metrics (e.g., change in Poisson deviance, log-loss, or AUC), confidence intervals, or a table of results across all tested scales; this omission makes it impossible to judge the magnitude or robustness of the improvement that underpins the paper's main contribution.
[Evaluation methodology] Evaluation methodology (implied in abstract and results description): hold-out on 'unseen postcodes' is not described as spatially blocked or geographically stratified; because environmental features are extracted from fixed external maps at fixed radii, random postcode splits permit spatial autocorrelation leakage between train and test sets, which directly threatens the claim that observed gains reflect genuine location-based risk signals rather than correlated predictors.

minor comments (2)

[Abstract] Abstract: grammatical error ('is often consider relevant' should be 'is often considered relevant').
[Abstract] Abstract: the statement 'smaller neighborhoods also improve baseline specifications' is imprecise; it should specify the radii tested and report the corresponding performance deltas to allow readers to assess the scale-sensitivity claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments. We address each major comment below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'both linear and tree-based models benefit most from combining coordinates with environmental features extracted at 5 km scale' is presented without any quantitative metrics (e.g., change in Poisson deviance, log-loss, or AUC), confidence intervals, or a table of results across all tested scales; this omission makes it impossible to judge the magnitude or robustness of the improvement that underpins the paper's main contribution.

Authors: We agree that the abstract would be strengthened by including quantitative indicators of the reported improvements. The results section of the manuscript already contains tables and figures with performance metrics (Poisson deviance, log-loss) across models, feature sets, and spatial scales. In the revised version we will update the abstract to report the key numerical gains (e.g., relative reduction in Poisson deviance for the best coordinate-plus-5 km environmental configuration versus the actuarial baseline) and will explicitly reference the corresponding results table. revision: yes
Referee: [Evaluation methodology] Evaluation methodology (implied in abstract and results description): hold-out on 'unseen postcodes' is not described as spatially blocked or geographically stratified; because environmental features are extracted from fixed external maps at fixed radii, random postcode splits permit spatial autocorrelation leakage between train and test sets, which directly threatens the claim that observed gains reflect genuine location-based risk signals rather than correlated predictors.

Authors: We acknowledge the validity of this concern. The current evaluation splits postcodes into train and test sets to evaluate performance on unseen locations, but a purely random postcode split does not explicitly enforce spatial separation. Because environmental features are derived from fixed-radius buffers, nearby postcodes can share highly correlated predictors, raising the possibility of leakage. In the revision we will replace the simple hold-out with a spatially blocked or geographically stratified procedure (e.g., blocking by larger administrative units or using a distance-based split) and will report the updated performance metrics together with a discussion of how this change affects the interpretation of the geographic signals. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical geographic feature augmentation

full rationale

The paper is a standard empirical ML study on the BeMTPL97 dataset. It augments zone-level claim frequency models with coordinates, environmental features from public sources (OpenStreetMap, CORINE), and image embeddings, then reports predictive performance on held-out postcodes using GLMs, regularized GLMs, and gradient-boosted trees. No mathematical derivation chain exists that reduces predictions or results to inputs by construction. No self-citations are load-bearing, no fitted parameters are relabeled as independent predictions, and no ansatzes or uniqueness theorems are invoked. The reported accuracy gains are data-driven outcomes evaluated against external benchmarks, making the analysis self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard statistical assumptions plus domain assumptions about geographic relevance to insurance risk; free parameters include chosen spatial scales and model hyperparameters.

free parameters (2)

environmental feature extraction scale (5 km)
Selected neighborhood size for feature aggregation; performance reported as best but chosen from tested options.
model hyperparameters (regularization, tree parameters)
Tuned or selected to optimize performance on the dataset.

axioms (2)

domain assumption Zone-level aggregation preserves predictive signals without introducing bias that geographic features merely compensate for
Invoked by the zone-level modeling framework choice.
domain assumption Environmental and visual features from public sources are causally or predictively relevant to claim frequency
Core premise for testing their contribution.

pith-pipeline@v0.9.0 · 5596 in / 1393 out tokens · 42476 ms · 2026-05-08T14:08:08.897351+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 42 canonical work pages

[1]

URL https://www.soa.org/4a6a75/globalassets/assets/files/resources/experience-studies/2019/ltc-intercompany-study.pdf

Long term care intercompany study, January 2015. URL https://www.soa.org/4a6a75/globalassets/assets/files/resources/experience-studies/2019/ltc-intercompany-study.pdf. Accessed: 2025-02-14

2015
[2]

Statistics Surveys 4:40--79

Arlot, S. and Celisse, A. A survey of cross-validation procedures for model selection. Statistics surveys, 4 0 (none): 0 40--79, 2010. ISSN 1935-7516. doi:10.1214/09-SS054

work page doi:10.1214/09-ss054 2010
[3]

Y., Asare, I

Asabere, N. Y., Asare, I. O., Lawson, G., Balde, F., Duodu, N. Y., Tsoekeku, G., Afriyie, P. O., and Ganiu, A. R. A. Geo-insurance: Improving big data challenges in the context of insurance services using a geographical information system (gis). Human Behavior and Emerging Technologies, 2024 0 (1): 0 9015012, 2024. doi:10.1155/2024/9015012

work page doi:10.1155/2024/9015012 2024
[4]

Ayuso, M., Guillen, M., and Nielsen, J. P. Improving automobile insurance ratemaking using telematics: incorporating mileage and driver behaviour data. Transportation, 46 0 (3): 0 735--752, 2019. doi:10.1007/s11116-018-9890-7

work page doi:10.1007/s11116-018-9890-7 2019
[5]

and Nagy, B

Benedek, B. and Nagy, B. Z. Traditional versus ai-based fraud detection: cost efficiency in the field of automobile insurance. Financial and Economic Review, 22 0 (2): 0 77--98, 2023. doi:10.33893/FER.22.2.77

work page doi:10.33893/fer.22.2.77 2023
[6]

Deep learning, volume 1

Bengio, Y., Goodfellow, I., Courville, A., et al. Deep learning, volume 1. MIT press Cambridge, MA, USA, 2017

2017
[7]

Ai revolution in insurance: bridging research and reality

Bhattacharya, S., Castignani, G., Masello, L., and Sheehan, B. Ai revolution in insurance: bridging research and reality. Frontiers in Artificial Intelligence, 8: 0 1568266, 2025. doi:10.3389/frai.2025.1568266

work page doi:10.3389/frai.2025.1568266 2025
[8]

Geographic ratemaking with spatial embeddings

Blier-Wong, C., Cossette, H., Lamontagne, L., and Marceau, E. Geographic ratemaking with spatial embeddings. ASTIN Bulletin: The Journal of the IAA, 52 0 (1): 0 1--31, 2022. doi:10.1017/asb.2021.25

work page doi:10.1017/asb.2021.25 2022
[9]

A representation-learning approach for insurance pricing with images

Blier-Wong, C., Lamontagne, L., and Marceau, E. A representation-learning approach for insurance pricing with images. ASTIN Bulletin: The Journal of the IAA, 54 0 (2): 0 280--309, 2024. doi:10.1017/asb.2024.9

work page doi:10.1017/asb.2024.9 2024
[10]

Modelling mtpl insurance claim events: Can machine learning methods overperform the traditional glm approach? Hungarian Statistical Review, 4 0 (2), 2021

Burka, D., Kov \'a cs, L., and Szepesv \'a ry, L. Modelling mtpl insurance claim events: Can machine learning methods overperform the traditional glm approach? Hungarian Statistical Review, 4 0 (2), 2021. doi:10.35618/hsr2021.02.en034

work page doi:10.35618/hsr2021.02.en034 2021
[11]

Belgian national geospatial data portal

Cartesius / National Geographic Institute (NGI Belgium) . Belgian national geospatial data portal. https://www.cartesius.be. Accessed 04.12.2025

2025
[12]

Chen and C

Chen, T. and Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785--794, 2016. doi:10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016
[13]

R., and Bravo, J

Clemente, C., Guerreiro, G. R., and Bravo, J. M. Modelling motor insurance claim frequency and severity using gradient boosting. Risks, 11 0 (9): 0 163, 2023. doi:10.3390/risks11090163

work page doi:10.3390/risks11090163 2023
[14]

CORINE Land Cover 2000 (CLC 2000)

Copernicus Land Monitoring Service / European Environment Agency . CORINE Land Cover 2000 (CLC 2000) . European Union’s Copernicus Land Monitoring Service, 2020. URL https://land.copernicus.eu/en/products/corine-land-cover/clc-2000. Accessed 02.12.2025

2000
[15]

Automobile insurance fraud detection based on pso-xgboost model and interpretable machine learning method

Ding, N., Ruan, X., Wang, H., and Liu, Y. Automobile insurance fraud detection based on pso-xgboost model and interpretable machine learning method. Insurance: Mathematics and Economics, 120: 0 51--60, 2025. ISSN 0167-6687. doi:https://doi.org/10.1016/j.insmatheco.2024.11.006. URL https://www.sciencedirect.com/science/article/pii/S0167668724001112

work page doi:10.1016/j.insmatheco.2024.11.006 2025
[16]

and Quan, Z

Dong, P. and Quan, Z. Automated machine learning in insurance. Insurance: Mathematics and Economics, 120: 0 17--41, 2025. ISSN 0167-6687. doi:https://doi.org/10.1016/j.insmatheco.2024.10.002. URL https://www.sciencedirect.com/science/article/pii/S0167668724001057

work page doi:10.1016/j.insmatheco.2024.10.002 2025
[17]

K., and Rane, S

Dubey, A., Parida, T., Birajdar, A., Prajapati, A. K., and Rane, S. Smart underwriting system: An intelligent decision support system for insurance approval & risk assessment. In 2018 3rd International Conference for Convergence in Technology (I2CT), pages 1--6. IEEE, 2018. doi:10.1109/I2CT.2018.8529792

work page doi:10.1109/i2ct.2018.8529792 2018
[18]

and Charpentier, A

Dutang, C. and Charpentier, A. CASdatasets: Insurance datasets, 2024. R package version 1.2-0

2024
[19]

Insurance dataset

Dutang, C., Charpentier, A., and Gallic, E. Insurance dataset. 2024

2024
[20]

Belgium postcode boundaries

Environmental Systems Research Institute (Esri) . Belgium postcode boundaries. https://www.arcgis.com/home/item.html?id=e385aeef974a4aea8ae7fb1b0efc1341, 2022. GIS dataset accessed January 2026

2022
[21]

M., Malawany, K., Osman, A

Fouad, M. M., Malawany, K., Osman, A. G., Amer, H. M., Abdulkhalek, A. M., and Eldin, A. B. Automated vehicle inspection model using a deep learning approach. Journal of Ambient Intelligence and Humanized Computing, 14 0 (10): 0 13971--13979, 2023. doi:10.1007/s12652-022-04105-3

work page doi:10.1007/s12652-022-04105-3 2023
[22]

Gao, G., Wang, H., and W \"u thrich, M. V. Boosting poisson regression models with telematics car driving data. Machine Learning, 111 0 (1): 0 243--272, 2022. doi:10.2139/ssrn.3596034

work page doi:10.2139/ssrn.3596034 2022
[23]

K., and Sahu, G

Gupta, S., Ghardallou, W., Pandey, D. K., and Sahu, G. P. Artificial intelligence adoption in the insurance industry: Evidence using the technology--organization--environment framework. Research in International Business and Finance, 63: 0 101757, 2022. doi:10.1016/j.ribaf.2022.101757

work page doi:10.1016/j.ribaf.2022.101757 2022
[24]

and Renshaw, A

Haberman, S. and Renshaw, A. E. Generalized linear models and actuarial science. Journal of the Royal Statistical Society: Series D (The Statistician), 45 0 (4): 0 407--436, 1996. doi:10.2307/2988543

work page doi:10.2307/2988543 1996
[25]

The Elements of Statistical Learning

Hastie, T., Tibshirani, R., Friedman, J., et al. The elements of statistical learning. Springer, New York, 2009. ISBN 978-0-387-84857-0. doi:10.1007/978-0-387-84858-7

work page doi:10.1007/978-0-387-84858-7 2009
[26]

Deep residual learning for image recognition

He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016. doi:10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[27]

and Antonio, K

Henckaerts, R. and Antonio, K. The added value of dynamically updating motor insurance prices with telematics collected driving behavior data. Insurance: Mathematics and Economics, 105: 0 79--95, 2022. doi:10.1016/j.insmatheco.2022.03.011

work page doi:10.1016/j.insmatheco.2022.03.011 2022
[28]

Boosting insights in insurance tariff plans with tree-based machine learning methods

Henckaerts, R., C \^o t \'e , M.-P., Antonio, K., and Verbelen, R. Boosting insights in insurance tariff plans with tree-based machine learning methods. North American Actuarial Journal, 25 0 (2): 0 255--285, 2021. doi:10.1080/10920277.2020.1745656

work page doi:10.1080/10920277.2020.1745656 2021
[29]

Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff

Holvoet, F., Antonio, K., and Henckaerts, R. Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff. North American Actuarial Journal, pages 1--44, 2025. doi:10.1080/10920277.2025.2451860

work page doi:10.1080/10920277.2025.2451860 2025
[30]

Evaluating xgboost for competitive insurance pricing: A case study on motor third-party liability insurance

Ibrahim, J., Stanley, J., Murfi, H., Novkaniza, F., and Devila, S. Evaluating xgboost for competitive insurance pricing: A case study on motor third-party liability insurance. In 2024 International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA), pages 847--852. IEEE, 2024. doi:10.1109/icicyta64807.2024.10912952

work page doi:10.1109/icicyta64807.2024.10912952 2024
[31]

M., Ahamed, T., Matsushita, S., and Noguchi, R

Islam, M. M., Ahamed, T., Matsushita, S., and Noguchi, R. A damage-based crop insurance system for flash flooding: a satellite remote sensing and econometric approach. In Remote sensing application II: A climate change perspective in agriculture, pages 121--163. Springer, 2024. doi:10.1007/978-981-97-1188-8\_5

work page doi:10.1007/978-981-97-1188-8 2024
[32]

ISO 19109:2022 Geographic information -- Rules for application schema

ISO . ISO 19109:2022 Geographic information -- Rules for application schema . Standard, International Organization for Standardization, Geneva, Switzerland, 2022

2022
[33]

Impact of ai in the general insurance underwriting factors

Jaiswal, R. Impact of ai in the general insurance underwriting factors. Central European Management Journal, 31 0 (2): 0 697--705, 2023

2023
[34]

and Kidzi \'n ski,

Kita-Wojciechowska, K. and Kidzi \'n ski, . Google street view image predicts car accident risk. Central European Economic Journal, 6 0 (53): 0 151--163, 2019. doi:10.2478/ceej-2019-0011

work page doi:10.2478/ceej-2019-0011 2019
[35]

Heung-Chang Lee and Jeonggeun Song

Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86 0 (11): 0 2278--2324, 1998. ISSN 0018-9219. doi:10.1109/5.726791

work page doi:10.1109/5.726791 1998
[36]

A survey of convolutional neural networks: analysis, applications, and prospects

Li, Z., Liu, F., Yang, W., Peng, S., and Zhou, J. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems, 33 0 (12): 0 6999--7019, 2021. doi:10.1109/tnnls.2021.3084827

work page doi:10.1109/tnnls.2021.3084827 2021
[37]

A., Goodchild, M

Longley, P. A., Goodchild, M. F., Maguire, D. J., and Rhind, D. W. Geographic information science and systems. John Wiley & Sons, 2015

2015
[38]

Generalized linear models

McCullagh, P. Generalized linear models. Routledge, 2019. doi:10.1201/9780203753736

work page doi:10.1201/9780203753736 2019
[39]

C., Belnap, T., Dwivedi, P., Deligani, A

Nguyen, Q. C., Belnap, T., Dwivedi, P., Deligani, A. H. N., Kumar, A., Li, D., Whitaker, R., Keralis, J., Mane, H., Yue, X., et al. Google street view images as predictors of patient health outcomes, 2017--2019. Big data and cognitive computing, 6 0 (1): 0 15, 2022. doi:10.3390/bdcc6010015

work page doi:10.3390/bdcc6010015 2017
[40]

Noll, A., Salzmann, R., and Wuthrich, M. V. Case study: French motor third-party liability claims. Available at SSRN 3164764, 2020. doi:10.2139/ssrn.3164764

work page doi:10.2139/ssrn.3164764 2020
[41]

Nomic embed vision: Expanding the latent space.arXiv preprint arXiv:2406.18587, 2024

Nussbaum, Z., Duderstadt, B., and Mulyar, A. Nomic embed vision: Expanding the latent space. arXiv preprint arXiv:2406.18587, 2024. doi:10.48550/arXiv.2406.18587

work page doi:10.48550/arxiv.2406.18587 2024
[42]

OpenStreetMap , 2025 a

OpenStreetMap contributors . OpenStreetMap , 2025 a . URL https://www.openstreetmap.org. Data licensed under the Open Database License (ODbL)

2025
[43]

OpenStreetMap Belgium Data Extract

OpenStreetMap contributors . OpenStreetMap Belgium Data Extract . Geofabrik GmbH, 2025 b . URL https://download.geofabrik.de/europe/belgium.html. Distributed by Geofabrik. Licensed under ODbL

2025
[44]

Social network analytics for supervised fraud detection in insurance

\'O skarsd \'o ttir, M., Ahmed, W., Antonio, K., Baesens, B., Dendievel, R., Donas, T., and Reynkens, T. Social network analytics for supervised fraud detection in insurance. Risk Analysis, 42 0 (8): 0 1872--1890, 2022. doi:10.1111/risa.13693

work page doi:10.1111/risa.13693 2022
[45]

A., Corzo-Garc \' a, D., Pro-Mart \' n, J

P \'e rez-Zarate, S. A., Corzo-Garc \' a, D., Pro-Mart \' n, J. L., \'A lvarez-Garc \' a, J. A., Mart \' nez-del Amor, M. A., and Fern \'a ndez-Cabrera, D. Automated car damage assessment using computer vision: Insurance company use case. Applied Sciences, 14 0 (20): 0 9560, 2024. doi:10.3390/app14209560

work page doi:10.3390/app14209560 2024
[46]

On the validation of claims with excess zeros in liability insurance: A comparative study

Qazvini, M. On the validation of claims with excess zeros in liability insurance: A comparative study. Risks, 7 0 (3): 0 71, 2019. doi:10.3390/risks7030071

work page doi:10.3390/risks7030071 2019
[47]

Rababaah, A. R. Investigation of deep learning models for vehicle damage classification. In 2023 10th International Conference on Signal Processing and Integrated Networks (SPIN), pages 25--30. IEEE, 2023. doi:10.1109/spin57001.2023.10116703

work page doi:10.1109/spin57001.2023.10116703 2023
[48]

Seyam, E. A. Predicting motor insurance claim incidence using generalized and tree-based models: A comparative statistical approach. Insurance Markets and Companies, 16 0 (2): 0 38, 2025. doi:10.21511/ins.16(2).2025.04

work page doi:10.21511/ins.16(2).2025.04 2025
[49]

Deep residential representations: Using unsupervised learning to unlock elevation data for geo-demographic prediction

Stevenson, M., Mues, C., and Bravo, C. Deep residential representations: Using unsupervised learning to unlock elevation data for geo-demographic prediction. ISPRS Journal of Photogrammetry and Remote Sensing, 187: 0 378--392, 2022. ISSN 0924-2716. doi:https://doi.org/10.1016/j.isprsjprs.2022.03.015. URL https://www.sciencedirect.com/science/article/pii/S...

work page doi:10.1016/j.isprsjprs.2022.03.015 2022
[50]

and Thomas, I

Thiran, P. and Thomas, I. Accidents de la route et distance au domicile. approche quantitative pour bruxelles. Les Cahiers Scientifiques du Transport-Scientific Papers in Transportation, 32, 1997. doi:10.46298/cst.11958

work page doi:10.46298/cst.11958 1997
[51]

o m, J., and Lindstr \

Tufvesson, O., Lindstr \"o m, J., and Lindstr \"o m, E. Spatial statistical modelling of insurance risk: a spatial epidemiological approach to car insurance. Scandinavian Actuarial Journal, 2019 0 (6): 0 508--522, 2019. doi:10.1080/03461238.2019.1576146

work page doi:10.1080/03461238.2019.1576146 2019
[52]

Claim frequency estimation in motor third-party liability (mtpl): Classical statistical models versus machine learning methods

Vít, O., Seif, L., and Štěpánek, L. Claim frequency estimation in motor third-party liability (mtpl): Classical statistical models versus machine learning methods. In Annals of Computer Science and Information Systems, volume 45, pages 161--166. Polish Information Processing Society, 2025. doi:10.15439/2025f5118

work page doi:10.15439/2025f5118 2025
[53]

Predictive analytics in long term care

Zail, H. Predictive analytics in long term care. In Actuarial Aspects of Long Term Care, pages 309--336. Springer, 2019. doi:10.1007/978-3-030-05660-5\_13

work page doi:10.1007/978-3-030-05660-5 2019
[54]

C., Li, M., and Smola, A

Zhang, A., Lipton, Z. C., Li, M., and Smola, A. J. Dive into deep learning. Cambridge University Press, 2023

2023
[55]

and Hastie, T

Zou, H. and Hastie, T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67 0 (2): 0 301--320, 2005. doi:10.1111/j.1467-9868.2005.00503.x

work page doi:10.1111/j.1467-9868.2005.00503.x 2005