pith. sign in

arxiv: 2606.29991 · v1 · pith:LGYHJBTDnew · submitted 2026-06-29 · 💻 cs.IR

Behind the Content: Wikipedia Mobile Views and Tourism Activity

Pith reviewed 2026-06-30 04:12 UTC · model grok-4.3

classification 💻 cs.IR
keywords Wikipediamobile pageviewstourism nowcastinghotel demanddigital tracesFrancedevice split
0
0 comments X

The pith

Mobile Wikipedia pageviews on city pages predict same-day hotel demand and attraction attendance while desktop views do not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether Wikipedia traffic can serve as a high-frequency indicator of local tourism by separating mobile and desktop pageviews. It argues that mobile views more often reflect on-site, same-day information needs, whereas desktop views capture broader planning interest spread over time. Linking daily data for 704 French communes shows mobile pageviews associate positively with hotel room-nights and dominate when both are included together; the link is stronger in leisure destinations and high-visibility locations. A separate check against daily attendance at six Orléans cultural sites reproduces the pattern at the individual attraction level, with mobile views forecasting same-day counts and nearby leads or lags near zero. The work positions this device split as a transparent way to nowcast tourism activity from open data.

Core claim

Mobile pageviews are positively associated with same-day hotel demand and dominate desktop traffic in joint specifications. The relationship is stronger in leisure-oriented destinations and in places with higher Wikipedia visibility. A micro-validation using daily attendance at six cultural attractions in Orléans shows the same pattern: mobile pageviews predict same-day gate counts, while surrounding leads and lags are close to zero.

What carries the argument

Device composition of Wikipedia attention, where mobile pageviews proxy situated contemporaneous information needs and desktop pageviews proxy temporally diffuse interest.

Load-bearing premise

Mobile device pageviews primarily capture on-site, real-time information seeking rather than other visitor behaviors.

What would settle it

A replication dataset in which mobile pageviews show no positive association with same-day hotel demand or gate counts, or in which leads and lags become statistically significant.

Figures

Figures reproduced from arXiv: 2606.29991 by Lucas Eustache, Paul Favier.

Figure 2
Figure 2. Figure 2: Timing profile: daily attendance regressed on 61 leads/lags of site [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
read the original abstract

This study examines whether open digital traces can provide interpretable, high-frequency indicators of local tourism activity. We argue that the device composition of Wikipedia attention helps distinguish situated information use from remote planning: mobile pageviews are more likely to reflect on-site, contemporaneous information needs, whereas desktop pageviews capture temporally diffuse interest. Linking daily Accor hotel room-nights to Wikipedia city-page traffic for 704 French communes from 2018 to 2025, we find that mobile pageviews are positively associated with same-day hotel demand and dominate desktop traffic in joint specifications. The relationship is stronger in leisure-oriented destinations and in places with higher Wikipedia visibility. A micro-validation using daily attendance at six cultural attractions in Orl{\'e}ans shows the same pattern: mobile pageviews predict same-day gate counts, while surrounding leads and lags are close to zero. The findings position mobile Wikipedia traffic as a transparent, replicable nowcasting signal for tourism activity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper claims that Wikipedia mobile pageviews serve as a high-frequency nowcasting signal for local tourism activity because they capture on-site, contemporaneous information needs (in contrast to desktop views, which reflect temporally diffuse planning). Linking daily Accor hotel room-nights to city-page traffic across 704 French communes (2018–2025), it reports positive associations between mobile views and same-day hotel demand, with mobile dominating desktop in joint specifications; effects are stronger in leisure destinations and high-visibility locations. A micro-validation with daily gate counts at six Orléans cultural attractions reproduces the pattern: mobile views predict same-day attendance while leads and lags are near zero.

Significance. If the reported associations prove robust, the work supplies a transparent, replicable, open-data indicator for tourism nowcasting that leverages device-type differentiation to separate situated from remote information use. It contributes to digital-trace methods in tourism studies and information retrieval by demonstrating that Wikipedia traffic, when disaggregated by device, can yield timely, location-specific signals without proprietary data. The large-scale linkage and the timing-focused micro-validation are concrete strengths.

major comments (3)
  1. [Abstract and micro-validation description] The device-type proxy for on-site versus remote use is load-bearing for the central claim that mobile views dominate because they reflect contemporaneous needs. The abstract and micro-validation description present timing patterns but supply no direct test (geolocation, user surveys, or demographic controls) that would rule out confounding by user demographics or access patterns. This assumption therefore remains unvalidated.
  2. [Abstract (results summary)] The abstract states that mobile pageviews are positively associated with same-day hotel demand and dominate in joint specifications, yet provides no regression equations, full specification details, controls for weather/events/seasonality, or robustness checks. Without these, it is impossible to assess whether the reported dominance survives standard confounder adjustments.
  3. [Micro-validation paragraph] The micro-validation reports that surrounding leads and lags are close to zero, but does not report coefficient magnitudes, standard errors, or the exact set of controls and fixed effects used. This omission makes it difficult to judge the precision and robustness of the same-day predictive claim.
minor comments (1)
  1. [Abstract] The LaTeX artifact "Orl{\'e}ans" appears in the abstract; ensure consistent rendering in the final version.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the presentation of our results. We address each point below.

read point-by-point responses
  1. Referee: [Abstract and micro-validation description] The device-type proxy for on-site versus remote use is load-bearing for the central claim that mobile views dominate because they reflect contemporaneous needs. The abstract and micro-validation description present timing patterns but supply no direct test (geolocation, user surveys, or demographic controls) that would rule out confounding by user demographics or access patterns. This assumption therefore remains unvalidated.

    Authors: The central claim relies on the device-type distinction as a proxy, supported by the differential timing patterns observed in both the large-scale and micro-validation analyses. While direct validation via geolocation or surveys is not available in this dataset, the pattern of same-day associations with mobile but not desktop views, and the near-zero leads/lags in the micro-validation, provide convergent evidence for the interpretation. We will expand the discussion section to explicitly address potential confounding by demographics and access patterns as a limitation. revision: partial

  2. Referee: [Abstract (results summary)] The abstract states that mobile pageviews are positively associated with same-day hotel demand and dominate in joint specifications, yet provides no regression equations, full specification details, controls for weather/events/seasonality, or robustness checks. Without these, it is impossible to assess whether the reported dominance survives standard confounder adjustments.

    Authors: The abstract summarizes the key findings concisely as is conventional. The full paper details the regression models, including equations, the full set of controls (seasonality via fixed effects, weather, events), and robustness checks in the methods and results sections. To address this, we will revise the abstract to include a brief mention of the controls used. revision: yes

  3. Referee: [Micro-validation paragraph] The micro-validation reports that surrounding leads and lags are close to zero, but does not report coefficient magnitudes, standard errors, or the exact set of controls and fixed effects used. This omission makes it difficult to judge the precision and robustness of the same-day predictive claim.

    Authors: We agree that reporting the specific coefficients, standard errors, and detailed controls would improve transparency. In the revised version, we will add these details to the micro-validation section, including a table with the estimates and specification information. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical associations from external datasets with no self-referential derivations

full rationale

The paper reports regression associations between Wikipedia mobile/desktop pageviews and hotel demand or gate counts, obtained by linking independent external data sources (Accor bookings, Wikipedia traffic logs, Orléans attendance records). No equations, fitted parameters, or predictions are defined in terms of the target outcomes themselves. The mobile-as-on-site interpretation is presented as an argument to motivate the analysis, not as a quantity derived from or equivalent to the fitted coefficients. No self-citations, ansatzes, or uniqueness theorems appear in the provided text. This is a standard observational study whose central claims remain falsifiable against the raw linked data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the interpretive assumption that device type proxies for user location and timing of information need, plus standard regression assumptions for observational data linkage. No new physical entities or ad-hoc constants are introduced.

axioms (1)
  • domain assumption Mobile Wikipedia pageviews reflect on-site, contemporaneous information needs while desktop pageviews reflect temporally diffuse remote interest.
    This premise is invoked in the abstract to justify why mobile traffic should track same-day tourism activity.

pith-pipeline@v0.9.1-grok · 5684 in / 1386 out tokens · 72273 ms · 2026-06-30T04:12:36.138879+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 24 canonical work pages · 1 internal anchor

  1. [1]

    D., & Pischke, J

    Angrist, J. D., & Pischke, J. -S. (2009). Mostly harmless econometrics: An empiricist’ s companion. Princeton University Press. 31e Conférence de l’Association Information et Management AIM 2026 : 20-22 mai à Neuchâtel (CH) 15

  2. [2]

    Bakos, J. Y . (1997). Reducing buyer search costs: Implications for electronic marketplaces. Management Science, 43(12), 1676–1692. https://doi.org/10.1287/mnsc.43.12.1676 Bańbura, M., Giannone, D., Modugno, M., & Reichlin, L. (2013). Now -casting and the real - time data flow. In Handbook of economic forecasting (V ol. 2, pp. 195 –237). Elsevier. https:/...

  3. [3]

    F., & Skeete, R

    Bangwayo-Skeete, P. F., & Skeete, R. W. (2015). Can Google data improve the forecasting performance of tourist arrivals? Mixed -data sampling approach. Tourism Management, 46 , 454–464. https://doi.org/10.1016/j.tourman.2014.07.014

  4. [4]

    Buhalis, D., & Law, R. (2008). Progress in information technology and tourism management: 20 years on and 10 years after the Internet: The state of eTourism research. Tourism Management, 29(4), 609–623. https://doi.org/10.1016/j.tourman.2008.01.005

  5. [5]

    A., & Mayzlin, D

    Chevalier, J. A., & Mayzlin, D. (2006). The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research, 43 (3), 345 –354. https://doi.org/10.1509/jmkr.43.3.345

  6. [6]

    Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88, 2–9. https://doi.org/10.1111/j.1475-4932.2012.00809.x De Los Santos, B., Hortaçsu, A., & Wildenbeest, M. R. (2012). Testing models of consumer search using data on web browsing and purchasing behavior. American Economic Review, 102(6), 2955–2980. https://doi.org...

  7. [7]

    Dellarocas, C. (2003). The digitization of word of mouth: Promise and challenges of online feedback mechanisms. Management Science, 49 (10), 1407 –1424. https://doi.org/10.1287/mnsc.49.10.1407.17308

  8. [8]

    E., Hibbert, J

    Dickinson, J. E., Hibbert, J. F., & Filimonau, V . (2016). Mobile technology and the tourist experience: (Dis)connection at the campsite. Tourism Management, 57 , 193 –201. https://doi.org/10.1016/j.tourman.2016.06.005

  9. [9]

    C., & Kraay, A

    Driscoll, J. C., & Kraay, A. C. (1998). Consistent covariance matrix estimation with spatially dependent panel data. Review of Economics and Statistics, 80 (4), 549 –560. https://doi.org/10.1162/003465398557825

  10. [10]

    Dunn, O. J. ( 2012). Confidence intervals for the means of dependent, normally distributed variables. Journal of the American Statistical Association, 54 (287), 613 –621. https://doi.org/10.1080/01621459.1959.10501524

  11. [11]

    Gao, B., Wang, J., Ding, X., & Guo, Y . (2025). The pitfalls of review solicitation: Evidence from a natural experiment on TripAdvisor. Management Science, 71 (2), 1671 –1691. https://doi.org/10.1287/mnsc.2023.01006

  12. [12]

    Ghose, A., Goldfarb, A., & Han, S. P. (2013). How is the mobile Internet different? Search costs and local activities. Information Systems Research, 24 (3), 613 –631. https://doi.org/10.1287/isre.1120.0453 31e Conférence de l’Association Information et Management AIM 2026 : 20-22 mai à Neuchâtel (CH) 16

  13. [13]

    Electron and positron pair production of compact stars

    Ghose, A., Ipeirotis, P. G., & Li, B. (2012). Designing ranking systems for hotels on travel search engines by mining user-generated and crowdsourced content. Marketing Science, 31(3), 493–520. https://doi.org/10.1287/mksc.1110.0700

  14. [14]

    Giannone, D., Reichlin, L., & Small, D. (2008). Nowcasting: The real -time informational content of macroeconomic data. Journal of Monetary Economics, 55 (4), 665 –676. https://doi.org/10.1016/j.jmoneco.2005.10.021

  15. [15]

    Godes, D., & Mayzlin, D. (2004). Using online conversations to study word -of-mouth communication. Marketing Science, 23(4), 545–560. https://doi.org/10.1287/mksc.1040.0071

  16. [16]

    Greenstein, S., & Zhu, F. (2016). Open content, Linus’ Law, and neutral point of view. Information Systems Research, 27(3), 618–635. https://doi.org/10.1287/isre.2016.0643

  17. [17]

    Greenstein, S., & Zhu, F. (2018). Do experts or crowd -based models produce more bias? Evidence from Encyclopedia Britannica and Wikipedia. MIS Quarterly . https://doi.org/10.25300/MISQ/2018/14084

  18. [18]

    Hinnosaar, M., Hinnosaar, T., Kummer, M., & Slivko, O. (2023). Wikipedia matters. Journal of Economics & Management Strategy, 32(3), 657–669. https://doi.org/10.1111/jems.12421

  19. [19]

    Hollenbeck, B., Moorthy, S., & Proserpio, D. (2019). Advertising strategy in the presence of reviews: An empirical analysis. Marketing Science, 38 (5), 793 –811. https://doi.org/10.1287/mksc.2019.1180

  20. [20]

    Huertas, A., & Orden-Mejía, M. (2022). Do tourists seek the same information at destinations? Analysis of digital tourist information searches according to different types of tourists. European Journal of Tourism Research, 32, 3211–3211. https://doi.org/10.54055/ejtr.v32i.2492

  21. [21]

    Li, H., Hu, M., & Li, G. (2021). Forecasting tourism demand with multisource big data. Annals of Tourism Research, 83, Article 102912. https://doi.org/10.1016/j.annals.2020.102912

  22. [22]

    H., & Paulus, G

    Owuor, I., Hochmair, H. H., & Paulus, G. (2023). Use of social media data, online reviews and Wikipedia page views to measure visitation patterns of outdoor attractions. Journal of Outdoor Recreation and Tourism, 44, Article 100681. https://doi.org/10.1016/j.jort.2023.100681

  23. [23]

    Wang, D., Xiang, Z., & Fesenmaier, D. R. (2016). Smartphones in tourism and hospitality marketing: A literature review. Journal of Travel & Tourism Marketing, 30 (1–2), 178 –192. https://doi.org/10.1080/10548408.2014.943458

  24. [24]

    Wooldridge, J. M. (2010). Econometric analysis of cross section and panel data (2nd ed.). MIT Press

  25. [25]

    H., & Uysal, M

    Xiang, Z., Schwartz, Z., Gerdes, J. H., & Uysal, M. (2015). What can big data and text analytics tell us about hotel guest experience and satisfaction? International Journal of Hospitality Management, 44, 120–130. https://doi.org/10.1016/j.ijhm.2014.10.013

  26. [26]

    Yang, Y ., Pan, B., & Song, H. (2015). Predicting hotel demand using destination marketing organization’s web traffic data. Journal of Travel Research, 54 (4), 433 –447. https://doi.org/10.1177/0047287514544190