AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Huy Vo; Mai Vu; Nguyen Ho; Torben Bach Pedersen

arxiv: 1906.09995 · v2 · pith:P4Y27ACUnew · submitted 2019-06-24 · 💻 cs.DC

AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Nguyen Ho , Huy Vo , Mai Vu , Torben Bach Pedersen This is my paper

Pith reviewed 2026-05-25 16:59 UTC · model grok-4.3

classification 💻 cs.DC

keywords mutual informationtime series correlationmulti-scale analysisbig dataadaptive streamingtemporal correlationsscalable correlation detection

0 comments

The pith

AMIC identifies and ranks multi-scale temporal correlations in big time series using mutual information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AMIC as a technique to detect correlations between large time series datasets at multiple time scales. It uses mutual information to measure these relationships and orders the discoveries by their strength so users can attend to the most significant ones first. An adaptive streaming technique reduces repeated calculations, supporting scalability for high-volume data. A sympathetic reader would care because analyzing big data requires efficient ways to uncover hidden relationships across scales without exhaustive manual effort.

Core claim

AMIC is a method based on mutual information to identify correlations at multiple temporal scales in large time series. Discovered correlations are suggested to users in an order based on the strength of the relationships. The method supports an adaptive streaming technique that minimizes duplicated computation and is implemented for scalability. Comprehensive evaluation uses both synthetic and real-world data sets to assess effectiveness and scalability.

What carries the argument

The AMIC method, which applies mutual information across different temporal scales to compute and rank correlations by strength while using adaptive streaming to avoid redundant work.

If this is right

Correlations are ranked by strength to direct user attention to the strongest relationships first.
The adaptive streaming technique minimizes duplicated computation for efficiency.
The approach handles the volume and velocity of big data through its scalable implementation.
Evaluation demonstrates both effectiveness in finding correlations and scalability on large datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the ranking by mutual information strength aligns with domain expert judgment, it could reduce the time spent reviewing irrelevant correlations.
The multi-scale aspect allows detection of both short-term and long-term relationships in the same analysis.
Extensions might include applying similar adaptive techniques to other correlation measures beyond mutual information.

Load-bearing premise

Mutual information appropriately captures relevant temporal correlations at multiple scales and the adaptive streaming technique maintains accuracy while minimizing duplicated computation without missing key relationships.

What would settle it

A dataset of synthetic time series with planted known correlations at specific scales, run through AMIC to verify if the method recovers and correctly ranks them without omissions from the streaming adaptation.

Figures

Figures reproduced from arXiv: 1906.09995 by Huy Vo, Mai Vu, Nguyen Ho, Torben Bach Pedersen.

**Figure 5.** Figure 5: illustrates the influenced region and influenced marginal region concepts, and explains how they can help to minimize computational cost. Consider a data set of seven data points p0, ... ,p6 with their locations projected into boxed-array as in Fig. Sa. Let p0 (in red) be the reference point under monitoring, k = 2 be the nearest neighbor parameter, and the maximum norm1 be the distance metric between neig… view at source ↗

**Figure 7.** Figure 7: Mutual Information with different k 5.2 Parameters Setting - Value of k: We use k ranging from 1 to 20 to compute MI for the variables extracted from the real data sets. The MI values produced by different k are compared together. We found that k between 1 and 4 produces high variance of MI value, while the MI becomes more stable with k from 5 to 10. The value k = 6 gives the most stable result, thus, is s… view at source ↗

**Figure 12.** Figure 12: Taxi Fare vs. Rain Precipitation 14 [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 15.** Figure 15: Taxi Trips vs. Collisions 15 and 311 complaints are not correlated overall, but a weak positive correlation is found in the extracted windows. This might suggest the presence of hidden variables in the periods where these two are weakly correlated. Additionally, our findings suggest that the number of complaints made by 311 calls has a daily periodic pattern, where the complaints are significantly higher… view at source ↗

**Figure 17.** Figure 17: Stress Test and Scalability Test on Spark cluster Summary In this section, we have performed an extensive evaluation on the performance of AMIC, verifying its capability in addressing Big Data challenges: variety, volume, velocity, and scalability. Specifically, the use of MI to measure correlations allows AMIC to uncover different types of relations and to work on any types of data, and thus to tackle t… view at source ↗

read the original abstract

Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first important step is to identify temporal correlations between data sets. Given the characteristics of Big Data in terms of volume and velocity, techniques that identify correlations not only need to be fast and scalable, but also need to help users in ordering the correlations across temporal scales so that they can focus on important relationships. In this paper, we present AMIC (Adaptive Mutual Information-based Correlation), a method based on mutual information to identify correlations at multiple temporal scales in large time series. Discovered correlations are suggested to users in an order based on the strength of the relationships. Our method supports an adaptive streaming technique that minimizes duplicated computation and is implemented on top of Apache Spark for scalability. We also provide a comprehensive evaluation on the effectiveness and the scalability of AMIC using both synthetic and real-world data sets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AMIC combines mutual information with an adaptive streaming layer on Spark to rank multi-scale correlations in large time series by strength.

read the letter

AMIC is a method that applies mutual information to detect temporal correlations at multiple scales in big time series data, then orders the findings by relationship strength. It adds an adaptive streaming technique meant to cut redundant computation and runs on Apache Spark for scale. The abstract frames this as a practical response to volume and velocity in big data analysis. The evaluation covers both synthetic and real-world sets for effectiveness and scalability. That combination of MI-based detection, adaptive efficiency, and distributed implementation is the concrete piece that stands out. The paper does a reasonable job laying out a full pipeline from the core idea to claimed performance on real workloads. The focus on ordering results for users and the Spark layer address actual pain points in applied settings. The soft spots sit in the assumptions that drive the method. Mutual information at selected scales is taken to surface the relevant relationships, and the adaptive decisions are assumed to preserve completeness without dropping important correlations. The abstract gives no derivation or proof for why the chosen scales or adaptation rules achieve this, so the experiments carry the full weight. If the adaptation uses heuristics that trade off accuracy for speed, that needs clear quantification against non-adaptive baselines. The ordering by strength is presented as helpful, but without any user-facing validation it remains an untested claim. This work is aimed at practitioners in distributed data systems and time-series mining who need scalable correlation tools rather than theorists. A reader building Spark pipelines for large sensor or log data could extract usable implementation ideas. The claims are testable and the engineering is grounded enough that it deserves a serious referee to check the adaptation logic and experimental controls.

Referee Report

0 major / 1 minor

Summary. The manuscript introduces AMIC, an Adaptive Mutual Information-based Correlation method for identifying multi-scale temporal correlations in big time series data. It orders discovered correlations by relationship strength, uses an adaptive streaming technique to minimize duplicated computation, implements the approach on Apache Spark for scalability, and provides evaluation on synthetic and real-world data sets.

Significance. Should the method prove effective, it would offer a scalable, information-theoretic approach to prioritizing correlations across temporal scales in large datasets, which is relevant for big data analytics in distributed computing contexts.

minor comments (1)

[Abstract] Abstract: the claim of a 'comprehensive evaluation' on effectiveness and scalability is stated without reference to specific metrics, baselines, or dataset characteristics that would allow assessment of the results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their review of our manuscript on AMIC. The summary accurately captures the method's adaptive mutual information approach, ordering by relationship strength, streaming support, Spark implementation, and evaluation. The significance assessment aligns with our goals for scalable multi-scale correlation discovery in big time series data. The recommendation is listed as uncertain with no specific major comments provided in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces AMIC as a mutual-information-based method for multi-scale temporal correlations in time series, with an adaptive streaming layer on Spark and evaluation on synthetic plus real-world datasets. No load-bearing steps reduce by construction to self-definitions, fitted inputs renamed as predictions, or self-citation chains. The central claims rest on external data evaluation rather than internal equivalence to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no details on specific parameters, axioms, or entities; any thresholds for scales or mutual information cutoffs would be free parameters but cannot be identified here.

pith-pipeline@v0.9.0 · 5742 in / 1189 out tokens · 34303 ms · 2026-05-25T16:59:36.645098+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

Agresti and B

A. Agresti and B. Finlay, Statistical Methods for the Social Sciences. Pearson Education Limited, 2014

work page 2014
[2]

[Online]

Nye open data. [Online]. Available: https:/ / opendata.cityofnewyork.us

work page
[3]

Application of some correla tion coefficient techniques to time-series analysis,

W. E. Dean Jr and R. Y. Anderson, "Application of some correla tion coefficient techniques to time-series analysis," Journal of the International Association for Mathematical Geology, vol. 6, no. 4, pp. 363-372, 1974

work page 1974
[4]

Application of pearson correlation coefficient (pee) and kolmogorov-smirnov distance (ksd) metrics to identify disease-specific biomarker genes,

H.-C. Huang, S. Zheng, and Z. Zhao, "Application of pearson correlation coefficient (pee) and kolmogorov-smirnov distance (ksd) metrics to identify disease-specific biomarker genes," BMC Bioinformatics, vol. 11, no. 4, p. 1, 2010

work page 2010
[5]

Analysis of covariance with qualitative data,

G. Chamberlain, "Analysis of covariance with qualitative data," 1979

work page 1979
[6]

Correlation analy sis of spatial time series datasets: A filter-and-refine approach,

P. Zhang, Y. Huang, S. Shekhar, and V. Kumar, "Correlation analy sis of spatial time series datasets: A filter-and-refine approach," in PAKDD Proc., 2003

work page 2003
[7]

Spatio-temporal correlation: theory and applications for wireless sensor networks,

M. C. Vuran, 6. B. Akan, and I. F. Akyildiz, "Spatio-temporal correlation: theory and applications for wireless sensor networks," Computer Networks, vol. 45, no. 3, pp. 245-259, 2004

work page 2004
[8]

Spatio-temporal correlation-based fast coding unit depth decision for high efficiency video coding,

C. Zhou, F. Zhou, and Y. Chen, "Spatio-temporal correlation-based fast coding unit depth decision for high efficiency video coding," Journal of Electronic Imaging, vol. 22, no. 4, pp. 043 001-043 001, 2013

work page 2013
[9]

Spatiotemporal models for data-anomaly detection in dynamic environmental monitoring campaigns,

E. W. Dereszynski and T. G. Dietterich, "Spatiotemporal models for data-anomaly detection in dynamic environmental monitoring campaigns," ACM Transactions on Sensor Networks (TOSN), vol. 8, no. 1, p. 3, 2011

work page 2011
[10]

Towards sustainable solutions for applications in cloud computing and big data,

T. T. N. HO, "Towards sustainable solutions for applications in cloud computing and big data," in Doctoral dissertation. Politec nico di Milano, Italy, 2017, http:/ /hdl.handle.net/10589/131740

work page 2017
[11]

A data-value-driven adaptation framework for energy efficiency for data intensive applications in clouds,

T. T. N. Ho and B. Pernici, "A data-value-driven adaptation framework for energy efficiency for data intensive applications in clouds," in Technologies for Sustainability (SusTech), 2015 IEEE Conference on. IEEE, 2015, pp. 47-52

work page 2015
[12]

Finding related tables,

A. Das Sarma, L. Fang, N. Gupta, A. Halevy, H. Lee, F. Wu, R. Xin, and C. Yu, "Finding related tables," in S/GMOD Proc., 2012, pp. 817-828

work page 2012
[13]

Fusing data with correlations,

R. Pochampally, A. Das Sarma, X. L. Dong, A. Meliou, and D. Srivastava, "Fusing data with correlations," in S/GMOD Proc., 2014

work page 2014
[14]

Helping scientists reconnect their datasets,

A. Alawini, D. Maier, K. Tufte, and B. Howe, "Helping scientists reconnect their datasets," in SSDBM Proc., 2014

work page 2014
[15]

A formal approach to finding explanations for database queries,

S. Roy and D. Suciu, "A formal approach to finding explanations for database queries," in SIGMOD Proc., 2014

work page 2014
[16]

a-clusters: Capturing subspace correlation in a large data set,

J. Yang, W. Wang, H. Wang, and P. Yu, "a-clusters: Capturing subspace correlation in a large data set," in Data Engineering Proc., 2002. Copyright ( c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This is the author's version of an article that has been publi...

work page doi:10.1109/tbda 2002
[17]

A fast and effective method to find correlations among attributes in databases,

E. P. de Sousa, C. Traina Jr, A. J. Traina, L. Wu, and C. Faloutsos, "A fast and effective method to find correlations among attributes in databases," Data Mining and Knowledge Discovery, vol. 14, no. 3, pp. 367-407, 2007

work page 2007
[18]

Efficient sen tinel mining using bitmaps on modern processors,

M. Middelfart, T. B. Pedersen, and J. Krogsgaard, "Efficient sen tinel mining using bitmaps on modern processors," IEEE Transac tions on Knowledge and Data Engineering, vol. 25, no. 10, pp. 2231- 2244, 2013

work page 2013
[19]

Dat a polygamy: the many-many relationships among urban spatio temporal data sets,

F. Chirigati, H. Dor aiswamy, T. Damoulas, and J. Freire, "Dat a polygamy: the many-many relationships among urban spatio temporal data sets," in SIGMOD Proc., 2016

work page 2016
[20]

Th e sliding wi ndow correlation procedure for detecting hidden corr elations: existence of behav ioral subgroups illustrated with aged rats,

D. Schulz and J. P. Huston, "Th e sliding wi ndow correlation procedure for detecting hidden corr elations: existence of behav ioral subgroups illustrated with aged rats, " Journal of neuroscience methods, vol. 121, no. 2, pp. 129-137, 2002

work page 2002
[21]

Fast wi ndow correlations over uncooperative time series,

R. Cole, D. Shasha, and X. Zhao, " Fast wi ndow correlations over uncooperative time series," in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. AC M, 2005, pp. 743-749

work page 2005
[22]

Estimating mutual information on data streams,

F. Keller, E. Mi.iller, and K. Bol:un, " Estimating mutual information on data streams," in SSDBM Proc., 2015

work page 2015
[23]

Local correla tion detection wi th linearity enhancement in streaming data,

Q. Xie, S. Shang, B. Yuan, C. Pang, and X. Zhang, " Local correla tion detection wi th linearity enhancement in streaming data," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013, pp. 309-318

work page 2013
[24]

Ana ly sing real world data streams w ith spatio-temporal correlations: Entropy vs. pearson correlation,

M. Bermudez-Edo, P. Barnaghi, and K. Moessner, "Ana ly sing real world data streams w ith spatio-temporal correlations: Entropy vs. pearson correlation," Automation in Construction, vol. 88, pp. 87- 100, 2018

work page 2018
[25]

F eature selection based on mutual infor mation criteria of max-dependency, max-relevance, and min-redundancy,

H. Peng, F. Long, and C. Ding, "F eature selection based on mutual infor mation criteria of max-dependency, max-relevance, and min-redundancy," IEEE Trans. on pattern analysis and machine intelligence, vol. 27, no. 8, pp. 1226-1238, 2005

work page 2005
[26]

Normal ized mutual information feature selection,

P. A. Estevez, M. Tesmer, C. A. Perez, and J. M. Zurada, "Normal ized mutual information feature selection," IEEE Trans. on Neural Networks, vol. 20, no. 2, pp. 189-201, 2009

work page 2009
[27]

Infor mation based clustering,

N. Slonim, G. S. Atwal, G. Tkacik, and W. Bialek, "Infor mation based clustering," Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 51, pp. 18 297-18 302, 2005

work page 2005
[28]

An information-theoretic approach to quantitative association rule mining,

Y. Ke, J. Cheng, and W. N g, "An information-theoretic approach to quantitative association rule mining," Knowledge and Information Systems, vol. 16, no. 2, pp. 213-244, 2008

work page 2008
[29]

Mutual infor mation-based registration of medical images: a survey,

J. P. Pluim, J. A. Maintz, and M. A. Viergever, " Mutual infor mation-based registration of medical images: a survey," IEEE Trans. on Medical Imaging, vol. 22, no. 8, pp. 986-1004, 2003

work page 2003
[30]

Information theoretic inference of large transcriptional regulatory networks,

P. E. Meyer, K. Kontos, F. Lafitt e, and G. Bontempi, " Information theoretic inference of large transcriptional regulatory networks," EURASIP journal on bioinformatics and systems biology, vol. 2007, no. 1, pp. 1-9, 2007

work page 2007
[31]

Aracne: an al goritl:un for the reconstruction of gene regulatory networks in a mammalian cellular context,

A. A. Margolin, I. N emenman, K. Basso, C. Wiggins, G. Stolovitzky, R. D. Favera, and A. Califano, " Aracne: an al goritl:un for the reconstruction of gene regulatory networks in a mammalian cellular context," BMC bioinformatics, vol. 7, no. Suppl 1, p. S7, 2006

work page 2006
[32]

Us ing time-delayed mutual infor mation to discover and interpret temporal correlation structure in complex populations,

D. J. Alber s and G. Hr ipcsak, "Us ing time-delayed mutual infor mation to discover and interpret temporal correlation structure in complex populations," Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 22, no. 1

work page
[33]

Spa tiotemporal dynamics of the magnetosphere during geospace storms: Mutual information analysis,

J. Chen, A. Sharma, J. Edwards, X. Shao, and Y. Kamide, "Spa tiotemporal dynamics of the magnetosphere during geospace storms: Mutual information analysis," Journal of Geophysical Re search: Space Physics, vol. 113, no. AS, 2008

work page 2008
[34]

Supporting correlation analysis on scientific datasets in parallel and distributed settings,

Y. Su, G. Ag raw al, J. Woodring, A. Biswas, and H.-W. Shen, "Supporting correlation analysis on scientific datasets in parallel and distributed settings," in HPDC Proc., 2014

work page 2014
[35]

An adaptive information-theoretic approach for identifying tempor al correlations in big data sets,

N. Ho, H. Vo, and M. Vu, "An adaptive information-theoretic approach for identifying tempor al correlations in big data sets," in Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016, pp. 666-675

work page 2016
[36]

T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley&Sons,2012

work page 2012
[37]

Some data analyses using mutual information,

D. R. Brillinger, "Some data analyses using mutual information," Brazilian Journal of Probability and Statistics, pp. 163-182, 2004

work page 2004
[38]

A comparative study of statistical methods used to identify dependencies between gene expression signals,

S. de Siqueira Santos, D. Y. Takahashi, A. Nakata, and A. Fujita, "A comparative study of statistical methods used to identify dependencies between gene expression signals," Briefings in bioin formatics, vol. 15, no. 6, pp. 906-918, 2013. 18

work page 2013
[39]

Estimation of entropy and mutual infor mation,

L. Paninski, "Estimation of entropy and mutual infor mation," Neural computation, vol. 15, no. 6, pp. 1191-1253, 2003

work page 2003
[40]

Estimating mutual information,

A. Kraskov, H. Stogbauer, and P. Grassberger, "Estimating mutual information," Physical review E, vol. 69, no. 6, 2004

work page 2004
[41]

Ev aluation of mutual information estimators for time series,

A. Papana and D. Kugiumtzis, "Ev aluation of mutual information estimators for time series," International Journal of Bifurcation and Chaos, vol. 19, no. 12, pp. 4197-4215, 2009

work page 2009
[42]

Mutual information estimation in higher dimensions: A speed-up of a k-nearest neigh bor based estimator,

M. Vejmelka and K. Hlavackova-Schindler, "Mutual information estimation in higher dimensions: A speed-up of a k-nearest neigh bor based estimator," in ICANNGA Proc

work page
[43]

Efficient neighbor searching in nonlinear time series analysis,

T. Schreiber, "Efficient neighbor searching in nonlinear time series analysis," International Journal of Bifurcation and Chaos, vol. 05, no. 02, pp. 349-358, 1995

work page 1995
[44]

Probability distributions and maximum entr opy,

K. Conrad, "Probability distributions and maximum entr opy," Entropy, vol. 6, no. 452, 2004

work page 2004
[45]

[Online]

Center of urban science and progress, new york university. [Online]. Available: http:/ /cusp.nyu.edu

work page
[46]

[Online]

Center of data-intensive system. [Online]. Available: http: / /www.d aisy.aau.dk

work page
[47]

A mutual information approach to calculating nonlin earity,

R. Smith, "A mutual information approach to calculating nonlin earity," Stat, vol. 4, no. 1, pp. 291-303, 2015

work page 2015
[48]

Linear interpolation,

M. Hazewinkel, "Linear interpolation," in Encyclopaedia of Mathe matics. Springer Science & Business Media, 1990

work page 1990
[49]

Velazquez, J

S. Velazquez, J. A. Carta, and J. Matias, "Comparison between anns and linear mcp algorithms in the long-term estimation of the cost per kwh produced by a wind turbine at a candidate site: a case study in the canary islands," Applied energy, vol. 88, no. 11, pp. 3869-3881, 2011

work page 2011
[50]

Measuring and testing dependence by correlation of distances,

G. J. Szekely, M. L. Rizzo, and N. K. Bakirov, "Measuring and testing dependence by correlation of distances," The annals of statistics, pp. 2769-2794, 2007. Nguyen Ho is a Postdoc Research Associate at the Center for Data-Intensive Systems (Daisy) at the Department of Computer Science, Aalborg University, Denmark. Her research focuses on Big Data Analyt...

work page 2007

[1] [1]

Agresti and B

A. Agresti and B. Finlay, Statistical Methods for the Social Sciences. Pearson Education Limited, 2014

work page 2014

[2] [2]

[Online]

Nye open data. [Online]. Available: https:/ / opendata.cityofnewyork.us

work page

[3] [3]

Application of some correla tion coefficient techniques to time-series analysis,

W. E. Dean Jr and R. Y. Anderson, "Application of some correla tion coefficient techniques to time-series analysis," Journal of the International Association for Mathematical Geology, vol. 6, no. 4, pp. 363-372, 1974

work page 1974

[4] [4]

Application of pearson correlation coefficient (pee) and kolmogorov-smirnov distance (ksd) metrics to identify disease-specific biomarker genes,

H.-C. Huang, S. Zheng, and Z. Zhao, "Application of pearson correlation coefficient (pee) and kolmogorov-smirnov distance (ksd) metrics to identify disease-specific biomarker genes," BMC Bioinformatics, vol. 11, no. 4, p. 1, 2010

work page 2010

[5] [5]

Analysis of covariance with qualitative data,

G. Chamberlain, "Analysis of covariance with qualitative data," 1979

work page 1979

[6] [6]

Correlation analy sis of spatial time series datasets: A filter-and-refine approach,

P. Zhang, Y. Huang, S. Shekhar, and V. Kumar, "Correlation analy sis of spatial time series datasets: A filter-and-refine approach," in PAKDD Proc., 2003

work page 2003

[7] [7]

Spatio-temporal correlation: theory and applications for wireless sensor networks,

M. C. Vuran, 6. B. Akan, and I. F. Akyildiz, "Spatio-temporal correlation: theory and applications for wireless sensor networks," Computer Networks, vol. 45, no. 3, pp. 245-259, 2004

work page 2004

[8] [8]

Spatio-temporal correlation-based fast coding unit depth decision for high efficiency video coding,

C. Zhou, F. Zhou, and Y. Chen, "Spatio-temporal correlation-based fast coding unit depth decision for high efficiency video coding," Journal of Electronic Imaging, vol. 22, no. 4, pp. 043 001-043 001, 2013

work page 2013

[9] [9]

Spatiotemporal models for data-anomaly detection in dynamic environmental monitoring campaigns,

E. W. Dereszynski and T. G. Dietterich, "Spatiotemporal models for data-anomaly detection in dynamic environmental monitoring campaigns," ACM Transactions on Sensor Networks (TOSN), vol. 8, no. 1, p. 3, 2011

work page 2011

[10] [10]

Towards sustainable solutions for applications in cloud computing and big data,

T. T. N. HO, "Towards sustainable solutions for applications in cloud computing and big data," in Doctoral dissertation. Politec nico di Milano, Italy, 2017, http:/ /hdl.handle.net/10589/131740

work page 2017

[11] [11]

A data-value-driven adaptation framework for energy efficiency for data intensive applications in clouds,

T. T. N. Ho and B. Pernici, "A data-value-driven adaptation framework for energy efficiency for data intensive applications in clouds," in Technologies for Sustainability (SusTech), 2015 IEEE Conference on. IEEE, 2015, pp. 47-52

work page 2015

[12] [12]

Finding related tables,

A. Das Sarma, L. Fang, N. Gupta, A. Halevy, H. Lee, F. Wu, R. Xin, and C. Yu, "Finding related tables," in S/GMOD Proc., 2012, pp. 817-828

work page 2012

[13] [13]

Fusing data with correlations,

R. Pochampally, A. Das Sarma, X. L. Dong, A. Meliou, and D. Srivastava, "Fusing data with correlations," in S/GMOD Proc., 2014

work page 2014

[14] [14]

Helping scientists reconnect their datasets,

A. Alawini, D. Maier, K. Tufte, and B. Howe, "Helping scientists reconnect their datasets," in SSDBM Proc., 2014

work page 2014

[15] [15]

A formal approach to finding explanations for database queries,

S. Roy and D. Suciu, "A formal approach to finding explanations for database queries," in SIGMOD Proc., 2014

work page 2014

[16] [16]

a-clusters: Capturing subspace correlation in a large data set,

J. Yang, W. Wang, H. Wang, and P. Yu, "a-clusters: Capturing subspace correlation in a large data set," in Data Engineering Proc., 2002. Copyright ( c) 2019 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org. This is the author's version of an article that has been publi...

work page doi:10.1109/tbda 2002

[17] [17]

A fast and effective method to find correlations among attributes in databases,

E. P. de Sousa, C. Traina Jr, A. J. Traina, L. Wu, and C. Faloutsos, "A fast and effective method to find correlations among attributes in databases," Data Mining and Knowledge Discovery, vol. 14, no. 3, pp. 367-407, 2007

work page 2007

[18] [18]

Efficient sen tinel mining using bitmaps on modern processors,

M. Middelfart, T. B. Pedersen, and J. Krogsgaard, "Efficient sen tinel mining using bitmaps on modern processors," IEEE Transac tions on Knowledge and Data Engineering, vol. 25, no. 10, pp. 2231- 2244, 2013

work page 2013

[19] [19]

Dat a polygamy: the many-many relationships among urban spatio temporal data sets,

F. Chirigati, H. Dor aiswamy, T. Damoulas, and J. Freire, "Dat a polygamy: the many-many relationships among urban spatio temporal data sets," in SIGMOD Proc., 2016

work page 2016

[20] [20]

Th e sliding wi ndow correlation procedure for detecting hidden corr elations: existence of behav ioral subgroups illustrated with aged rats,

D. Schulz and J. P. Huston, "Th e sliding wi ndow correlation procedure for detecting hidden corr elations: existence of behav ioral subgroups illustrated with aged rats, " Journal of neuroscience methods, vol. 121, no. 2, pp. 129-137, 2002

work page 2002

[21] [21]

Fast wi ndow correlations over uncooperative time series,

R. Cole, D. Shasha, and X. Zhao, " Fast wi ndow correlations over uncooperative time series," in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. AC M, 2005, pp. 743-749

work page 2005

[22] [22]

Estimating mutual information on data streams,

F. Keller, E. Mi.iller, and K. Bol:un, " Estimating mutual information on data streams," in SSDBM Proc., 2015

work page 2015

[23] [23]

Local correla tion detection wi th linearity enhancement in streaming data,

Q. Xie, S. Shang, B. Yuan, C. Pang, and X. Zhang, " Local correla tion detection wi th linearity enhancement in streaming data," in Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013, pp. 309-318

work page 2013

[24] [24]

Ana ly sing real world data streams w ith spatio-temporal correlations: Entropy vs. pearson correlation,

M. Bermudez-Edo, P. Barnaghi, and K. Moessner, "Ana ly sing real world data streams w ith spatio-temporal correlations: Entropy vs. pearson correlation," Automation in Construction, vol. 88, pp. 87- 100, 2018

work page 2018

[25] [25]

F eature selection based on mutual infor mation criteria of max-dependency, max-relevance, and min-redundancy,

H. Peng, F. Long, and C. Ding, "F eature selection based on mutual infor mation criteria of max-dependency, max-relevance, and min-redundancy," IEEE Trans. on pattern analysis and machine intelligence, vol. 27, no. 8, pp. 1226-1238, 2005

work page 2005

[26] [26]

Normal ized mutual information feature selection,

P. A. Estevez, M. Tesmer, C. A. Perez, and J. M. Zurada, "Normal ized mutual information feature selection," IEEE Trans. on Neural Networks, vol. 20, no. 2, pp. 189-201, 2009

work page 2009

[27] [27]

Infor mation based clustering,

N. Slonim, G. S. Atwal, G. Tkacik, and W. Bialek, "Infor mation based clustering," Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 51, pp. 18 297-18 302, 2005

work page 2005

[28] [28]

An information-theoretic approach to quantitative association rule mining,

Y. Ke, J. Cheng, and W. N g, "An information-theoretic approach to quantitative association rule mining," Knowledge and Information Systems, vol. 16, no. 2, pp. 213-244, 2008

work page 2008

[29] [29]

Mutual infor mation-based registration of medical images: a survey,

J. P. Pluim, J. A. Maintz, and M. A. Viergever, " Mutual infor mation-based registration of medical images: a survey," IEEE Trans. on Medical Imaging, vol. 22, no. 8, pp. 986-1004, 2003

work page 2003

[30] [30]

Information theoretic inference of large transcriptional regulatory networks,

P. E. Meyer, K. Kontos, F. Lafitt e, and G. Bontempi, " Information theoretic inference of large transcriptional regulatory networks," EURASIP journal on bioinformatics and systems biology, vol. 2007, no. 1, pp. 1-9, 2007

work page 2007

[31] [31]

Aracne: an al goritl:un for the reconstruction of gene regulatory networks in a mammalian cellular context,

A. A. Margolin, I. N emenman, K. Basso, C. Wiggins, G. Stolovitzky, R. D. Favera, and A. Califano, " Aracne: an al goritl:un for the reconstruction of gene regulatory networks in a mammalian cellular context," BMC bioinformatics, vol. 7, no. Suppl 1, p. S7, 2006

work page 2006

[32] [32]

Us ing time-delayed mutual infor mation to discover and interpret temporal correlation structure in complex populations,

D. J. Alber s and G. Hr ipcsak, "Us ing time-delayed mutual infor mation to discover and interpret temporal correlation structure in complex populations," Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 22, no. 1

work page

[33] [33]

Spa tiotemporal dynamics of the magnetosphere during geospace storms: Mutual information analysis,

J. Chen, A. Sharma, J. Edwards, X. Shao, and Y. Kamide, "Spa tiotemporal dynamics of the magnetosphere during geospace storms: Mutual information analysis," Journal of Geophysical Re search: Space Physics, vol. 113, no. AS, 2008

work page 2008

[34] [34]

Supporting correlation analysis on scientific datasets in parallel and distributed settings,

Y. Su, G. Ag raw al, J. Woodring, A. Biswas, and H.-W. Shen, "Supporting correlation analysis on scientific datasets in parallel and distributed settings," in HPDC Proc., 2014

work page 2014

[35] [35]

An adaptive information-theoretic approach for identifying tempor al correlations in big data sets,

N. Ho, H. Vo, and M. Vu, "An adaptive information-theoretic approach for identifying tempor al correlations in big data sets," in Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016, pp. 666-675

work page 2016

[36] [36]

T. M. Cover and J. A. Thomas, Elements of information theory. John Wiley&Sons,2012

work page 2012

[37] [37]

Some data analyses using mutual information,

D. R. Brillinger, "Some data analyses using mutual information," Brazilian Journal of Probability and Statistics, pp. 163-182, 2004

work page 2004

[38] [38]

A comparative study of statistical methods used to identify dependencies between gene expression signals,

S. de Siqueira Santos, D. Y. Takahashi, A. Nakata, and A. Fujita, "A comparative study of statistical methods used to identify dependencies between gene expression signals," Briefings in bioin formatics, vol. 15, no. 6, pp. 906-918, 2013. 18

work page 2013

[39] [39]

Estimation of entropy and mutual infor mation,

L. Paninski, "Estimation of entropy and mutual infor mation," Neural computation, vol. 15, no. 6, pp. 1191-1253, 2003

work page 2003

[40] [40]

Estimating mutual information,

A. Kraskov, H. Stogbauer, and P. Grassberger, "Estimating mutual information," Physical review E, vol. 69, no. 6, 2004

work page 2004

[41] [41]

Ev aluation of mutual information estimators for time series,

A. Papana and D. Kugiumtzis, "Ev aluation of mutual information estimators for time series," International Journal of Bifurcation and Chaos, vol. 19, no. 12, pp. 4197-4215, 2009

work page 2009

[42] [42]

Mutual information estimation in higher dimensions: A speed-up of a k-nearest neigh bor based estimator,

M. Vejmelka and K. Hlavackova-Schindler, "Mutual information estimation in higher dimensions: A speed-up of a k-nearest neigh bor based estimator," in ICANNGA Proc

work page

[43] [43]

Efficient neighbor searching in nonlinear time series analysis,

T. Schreiber, "Efficient neighbor searching in nonlinear time series analysis," International Journal of Bifurcation and Chaos, vol. 05, no. 02, pp. 349-358, 1995

work page 1995

[44] [44]

Probability distributions and maximum entr opy,

K. Conrad, "Probability distributions and maximum entr opy," Entropy, vol. 6, no. 452, 2004

work page 2004

[45] [45]

[Online]

Center of urban science and progress, new york university. [Online]. Available: http:/ /cusp.nyu.edu

work page

[46] [46]

[Online]

Center of data-intensive system. [Online]. Available: http: / /www.d aisy.aau.dk

work page

[47] [47]

A mutual information approach to calculating nonlin earity,

R. Smith, "A mutual information approach to calculating nonlin earity," Stat, vol. 4, no. 1, pp. 291-303, 2015

work page 2015

[48] [48]

Linear interpolation,

M. Hazewinkel, "Linear interpolation," in Encyclopaedia of Mathe matics. Springer Science & Business Media, 1990

work page 1990

[49] [49]

Velazquez, J

S. Velazquez, J. A. Carta, and J. Matias, "Comparison between anns and linear mcp algorithms in the long-term estimation of the cost per kwh produced by a wind turbine at a candidate site: a case study in the canary islands," Applied energy, vol. 88, no. 11, pp. 3869-3881, 2011

work page 2011

[50] [50]

Measuring and testing dependence by correlation of distances,

G. J. Szekely, M. L. Rizzo, and N. K. Bakirov, "Measuring and testing dependence by correlation of distances," The annals of statistics, pp. 2769-2794, 2007. Nguyen Ho is a Postdoc Research Associate at the Center for Data-Intensive Systems (Daisy) at the Department of Computer Science, Aalborg University, Denmark. Her research focuses on Big Data Analyt...

work page 2007