Fig 1. Overview of Linearized Encoding Analysis (LEA)
Methodological issues
Reverse double-dipping
This article (Kim, 2025) elucidates a methodological pitfall of cross-validation for evaluating predictive models applied to naturalistic neuroimaging data—namely, ‘reverse double-dipping’ (RDD). In a broader context, this problem is also known as ‘leakage in training examples’, which is difficult to detect in practice. RDD can occur when predictive modeling is applied to data from a conventional neuroscientific design, characterized by a limited set of stimuli repeated across trials and/or participants. It results in spurious predictive performances due to overfitting to repeated signals, even in the presence of independent noise. Through comprehensive simulations and real-world examples following theoretical formulation, the article underscores how such information leakage can occur and how severely it could compromise the results and conclusions when it is combined with widely spread informal reverse inference. The article concludes with practical recommendations for researchers to avoid RDD in their experiment design and analysis.
Fig 2. Reserve double-dipping: data dips you, twice!
Time series prediction
Fig 3. Spurious correlation between smooth time series
Resources
Kim, 2024-09-07, Linearized Encoding Modeling: a Predictive Analysis Methodology for Music Perception, Korean Society for Music Perception and Cognition (KSMPC) Summer School 24, Session 3 lecture. [slides][code][repo]
References
2025
preprint
Reverse Double-Dipping: When Data Dips You, Twice—Stimulus-Driven Information Leakage in Naturalistic Neuroimaging
This article elucidates a methodological pitfall of cross-validation for evaluating predictive models applied to naturalistic neuroimaging data–namely, ’reverse double-dipping’ (RDD). In a broader context, this problem is also known as ’leakage in training examples’, which poses challenges in detecting it in practice. This issue can occur when predictive modeling is employed with data from a conventional neuroscientific design, characterized by a limited set of stimuli repeated across trials and/or participants, resulting in spurious predictive performances due to overfitting to repeated signals, even in the presence of independent noise. Through comprehensive simulations and real-world examples following theoretical formulation, the article underscores how such information leakage can occur and how severely it could compromise the analysis when it is combined with widely spread informal reverse inference. The article concludes with practical recommendations for researchers to avoid RDD in their experiment design and analysis.Competing Interest StatementThe authors have declared no competing interest.
@article{kim2025rdd,author={Kim, Seung-Goo},doi={10.1101/2025.04.01.646146},elocation-id={2025.04.01.646146},eprint={https://www.biorxiv.org/content/early/2025/04/05/2025.04.01.646146.full.pdf},journal={bioRxiv},publisher={Cold Spring Harbor Laboratory},title={Reverse Double-Dipping: When Data Dips You, Twice{\textemdash}Stimulus-Driven Information Leakage in Naturalistic Neuroimaging},year={2025},bdsk-url-1={https://www.biorxiv.org/content/early/2025/04/05/2025.04.01.646146},bdsk-url-2={https://doi.org/10.1101/2025.04.01.646146},}
@article{kim2024cc,journal={Cerebral Cortex},author={Kim, Seung-Goo and Martino, Federico De and Overath, Tobias},date-modified={2024-04-25 14:43:05 +0200},doi={10.1093/cercor/bhae155},title={Linguistic modulation of the neural encoding of phonemes},year={2024},}
Proceedings – The Joint Conference of the 17th International Conference on Music Perception and Cognition (ICMPC) and the 7th Conference of the Asia-Pacific Society for the Cognitive Sciences of Music (APSCOM), 2023
@article{kim2023icmpc,author={Kim, Seung-Goo and Overath, Tobias and Sammler, Daniela},journal={Proceedings -- The Joint Conference of the 17th International Conference on Music Perception and Cognition (ICMPC) and the 7th Conference of the Asia-Pacific Society for the Cognitive Sciences of Music (APSCOM)},date={2023-08-01},date-modified={2024-04-25 18:48:05 +0200},organization={International Conference on Music Perception and Cognition (ICMPC)},title={Emotion-relevant Representations of Music Extracted by Convolutional Neural Networks Are Encoded in Medial Prefrontal Cortex},year={2023},}
This article discusses recent developments and advances in the neuroscience of music to understand the nature of musical emotion. In particular, it highlights how system identification techniques and computational models of music have advanced our understanding of how the human brain processes the textures and structures of music and how the processed information evokes emotions. Musical models relate physical properties of stimuli to internal representations called features, and predictive models relate features to neural or behavioral responses and test their predictions against independent unseen data. The new frameworks do not require orthogonalized stimuli in controlled experiments to establish reproducible knowledge, which has opened up a new wave of naturalistic neuroscience. The current review focuses on how this trend has transformed the domain of the neuroscience of music.
@article{kim2022fn,author={Kim, Seung-Goo},doi={10.3389/fnins.2022.928841},issn={1662-453X},journal={Frontiers in Neuroscience},title={On the encoding of natural music in computational models and human brains},volume={16},year={2022},bdsk-url-1={https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2022.928841},bdsk-url-2={https://doi.org/10.3389/fnins.2022.928841}}