Conferências

C1 - Are we already FAIR? – The future of data sharing

Iris Pigeot

Leibniz Institute, Germany
Minicurrículo

‘Data sharing’ is becoming increasingly important in terms of efficient use of resources. In 2007, the OECD called for easy access to research data for the scientific community. In 2016, the FAIR principles (findable, accessible, interoperable, reusable) for research data were published. In 2018, the German government decided to establish a National Research Data Infrastructure (NFDI) where NFDI4Health takes responsibility for personal health data. This talk will present the infrastructures that have been realized so far and discuss potential (statistical) hurdles by giving illustrative examples. Further European developments such as the European Health Data Space will be addressed.

C4 - Modelagem Estatística de Redes Complexas

Andressa Cerqueira

UFSCar
Minicurrículo

Redes complexas têm recebido uma grande atenção da comunidade estatística, especialmente no contexto de analisar e descrever as interações de sistemas complexos aleatórios. Redes estão por toda parte, desde redes sociais até redes biológicas e sistemas de transporte. No entanto, compreender e analisar essas redes complexas é um desafio científico. Nesta palestra discutiremos como técnicas estatísticas nos permitem desvendar os padrões subjacentes, identificar comunidades, medir a robustez e prever o comportamento futuro das redes complexas.

C5 - Statistical Pitfalls in Measuring Biological Aging with Epigenetic Clocks: Insights from a Chronic Disease Setting

Fernanda Schumacher

Ohio State University, US
Minicurrículo

The third goal of the United Nations 2030 Agenda for Sustainable Development is to “Ensure healthy lives and promote well-being for all at all ages”. Its targets are a good representation of how vast health care is, covering maternal mortality, epidemics of infectious diseases, premature mortality from chronic diseases, prevention and treatment of substance abuse, access to sexual and reproductive health care, and affordable health care, just to name a few. In addition to the essential and undeniable role of statisticians in measuring all the diverse indicators related to this goal, how can statisticians contribute to reaching the targets?

In this talk, I will focus on statistical challenges faced when working with chronic diseases, with particular emphasis on people with Multiple Sclerosis (MS). Beyond chronological age, biological age reflects the cumulative damage to a person’s cells over time and is increasingly recognized as a critical factor in understanding health disparities and the progression of chronic conditions. Several biomarkers have recently been proposed to measure biological age and assess the cumulative burden of aging, which directly affect not only life expectancy but also quality of life, especially for people with debilitating diseases. As one of the popular markers, epigenetic “clocks” are based on statistical and machine learning tools to detect DNA methylation (DNAm) patterns. 

Epigenetic modifications represent a reversible mechanism in regulating the function of the genome without altering the underlying DNA sequence and have been linked to aging through several factors, allowing DNAm to be affected by environmental exposures and lifestyle habits. Since the first proposed epigenetic clock model in 2011, multiple epigenetic clocks have been reported with increasing accuracy, precision, and broader application prospects in aging research. Still, they are based on regression coefficients determined on general training populations, which are then used for out-of-sample prediction. Commonly, the predictions are then regressed on chronological age and technical variables, and the corresponding residuals are called Epigenetic Age Acceleration (EAA), which are then often used for hypothesis testing. 

The lack of interval estimates for individual predictions, the availability of several algorithms, the lack of a gold standard measure of biological age, and the use of prediction-based inference add to the statistical challenges of these markers. This talk will discuss such challenges, illustrated using data from a clinical study on biological aging in people with MS. I will also discuss the role of statisticians in ensuring that such issues are properly considered, especially when such measures could evolve to be outcomes in clinical trials for anti-aging treatment and the prevention of chronic diseases progression.

C7 - Influência de Ancestralidade Local na Expressão Gênica de Tipos Celulares da População Brasileira

Benilton de Sá Carvalho

UNICAMP
Minicurrículo

A diversidade genética da população brasileira, resultante de um complexo histórico de migração e miscigenação, oferece uma oportunidade única para estudar como a ancestralidade genética influencia a expressão gênica. Este estudo propõe uma análise integrada de dados de RNA-seq de célula única (scRNA-seq) e de sequenciamento do genoma completo (WGS) para investigar a relação entre ancestralidade local e expressão diferencial de genes em diferentes tipos celulares. Neste trabalho, serão discutidos métodos de pré-processamento de dados de scRNA-seq, incluindo a filtragem de células de baixa qualidade, normalização, remoção de efeitos de lote e anotação de tipos celulares. Paralelamente, serão também apresentados métodos de pré-processamento de dados de sequenciamento de genoma completo, genotipagem e inferência de ancestralidade local. Um modelo de regressão  para respostas do tipo binomial negativa será discutido e utilizado na integração de ambas modalidades de dados, permitindo a comparação da expressão gênica entre os grupos de diferentes ancestrais locais. Extensões que permitam o uso de variáveis medidas com erros e também o uso de observações correlacionadas serão discutidas. Esta estratégia permitirá a identificação de genes cuja expressão é significativamente influenciada pela ancestralidade local em diferentes tipos celulares. Esses achados podem fornecer novos insights sobre as bases genéticas das diferenças fenotípicas observadas na população brasileira e contribuir para a compreensão das interações entre genética e ambiente. Este estudo demonstra uma abordagem inovadora e integrada para explorar a influência da ancestralidade genética na expressão gênica em nível celular, utilizando dados de scRNA-seq e WGS. A aplicação dessas metodologias pode abrir novas perspectivas para pesquisas em genética populacional e medicina personalizada, especialmente em populações geneticamente diversas como a brasileira.

Financiamento: BRAINN/FAPESP 2013/07559-3

C8 - Recurrent Event Process Models: change point models and clustering of events

Elizabeth Juarez-Colunga

University of Colorado, US
Minicurrículo

Recurrent event data arise when an event may occur repeatedly over time. Examples include recurrence of bladder cancer tumors, epileptic seizures, or pulmonary exacerbations. This talk will main discuss two projects. The first focuses on modeling pulmonary exacerbations and their relationship to a longitudinal binary outcome, and the second aims to understand the clustering of events within individuals.

The first project was motivated by a study of cystic fibrosis, a hereditary lung disease characterized by progressive loss of lung function. Chronic Pseudomonas aeruginosa (PA) infection is associated with worse clinical outcomes, including more frequent pulmonary exacerbations (PE). The longitudinal progression of PA infection and recurrent PE events are likely intrinsically linked, but their temporal interrelationship has not been fully characterized. It is known that the rate of PA progression increases as individuals age, with potential sharp changes in its trajectory. Using data from the Early Pseudomonas Infection Control Observational Study, we propose a joint model to examine longitudinal PA and recurrent PE events. This model incorporates individual-specific random effects in the longitudinal sub-model, linked to those in the recurrent event sub-model. The longitudinal sub-model includes two change points to represent sharp changes in the trajectory, while the recurrent event sub-model employs a counting process for recurrent events and accommodates delayed entry. The results indicate that children experience a modest increase of 5.13% per year in the odds of PA starting at age 6.9, followed by a more pronounced rise of 27.12% around age 14.5. Additionally, an increased probability of PA is associated with a higher risk of experiencing subsequent PE events. The second project focuses on epileptic seizures, with the primary goal of understanding the clustering of seizures within individuals. We model clustering using a self-exciting stochastic process.

C9 - Causal Inference on Flexible Non-mixture Cure Rate Modeling with Piecewise Hazard and Gaussian Process

Dipak Dey

University of Connecticut, US
Minicurrículo

In the field of oncology, survival analysis often requires the inclusion of a cure fraction to account for individuals who are effectively cured of their disease. The concept of a cure fraction in survival analysis was introduced in a study examining long term survival following cancer therapy. The early work laid the foundation for defining cured individuals who were not at risk of experiencing the cancer recurrence after a certain period. Recently, there has been increasing interest in semiparametric mixture cure models, which relax some parametric assumptions.

The non-mixture cure rate model, which is another branch of the cure rate modeling, represents a significant advance in cure rate modeling. The non-mixture cure rate model, unlike the traditional mixture cure rate model, addresses this need by introducing a latent variable, often interpreted as the unobserved count of cancer cells, to indirectly estimate an individual’s cure status. This latent factor approach offers several benefits, including the flexibility to integrate a proportional hazards structure and enhanced computational efficiency. Over time, the non-mixture cure rate model has been extensively extended to handle complex data, leveraging semiparametric methods for modeling survival function. The non-mixture cure rate model typically incorporates covariates into the cure rate parameter through a log-linear form, assuming a Poisson distribution for the unobserved cancer cell count. Additionally, the common use of a linear functional form for covariate effects can be restrictive, particularly for continuous covariates that often display nonlinear relationships.

In modeling cure rates, it is often assumed that the effects of continuous covariates vary smoothly over their domain. However, the exact relationship between these covariates and the event of interest is not typically known a priori and may exhibit complex, nonlinear patterns. To flexibly capture these nonlinear covariate effects, we impose a Gaussian Process prior over the effects of the continuous covariates.

In this presentation, we consider non-mixture cure rate models in presence of Gaussian Process and further develop causality approach to decide the order of treatment procedures. The methodology is exemplified on a breast cancer study.

Iris Pigeot

Leibniz Institute, Germany

Prof. Dr. Iris Pigeot is the Director of the Leibniz Institute for Prevention Research and Epidemiology – BIPS and a professor at the University of Bremen. Prof. Pigeot is the president of the International Biometric Society (IBS). Her research focuses on innovative statistical approaches in areas such as pharmacoepidemiology, prevention research, and complex data analysis. Prof. Pigeot is widely recognized for her dedication to advancing biostatistics and fostering collaboration among researchers globally.

Website pessoal

Andressa Cerqueira

UFSCar

Andressa Cerqueira é professora no Departamento de Estatística da Universidade Federal de São Carlos desde 2020. Seus interesses de pesquisa incluem inferência estatística para grafos e redes aleatórias com aplicações em ciências biológicas. Ela obteve seu doutorado em Estatística pela Universidade de São Paulo em 2018. Durante o doutorado, foi estudante visitante no Institut de Mathématiques de Toulouse, na França. Realizou um pós-doutorado na Universidade de Campinas e, posteriormente, ocupou uma posição de pós-doutorado na University of Michigan.

Currículo Lattes

Fernanda Schumacher

Ohio State University, US

Fernanda Lang Schumacher is an assistant professor at the Division of Biostatistics of The Ohio State University College of Public Health. She completed her Ph.D. in Statistics in 2021 at the University of Campinas, Brazil, where she also obtained a master’s degree in Statistics in 2016. Her research interests include robust models, longitudinal data, skewed distributions, models for censored data, missing data, variable selection for mixed models, and multiple sclerosis disease.

Website pessoal

Benilton de Sá Carvalho

UNICAMP

Benilton Carvalho é Professor Associado e Chefe do Departamento de Estatística da UNICAMP, com doutorado em Bioestatística pela Johns Hopkins University, pós-doutorados nas Universidade de Cambridge e UNICAMP, com extensa atuação na integração entre estatística e genômica. Ele é pioneiro no desenvolvimento de ferramentas computacionais como os pacotes oligo e crlmm, amplamente usados para análises genômicas. Além disso, é cofundador da Iniciativa Brasileira de Medicina de Precisão (BIPMed), que promove o compartilhamento de dados genômicos de populações sub-representadas.

Atualmente, Benilton é Pesquisador Principal em dois grandes projetos de impacto internacional. No projeto JAGUAR, financiado pela Chan Zuckerberg Initiative, ele coordena os esforços de "admixture mapping", eQTL e genotipagem para mapear a diversidade de células imunológicas na América Latina, contribuindo para avanços na medicina de precisão em populações diversas. Já no INCT Model3D, ele combina inovação em bioinformática e biologia para compreender mecanismos de doenças crônicas.

Currículo Lattes

Elizabeth Juarez-Colunga

University of Colorado, US

Elizabeth Juarez-Colunga is an Associate Professor in the Department of Biostatistics and Informatics at the Colorado School of Public Health, University of Colorado Anschutz Medical Campus. Originally from Mexico, she earned her undergraduate degree in Applied Mathematics from the Universidad Autónoma de Querétaro and her MSc in Mathematical Sciences from the Universidad Nacional Autónoma de México. She obtained her PhD in Statistics from Simon Fraser University in Canada.

Dr. Juarez-Colunga’s research areas of interest include survival analysis, recurrent events, longitudinal data, and joint models. She has been a collaborative biostatistician at the University of Colorado Anschutz Medical Campus since 2012. She has collaborated on projects in geriatrics, palliative care, seizures, and health services research in general. Additionally, she has served in various roles within the Western North American Region (WNAR) of the International Biometric Society, including president.

Website pessoal

Dipak Dey

University of Connecticut, US

Prof. Dipak K. Dey is a Board of Trustees Distinguished Professor in the Department of Statistics at the University of Connecticut (UConn). A prominent statistician, he is most known for his pioneering work in Bayesian analysis, decision science, and model selection. With over 320 research articles published in reputable national and international journals, and over 10 books and edited volumes to his name, he has made a significant impact on the field of statistics and data science. Prof. Dey earned his Bachelor's and Master's degrees in Statistics from the Indian Statistical Institute and a Ph.D. in Statistics from Purdue University, under the supervision of Prof. Jim Berger. Before joining the UConn in 1985, Prof. Dey held academic positions at Stanford University, the University of Kentucky, and Texas Tech University, and has also held visiting appointments at several universities and institutions worldwide. He is a fellow of the American Association for the Advancement of Science, the American Statistical Association, the Institute of Mathematical Statistics, the International Society for Bayesian Analysis, and the International Statistical Institute, and has received numerous awards and honors for his work. Prof. Dey is a dedicated mentor to students and colleagues. He has supervised over 45 Ph.D. students and has collaborated with practically every colleague in his department in a career spanning more than 40 years, helping tenure-track faculty and Ph.D. students achieve their professional goals. His broad range of interest and expertise, combined with his devotion to his peers has been instrumental to many in the statistical community. One of his many awards was the Marth Award for mentorship at UConn. Prof. Dey has held multiple leadership positions. He was for fourteen years as department head and for five years as the Associate Dean for Research in the College of Liberal Arts and Sciences at UConn. He has been a highly effective leader, while maintaining an extremely active academic career. Among his many accomplishments in his leadership roles, Prof. Dey oversaw an expansion of the statistics department; he started a Biostatistics program, a partnership with UConn Health; he developed collaborative research program with various other schools, colleges and Institutes (e.g., CHIP, IMS, Center for Environmental Science); and he initiated corporate partnership with Pfizer, CIGNA and Travelers. Prof. Dey has served as an associate editor for several statistical journals, including the Journal of the American Statistical Association (1997-1999), the Journal of Statistical Planning and Inference (2001-2003), and is currently the editor- in- chief of Sankhya, series A and series B, official journal of Indian Statistical Institute, since 2016 which is the second oldest journal in Statistics in the world. Prof. Dey has a clear long-term vision for the field of statistics and data science based on his many years of experience as a researcher, mentor, teacher, and interdisciplinary collaborator. He believes that statistics should be introduced from an early age in schools in order to develop statistical thinking and to learn how to apply them in real-life situations. His goal for the profession is to make it broadly understood, much beyond STEM programs. At the college level, data science education must include statistics, mathematics, and computational skills in order to train students who plan to pursue a professional career as data scientists in industry, government, and academia.

Website pessoal