Introduction

Solar radiation is the major source of human exposure to ultraviolet radiation (UVR)1, and the major risk factor for cutaneous melanoma and keratinocyte skin cancers2,3. Exposure to artificial UVR (indoor tanning) also increases skin cancer risk, and is classified as carcinogenic to humans4.

Identification of biomarkers indicating past exposures is important in the study of chronic diseases and their etiology. In epidemiological studies, DNA methylation has been a strong marker of environmental exposure5,6. Exposure to smoking, air pollution, and heavy metals have consistently been linked to epigenetic changes, mainly to DNA methylation7,8. UVR exposure has also been linked to DNA methylation, as UVR exposure has been demonstrated to change the epigenetic profile of the epidermis9. An assessment of ambient UVR exposure and DNA methylation in CD4+ T-cells in European American individuals10 demonstrated an epigenome-wide significant association for cg26930596 (PRKCZ), but failed to replicate in an independent sample. An Australian study found an association between UVR exposure and total LINE-1 hypomethylation11. LINE-1 has often been used as a marker of genomic integrity, and a loss of methylation in LINE-1 is associated with global hypomethylation and with structural instability of the genome.

Global hypomethylation has been associated with multiple cancers, including bladder, liver, breast, kidney, colon and melanoma12. With UVR exposure as the main risk factor for melanoma, it is of interest to investigate if UVR exposure can affect epigenetic profiles, and if DNA methylation mediates the association between UVR exposure and the risk of melanoma. Our aim was to assess the former, i.e., whether DNA methylation in blood leucocytes is associated with life history of UVR exposure. We used data from the Norwegian Women and Cancer (NOWAC) study, a population-based cohort, with information on lifetime UVR exposure, which has shown consistent associations with skin cancer13,14,15,16,17,18. We studied genome-wide DNA methylation as well as global methylation, including imputation of LINE-1 specific CpGs, in whole blood. UVR exposure is the main driver for skin photoaging, and we also examined if lifetime UVR exposure could result in an accelerated epigenetic age, estimated from DNA methylation in leucocytes19. Analyses were performed in the discovery set and two replication sets.

Materials and Methods

Study samples

The NOWAC cohort includes 172 000 women aged 30–70 years (born 1927–1965) when included in 1991–2006 from a nationwide random sample (response 54%)20. Host characteristics and lifetime UVR exposure were collected through questionnaires at baseline and every 4–6 years. Approximately 50 000 women in the NOWAC cohort donated blood samples and constitute the postgenome cohort21. The present paper includes controls from three data sets from the postgenome cohort, all cancer-free women at the time of blood sampling and selected as controls in case-control studies of melanoma (discovery set, n = 183 controls), breast cancer (replication set R1, n = 191 controls)22, and lung cancer (replication set R2, n = 125 controls)5,23. Matching factors were time since blood sampling and year of birth (1943–1947, 1948–1952, 1953–1957).

The women gave written informed consent to donate blood samples for biomarker analyses. We confirm that all methods employed in the study were performed in accordance with the relevant guidelines and regulations.

UVR exposure

On the basis of ambient UVR hours at place of residence, ambient UVR is categorized as low (northern Norway), medium-low (central Norway), medium (southwestern Norway), and highest (southeastern Norway)24,25. In the baseline and follow-up questionnaires, participants reported history of severe sunburns (never, 1, 2–3, 4–5, ≥6 times per year), average number of weeks spent on sunbathing vacations per year (never, 1, 2–3, 4–6, ≥7 weeks) and average use of an indoor tanning device (never; rarely; 1, 2, or 3–4 times/month; >1 time/week) in childhood (≤9 years), adolescence (10–19 years), and adulthood (>19 years)24. The reported frequencies of indoor tanning, sunbathing vacations, and sunburns were transformed into equivalents of yearly sessions and multiplied by the length of each interval16. The participants were then classified into five categories; non-exposed and quartiles. To capture the tail of the distribution, the upper quartile was further divided into two equally sized groups (i.e. six categories in total). Cumulative UVR exposure was constructed by summarizing the categories (i.e. scores 0–5) for indoor tanning and sunbathing vacations16.

Covariates

Participants reported education (≤10, 11–13, ≥14 years), smoking (never, former, current smoker), and hair color (dark brown/black, brown, blond/yellow, red); which is the best measure of skin sensitivity to UVR in the NOWAC cohort13,15.

DNA methylation analyses

DNA was extracted at the HUNT Biobank, Levanger, Norway, and methylation arrays were analyzed at the Institute for Genomic Medicine, Torino, Italy. DNA was extracted from the blood samples using the QIAsymphony DNA Midi Kit (Qiagen, Crawley, UK), and 1000 ng (discovery) and 500 ng (R1 and R2) of DNA were converted with bi-sulfite (EZ-96 DNA Methylation-Gold™ Kit, Zymo Research, Orange, CA, USA) according to manufacturer’s instruction.

The samples for the discovery set were randomly placed on the plates, and randomly assigned to a row/column position, with equally many cases and controls on each column and plate. The Illumina Infinium MethylationEPIC BeadChips were hybridized according to the manufacturer’s protocol. All predicted cross-hybridizing probes (44 210)26, out-of-band probes (2843), and all probes with at least one CpG with detection p-value above 0.8 (5504 CpGs) were removed. This left 775 528 CpGs in samples from 183 controls. DNA methylation at LINE-1 CpGs were imputed in the discovery set using the R-package REMP27 and its default pipeline, without removing cross-hybridizing probes. We assessed only LINE-1 methylation and not other repetitive elements, since this is by far the most studied marker of association between UVR and DNA methylation.

For R1 and R2, the Illumina Infinium HumanMethylation450 BeadChips were hybridized according to the manufacturer’s protocol. Plate specific batch effects were corrected using ComBat28,29. After quality control that included removal of CpGs with >20% missing and non-specific CpGs, 416412 autosomal CpGs remained for R1 and 450890 for R2. Quality controls have been described in detail for R122 and R25.

All three data sets had background subtraction and control normalization performed with minfi to reduce background noise and dye bias30. Beta mixture quantile normalization31 using the wateRmelon R-package32 was performed for type I and type II probes in the three sets jointly. Cell type composition was estimated using the Houseman algorithm33 with a reference data set from Reinius et al.34. White blood cell composition estimates were obtained for CD4+ and CD8+ T-cells, NK cells, B cells, monocytes, granulocytes, and we estimated the granulocytes-to-leukocytes ratio.

Statistical analysis

Correlations between the five UVR variables were estimated using Pearson’s correlation coefficient, r. Linear regression was used to study associations between UVR exposure variables and estimated fraction of each cell type component, as well as the lymphocyte to neutrophil ratio, adjusting for age at sampling, smoking status, time in freezer, and data set.

The methylation values were transformed from beta-values to M-values using a logit2 transform. Smoking results in a strong, well-known pattern in the DNA methylation and as a quality control, we performed linear regression with smoking status as the main exposure and DNA methylation as the outcome.

In the genome-wide analysis, DNA methylation was modelled as the outcome and UVR as the covariate in a linear regression model for each CpG, adjusting for age, smoking status, and time in freezer. Additional adjustment was performed for hair color, as a marker of skin sensitivity15, and for cell type composition. We tested for interactions between cumulative UVR exposure and hair color in each CpG, and similarly between lifetime sunburns and hair color.

We present estimated regression coefficients with standard errors (SE). Note that, as we are testing for trends through ordered categorical exposure variables, the estimated regression coefficients should be interpreted with caution. Furthermore, we used non-negative matrix factorization to summarize the UVR exposure variables and hair color, and to cluster the individuals into three exposure groups. Analysis of variance (ANOVA) was used to test for differences between these groups with regard to each CpG.

All p–value adjustments for multiple testing were done with the Benjamini-Hochberg false discovery rate (FDR) procedure. A CpG site was defined as significant if the FDR adjusted p-value was <0.05, and as replicated if the nominal p-value in any of the replication sets was <0.05. Replication was attempted for the 20 CpGs with lowest p-values in the discovery set.

We attempted replication of the previously reported association between ambient UVR exposure and cg26930596 in the PRKCZ gene10 in all three sets using linear regression.

We assessed global DNA methylation by two indicators: the average over all measured CpGs, and by imputing methylation at CpG sites in LINE-1. Average methylation levels were analyzed using linear regression, with UVR as the exposure and average methylation as the outcome, adjusting for age, smoking, and time in freezer. The association between UVR exposure and LINE-1 CpGs was modeled with two models, one at the level of individual CpGs with linear regression and one at the level of LINE-1 subfamilies using linear mixed models, with subfamilies as grouping factor, adjusting for age, smoking, and time in freezer for both models.

Biological age (PhenoAge) was estimated based on the 513 CpGs published by Levine et al.19, out of which 512 were available in the discovery set, 506 in R1 and 505 in R2. Age acceleration phenotype was defined as the difference between the chronological age and the estimated biological age35. Linear regression was used with age acceleration as the outcome and UVR as the exposure, adjusting for smoking and time in freezer. All analyses were performed using R software36.

Ethics approval

The Medical Ethical Committees of North Norway has approved the NOWAC study and the storage of human biological material, as well as each sub-study used in this project.

Results

Women in the discovery and replication sets were older than women invited to the postgenome cohort (Table 1). Furthermore, R2 was older, recruited earlier, and had shorter time in freezer, lower education, and more non-smokers compared to the discovery and R1 sets. UVR exposures in the three sets are presented in Table 2, and R2 had lowest proportion of women from the region with highest ambient UVR. Low correlation was found between residential ambient UVR and the other four UVR variables (−0.06 ≤ r ≤ 0.14), and between lifetime sunburns and the other UVR variables (0.09 ≤ r ≤ 0.16). Indoor tanning and sunbathing vacations were moderately correlated (r = 0.30).

Table 1 Characteristics of the women in the discovery and replication sets, and the women invited to participate in the postgenome cohort.
Table 2 Ultraviolet radiation (UVR) exposure in the discovery and the replication sets.

When testing each cell type independently, the UVR exposure variables were not significantly associated with cell type composition in any of the three sets, (0.06 ≤ padjusted ≤ 0.98) (Supplementary Table S1). The lymphocyte to neutrophil ratio was also not significant for any UVR exposure (0.07 ≤ padjusted ≤ 1). A total of 326 758 CpGs were present in all three sets. In the analysis of smoking, 113 CpG sites had padjusted < 0.05, of which 58 were replicated in at least one of the two replication sets (Supplementary Table S2).

Differentially methylated CpG sites

The top 20 CpGs associated with each UVR exposure are listed in Supplementary Table S3. Two of the top 20 CpGs replicated in either R1 (sunburns and cg00033666) or R2 (ambient UVR and cg05860019). One CpG (cg01884057) was genome-wide significantly associated with UVR exposure (cumulative UVR) in the discovery set (padjusted  = 0.03), but it was not replicated neither in R1 (pnomial = 0.64) nor in R2 (pnomial = 0.42) (Table 3). After further adjustment for hair color, the CpG associated with lifetime sunburns (cg00033666) was replicated (Supplementary Table S4). One new CpG replicated in this model; for sunbathing vacations (cg19577365) (Supplementary Table S4).

Table 3 Differentially methylated CpGs in the discovery set, either genome-wide significant, or replicated in replication sets R1 or R2.

We tested for interaction between lifetime sunburns and hair color and found no interaction for any of the CpGs (padjusted ≥ 0.42, discovery set). When testing the interaction between lifetime cumulative UVR and hair color, significant interaction was found for one CpG (cg15277477, padjusted = 4.1e-3), but this was not replicated (pnominal = 0.81 in R1 and pnominal = 0.99 in R2). After adjustment for cell type composition (Supplementary Table S3), no substantial differences were observed. The correlation coefficients between the top 20 effect estimates in the model with and without this adjustment ranged from 0.80 to 0.99.

The ANOVA comparing each CpG between the groups from the cluster analyses, identified two in the top 20 CpGs that were replicated: cg21452538 (pnominal = 3.69e-5 in discovery) was replicated in R1 (pnominal = 0.03) and cg05967123 (pnominal = 2.75e-5 in discovery) in R2 (pnominal = 0.02). The main driver of these associations was a factor composed of sunbathing vacations and cumulative UVR exposure.

CpG site indicated in the literature

The CpG cg26930596 in the PRKCZ gene, previously reported to be associated with ambient UVR exposure, was significantly associated with ambient UVR exposure in R1 (pnominal = 9.34e-3), but not in the discovery set (pnominal = 0.65) or in R2 (pnominal = 0.28).

Global DNA methylation

Average methylation was not associated with any of the UVR exposure variables in the discovery or replication sets (0.06 ≤ pnominal ≤ 0.93) (Supplementary Table S5). Indoor tanning and cumulative UVR exposure had negative effect estimates in all three sets, sunbathing vacation had positive effect estimates in all three sets, while lifetime sunburns and ambient UVR had a positive effect estimate in the discovery set and negative estimates in both replication sets. In the discovery set, no LINE-1 CpG was significantly associated with any of the UVR exposure variables (data not shown). No LINE-1 subfamily was significantly associated with any of the UVR exposure variables (Supplementary Table S6).

Accelerated aging

Accelerated aging was associated with sunbathing vacations in R2 (regression coefficient = 1.8, SE = 0.48, pnominal = 1.20e-3), but not in the other two sets (0.08 ≤ pnominal ≤ 0.32). The remaining four UVR exposure variables were not significantly associated with accelerated aging (0.06 ≤ pnominal ≤ 0.88; with the lowest p-value for cumulative UVR in R2).

Discussion

We investigated the association between five UVR exposure variables (residential ambient UVR exposure, lifetime sunburns, lifetime sunbathing vacations, lifetime indoor tanning, and cumulative UVR exposure) and DNA methylation in lymphocytes in a discovery and two replication sets from the NOWAC cohort.

Only one CpG (cg01884057) site was associated with cumulative UVR exposure, but this finding was not replicated. Additionally, two CpGs were suggestively associated with the other four UVR exposure variables and replicated in one of the replication sets.

The CpG associated with cumulative UVR in our study lies in a DNase hypersensitive region 7 kb upstream of the Adenylate Cyclase 3 (ADCY3) gene, shown to be a potential oncogene37. However, no robust association with skin cancer has been indicated. Ambient UVR exposure was suggestively associated to a CpG (cg05860019) about 10 kb upstream, in the shore of a CpG Island associated to the One cut homeobox 1 (ONECUT1) gene. This gene is mainly transcribed in liver cells, but is important for cell cycle regulation and potentially associated with tumorigenesis or metastasis of malignant tumors38. The CpG suggestively associated with lifetime sunburns (cg00033666) lies in the shore of a CpG Island next to the master regulator gene Nuclear Receptor subfamily 2, group F (NR2F2). This gene has been suggested as an inhibitor target for melanoma and other cancers39. Somatic mutations in NR2F2 have been observed in about 1% of melanomas40.

There are few studies on UVR exposure and DNA methylation, and most of the existing studies focus on cell lines or short-term exposure to UVR. The most similar study to ours in terms of design is the study by Aslibekyan et al.10, who investigated ambient UVR exposure and DNA methylation in CD4+ T-cells, which have been shown to express the CCR10 receptor when stimulated with sun induced vitamin D341. One CpG from Aslibekyan et al.10 was nominally significant (cg26930596) in one of our samples, but with an effect estimate in the opposite direction.

The average beta-value across all methylation probes was not associated with any of the UVR exposure variables, but we observed an indication (not statistically significant) of hypomethylation. This is in line with previous research, which has observed a loss of DNA methylation after UVR exposure11. UVR exposure has been linked to LINE-1 hypomethylation in previous studies, but this has not been translated into an increased risk of melanoma42. LINE-1 methylation is often used as an indicator for global methylation. In this study, we used both average methylation over all observed CpGs, and imputed CpG levels at LINE-1. Neither was significantly associated with any of the UVR exposures.

UVR exposure is a primary driver of photo aging in the skin, and it can be hypothesized that other tissues could also show accelerated aging after UVR exposure. However, the association between sunbathing vacations and accelerated aging observed in R2 was not very strong, and was not found in the discovery or the R1 sets. No other UVR exposures showed a significant association.

An important strength of our study is the detailed life history of solar and artificial UVR exposure in the population-based NOWAC cohort, which has been consistently associated with risk of cutaneous melanoma13,15,16,17 and squamous cell carcinoma14,18. Indoor tanning irradiances are high in UVA radiation43 while UVB is the main cause of sunburns44. The intensity of the UVR exposure could not be directly assessed through the questionnaires, but since the UVR questions were segmented into age intervals in decades for each individual, estimates of dose were obtained.

An exposure with a demonstrated strongest epigenetic footprint, smoking, has been extensively studied, also in the NOWAC study5. As a quality control of the methylation data, we studied the associations with smoking in all three data sets. All significant probes that replicated across our sets have been previously reported in a large meta-analysis on methylation and smoking45 and demonstrate that the data were of sufficient quality and sample size to find biomarkers of strong exposures.

A weakness of our study is the lack of a directly exposed tissue, and the use of whole blood over skin samples. When studying the epigenetic patterns relating to environmental exposures or diseases, being as close as possible (in time and space) to the affected tissue is important since the epigenetic profile differs between tissues. Different cell types will also respond differently to the same environmental exposure. However, this has to be balanced against the availability of bio-samples. Large scale, general purpose biobanks, suitable for pre-diagnostic sampling will usually store only blood samples. While skin is the primary exposed tissue to UVR, and thus the most relevant tissue for studying direct effects of the UVR exposure, secondary effects of chronic UVR exposure might also be observed elsewhere, including in circulating lymphocytes46. The suppression of the immune system by UVR exposure is documented and is used as treatment for some autoimmune diseases47. Under the hypothesis that sustained UVR exposure influences the immune system, the cell type composition would be affected by the UVR exposure, and thus act as a mediator on the path from exposure to outcome. This places cell type composition on the causal pathway between UVR exposure and DNA methylation. Thus, after adjusting for cell type composition we will not be able to identify the total effect of the UVR exposure on DNA methylation, but rather the direct effect that is not caused by changes in the immune cell composition. We did not observe any strong associations between UVR exposures and the estimated white blood cell composition. The only UVR exposure variable showing some potential association with cell type composition was residential ambient UVR, which is also potentially confounded by population substructure. Given that no other UVR exposure showed a consistent association with cell type composition, this association is more likely caused by other factors than UVR exposure. Moreover, additional adjustment for cell type composition did not change the results.

The three data sets were collected as controls for case-control studies of melanoma (discovery), breast cancer (R1) and lung cancer (R2), which explains the older age of these sets compared to all women invited to the postgenome cohort and also the differences in time in freezer. The long-term storage of whole blood and DNA in biobanks may have a negative effect on DNA yield, but the integrity of DNA methylation does not seem to be affected by this48. However, there may be systematic differences between the three samples that are reflected in the time spent in freezer, such as difference in lab procedures, and this variable may serve as a proxy for such differences. To make the three samples more comparable, we therefore included this adjustment in all models.

UVR exposure is the main risk factor for skin cancer, but if this risk is mediated by DNA methylation is still not determined. We have made an extensive analysis of the potential association between UVR exposure and DNA methylation in blood, investigating the problem using different statistical approaches. Thus, we feel confident that long term UVR exposure has little effect on DNA methylation, and if DNA methylation is acting as a mediator of the melanoma risk from chronic UVR exposure, this is not reflected in DNA methylation in white blood cells.