ZFP148 is a transcriptional repressor of cytolytic effector CD8+ T cell differentiation

ZFP148 is a transcriptional repressor of cytolytic effector CD8+ T cell differentiation

Ethics

All research conducted in this study complied with ethical regulations and was approved by The Ohio State University Institutional Animal Care and Use Committee (IACUC; protocol 2019A00000075), Institutional Biosafety Committee (IBC; protocol 2019R00000046), and Institutional Review Board (Buck-IRB; protocol 2018C0181).

Mice

C57BL/6J (strain 000664) mice were obtained from the Jackson Laboratory. CD8+ T cell-specific ZFP148-deficient mice were generated by crossing E8iCre (the Jackson Laboratory, strain 008766) mice with Zfp148fl/fl mice kindly provided by J. L. Merchant at University of Arizona. P14 mice were kindly provided by S. M. Kaech at The Salk Institute for Biological Studies. KLF2GFP reporter mice were kindly provided by S. C. Jameson at the University of Minnesota through W. Cui at Northwestern University and crossed with P14 mice at Northwestern University to generate P14 KLF2–EGFP mice. Mice were maintained in the animal facility at The Ohio State University under standard conditions (ambient temperature 20–24 °C, relative humidity 30–70%, 12 h dark–light cycle (lights on from 6:00 to 18:00)). Both male and female mice aged 8–10 weeks were used. All procedures were performed in strict accordance with the NIH Guide for the Care and Use of Laboratory Animals and approved by the Committee on the Ethics of Animal Experiments of The Ohio State University.

Cell lines

B16-GP cell line was kindly provided by A. Wieland at The Ohio State University. B16-GP cells were cultured in RPMI-1640 medium (Gibco, cat. no. 11875-093) with 10% heat-inactivated fetal bovine serum (FBS) (Gibco, cat. no. 10082-147) and 1% penicillin/streptomycin (Pen-Strep; Gibco, cat. no. 15140-122). The MC38 cell line was purchased from Kerafast (cat. no. ENH204-FP). MC38 cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM; Gibco, cat. no. 11965-092) with 10% FBS and 1% Pen-Strep. EL4 cell line was kindly provided by K. Oestreich at The Ohio State University. EL4 cells were cultured in RPMI-1640 with 10% FBS and 1% Pen-Strep. Cell lines were tested regularly for mycoplasma contamination.

Tumor challenge

MC38 cells (1.5 × 106) were resuspended in 100 μl of ice-cold phosphate-buffered saline (PBS) for subcutaneous injection into the right flanks of shaved mice. For experiments involving PD-1 blockade, mice were treated with aPD-1 Ab (BioXCell, 200 μg, clone 29F.1A12X) on days 9, 12 and 15 after tumor implantation. Tumors were monitored and measured daily using an electronic caliper starting on day 8 after tumor injection until day 20. Tumor surface area was calculated using the formula (width × length in square millimeters). Maximal tumor size cutoff was 16 mm in diameter for non-endpoint studies. For survival analysis, tumor-bearing mice were monitored daily and euthanized as nonsurvivors when the tumors sizes reached 16 mm in diameter.

LCMV infection models

For acute LCMV infection, 8- to 10-week-old mice were injected intraperitoneally with 2 × 105 plaque-forming units of LCMV Armstrong. For chronic LCMV infection, 8- to 10-week-old mice were injected intravenously with 2 × 106 plaque-forming units of LCMV Cl13. Viruses were prepared by a single passage on BHK21 cells and viral titers were determined by plaque formation assay on Vero cells. Serum virus titers were determined by the plaque assay performed using Vero cells as described previously50.

Tissue processing and single-cell suspension preparation

Mouse spleens were disrupted mechanically, washed once with ice-cold PBS, subjected to red blood cell lysis (BioLegend, cat. no. 420302), passed through 70-μm cell strainers, and resuspended as single-cell suspensions.

For liver lymphocyte isolation, mice were perfused with ice-cold PBS via the hepatic portal vein. Livers were dissociated mechanically through 100-μm strainers, followed by lymphocyte isolation using Percoll gradient centrifugation (44% Percoll in RPMI over 67% Percoll in PBS; 450g for 20 min at room temperature) and red blood cell lysis.

For lung lymphocyte isolation, mice were perfused with ice-cold PBS. Lungs were minced and digested with Collagenase Type III (Worthington Biochemical Corporation, cat. no. LS004182) for 90 min at 37 °C, passed through 70-μm strainers, and lymphocytes were isolated by Percoll gradient centrifugation (44% Percoll in RPMI over 67% Percoll in PBS; 500g for 20 min at room temperature).

Tumors were dissociated mechanically and digested with Collagenase Type I and Collagenase Type IV (Worthington Biochemical Corporation, cat. nos. LS004196 and LS004188) for 30 min at 37 °C with shaking at 125 rpm. Digestion was quenched with ice-cold PBS containing 2% bovine serum albumin (BSA), and red blood cell lysis (BioLegend) was performed as needed before filtration through 70-μm strainers.

Human bladder tumor and peripheral blood sample processing

Muscle-invasive bladder tumor samples were obtained from consenting treatment-naive patients (the patient cohort had a median age of 68 years) enrolled prospectively in The Ohio State University Comprehensive Cancer Center bladder cancer tissue registry (Buck-IRB 2018C0181). Clinical specimens were processed on the day of radical cystectomy. Tumor tissue was dissociated manually, centrifuged (600g for 5 min at 4 °C), resuspended in human tumor dissociation enzyme solution (Miltenyi Biotec, cat. no. 130-095-929) and homogenized using a gentleMACS semi-automated dissociator (Miltenyi Biotec). Homogenized tissue was incubated at 37 °C for 20 min under continuous rotation on a MACSmix Tube Rotator (Miltenyi Biotec). Following addition of 2% BSA, single-cell suspensions were filtered twice through a 70-µm cell strainer, washed with PBS, subjected to red blood cell lysis (BioLegend, cat. no. 420302), and resuspended in RPMI medium.

Peripheral blood was collected in heparin–EDTA tubes and processed by Ficoll–Hypaque (Sigma) density gradient centrifugation to isolate PBMCs.

Flow cytometry

Single-cell suspensions were stained at 4 °C with LIVE/DEAD Fixable Blue viability dye (Invitrogen, cat. no. L23105) for 15 min, followed by simultaneous Fc receptor (FcR) blocking and surface marker staining for 30 min at 4 °C. Intracellular staining was performed using the Foxp3 Transcription Factor Staining Kit (Invitrogen, cat. no. 00-5523-00) according to the manufacturer’s instructions. Data were acquired on a five-laser Cytek Aurora high-dimensional flow cytometer.

For cytokine detection in CD8+ TILs and human CD8+ T cells, cells were restimulated with Cell Stimulation Cocktail (Invitrogen, cat. no. 00-4970-03) containing PMA and ionomycin in the presence of brefeldin A (BioLegend, cat. no. 420601) for 2 h at 37 °C. For detection of cytokine production by GP33–41-specific CD8+ T cells from spleens of LCMV-infected mice, splenocytes were restimulated with GP33–41 peptide in the presence of brefeldin A for 5 h at 37 °C.

Apoptosis was assessed by Annexin V and PI staining using the fluorescein isothiocyanate Annexin V Apoptosis Detection Kit with PI (BioLegend, cat. no. 640914), followed by FcR blocking and surface marker staining. Data were analyzed using FlowJo software (v.10.7.1, Tree Star) or OMIQ Flow Cytometry software (Dotmatics). Antibodies used for multispectral flow cytometry are listed in Supplementary Table 3.

CD8+ T cell isolation and stimulation in vitro

Spleens from C57BL/6J WT mice were processed into single-cell suspensions, and untouched CD8+ T cells were purified by negative selection (STEMCELL, cat. no. 19853). For time-course TCR stimulation assays, CD8+ T cells were cultured in complete T cell medium (cTCM) (RPMI-1640 (Gibco, cat. no. 11875-093) with 10% FBS, 1% Pen-Strep, 1 mM sodium pyruvate (Gibco, cat. no. 11360-070), 1× MEM NEAA (Gibco, cat. no. 11140-050), 10 mM HEPES (Gibco, cat. no. 15630-080) and 50 μM 2-mercaptoethanol (Gibco, cat. no. 21985-023)) supplemented with 100 U ml−1 recombinant human IL-2 (rhIL-2, acquired from the Biological Resources Branch at the NIH) and stimulated with 5 μg ml−1 plate-bound aCD3 Ab (BioLegend, cat. no. 100359), with or without 2 μg ml−1 soluble aCD28 Ab (BioLegend, cat. no. 102121), at 1 × 106 cells per well in 24-well plates for 0, 10, 24 or 48 h at 37 °C and 5% CO2. For dose–response assays, cells were stimulated with 0.1–5 μg ml−1 plate-bound aCD3 Ab plus 2 μg ml−1 soluble aCD28 Ab for 48 h under identical culture conditions. For CsA inhibition experiments, cells were stimulated with 5 μg ml−1 plate-bound aCD3 Ab and 2 μg ml−1 soluble aCD28 Ab in the presence of 0, 1, 10, 100, 1,000 or 10,000 nM cyclosporine A (Thermo Scientific Chemicals, cat. no. 457970010) for 48 h.

CRISPR–Cas9 RNP KO in activated P14 CD8+ T cells and adoptive transferring

sgRNAs targeting Klf2 were adapted from a published study51, whereas sgRNAs for other candidates were designed by Integrated DNA Technologies (IDT); all were purchased from IDT. Sequences are listed in Supplementary Table 2. For experiments using P14 or P14 KLF2–EGFP mice, on day 0, fresh splenocytes were isolated and stimulated with 1 μg ml−1 LCMV GP33–41 peptide (GenScript, cat. no. RP20257) in the presence of 100 U ml−1 rIL-2 (Biological Resources Branch at NIH) in cTCM at 1 × 106 cells ml−1 in 24-well plates. After 3 days, cells were collected and counted using a Bio-Rad TC20 automated cell counter. Cas9 (IDT, cat. no. 1081059) and sgRNAs were combined and incubated at room temperature for 20 min. Two sgRNAs were used per target to increase KO efficiency. Electroporation was performed using the 4D-Nucleofector 4 Core Unit (Lonza) and P4 primary cell 4D-Nucleofector kit S (Lonza, cat. no. V4XP-4032) with program CM137. Following electroporation, cells were kept at room temperature for 10 min, after which 200 μl prewarmed cTCM was added to each well. Cells were then rested at 37 °C and 5% CO2 for 2 h, counted, resuspended in cTCM with 100 U ml−1 rhIL-2 at 0.5 × 106 cells ml−1 in 24-well plates, and returned to the incubator. Two days postelectroporation, CRISPR-edited P14 or P14 KLF2–EGFP CD8+ T cells were collected for flow cytometry analysis or 5,000 cells from each condition were adoptively transferred into C57BL/6 WT recipient mice by intravenous injection, followed by LCMV Cl13 infection on the same day.

CRISPR–Cas9 RNP KO of genomic region in activated P14 KLF2–EGFP CD8+ T cells

sgRNAs targeting the distal regulatory element of Klf2 (Klf2+10.9kb/element) were designed and purchased from IDT (sequences in Supplementary Table 2). CRISPR–RNP transfection was performed as described in the ‘CRISPR–Cas9 RNP KO in activated P14 CD8+ T cells and adoptive transferring’ section.

Targeting efficiency on genomic DNA was assessed by PCR amplification of 50 ng genomic DNA (extracted using New England Biolabs, cat. no. T3010L) using Platinum SuperFi II PCR Master Mix (Thermo Fisher, cat. no. 12369010). PCR products were purified (Qiagen, cat. no. 28506) and Sanger sequenced (Azenta). Editing efficiency was assessed using the Inference of CRISPR edits tool (Synthego).

Two days postelectroporation, cells were subjected to RNA extraction and quantitative PCR analysis, flow cytometry, or adoptive transfer into C57BL/6J WT recipient mice as described above.

Human CD8+ T cell activation and CRISPR–Cas9 RNP KO

Naive human CD8+ T cells were isolated from cryopreserved healthy donor PBMCs using EasySep immunomagnetic negative selection kits (STEMCELL, cat. no. 17953). Cells were resuspended in cTCM with 100 U ml−1 rhIL-2 at 1 × 106 cells ml−1 and stimulated with Dynabeads Human T-Activator CD3/CD28 (Gibco, cat. no. 11131D). On day 3 poststimulation, cells were collected and counted. sgRNAs were designed and purchased from IDT (sequences in Supplementary Table 2).

Electroporation was performed using the 4D-Nucleofector 4 Core Unit and P2 primary cell 4D-Nucleofector kit S (Lonza, cat. no. V4XP-2024) with program DN100. Cells were rested at room temperature for 10 min, recovered with 200 μl prewarmed cTCM per well, incubated at 37 °C and 5% CO2 for 2 h, then counted and replated in cTCM with 100 U ml−1 rhIL-2 at 0.5 × 106 cells ml−1. Two days postelectroporation, CRISPR-edited human CD8+ T cells were collected for flow cytometry.

Quantitative PCR

Total RNA was extracted using Direct-zol RNA Microprep Kits (Zymo Research, cat. no. R2062) according to the manufacturer’s instructions. For each sample, 1 μg total RNA was reverse transcribed using the iScript cDNA Synthesis Kit (Bio-Rad, cat. no. 1708891) in a 20 μl reaction. qPCR was performed using SsoAdvanced Universal SYBR Green Supermix (Bio-Rad, cat. no. 1725272) on a StepOne Real-Time PCR System (Applied Biosystems). Primers were purchased from IDT: Klf2 (forward: 5′-ACCAACTGCGGCAAGACCTA-3′; reverse: 5′-CATCCTTCCCAGTTGCAATGA-3′)51; β-actin (forward: 5′-AGCTGAGAGGGAAATCGTGC-3′; reverse: 5′-TCCAGGGAGGAAGAGGATGC-3′)24.

Dual reporter luciferase assay

A 1,913-bp genomic region spanning Klf2+10.9kb/element was cloned into the pGL3 Promoter Vector (Addgene, cat. no. 212939) 15 bp upstream of the SV40 minimal promoter by GenScript to generate the pGL3-Klf2+10.9kb/element vector.

Two days before electroporation, EL4 cells were seeded at 3 × 105 cells ml−1 in T-75 flasks. On the day of electroporation, EL4 cells were collected and counted. CRISPR–RNPs were prepared with an NT sgRNA or the same pair of Zfp148-targeting sgRNAs as in mouse CD8+ T cells (sequences in Supplementary Table 2). Electroporation was performed using the 4D-Nucleofector system and SF Cell Line 4D-Nucleofector X Kit S (Lonza, cat. no. V4XC-2032) with program CM120. Electroporated EL4 cells were rested at room temperature for 10 min and recovered in prewarmed electroporation medium (RPMI-1640 with 10% FBS).

For luciferase transfection, 2 days after CRISPR–RNP electroporation, 4 × 105 EL4NT or EL4ZFP148-null cells were transfected with 3 μg pGL3-Klf2+10.9kb/element and 20 ng SV40-Renilla vectors using program CM120. After 24 h, cells were collected and luciferase activity was measured using the Dual-Luciferase Reporter Assay System (Promega, cat. no. E1910) on a SpectraMax iD3 reader in technical duplicates.

Incucyte cytotoxicity assay

B16-GP cells were seeded at 2 × 103 cells per well in flat-bottom 96-well plates and incubated at 37 °C for 30 min in the presence of Incucyte Annexin V Green Reagent (Sartoris, cat. no. 4642). Following initial imaging on the IncuCyte S3 Live-Cell Analysis System, CD8+ T cells were added: (1) mixed CD44hiGP33–41 Tet+ and CD44hiGP276–286 Tet+CD8+ T cells sorted from spleens of Zfp148fl/fl or ZFP148 cKO mice at day 22 post-LCMV Cl13 infection (E:T = 25:1), or (2) P14 CD8+ T cells CRISPR-edited with NT or Zfp148-targeting sgRNAs (E:T = 5:1). Images were acquired every 2 h and analyzed using IncuCyte S3 software.

Comparison of KLF2 and PDCD1 mRNA expression in ZNF148
hi versus ZNF148
lo patient cohorts

mRNA expression of PDCD1 and KLF2 was compared between ZNF148hi and ZNF148lo patients from the TCGA colorectal adenocarcinoma dataset. ZNF148hi was defined as ZNF148 mRNA expression z scores greater than 2 (n = 51) relative to normal and ZNF148lo as z scores less than −2 (n = 123) relative to normal.

Single-cell multiome library preparation

CD44hiGP33–41 Tet+CD8+ T cells were FACs-sorted from the spleens of Zfp148fl/fl or ZFP148 cKO mice at day 21 post-LCMV C13 infection. After sorting, cells were washed with PBS containing 0.04% BSA, then approximately 10,000 nuclei of either Zfp148fl/fl or ZFP148 cKO sample were isolated and processed with the Chromium Single Cell Multiome ATAC+ Gene Expression Reagent Kit (10x genomics, cat. no. 1000283) following the manufacturer’s manual. GEX and ATAC libraries were generated per manufacturer instructions, quality controlled by TapeStation and sequenced on an Illumina NovaSeq X Plus platform (Azenta Life Sciences).

Single-cell multiomic sequencing alignment and downstream analysis

scRNA-seq and scATAC-seq data were processed using the 10x Genomics Cell Ranger ARC pipeline (v.2.0.2) and aligned to the mm10 reference genome. Downstream analyses were performed in R (v.4.4.0) using Seurat (v.5.1.0) and Signac (v.1.14.0) with default parameters unless otherwise specified. For quality control, high-quality cells were defined as those with ATAC counts between 5 × 103 and 7 × 105, RNA counts between 1,000 and 25,000, and mitochondrial gene expression <20%. For the RNA modality, standard Seurat preprocessing was performed, including SCTransform normalization, principal component analysis (RunPCA), and dimensionality reduction using UMAP (RunUMAP). For the ATAC modality, preprocessing included term frequency–inverse document frequency normalization (RunTFIDF), feature selection (FindTopFeatures), singular value decomposition (RunSVD) and UMAP projection. To integrate modalities, we applied the weighted nearest neighbor (WNN) algorithm using the FindMultiModalNeighbors function in Seurat to construct a joint neighbor graph representing weighted contributions of RNA and ATAC modalities. WNN clusters were identified at a resolution of 0.1. Cell types were annotated based on mRNA expression of canonical marker genes and signature scores derived from previously published genesets calculated using AUCell (v.1.26.0). Gene activity scores were computed from chromatin accessibility data using the GeneActivity function in Signac. Differential gene expression analyses were performed on RNA or gene activity matrices using FindAllMarkers or FindMarkers with min.pct = 0.25, filtered at log2 fold change ≥ 0.25 and adjusted P < 0.05. DACRs were identified using FindAllMarkers or FindMarkers with logistic regression testing, min.pct = 0.05, log2 fold change ≥ 0.25 and adjusted P < 0.05, including total counts as a latent variable. Genes linked to DACRs were identified using the Links() function in Signac. GO enrichment analysis of DEGs was performed using clusterProfiler (v.4.12.6). Developmental trajectories and pseudotime relationships were inferred using Slingshot (v.2.12.0). Motif deviation (accessibility) analysis was conducted using chromVAR (v.1.26.0) to quantify variability in transcription factor motif accessibility across single cells. Differential transcription factor motif enrichment was calculated using FindAllMarkers or FindMarkers for pairwise comparisons with logistic regression testing, min.pct = 0.05, log2 fold change threshold = 2 and adjusted P < 0.05, with total counts included as a latent variable. All heatmaps were generated using ComplexHeatmap (v.2.20.0).

CUT&Tag-seq

Naive CD8+ T cells were isolated from spleens of C57BL/6 WT mice and activated with 5 μg ml−1 plate-bound aCD3 Ab and 2 μg ml−1 soluble aCD28 Ab for 24 h in 24-well plates. A total of 1 × 106 cells were used per target (ZFP148 or IgG control) for library construction using CUT&Tag (Epicypher, cat. no. 14-1102-48s1). Libraries were pooled at equimolar ratios and sequenced on an Illumina NovaSeq X Plus platform (Azenta Life Sciences), generating 5–10 million reads per sample.

CUT&Tag-seq analysis

CUT&Tag sequencing data were processed using the nf-core/cutandrun pipeline (v.3.0.0) (https://nf-co.re/cutandrun/3.2.2/)—a community-curated Nextflow pipeline. Raw sequencing reads were first subjected to adapter trimming using fastp (v.0.23.2), followed by alignment to the mouse reference genome (mm10) using Bowtie2 (v.2.4.4). Aligned reads were filtered to remove low-quality mappings, PCR duplicates (using Picard MarkDuplicates v.2.27.5) and mitochondrial reads. Peaks were called using SEACR (v.1.3) in ‘relaxed’ mode with appropriate IgG or input control normalization. Genome-wide signal tracks were generated using deepTools (v.3.5.1) and IGV (v.2.18.4) for visualization. Quality control metrics, including fragment size distribution, duplication rates and library complexity, were assessed and summarized using MultiQC (v.1.13). All steps were run with default settings unless otherwise specified.

Secondary analysis of scRNA-seq datasets of human cancer patients

Pan-cancer scRNA-seq data assembly

An extensive compendium of single-cell transcriptomes was constructed by aggregating profiles from 346 tumor specimens, representing 251 patients, across 20 publicly available scRNA-seq datasets52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71. To reduce platform-specific biases and ensure consistency, only datasets generated with the 10x Genomics droplet-based system was included.

Quality assessment and preprocessing

Raw single-cell data were filtered stringently using Scanpy (v.1.9.5). Cells were retained only if they expressed at least 200 genes and exhibited mitochondrial gene fractions below 20% of total counts. Additional filters eliminated barcodes suggestive of debris (fewer than 400 genes or 500 unique molecular identifiers) and excluded potential doublets (cells with more than 5,500 genes or 30,000 unique molecular identifiers). After these quality control steps, the raw count matrices and corresponding AnnData objects were merged. Data normalization to transcripts per million was performed using sc.pp.normalize_total, followed by a logarithmic transformation with sc.pp.log1p. Only tumor-derived cells were retained, resulting in a final dataset comprising 1,030,968 high-quality cells and 14,090 genes for further analyses.

Batch correction and integration

To reconcile differences between studies while preserving genuine biological variation, batch effects were mitigated using the scVI Python package (scvi-tools v.1.0.4). By incorporating sample identity as a covariate, the scVI model effectively reduced technical variability across samples. The performance of the batch correction was evaluated by measuring the reduction of study-specific biases while retaining critical biological signals. Subsequent analyses—including clustering, differential expression and trajectory inference—were conducted on the integrated dataset. Cellular heterogeneity was visualized using UMAP, which delineated variations across batches, datasets, sex, tissue origin and cancer type.

Cell type identification

Cell populations were classified using the scANVI algorithm (scvi-tools v.1.0.4), which leverages pre-annotated reference data for main immune cell lineages such as epithelial, endothelial, fibroblast, lymphoid, myeloid and plasma cells as well as subsets of CD8+ TILs. Initial clustering within the scANVI latent space was refined with Leiden clustering to assign discrete cell identities. The model was trained for 20 epochs with cell-type labels transferred using 100 samples per label. For a more detailed resolution of T cell subpopulations, corresponding AnnData objects were further integrated and subjected to scVI-based batch correction.

Functional signature score analysis

Functional states across individual cells were quantified by computing gene signature scores using the scanpy.tl.score_genes function (Scanpy v.1.9.5). This approach enabled the assessment of cellular functional signatures within the dataset.

Validation and prognostic evaluation of the ZFP148 KO gene signature in CD8+ TILs

The clinical relevance of the ZFP148 KO gene signature in CD8+ T cells was examined using processed scRNA-seq data from 116 liver cancer samples obtained from 94 male patients34. Analysis was confined to primary tumors and metastatic lesions. After applying the same rigorous quality control, batch correction, and cell-type annotation pipeline, CD8+ T cells were isolated and ZFP148 KO signature score was computed using scanpy.tl.score_genes.

Survival analysis using expression of ZFP148 KO gene signature in CD8+ TILs

To determine the prognostic impact of ZFP148 KO gene signature expression in CD8+ TILs, Kaplan–Meier survival curves were generated and differences assessed via the log-rank test and univariate Cox proportional hazards (Cox PH) models. Two additional multivariable Cox PH models were also employed to adjust for potential confounders, with hazard ratios and 95% confidence intervals reported accordingly. The optimal cutoff for stratifying ZFP148 KO gene signature expression levels was established using the surv_cutpoint function from the survminer R package, which applies maximally selected rank statistics from the maxstat72 package. Continuous covariates in the Cox PH models were examined for linearity to validate the model assumptions.

Expression of ZNF148 mRNA, ZFP148 KO gene signature, KLF2 mRNA and KLF2 gene signature in immunotherapy-treated cohorts

Expression of ZNF148 mRNA, ZFP148 KO gene signature, KLF2 mRNA and KLF2 gene signature was investigated further in independent scRNA-seq datasets derived from patients undergoing various immunotherapeutic regimens. These cohorts included patients receiving anti-CD19 CAR-T cells for DLBCL38 and aPD-1 + aCTLA-4 Abs for ccRCC37. For each dataset, the identical preprocessing pipeline—comprising quality filtering, batch correction and cell-type annotation—was applied.

OS analysis of human cancer patients using published datasets

OS analyses of human cancer patients were performed using webservers that have access to published datasets. Treatment-naive gastric cancer, colon cancer, ovarian cancer, pancreatic cancer, liver hepatocellular carcinoma, pancreatic ductal adenocarcinoma, sarcoma, thyroid carcinoma and uterine corpus endometrial carcinoma patients were stratified into ZNF148hi and ZNF148lo groups using the expression-based ‘best cutoff’ option of the Kaplan–Meier Plotter webserver33, and OS was visualized using Kaplan–Meier curves. Melanoma patients treated with aPD-1 or aCTLA-4 Ab were stratified into high- and low- expression groups based on pretreatment ZNF148 or KLF2 mRNA levels using the expression-based ‘best cutoff’ option of the Kaplan–Meier Plotter webserver35 or expression of the ZFP148 KO gene signature or KLF2 gene signature30 using the Tumor Immune Dysfunction and Exclusion (TIDE) webserver36; OS was visualized by Kaplan–Meier analysis.

Statistics and reproducibility

Statistical analyses for flow cytometry, tumor growth curves and mouse survival were performed using GraphPad Prism (v.10). Unpaired or paired two-sided t-tests were used for comparisons between two unpaired or paired groups, respectively. One-way analysis of variance (ANOVA) followed by Tukey’s multiple comparisons test was used for comparisons among three or more groups. One-way ANOVA followed by Holm–Šídák’s multiple comparisons test was used for comparisons between preselected pairs among three or more groups. Two-way ANOVA was used to compare time-course curves, with Bonferroni correction for multiple comparisons. The log-rank test was used to compare OS of mice across several treatment groups, with Bonferroni correction for multiple comparisons.

Analyses of mouse scRNA-seq and scATAC-seq data were performed using R (v.4.4.0) with the packages Seurat (v.5.1.0), Signac (v.1.14.0), AUCell (v.1.26.0), slingshot (v.2.12.0), chromVAR (v.1.26.0), clusterProfiler (v.4.12.6), ComplexHeatmap (v.2.20.0) and EnhancedVolcano (v.1.22.0). Differentially expressed genes (DEGs) were identified using the FindMarkers() or FindAllMarkers() functions in Seurat, with statistical significance assessed by a two-sided Wilcoxon rank-sum test, with Benjamini–Hochberg correction for multiple comparisons. DACRs and differential transcription factor motif accessibility were identified using FindMarkers() or FindAllMarkers() with statistical significance assessed by a two-sided logistic regression likelihood-ratio test, with Benjamini–Hochberg correction for multiple comparisons. GO enrichment analysis was performed using a one-sided hypergeometric test, with Benjamini–Hochberg correction for multiple comparisons. A two-sided Wilcoxon rank-sum test was used to compare mRNA expression, promoter chromatin accessibility, motif accessibility and gene signature scores between Zfp148fl/fl and ZFP148 cKO CD8+ T cells. Analyses of the integrated human scRNA-seq dataset were performed using Python (v.3.10.9) packages Scanpy (v.1.9.5), Pandas (v.2.0.0), Statsmodels (v.0.14.0), NumPy (v.1.24.2), SciPy (v.1.10.1), Matplotlib (v.3.8.0), Seaborn (v.0.11.2) and scikit-learn (v.1.3.2), as well as R (v.4.3.1) packages Circlize (v.0.4.16), GseaVis (v.0.0.5), Enrichplot (v.1.22.0), GridExtra (v.2.3.0), pheatmap (v.1.0.12) and DEGreport (v.1.38.5). A two-sided Wilcoxon rank-sum test was used for comparisons between two groups. OS between two groups of patients was compared using the log-rank test. A P value ≤ 0.05 (or adjusted P value ≤ 0.05 after multiple-testing correction) was considered statistically significant.

No statistical methods were used to predetermine sample size, but sample sizes were similar to those reported in previous publications24,73. Data distribution was assumed to be normal but was not formally tested. Age- and sex-matched animals were assigned randomly to experimental conditions. Investigators were not blinded to group allocation during data collection and analysis. No data were excluded. All data were generated from at least two independent experiments with a minimum of three biological replicates, yielding consistent phenotypes to ensure reproducibility, with the exceptions of the paired scRNA-seq and scATAC-seq using CD44hiGP33–41 Tet+CD8+ T cells in spleens of Zfp148fl/fl and ZFP148 cKO mice at day 21 post-LCMV Cl13 infection and the CUT&Tag-seq using activated splenic CD8+ T cells from C57BL/6 WT mice. To minimize intermouse variability and enhance reproducibility, cells used for the single-cell multiome experiment were pooled from 10 mice for Zfp148fl/fl and 12 mice for ZFP148 cKO; cells used for the CUT&Tag-seq experiment were pooled from 5 mice.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.