Part 6: Execution of overrepresentation analysis (ORA) in Python

The AOP project ► Key objective 2

Author: Shakira Agata

This Jupyter Notebook describes the steps for the execution of Overrepresentation analysis (ORA) from the GSEApy package on datasets:GSE109565, E-MEXP-3583, E-MEXP-2599, GSE44729 and E-GEOD-69851.

This notebook is subdivided into the following seven sections:

Section 1: System preparation
Section 2: Overrepresentation analysis (ORA) for dataset:GSE109565
- Section 2.1: Generation of background genelist
- Section 2.2: Generation of the genelists
- Section 2.3: Execution of ORA
- Section 2.4: Saving plots of ORA
Section 3: Overrepresentation analysis (ORA) for dataset:E-MEXP-2583
- Section 3.1: Generation of background genelist
- Section 3.2: Generation of the genelists
- Section 3.3: Execution of ORA
- Section 3.4: Saving plots of ORA
Section 4 Overrepresentation analysis (ORA) for dataset:E-MEXP-2599
- Section 4.1: Generation of background genelist
- Section 4.2: Generation of the genelists
- Section 4.3: Execution of ORA
- Section 4.4: Saving plots of ORA
Section 5: Overrepresentation analysis (ORA) for dataset:GSE44729
- Section 5.1: Generation of background genelist
- Section 5.2: Generation of the genelists
- Section 5.3: Execution of ORA
- Section 5.4: Saving plots of ORA
Section 6: Overrepresentation analysis (ORA) for dataset:E-GEOD-69851
- Section 6.1: Generation of background genelist
- Section 6.2: Generation of the genelists
- Section 6.3: Execution of ORA
- Section 6.4: Saving plots of ORA
Section 7: Metadata

Section 1: System preparation

In this section, the necessary packages are imported.

Step 1: The necessary packages for using this pipeline are first installed.

import pandas as pd
from gseapy.plot import gseaplot
import gseapy as gp
import numpy as np
import matplotlib.pyplot as plt
from gseapy import dotplot
import os
from gseapy import barplot, dotplot

Section 2: Execution of Overrepresentation analysis for dataset: GSE109565

In this section, you will execute overrepresentation analysis for dataset: GSE109565 using the Enrichr function of GSEApy and background list of genes that is created.

Section 2.1: Generation of the background genelist

In this section, you will create the background list of genes which contain all expressed genes of dataset:GSE109565. This requires the creation of a folder with the expressed genes per condition of the dataset:

PCB concentration 1
PCB concentration 2
PCB concentration 3
Roundup concentration 1
Glyphosate concentration 1
Glyphosate concentration 2
Glyphosate concentration 3

Step 2: First the contents of the folder containing the needed files are verified.

path_ORA = "C:\\Users\\shaki\\Downloads\\BackgroundORA-GSE109565"
dir_list_ORA = os.listdir(path_ORA)
print("Files and directories in '", path_ORA, "' :")
print(dir_list_ORA)

Files and directories in ' C:\Users\shaki\Downloads\BackgroundORA-GSE109565 ' :
['Glyphosate concentration 1-GSE109565.top.table.tsv', 'Glyphosate concentration 2-GSE109565.top.table.tsv', 'Glyphosate concentration 3-GSE109565.top.table.tsv', 'PCB concentration 1-GSE109565.top.table.tsv', 'PCB concentration 2-GSE109565.top.table.tsv', 'PCB concentration 3-GSE109565.top.table.tsv', 'Roundup concentration-GSE109565.top.table.tsv']

Step 3: Next, individual dataframes are created for each of the files.

list_of_names_ORA = ['Glyphosate concentration 1-GSE109565.top.table','Glyphosate concentration 2-GSE109565.top.table','Glyphosate concentration 3-GSE109565.top.table','Roundup concentration-GSE109565.top.table','PCB concentration 1-GSE109565.top.table', 'PCB concentration 2-GSE109565.top.table','PCB concentration 3-GSE109565.top.table']
dataframes_list_ORA = []

for i in range(len(list_of_names_ORA)):
 
    temp_df_ORA = pd.read_csv("./BackgroundORA-GSE109565/" + list_of_names_ORA[i] + ".tsv", sep='\t')
    dataframes_list_ORA.append(temp_df_ORA)

Step 4: The dataframes are merged by vertical stacking followed by manipulation to retrieve the background gene list.

combined_dataframes_list_ORA=pd.concat([dataframes_list_ORA[0],dataframes_list_ORA[1],dataframes_list_ORA[2],dataframes_list_ORA[3],dataframes_list_ORA[4],dataframes_list_ORA[5],dataframes_list_ORA[6]], ignore_index=True, axis=0)
genelist= combined_dataframes_list_ORA['Symbol'].copy()
gene_list = genelist.squeeze().str.strip().to_list()
Backgroundgenelist= [x for x in gene_list if x==x]

Section 2.2: Generation of the genelists

In this section, the genelists for PCB concentration 1-3 and Roundup are created.

Section 2.2.1 Genelist for PCB concentration 1

Step 5: The significant results are extracted and the dataframe is next converted into a list for comparison: PCB concentration 1.

genelist_df1 = dataframes_list_ORA[4]
genelist_PCB1= genelist_df1[genelist_df1['padj'] < 0.05]
genelist_PCB1=genelist_PCB1['Symbol'].copy()
gene_listPCB1 = genelist_PCB1.squeeze().str.strip().to_list()
gene_list_PCB1= [x for x in gene_listPCB1 if x==x]

Section 2.2.2 Genelist for PCB concentration 2

Step 6: he significant results are extracted and the dataframe is next converted into a list for comparison: PCB concentration 2.

genelist_df2 = dataframes_list_ORA[5]
genelist_PCB2= genelist_df2[genelist_df2['padj'] < 0.05]
genelist_PCB2=genelist_PCB2['Symbol'].copy()
gene_listPCB2 = genelist_PCB2.squeeze().str.strip().to_list()
gene_list_PCB2= [x for x in gene_listPCB2 if x==x]

Section 2.2.3 Genelist for PCB concentration 3

Step 7: The significant results are extracted and the dataframe is next converted into a list for comparison: PCB concentration 3.

genelist_df3 = dataframes_list_ORA[6]
genelist_PCB3= genelist_df3[genelist_df3['padj'] < 0.05]
genelist_PCB3=genelist_PCB3['Symbol'].copy()
gene_listPCB3 = genelist_PCB3.squeeze().str.strip().to_list()
gene_list_PCB3= [x for x in gene_listPCB3 if x==x]

Section 2.2.4 Genelist for Roundup concentration 1

Step 8: The significant results are extracted and the dataframe is next converted into a list for comparison: Roundup.

genelist_df4 = dataframes_list_ORA[3]
genelist_RU= genelist_df4[genelist_df4['padj'] < 0.05]
genelist_RU=genelist_RU['Symbol'].copy()
gene_list_RU= genelist_RU.to_list()

Section 2.3: Execution of ORA

In this section, ORA will be executed per comparison.

Step 9: The Enrichr function is executed for each comparison and the top results are displayed.

enr_bg_PCB1 = gp.enrichr(gene_list=gene_list_PCB1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GSE109565_ORApathwaytable', 
                 verbose=True)

2025-04-14 14:42:38,015 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 14:42:39,730 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 14:42:40,914 [INFO] Done.

enr_bg_PCB1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	9.207888e-18	7.218984e-15	3.146632	123.431285	KEAP1;IRS2;NR3C1;RGS2;FTH1;CYP1B1;IGFBP1;SLC6A...
1	WikiPathways_2024_Human	Complement System WP2806	1.234761e-10	4.840264e-08	5.028144	114.716974	CRP;SELPLG;CFH;PROS1;ITGB3;CFI;F13A1;C4BPA;ADM...
2	WikiPathways_2024_Human	Metapathway Biotransformation Phase I And II W...	1.665226e-09	4.351790e-07	3.025687	61.159134	UGT1A10;NDST2;NDST1;CYP26B1;NDST4;NDST3;CYP1B1...
3	WikiPathways_2024_Human	Glucocorticoid Receptor Pathway WP2880	3.251000e-09	6.371959e-07	4.920421	96.166200	SLC26A2;AMIGO2;TNFAIP3;NR3C1;PTGS2;PRRG4;FGD4;...
4	WikiPathways_2024_Human	NRF2 Pathway WP2884	4.458371e-09	6.990726e-07	3.316338	63.768141	SRXN1;SLC2A1;KEAP1;TGFA;SLC2A2;TXN;SLC2A4;SLC7...

enr_bg_PCB2 = gp.enrichr(gene_list=gene_list_PCB2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GSE109565_ORApathwaytable', 
                 verbose=True)

2025-04-14 14:46:49,822 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 14:46:51,457 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 14:46:51,963 [INFO] Done.

enr_bg_PCB2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	1.274958e-29	9.090451e-27	5.897351	392.362853	CDKN1B;SRXN1;SLC2A1;IRS2;SLC7A11;SLC2A4;NR3C1;...
1	WikiPathways_2024_Human	NRF2 Pathway WP2884	1.026815e-13	3.660597e-11	5.941496	177.693186	SRXN1;SLC2A1;TGFA;TXN;SLC2A4;SLC7A11;SLC2A6;UG...
2	WikiPathways_2024_Human	Pleural Mesothelioma WP5087	2.923317e-11	6.947750e-09	2.867939	69.563908	CDKN1A;ITGB4;ITGB3;WWC1;SLC2A1;AREG;ACTB;ACTG1...
3	WikiPathways_2024_Human	Glucocorticoid Receptor Pathway WP2880	4.004986e-10	7.138888e-08	7.091578	153.449771	SRGN;SLC26A2;MGAM;CAVIN2;AMIGO2;CUL1;NR3C1;TGF...
4	WikiPathways_2024_Human	Metapathway Biotransformation Phase I And II W...	6.115627e-10	8.720884e-08	4.123522	87.480530	UGT1A10;GLYAT;CYP2C19;CYP4F22;CYP19A1;CYP7A1;N...

enr_bg_PCB3 = gp.enrichr(gene_list=gene_list_PCB3,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GSE109565_ORApathwaytable', 
                 verbose=True)

2025-04-14 14:47:15,835 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 14:47:17,078 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 14:47:17,523 [INFO] Done.

enr_bg_PCB3.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Benzo A Pyrene Metabolism WP696	0.002069	0.013766	724.083333	4475.443288	CYP1A1
1	WikiPathways_2024_Human	Estrogen Receptor Pathway WP2881	0.002987	0.013766	482.611111	2805.639574	CYP1A1
2	WikiPathways_2024_Human	Fatty Acid Omega Oxidation WP206	0.003217	0.013766	445.461538	2556.698083	CYP1A1
3	WikiPathways_2024_Human	Estrogen Metabolism WP697	0.003905	0.013766	361.875000	2006.791561	CYP1A1
4	WikiPathways_2024_Human	Tamoxifen Metabolism WP691	0.004593	0.013766	304.684211	1640.199523	CYP1A1

Section 2.4: Saving plots of ORA

In this section, The ORA plots for each comparison are saved.

Section 2.4.1 PCB concentration 1

Step 10: The ORA barplot is created using the following commands below per condition. The variable ´ofname´ is set to save the figures in your laptop.

ax_PCB1 = barplot(enr_bg_PCB1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
                title= 'PCB concentration 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='PCB concentration 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_PCB2 = barplot(enr_bg_PCB2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'PCB concentration 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='PCB concentration 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_PCB3 = barplot(enr_bg_PCB3.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'PCB concentration 3 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='PCB concentration 3 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

Section 3: Execution of Overrepresentation analysis for dataset: E-MEXP-3583

In this section, you will execute overrepresentation analysis for dataset: E-MEXP-3583 using the Enrichr function of GSEApy and background list of genes that is created.

Section 3.1: Generation of the background genelist

Step 11: First, you verify the contents of the folder.

path_ORA = "C:\\Users\\shaki\\Downloads\\BackgroundORA-EMEXP3583"
dir_list_ORA = os.listdir(path_ORA)
print("Files and directories in '", path_ORA, "' :")
print(dir_list_ORA)

Files and directories in ' C:\Users\shaki\Downloads\BackgroundORA-EMEXP3583 ' :
['topTable_Ag._.1.3_24 - H2O.control_.0.0_24.csv', 'topTable_Ag._.1.3_48 - H2O.control_.0.0_48.csv', 'topTable_AgNP._.12.1_24- H2O.control_.0.0_24.csv', 'topTable_AgNP._.12.1_48 - H2O.control_.0.0_48.csv', 'topTable_H2O_0.0_48 - H2O.control_.0.0_24.csv']

Step 12: Next, we create individual dataframes for each of the files.

list_of_names_ORA = ['topTable_Ag._.1.3_24 - H2O.control_.0.0_24', 'topTable_Ag._.1.3_48 - H2O.control_.0.0_48','topTable_AgNP._.12.1_24- H2O.control_.0.0_24','topTable_AgNP._.12.1_48 - H2O.control_.0.0_48','topTable_H2O_0.0_48 - H2O.control_.0.0_24']
dataframes_list_ORA = []

for i in range(len(list_of_names_ORA)):
 
    temp_df_ORA = pd.read_csv("./BackgroundORA-EMEXP3583/" + list_of_names_ORA[i] + ".csv", comment=';')
    dataframes_list_ORA.append(temp_df_ORA)

Step 13: You will now merge the dataframes by vertical stacking followed by manipulation to retrieve the background gene list.

combined_dataframes_list_ORA=pd.concat([dataframes_list_ORA[0],dataframes_list_ORA[1],dataframes_list_ORA[2],dataframes_list_ORA[3],dataframes_list_ORA[4]], ignore_index=True, axis=0)
genelist= combined_dataframes_list_ORA['gene_name'].copy()
gene_list = genelist.squeeze().str.strip().to_list()
Backgroundgenelist= [x for x in gene_list if x==x]

Section 3.2: Generation of the genelists

In this section, you will create the genelists.

Section 3.2.1 Genelist for Ag+ timepoint 1

Step 14: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df1 = dataframes_list_ORA[0]
genelist_Ag1= genelist_df1[genelist_df1['padj'] < 0.05]
genelist_Ag1=genelist_Ag1['gene_name'].copy()
gene_listAg1 = genelist_Ag1.squeeze().str.strip().to_list()
gene_list_Ag1= [x for x in gene_listAg1 if x==x]

Section 3.2.2 Genelist for Ag+ timepoint 2

Step 15: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df2 = dataframes_list_ORA[1]
genelist_Ag2= genelist_df2[genelist_df2['padj'] < 0.05]
genelist_Ag2=genelist_Ag2['gene_name'].copy()
gene_listAg2 = genelist_Ag2.squeeze().str.strip().to_list()
gene_list_Ag2= [x for x in gene_listAg2 if x==x]

Section 3.2.3 Genelist for AgNP timepoint 1

Step 16: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df3 = dataframes_list_ORA[2]
genelist_AgNP1= genelist_df3[genelist_df3['padj'] < 0.05]
genelist_AgNP1=genelist_AgNP1['gene_name'].copy()
gene_listAgNP1 = genelist_AgNP1.squeeze().str.strip().to_list()
gene_list_AgNP1= [x for x in gene_listAgNP1 if x==x]

Section 3.2.4 Genelist for AgNP timepoint 2

Step 17: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df4 = dataframes_list_ORA[3]
genelist_AgNP2= genelist_df4[genelist_df4['padj'] < 0.05]
genelist_AgNP2=genelist_AgNP2['gene_name'].copy()
gene_listAgNP2 = genelist_AgNP2.squeeze().str.strip().to_list()
gene_list_AgNP2= [x for x in gene_listAgNP2 if x==x]

Section 3.3: Execution of ORA

In this section, ORA will be executed per condition.

enr_bg_Ag1 = gp.enrichr(gene_list=gene_list_Ag1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='ORA_tables_for_comparison', 
                 verbose=True)

2025-04-13 12:19:15,194 [INFO] Run: WikiPathways_2024_Human 
2025-04-13 12:19:16,786 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-13 12:19:18,724 [INFO] Done.

enr_bg_Ag1.results.head()

enr_bg_Ag2 = gp.enrichr(gene_list=gene_list_Ag2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='ORA_tables_for_comparison', 
                 verbose=True)

2025-04-13 12:24:30,831 [INFO] Run: WikiPathways_2024_Human 
2025-04-13 12:24:32,402 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-13 12:24:32,816 [INFO] Done.

enr_bg_Ag2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Platelet Mediated Interactions W Vascular And ...	0.000259	0.013651	101.071605	834.692381	CCL2;TLR4
1	WikiPathways_2024_Human	P53 Transcriptional Gene Network WP4963	0.000266	0.013651	27.364937	225.276445	CCL2;ULBP1;SERPINB5
2	WikiPathways_2024_Human	Network Map Of SARS CoV 2 Signaling WP5115	0.000428	0.013651	12.948480	100.448100	IFITM3;CCL2;PTGS2;CXCL5
3	WikiPathways_2024_Human	LDL Influence On CD14 And TLR4 WP5272	0.000479	0.013651	72.172840	551.613291	CCL2;TLR4
4	WikiPathways_2024_Human	Spinal Cord Injury WP2431	0.000564	0.013651	20.985577	156.979073	CCL2;PTGS2;TLR4

enr_bg_AgNP1 = gp.enrichr(gene_list=gene_list_AgNP1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='ORA_tables_for_comparison', 
                 verbose=True)

2025-04-13 12:25:06,014 [INFO] Run: WikiPathways_2024_Human 
2025-04-13 12:25:07,904 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-13 12:25:08,414 [INFO] Done.

enr_bg_AgNP1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Ciliopathies WP4803	0.000002	0.001669	2.050893	26.876247	INVS;GALNT11;DYNC2I1;ODAD4;TRAF3IP1;IFT172;CEP...
1	WikiPathways_2024_Human	Genes Related To Primary Cilium Development Ba...	0.000006	0.002491	2.477324	29.756033	DYNC2I1;TTC23;TRAF3IP1;IFT172;CEP19;CEP120;CBY...
2	WikiPathways_2024_Human	Pluripotent Stem Cell Differentiation Pathway ...	0.000100	0.020127	3.121977	28.753550	ALK;CSF1R;EPO;PDGFA;FGF1;FGF4;INS;NT5E;FGF8;CX...
3	WikiPathways_2024_Human	Bardet Biedl Syndrome WP5234	0.000119	0.020127	2.299628	20.771291	INVS;DYNC2I1;CEP104;TRAF3IP1;IFT172;PKD1L1;PKD...
4	WikiPathways_2024_Human	NRF2 Pathway WP2884	0.000143	0.020127	1.918783	16.992907	SERPINA1;HSP90AB1;SRXN1;SLC2A1;KEAP1;SLC2A2;SL...

enr_bg_AgNP2 = gp.enrichr(gene_list=gene_list_AgNP2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='ORA_tables_for_comparison', 
                 verbose=True)

2025-04-13 12:25:48,930 [INFO] Run: WikiPathways_2024_Human 
2025-04-13 12:25:50,729 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-13 12:25:51,136 [INFO] Done.

enr_bg_AgNP2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Copper Homeostasis WP3286	0.000004	0.001917	4.418958	54.597864	GSK3B;JUN;SLC31A2;XIAP;MT1X;MT4;MT2A;MT1A;APC;...
1	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	0.000005	0.001917	1.999392	24.344522	SLC2A1;TNFAIP3;SLC2A3;GCC1;SLC2A6;SLC2A8;RGS2;...
2	WikiPathways_2024_Human	Selenium Metabolism And Selenoproteins WP28	0.000102	0.025295	3.792689	34.857059	EEFSEC;GPX2;JUN;GPX4;TXNRD3;GPX3;CREM;SELENOK;...
3	WikiPathways_2024_Human	Novel Intracellular Components Of RIG I Like R...	0.000188	0.034957	3.199706	27.451509	DDX3Y;DDX3X;CXCL8;CHUK;TRADD;ATG12;IFIH1;CYLD;...
4	WikiPathways_2024_Human	Small Cell Lung Cancer WP4658	0.000560	0.080545	2.440505	18.271498	LAMA5;CHUK;LAMB3;GADD45B;ITGA3;GADD45A;LAMB2;L...

Section 3.4: Saving plots of ORA

In this section, we create and save the ORA plots for each comparison (PCB concentration 1, PCB concentration 2, PCB concentration 3 and RoundUp concentration 1).

Step 18 The ORA barplots will be created using the following commands below per condition.

ax_Ag1 = barplot(enr_bg_Ag1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
                title= 'Ag+ timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Ag+ timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Ag2 = barplot(enr_bg_Ag2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Ag+ timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Ag timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_AgNP1 = barplot(enr_bg_AgNP1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'AgNP timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='AgNP timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_AgNP2 = barplot(enr_bg_AgNP2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'AgNP timepoint 2ORA',
              top_term=10,
              figsize=(3,5),
              ofname='AgNP timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

Section 4: Execution of Overrepresentation analysis for dataset: E-MEXP-2599

In this section, you will execute overrepresentation analysis for dataset: E-MEXP-2599 using the Enrichr function of GSEApy and background list of genes that is created.

In this section, you will execute overrepresentation analysis using the Enrichr function of GSEApy and background list of genes.

Section 4.1: Generation of the background genelist

Step 19 First, you verify the contents of the folder.

path_ORA = "C:\\Users\\shaki\\Downloads\\BackgroundORA-EMEXP2599"
dir_list_ORA = os.listdir(path_ORA)
print("Files and directories in '", path_ORA, "' :")
print(dir_list_ORA)

Files and directories in ' C:\Users\shaki\Downloads\BackgroundORA-EMEXP2599 ' :
['X12_CdCI2_.5- X12_vehicle...control_.0.csv', 'X12_CsA_.5 - X12_vehicle...control_.0.csv', 'X12_Diquat_30 - X12_vehicle...control_.0.csv', 'X48_CdCI2_.5 - X48_vehicle...control_.0.csv', 'X48_CsA_.5 - X48_vehicle...control_.0.csv', 'X48_Diquat_30 - X48_vehicle...control_.0.csv']

Step 20: Next, we create individual dataframes for each of the files.

list_of_names_ORA = ['X12_CdCI2_.5- X12_vehicle...control_.0', 'X12_CsA_.5 - X12_vehicle...control_.0','X12_Diquat_30 - X12_vehicle...control_.0','X48_CdCI2_.5 - X48_vehicle...control_.0','X48_CsA_.5 - X48_vehicle...control_.0','X48_Diquat_30 - X48_vehicle...control_.0']
dataframes_list_ORA = []

for i in range(len(list_of_names_ORA)):
 
    temp_df_ORA = pd.read_csv("./BackgroundORA-EMEXP2599/" + list_of_names_ORA[i] + ".csv", comment=';')
    dataframes_list_ORA.append(temp_df_ORA)

Step 21: You will now merge the dataframes by vertical stacking followed by manipulation to retrieve the background gene list.

combined_dataframes_list_ORA=pd.concat([dataframes_list_ORA[0],dataframes_list_ORA[1],dataframes_list_ORA[2],dataframes_list_ORA[3],dataframes_list_ORA[4],dataframes_list_ORA[5]], ignore_index=True, axis=0)
genelist= combined_dataframes_list_ORA['gene_name'].copy()
gene_list = genelist.squeeze().str.strip().to_list()
Backgroundgenelist= [x for x in gene_list if x==x]

Section 4.2: Generation of the genelists

\[3\]\[6\]

Section 4.2.1 Genelist for cadmium chloride timepoint 1

Step 22: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df1 = dataframes_list_ORA[0]
genelist_CdCI1= genelist_df1[genelist_df1['padj'] < 0.05]
genelist_CDCI1=genelist_CdCI1['gene_name'].copy()
gene_listCDCL1 = genelist_CDCI1.squeeze().str.strip().to_list()
gene_list_CDCI1= [x for x in gene_listCDCL1 if x==x]

Section 4.2.2 Genelist for cyclosporin A timepoint 1

Step 23: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df2 = dataframes_list_ORA[1]
genelist_CSA2= genelist_df2[genelist_df2['padj'] < 0.05]
genelist_CsA2=genelist_CSA2['gene_name'].copy()
gene_listCSA = genelist_CsA2.squeeze().str.strip().to_list()
gene_list_CSA1= [x for x in gene_listCSA if x==x]

Section 4.2.3 Genelist for diquat dibromide timepoint 1

Step 24: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df3 = dataframes_list_ORA[2]
genelist_diquat= genelist_df3[genelist_df3['padj'] < 0.05]
genelist_Diquat=genelist_diquat['gene_name'].copy()
gene_listDiquat = genelist_Diquat.squeeze().str.strip().to_list()
gene_list_Diquat1= [x for x in gene_listDiquat if x==x]

Section 4.2.4 Genelist for cadmium chloride timepoint 2

Step 25: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df4 = dataframes_list_ORA[3]
genelist_CDCL22= genelist_df4[genelist_df4['padj'] < 0.05]
genelist_CDCI2=genelist_CDCL22['gene_name'].copy()
gene_listCDCI22 = genelist_CDCI2.squeeze().str.strip().to_list()
gene_list_CDCI22= [x for x in gene_listCDCI22 if x==x]

Section 4.2.5 Genelist for cyclosporin A timepoint 2

Step 26: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df5 = dataframes_list_ORA[4]
genelist_CsA_2= genelist_df5[genelist_df5['padj'] < 0.05]
genelist_CSA2=genelist_CsA_2['gene_name'].copy()
gene_listCsA22 = genelist_CSA2.squeeze().str.strip().to_list()
gene_list_CsA2= [x for x in gene_listCsA22 if x==x]

Section 4.2.6 Genelist for diquat dibromide timepoint 2

Step 27: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df6 = dataframes_list_ORA[5]
genelist_Diquat_2= genelist_df6[genelist_df6['padj'] < 0.05]
genelist_Diquat2=genelist_Diquat_2['gene_name'].copy()
gene_listDiquat = genelist_Diquat2.squeeze().str.strip().to_list()
gene_list_Diquat2= [x for x in gene_listDiquat if x==x]

Section 4.3: Execution of ORA

In this section, ORA will be executed per condition.

Step 28: The Enrichr function will be executed per comparison and saved.

enr_bg_CdCI1 = gp.enrichr(gene_list=gene_list_CDCI1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 09:58:56,222 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 09:58:59,621 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 09:59:01,252 [INFO] Done.

enr_bg_CdCI1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	5.091424e-21	4.154602e-18	8.952987	418.343788	TOP2A;CDKN1A;CDKN1B;MCM7;SUV39H1;HMGB2;SMC2;CC...
1	WikiPathways_2024_Human	DNA Repair Pathways Full Network WP4946	2.079615e-17	8.484830e-15	5.137071	197.323961	H2AX;FEN1;DCLRE1C;MRE11;WDR48;CCNH;MPG;CETN2;B...
2	WikiPathways_2024_Human	DNA IR Damage And Cellular Response Via ATR WP...	4.199721e-15	1.142324e-12	6.418160	212.465215	H2AX;DCLRE1A;MDC1;FEN1;MRE11;CEP164;UBE2D3;BRC...
3	WikiPathways_2024_Human	Cell Cycle WP179	4.142647e-14	8.450999e-12	4.172387	128.571514	CDKN1C;GSK3B;CDKN1A;CDKN1B;MCM7;CCNH;CDC14B;CD...
4	WikiPathways_2024_Human	Ciliary Landscape WP4352	1.388601e-10	2.198580e-08	2.464009	55.926974	DYNC2I2;DGKE;IFT172;UBE2D2;SMC4;DCAF7;PSMD7;MA...

enr_bg_CSA1 = gp.enrichr(gene_list=gene_list_CSA1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 10:00:23,265 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 10:00:24,820 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 10:00:25,292 [INFO] Done.

enr_bg_CSA1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Photodynamic Therapy Induced Unfolded Protein ...	1.069049e-11	6.360841e-09	25.863598	653.357585	DNAJC3;XBP1;HSPA5;EDEM1;DDIT3;DNAJB11;DNAJB9;T...
1	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	4.808685e-09	1.430584e-06	3.122448	59.803759	FHOD1;FHL2;HTRA1;PLOD3;NDRG1;PPM1G;HERPUD1;LMA...
2	WikiPathways_2024_Human	Prolactin Signaling WP2037	8.095352e-06	1.320876e-03	5.497044	64.448556	STAT5A;NFKBIA;MAP2K2;STAT3;EIF4EBP1;SIRPA;AKT1...
3	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	8.879840e-06	1.320876e-03	4.980684	57.933952	RRM2;MCM7;SUV39H1;SMC3;TYMS;CDC25A;CDC25B;POLA...
4	WikiPathways_2024_Human	Pleural Mesothelioma WP5087	1.436712e-05	1.709687e-03	2.418983	26.973040	LAMA5;LAMC3;WWC1;CHD8;ITPR3;FOXM1;NDRG1;CCND3;...

enr_bg_Diquat1 = gp.enrichr(gene_list=gene_list_Diquat1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 10:01:07,378 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 10:01:08,961 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 10:01:09,394 [INFO] Done.

enr_bg_Diquat1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Pleural Mesothelioma WP5087	1.884671e-08	0.000012	2.612054	46.460414	CDKN1A;WWC1;CHD8;FOXM1;DEPTOR;PLAU;RPS6KA1;AKT...
1	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	2.077285e-06	0.000655	2.321610	30.376988	LRRC59;PSMD11;FLII;FHOD1;CLTC;HTRA1;IQGAP1;NDR...
2	WikiPathways_2024_Human	Hypothesized Pathways In Pathogenesis Of Cardi...	2.141000e-05	0.004503	8.863674	95.299142	FBN2;TGFBR3;SERPINE1;FLNA;LTBP2;MAPK14;TGFBR2;...
3	WikiPathways_2024_Human	P53 Transcriptional Gene Network WP4963	2.931388e-05	0.004624	3.941386	41.138023	CDKN1A;GADD45A;XRCC5;SERPINE1;LIF;TSC2;SLC7A11...
4	WikiPathways_2024_Human	Focal Adhesion WP306	3.905396e-05	0.004929	2.730696	27.718114	LAMA5;JUN;LAMB3;PRKCB;ITGA3;LAMA1;LAMB2;ACTN1;...

enr_bg_CDCI22 = gp.enrichr(gene_list=gene_list_CDCI22,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 10:01:51,520 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 10:01:53,315 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 10:01:54,607 [INFO] Done.

enr_bg_CDCI22.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	2.207376e-20	1.803427e-17	8.362074	378.466622	TOP2A;RB1;CDKN1A;CDKN1B;MCM7;HMGB2;SMC3;SMC2;C...
1	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	4.634443e-13	1.893170e-10	2.096834	59.550284	RPL5;TRAF3IP2;NCF2;PLOD3;ETS1;ICAM1;AMOT;ACTG1...
2	WikiPathways_2024_Human	Cell Cycle WP179	1.158770e-11	3.155717e-09	3.583526	90.237040	RB1;CDKN1C;CDKN1A;CDKN1B;MCM7;CCNH;SMC3;CDC14B...
3	WikiPathways_2024_Human	G1 To S Cell Cycle Control WP45	7.583665e-10	1.548964e-07	4.814595	101.105792	RB1;CDKN1C;CDKN1A;CDKN1B;PCNA;MCM7;CCNH;PRIM1;...
4	WikiPathways_2024_Human	DNA Replication WP466	3.955954e-09	6.464029e-07	6.867953	132.881465	PCNA;MCM7;PRIM1;GMNN;MCM10;POLD3;ORC4;CDC45;OR...

enr_bg_CsA2 = gp.enrichr(gene_list=gene_list_CsA2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 10:02:25,303 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 10:02:26,877 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 10:02:27,470 [INFO] Done.

enr_bg_CsA2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	5.777480e-26	4.188673e-23	12.199144	708.931848	TOP2A;RB1;CDKN1A;PCNA;MCM7;PRIM1;HMGB2;TTK;TYM...
1	WikiPathways_2024_Human	DNA Replication WP466	3.466155e-17	1.256481e-14	17.166138	650.612078	PCNA;MCM7;PRIM1;GMNN;ORC4;CDC45;ORC1;RFC5;CDT1...
2	WikiPathways_2024_Human	Cell Cycle WP179	2.912721e-15	7.039075e-13	5.818540	194.744738	RB1;CDKN1A;HDAC2;PCNA;MCM7;TTK;CDC14B;CDC20;OR...
3	WikiPathways_2024_Human	G1 To S Cell Cycle Control WP45	2.686793e-12	4.869812e-10	7.512052	200.141147	RB1;CDKN1A;PCNA;MCM7;PRIM1;ORC4;CCNB1;CDC45;OR...
4	WikiPathways_2024_Human	DNA Mismatch Repair WP531	2.604210e-09	3.776105e-07	15.752656	311.369143	RFC5;RFC3;PCNA;RFC4;RFC2;RPA1;RPA2;MSH6;MSH2;E...

enr_bg_Diquat2 = gp.enrichr(gene_list=gene_list_Diquat2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='EMEXP-2599_ORApathwaytable', 
                 verbose=True)

2025-04-14 10:03:28,790 [INFO] Run: WikiPathways_2024_Human 
2025-04-14 10:03:30,575 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-14 10:03:31,122 [INFO] Done.

enr_bg_Diquat2.results.head()

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	3.920321e-23	3.198982e-20	10.221704	527.371265	TOP2A;CDKN1A;CDKN1B;MCM7;HMGB2;SMC2;CCND1;SIN3...
1	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	5.853628e-19	2.388280e-16	2.472575	103.803798	TRAF3IP2;ARPC5L;ICAM1;GJA1;PNP;RPS6KA5;BSG;LUC...
2	WikiPathways_2024_Human	G1 To S Cell Cycle Control WP45	8.953469e-10	2.435344e-07	4.781278	99.612229	CDKN1C;CDKN1A;CDKN1B;PCNA;MCM7;CCNH;PRIM1;ORC4...
3	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	1.227298e-09	2.503688e-07	2.060356	42.275309	KEAP1;IRS2;GCC1;NR3C1;SCP2;CCND1;SLC39A9;FTH1;...
4	WikiPathways_2024_Human	Cell Cycle WP179	1.873745e-09	3.057952e-07	3.088347	62.061351	CDKN1C;CDKN1A;CDKN1B;MCM7;CCNH;CDC20;CCND1;PTT...

Section 4.4: Saving plots of ORA

In this section, we create and save the ORA plots for each comparison (PCB concentration 1, PCB concentration 2, PCB concentration 3 and RoundUp concentration 1).

Step 29: The ORA barplots will be created using the following commands below per condition. The variable ´ofname´ is set to save the figures in your laptop.

ax_CDCI1 = barplot(enr_bg_CdCI1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
                title= 'CDCI2 timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CDCI2 timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_CsA1 = barplot(enr_bg_CSA1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'CsA timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CsA timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Diquat1 = barplot(enr_bg_Diquat1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Diquat timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Diquat timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_CDCI22 = barplot(enr_bg_CDCI22.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'CDCI2 timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CDCI2 timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_CsA2 = barplot(enr_bg_CsA2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'CsA timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CsA timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Diquat2 = barplot(enr_bg_Diquat2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Diquat timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Diquat timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

Section 5: Execution of Overrepresentation analysis for dataset: GSE44729

In this section, you will execute overrepresentation analysis for dataset: GSE44729 using the Enrichr function of GSEApy and background list of genes.

Section 5.1: Generation of the background genelist

In this section, you will create the background list of genes which contain all expressed genes of dataset:GSE44729. This requires the creation of a folder with the expressed genes per condition of the dataset.

Step 30: First, you verify the contents of the folder.

path_ORA = "C:\\Users\\shaki\\Downloads\\BackgroundORA-EGEOD44729"
dir_list_ORA = os.listdir(path_ORA)
print("Files and directories in '", path_ORA, "' :")
print(dir_list_ORA)

Files and directories in ' C:\Users\shaki\Downloads\BackgroundORA-EGEOD44729 ' :
['adaptedACR10h.tsv', 'adaptedACR24h.tsv', 'adaptedCP10h.tsv', 'adaptedCP24h.tsv', 'adaptedMA10h.tsv', 'adaptedMA24h.tsv']

Step 31: Next, we create individual dataframes for each of the files.

list_of_names_ORA = ['adaptedACR10h', 'adaptedACR24h','adaptedCP10h','adaptedCP24h','adaptedMA10h','adaptedMA24h']
dataframes_list_ORA = []

for i in range(len(list_of_names_ORA)):
 
    temp_df_ORA = pd.read_csv("./BackgroundORA-EGEOD44729/" + list_of_names_ORA[i] + ".tsv", sep='\t')
    dataframes_list_ORA.append(temp_df_ORA)

Step 32: The dataframes were merged by vertical stacking followed by manipulation to retrieve the background gene list.

combined_dataframes_list_ORA=pd.concat([dataframes_list_ORA[0],dataframes_list_ORA[1],dataframes_list_ORA[2],dataframes_list_ORA[3],dataframes_list_ORA[4],dataframes_list_ORA[5]], ignore_index=True, axis=0)
genelist= combined_dataframes_list_ORA['GENE_SYMBOL'].copy().drop_duplicates()
gene_list = genelist.squeeze().str.strip().to_list()
Backgroundgenelist= [x for x in gene_list if x==x]

Section 5.2: Generation of the genelists

In this section, you will create the genelists.

Section 5.2.1 Genelist for ACR timepoint 1

Step 33. You extract the significant results and convert the dataframe into a list per comparison.

ACR timepoint 1 GSEA and ORA could not be executed as it only had one gene.

#genelist_df1 = dataframes_list_ORA[0]
#genelist_ACR1= genelist_df1[genelist_df1['padj'] < 0.05]
#genelist_ACR_1=genelist_ACR1['GENE_SYMBOL'].copy()
#gene_listACR1 = genelist_ACR_1.squeeze().str.strip().to_list()
#gene_list_ACR10_t1= [x for x in gene_listACR1 if x==x]

Section 5.2.2 Genelist for ACR imepoint 2

Step 34. You extract the significant results and convert the dataframe into a list per comparison.

genelist_df2 = dataframes_list_ORA[1]
genelist_ACR2= genelist_df2[genelist_df2['padj'] < 0.05]
genelist_acr2=genelist_ACR2['GENE_SYMBOL'].copy()
gene_listACR2 = genelist_acr2.squeeze().str.strip().to_list()
gene_list_ACR24_t2= [x for x in gene_listACR2 if x==x]

Section 5.2.3 Genelist for CP timepoint 1

Step 35: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df3 = dataframes_list_ORA[2]
genelist_CP10= genelist_df3[genelist_df3['padj'] < 0.05]
genelist_cp10=genelist_CP10['GENE_SYMBOL'].copy()
gene_listCP10 = genelist_cp10.squeeze().str.strip().to_list()
gene_list_CP10_t1= [x for x in gene_listCP10 if x==x]

Section 5.2.4 Genelist for CP timepoint 2

Step 36: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df4 = dataframes_list_ORA[3]
genelist_CP24= genelist_df4[genelist_df4['padj'] < 0.05]
genelist_cp24=genelist_CP24['GENE_SYMBOL'].copy()
gene_listCP24 = genelist_cp24.squeeze().str.strip().to_list()
gene_list_CP24_t2= [x for x in gene_listCP24 if x==x]

Section 5.2.5 Genelist for MA timepoint 1

Step 37. You extract the significant results and convert the dataframe into a list per comparison.

genelist_df5 = dataframes_list_ORA[4]
genelist_MA10= genelist_df5[genelist_df5['padj'] < 0.05]
genelist_ma10=genelist_MA10['GENE_SYMBOL'].copy()
gene_listMA10 = genelist_ma10.squeeze().str.strip().to_list()
gene_list_MA10_t1= [x for x in gene_listMA10 if x==x]

Section 5.2.6 Genelist for MA timepoint 2

Step 38: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df6 = dataframes_list_ORA[5]
genelist_MA24= genelist_df6[genelist_df6['padj'] < 0.05]
genelist_ma24=genelist_MA24['GENE_SYMBOL'].copy()
gene_listMA24 = genelist_ma24.squeeze().str.strip().to_list()
gene_list_MA24_t2= [x for x in gene_listMA24 if x==x]

Section 5.3: Execution of ORA

In this section, ORA will be executed per condition.

Step 39: The Enrichr function will be executed per comparison with exception of Acrolein timepoint 1 which did not give ORA results.

enr_bg_ACR24_t2 = gp.enrichr(gene_list=gene_list_ACR24_t2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir=None, 
                 verbose=True)

2025-03-18 14:35:19,806 [INFO] Run: WikiPathways_2024_Human 
2025-03-18 14:35:20,846 [INFO] Done.

enr_bg_ACR24_t2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Transcriptional Activation By NRF2 In Response...	0.000032	0.000489	594.750000	6164.285398	HMOX1;SLC7A11
1	WikiPathways_2024_Human	NRF2 ARE Regulation WP4357	0.000032	0.000489	594.750000	6164.285398	HMOX1;SLC7A11
2	WikiPathways_2024_Human	Oxidative Stress Response WP408	0.000110	0.001135	237.600000	2165.743440	CYP1A1;HMOX1
3	WikiPathways_2024_Human	Antiviral And Anti-Inflam Effects Of Nrf2 On S...	0.000146	0.001135	197.916667	1747.310661	HMOX1;SLC7A11
4	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	0.000301	0.001865	40.051724	324.773808	CYP1A1;HMOX1;SLC7A11

enr_bg_CP10_t1 = gp.enrichr(gene_list=gene_list_CP10_t1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir=None, 
                 verbose=True)

2025-03-18 14:53:58,480 [INFO] Run: WikiPathways_2024_Human 
2025-03-18 14:53:59,585 [INFO] Done.

enr_bg_CP10_t1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Epithelial To Mesenchymal Transition In Colore...	0.000965	0.041872	61.480519	426.881789	ID2;SNAI1
1	WikiPathways_2024_Human	Hepatitis B Infection WP4666	0.001669	0.041872	45.009524	287.847526	EGR3;FOS
2	WikiPathways_2024_Human	Serotonin And Anxiety WP3947	0.003770	0.041872	inf	inf	FOS
3	WikiPathways_2024_Human	MAPK Pathway In Congenital Thyroid Cancer WP4928	0.003770	0.041872	inf	inf	FOS
4	WikiPathways_2024_Human	Estrogen Signaling WP712	0.003770	0.041872	inf	inf	FOS

enr_bg_CP24_t2 = gp.enrichr(gene_list=gene_list_CP24_t2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir=None, 
                 verbose=True)

2025-03-18 14:29:05,143 [INFO] Run: WikiPathways_2024_Human 
2025-03-18 14:29:06,377 [INFO] Done.

enr_bg_CP24_t2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	0.000005	0.001376	3.395663	41.215212	TNFAIP3;GCC1;SLC7A11;CYP3A5;DNAJB1;RGS2;CCND1;...
1	WikiPathways_2024_Human	NRF2 Pathway WP2884	0.000006	0.001376	9.572727	114.873390	NQO1;SLC7A11;PRDX6;TGFBR2;DNAJB1;MAFF;ME1;HMOX...
2	WikiPathways_2024_Human	TGF Beta Signaling Pathway WP366	0.000008	0.001376	6.919643	81.087304	CDKN2B;MMP1;LIMK2;TNC;FOS;THBS1;RUNX2;PML;SMAD...
3	WikiPathways_2024_Human	Endoplasmic Reticulum Stress Response In Coron...	0.000035	0.004418	29.228963	300.014140	PPP1R15A;PPP1R15B;PPP1R14B;XBP1;PPP1R10;PPP1R3...
4	WikiPathways_2024_Human	Photodynamic Therapy Induced Unfolded Protein ...	0.000141	0.014307	14.606654	129.503854	PPP1R15A;XBP1;DDIT3;DNAJB9;ASNS;TRIB3;CALR;ATF4

enr_bg_MA10_t1 = gp.enrichr(gene_list=gene_list_MA10_t1,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir=None, 
                 verbose=True)

2025-03-18 14:29:07,128 [INFO] Run: WikiPathways_2024_Human 
2025-03-18 14:29:08,141 [INFO] Done.

enr_bg_MA10_t1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	FBXL10 Enhancement Of MAP ERK Signaling In DLB...	0.003349	0.015063	794.000000	4525.002412	H3C14
1	WikiPathways_2024_Human	Effect Of Progerin On Genes Involved In Proger...	0.005021	0.015063	396.833333	2100.898045	H3C14
2	WikiPathways_2024_Human	Histone Modifications WP2369	0.008358	0.016715	198.250000	948.544395	H3C14
3	WikiPathways_2024_Human	NF1 Copy Number Variation Syndrome WP5366	0.018317	0.027476	79.100000	316.392205	H3C14
4	WikiPathways_2024_Human	Ebola Virus Infection In Host WP4217	0.024916	0.029899	56.404762	208.261211	CD300A

enr_bg_MA24_t2 = gp.enrichr(gene_list=gene_list_MA24_t2,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir=None, 
                 verbose=True)

2025-03-18 14:29:09,485 [INFO] Run: WikiPathways_2024_Human 
2025-03-18 14:29:10,994 [INFO] Done.

enr_bg_MA24_t2.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Ciliary Landscape WP4352	0.020091	0.999997	6.306452	24.642204	DYNC2I2;DYNC2I1;IFT70A;PSMD12;CTBP2;DYNLT2B;HT...
1	WikiPathways_2024_Human	Sudden Infant Death Syndrome SIDS Susceptibili...	0.034278	0.999997	inf	inf	CREBBP;NFYA;MAOA;SLC1A3;YBX1;RUNX3;MECP2;C4A;G...
2	WikiPathways_2024_Human	Cholesterol Metab With Bloch And Kandutsch Rus...	0.048097	0.999997	inf	inf	MVK;HMGCS1;ELOVL3;CYP51A1;HMGCR;LSS;ACAT2;TM7S...
3	WikiPathways_2024_Human	Selenium Micronutrient Network WP15	0.067466	0.999997	inf	inf	FGB;SERPINE1;SAA4;APOA1;MTHFR;PTGS1;PRDX4;IFNG...
4	WikiPathways_2024_Human	Cholesterol Metabolism WP5304	0.067466	0.999997	inf	inf	MVK;HMGCS1;CYP51A1;PCSK9;APOA1;HMGCR;LSS;ACAT2...

Section 5.4: Saving plots of ORA

In this section, we create and save the ORA plots for each comparison.

Step 40: We create the ORA barplot using the following commands below per condition. The variable ´ofname´ is set to save the figures in your laptop. The dotplot can’t be created for this comparison, it is too big.

ax_ACR24h_t2 = barplot(enr_bg_ACR24_t2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'ACR timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='ACR timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_CP10_t1 = barplot(enr_bg_CP10_t1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'CP timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CP timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_CP24_t2 = barplot(enr_bg_CP24_t2.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'CP timepoint 2 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='CP timepoint 2 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_MA_t1 = barplot(enr_bg_MA10_t1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'MA timepoint 1 ORA',
              top_term=10,
              figsize=(3,5),
              ofname='MA timepoint 1 ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

Section 6: Execution of Overrepresentation analysis for dataset: E-GEOD-69851

In this section, you will execute overrepresentation analysis using the Enrichr function of GSEApy and background list of genes.

Section 6.1: Generation of the background genelist

In this section, you will create the background list of genes which contain all expressed genes of dataset:E-GEOD-69851. This is the dataframes you compared in your DEG analysis. This requires the creation of a folder with the expressed genes per condition of the dataset:

df1: 0= Bisphenol A 1uM
df2: 1= Bisphenol A 10uM
df3: 2= Bisphenol A 100uM
df4: 3= Farnesol 1uM
df5: 4= Farnesol 10uM
df6: 5= Farnesol 100uM
df7: 6= Tetrachlorodibenzopdioxin 1 nM
df8: 7= Tetrachlorodibenzopdioxin 10 nM
df9: 8= Tetrachlorodibenzopdioxin 100 nM
df10:9= Troglitazone 1uM
df11:10= Troglitazone 10uM
df12:11= Troglitazone 100uM
df13:12= ValproicAcid-1uM
df14:13= ValproicAcid-10uM
df15:14= ValproicAcid-100mM

Step 41: First, you verify the contents of the folder.

path_ORA = "C:\\Users\\shaki\\Downloads\\BackgroundORA_EGEOD69851"
dir_list_ORA = os.listdir(path_ORA)
print("Files and directories in '", path_ORA, "' :")
print(dir_list_ORA)

Files and directories in ' C:\Users\shaki\Downloads\BackgroundORA_EGEOD69851 ' :
['GSE69844.BisphenolA-100uM.tsv', 'GSE69844.BisphenolA-10uM.tsv', 'GSE69844.BisphenolA-1uM.tsv', 'GSE69844.Farnesol-100uM.tsv', 'GSE69844.Farnesol-10uM.tsv', 'GSE69844.Farnesol-1uM.tsv', 'GSE69844.Tetrachlorodibenzopdioxin-100nM.tsv', 'GSE69844.Tetrachlorodibenzopdioxin-10nM.tsv', 'GSE69844.Tetrachlorodibenzopdioxin-1nM.tsv', 'GSE69844.Troglitazone-100uM.tsv', 'GSE69844.Troglitazone-10uM.tsv', 'GSE69844.Troglitazone-1uM.tsv', 'GSE69844.ValproicAcid-100uM.tsv', 'GSE69844.ValproicAcid-10uM.tsv', 'GSE69844.ValproicAcid-1mM.tsv']

Step 42: Next, we create individual dataframes for each of the files and merge the dataframes by vertical stacking followed by manipulation to retrieve the background gene list.

list_of_names_ORA = ['GSE69844.BisphenolA-100uM', 'GSE69844.BisphenolA-10uM', 'GSE69844.BisphenolA-1uM', 'GSE69844.Farnesol-100uM', 'GSE69844.Farnesol-10uM', 'GSE69844.Farnesol-1uM', 'GSE69844.Tetrachlorodibenzopdioxin-100nM', 'GSE69844.Tetrachlorodibenzopdioxin-10nM', 'GSE69844.Tetrachlorodibenzopdioxin-1nM', 'GSE69844.Troglitazone-100uM', 'GSE69844.Troglitazone-10uM', 'GSE69844.Troglitazone-1uM', 'GSE69844.ValproicAcid-100uM', 'GSE69844.ValproicAcid-10uM', 'GSE69844.ValproicAcid-1mM']
dataframes_list_ORA = []

for i in range(len(list_of_names_ORA)):
 
    temp_df_ORA = pd.read_csv("./BackgroundORA_EGEOD69851/" + list_of_names_ORA[i] + ".tsv", sep='\t')
    dataframes_list_ORA.append(temp_df_ORA)

combined_dataframes_list_ORA=pd.concat([dataframes_list_ORA[0],dataframes_list_ORA[1],dataframes_list_ORA[2],dataframes_list_ORA[3],dataframes_list_ORA[4],dataframes_list_ORA[5],dataframes_list_ORA[6],dataframes_list_ORA[7],dataframes_list_ORA[8],dataframes_list_ORA[9],dataframes_list_ORA[10],dataframes_list_ORA[11],dataframes_list_ORA[12],dataframes_list_ORA[13],dataframes_list_ORA[14]], ignore_index=True, axis=0)
genelist= combined_dataframes_list_ORA['Gene.Symbol'].copy().drop_duplicates()
gene_list = genelist.squeeze().str.strip().to_list()
Backgroundgenelist= [x for x in gene_list if x==x]

Section 6.2: Generation of the genelists

In this section, you will create the genelists.

Section 6.2.1 Genelist for Bisphenol A 100 uM

Step 43: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df1 = dataframes_list_ORA[0]
genelist_BisphenolA100uM= genelist_df1[genelist_df1['padj'] < 0.05]
genelist_BisphenolA_100uM=genelist_BisphenolA100uM['Gene.Symbol'].copy()
gene_listBisphenolA_ = genelist_BisphenolA_100uM.squeeze().str.strip().to_list()
gene_list_BisphenolA_100uM= [x for x in gene_listBisphenolA_ if x==x]

Section 6.2.2 Genelist for Bisphenol A 10uM

Step 44: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df2 = dataframes_list_ORA[1]
genelist_BisphenolA10uM= genelist_df2[genelist_df2['padj'] < 0.05]
genelist_BisphenolA_10uM=genelist_BisphenolA10uM['Gene.Symbol'].copy()
gene_listBisphenolA_10 = genelist_BisphenolA_10uM.squeeze().str.strip().to_list()
gene_list_BisphenolA_10uM= [x for x in gene_listBisphenolA_10 if x==x]

Section 6.2.3 Genelist for Bisphenol A 1uM

Step 45: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df3 = dataframes_list_ORA[2]
genelist_BisphenolA1uM= genelist_df3[genelist_df3['padj'] < 0.05]
genelist_BisphenolA_1uM=genelist_BisphenolA1uM['Gene.Symbol'].copy()
gene_listBisphenolA_1 = genelist_BisphenolA_1uM.squeeze().str.strip().to_list()
gene_list_BisphenolA_1uM= [x for x in gene_listBisphenolA_1 if x==x]

Section 6.2.4 Genelist for Farnesol 100uM

Step 46: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df4 = dataframes_list_ORA[3]
genelist_Farnesol100_uM= genelist_df4[genelist_df4['padj'] < 0.05]
genelist_Farnesol100=genelist_Farnesol100_uM['Gene.Symbol'].copy()
gene_listFarnesol100uM = genelist_Farnesol100.squeeze().str.strip().to_list()
gene_list_Farnesol_100uM= [x for x in gene_listFarnesol100uM if x==x]

Section 6.2.5 Genelist for Farnesol 10uM

Step 47: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df5 = dataframes_list_ORA[4]
genelist_Farnesol10_uM= genelist_df5[genelist_df5['padj'] < 0.05]
genelist_Farnesol10=genelist_Farnesol10_uM['Gene.Symbol'].copy()
gene_listFarnesol10uM = genelist_Farnesol10.squeeze().str.strip().to_list()
gene_list_Farnesol_10uM= [x for x in gene_listFarnesol10uM if x==x]

Section 6.2.6 Genelist for Farnesol 1uM

Step 48: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df6 = dataframes_list_ORA[5]
genelist_Farnesol1_uM= genelist_df6[genelist_df6['padj'] < 0.05]
genelist_Farnesol1=genelist_Farnesol1_uM['Gene.Symbol'].copy()
gene_listFarnesol1uM = genelist_Farnesol1.squeeze().str.strip().to_list()
gene_list_Farnesol_1uM= [x for x in gene_listFarnesol1uM if x==x]

Section 6.2.7 Genelist for TP dioxin 100uM

Step 49: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df7 = dataframes_list_ORA[6]
genelist_Tpdioxin100_uM= genelist_df7[genelist_df7['padj'] < 0.05]
genelist_Tpdioxin100=genelist_Tpdioxin100_uM['Gene.Symbol'].copy()
gene_listTpdioxin100uM = genelist_Tpdioxin100.squeeze().str.strip().to_list()
gene_list_Tpdioxin_100uM= [x for x in gene_listTpdioxin100uM if x==x]

Section 6.2.8 Genelist for Tp dioxin 10uM

Step 50: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df8 = dataframes_list_ORA[7]
genelist_Tpdioxin10_uM= genelist_df8[genelist_df8['padj'] < 0.05]
genelist_Tpdioxin10=genelist_Tpdioxin10_uM['Gene.Symbol'].copy()
gene_listTpdioxin10uM = genelist_Tpdioxin10.squeeze().str.strip().to_list()
gene_list_Tpdioxin_10uM= [x for x in gene_listTpdioxin10uM if x==x]

Section 6.2.9 Genelist for Tp dioxin 1uM

Step 51: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df9 = dataframes_list_ORA[8]
genelist_Tpdioxin1_uM= genelist_df9[genelist_df9['padj'] < 0.05]
genelist_Tpdioxin1=genelist_Tpdioxin1_uM['Gene.Symbol'].copy()
gene_listTpdioxin1uM = genelist_Tpdioxin1.squeeze().str.strip().to_list()
gene_list_Tpdioxin_1uM= [x for x in gene_listTpdioxin1uM if x==x]

Section 6.2.10 Genelist for Troglitazone 100 uM

Step 52: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df10 = dataframes_list_ORA[9]
genelist_Troglitazone100_uM= genelist_df10[genelist_df10['padj'] < 0.05]
genelist_Troglitazone100=genelist_Troglitazone100_uM['Gene.Symbol'].copy()
gene_listTroglitazone100 = genelist_Troglitazone100.squeeze().str.strip().to_list()
gene_list_Troglitazone_100uM= [x for x in gene_listTroglitazone100 if x==x]

Section 6.2.11 Genelist for Troglitazone 10 uM

Step 53: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df11 = dataframes_list_ORA[10]
genelist_Troglitazone10_uM= genelist_df11[genelist_df11['padj'] < 0.05]
genelist_Troglitazone10=genelist_Troglitazone10_uM['Gene.Symbol'].copy()
gene_listTroglitazone10 = genelist_Troglitazone10.squeeze().str.strip().to_list()
gene_list_Troglitazone_10uM= [x for x in gene_listTroglitazone10 if x==x]

Section 6.2.12 Genelist for Troglitazone 1 uM

Step 54: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df12 = dataframes_list_ORA[11]
genelist_Troglitazone1_uM= genelist_df12[genelist_df12['padj'] < 0.05]
genelist_Troglitazone1=genelist_Troglitazone1_uM['Gene.Symbol'].copy()
gene_listTroglitazone1 = genelist_Troglitazone1.squeeze().str.strip().to_list()
gene_list_Troglitazone_1uM= [x for x in gene_listTroglitazone1 if x==x]

Section 6.2.13 Genelist for Valproic acid 100uM

Step 55: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df13 = dataframes_list_ORA[12]
genelist_Valproicacid100uM= genelist_df13[genelist_df13['padj'] < 0.05]
genelist_Valproicacid100=genelist_Valproicacid100uM['Gene.Symbol'].copy()
gene_listValproicacid100uM = genelist_Valproicacid100.squeeze().str.strip().to_list()
gene_list_Valproicacid_100uM= [x for x in gene_listValproicacid100uM if x==x]

Section 6.2.14 Genelist for Valproic acid 10uM

Step 56: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df14 = dataframes_list_ORA[13]
genelist_Valproicacid10uM= genelist_df14[genelist_df14['padj'] < 0.05]
genelist_Valproicacid10=genelist_Valproicacid10uM['Gene.Symbol'].copy()
gene_listValproicacid10uM = genelist_Valproicacid10.squeeze().str.strip().to_list()
gene_list_Valproicacid_10uM= [x for x in gene_listValproicacid10uM if x==x]

Section 6.2.15 Genelist for Valproic acid 1uM

Step 57: You extract the significant results and convert the dataframe into a list per comparison.

genelist_df15 = dataframes_list_ORA[14]
genelist_Valproicacid1uM= genelist_df15[genelist_df15['padj'] < 0.05]
genelist_Valproicacid1=genelist_Valproicacid1uM['Gene.Symbol'].copy()
gene_listValproicacid1uM = genelist_Valproicacid1.squeeze().str.strip().to_list()
gene_list_Valproicacid_1uM= [x for x in gene_listValproicacid1uM if x==x]

Section 6.3: Execution of ORA

In this section, ORA will be executed per condition.

Step 58: The Enrichr function is executed per comparison and saved.

enr_bg_BisphenolA100 = gp.enrichr(gene_list=gene_list_BisphenolA_100uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:08:35,632 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:08:37,759 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:08:38,878 [INFO] Done.

enr_bg_BisphenolA100.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	2.078731e-13	1.667142e-10	2.264959	66.140983	ATF2;NCF2;CTNND1;PLOD3;ELK1;ETS1;GJA1;PNP;CCND...
1	WikiPathways_2024_Human	Sterol Regulatory Element Binding Proteins SRE...	4.376168e-10	1.754843e-07	4.844598	104.399518	SCARB1;IDI1;SEC23A;SAR1A;SAR1B;INSIG2;INSIG1;L...
2	WikiPathways_2024_Human	Cholesterol Metabolism WP5304	4.001475e-08	1.069728e-05	4.056097	69.091633	SCARB1;IDI1;LRP1;SAR1B;LPL;LCAT;HMGCR;LIPA;CYP...
3	WikiPathways_2024_Human	Pathways Affected In Adenoid Cystic Carcinoma ...	1.418930e-07	2.844955e-05	4.255719	67.105003	CEBPA;MYCBP;SRCAP;PRKDC;CTBP1;PTEN;JMJD1C;DTX4...
4	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	2.389575e-07	3.753421e-05	1.976154	30.130378	SERPINA1;ALAS1;SRXN1;SLC2A2;IRS2;SLC7A11;NR3C1...

enr_bg_BisphenolA10 = gp.enrichr(gene_list=gene_list_BisphenolA_10uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:09:24,475 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:09:26,039 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:09:26,347 [INFO] Done.

enr_bg_BisphenolA10.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Old P-value	Old adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Pancreatic Cancer Subtypes WP5390	0.004687	0.009373	0	0	434.478261	2330.130322	S100A2
1	WikiPathways_2024_Human	Vitamin D Receptor Pathway WP2877	0.017592	0.017592	0	0	112.818182	455.819253	S100A2

enr_bg_BisphenolA1 = gp.enrichr(gene_list=gene_list_BisphenolA_1uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:09:56,544 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:09:58,628 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:09:59,107 [INFO] Done.

enr_bg_BisphenolA1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	7.703601e-13	5.870144e-10	2.452877	68.415449	ATF2;CTNND1;PLOD3;ICAM1;ACTG1;CCND1;CFL1;BSG;C...
1	WikiPathways_2024_Human	Cholesterol Metabolism WP5304	4.644690e-09	1.769627e-06	4.924623	94.491412	SCARB1;IDI1;LRP1;SAR1B;LPL;LCAT;HMGCR;LIPA;ACA...
2	WikiPathways_2024_Human	Sterol Regulatory Element Binding Proteins SRE...	2.265386e-08	5.754080e-06	4.627092	81.450401	SCARB1;IDI1;SEC23A;PRKAA1;SAR1B;LPL;HMGCR;MED1...
3	WikiPathways_2024_Human	Enterocyte Cholesterol Metabolism WP5333	5.665060e-08	9.998716e-06	7.456660	124.424531	IDI1;FDPS;ABCG8;DGAT1;HMGCS1;SAR1B;CYP51A1;DHC...
4	WikiPathways_2024_Human	Pathways Affected In Adenoid Cystic Carcinoma ...	6.560837e-08	9.998716e-06	4.907540	81.168556	SMARCE1;CEBPA;MAP2K2;MYCBP;SRCAP;PRKDC;CTBP1;M...

enr_bg_Farnesol100 = gp.enrichr(gene_list=gene_list_Farnesol_100uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:10:49,806 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:10:51,345 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:10:51,762 [INFO] Done.

enr_bg_Farnesol100.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	0.002155	0.018264	44.291807	271.950920	IGFBP1;ANGPTL4
1	WikiPathways_2024_Human	Familial Hyperlipidemia Type 1 WP5108	0.004236	0.018264	312.703125	1708.655726	ANGPTL4
2	WikiPathways_2024_Human	Effect Of Progerin On Genes Involved In Proger...	0.005479	0.018264	238.190476	1240.213846	CBX5
3	WikiPathways_2024_Human	Photodynamic Therapy Induced HIF 1 Survival Si...	0.008705	0.020383	147.022059	697.444064	IGFBP1
4	WikiPathways_2024_Human	Aryl Hydrocarbon Receptor Pathway WP2873	0.010192	0.020383	124.931250	572.957193	IGFBP1

enr_bg_Tpdioxin100 = gp.enrichr(gene_list=gene_list_Tpdioxin_100uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:11:14,809 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:11:16,703 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:11:17,357 [INFO] Done.

enr_bg_Tpdioxin100.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	7.261196e-09	0.000003	9.497582	177.991541	JUN;SERPINB2;CCL20;VDR;SLC7A11;CYP2C9;SLC6A6;S...
1	WikiPathways_2024_Human	Estrogen Receptor Pathway WP2881	5.298219e-07	0.000096	85.946063	1241.982912	JUN;GPAM;CYP1A1;CYP1B1
2	WikiPathways_2024_Human	Aryl Hydrocarbon Receptor Pathway WP2873	2.543356e-06	0.000308	27.084695	348.905753	SLC7A5;JUN;SERPINB2;CYP1A1;CYP1B1
3	WikiPathways_2024_Human	Non Genomic Actions Of 1 25 Dihydroxyvitamin D...	4.451791e-05	0.003314	14.315888	143.439745	JUN;IFNGR1;VDR;CCL2;JAK1
4	WikiPathways_2024_Human	Vitamin D Receptor Pathway WP2877	4.628739e-05	0.003314	8.135235	81.194863	CYP2C9;SULT1C2;SLC37A2;GADD45A;VDR;HSD17B2;CYP1A1

enr_bg_Tpdioxin10 = gp.enrichr(gene_list=gene_list_Tpdioxin_10uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:11:41,764 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:11:43,291 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:11:43,886 [INFO] Done.

enr_bg_Tpdioxin10.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Sulindac Metabolic Pathway WP2542	0.000250	0.001947	inf	inf	CYP1B1
1	WikiPathways_2024_Human	Benzo A Pyrene Metabolism WP696	0.000449	0.001947	inf	inf	CYP1B1
2	WikiPathways_2024_Human	Estrogen Metabolism WP697	0.000599	0.001947	inf	inf	CYP1B1
3	WikiPathways_2024_Human	Estrogen Receptor Pathway WP2881	0.000649	0.001947	inf	inf	CYP1B1
4	WikiPathways_2024_Human	Tamoxifen Metabolism WP691	0.000849	0.002036	inf	inf	CYP1B1

enr_bg_Tpdioxin1= gp.enrichr(gene_list=gene_list_Tpdioxin_1uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:12:13,318 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:12:14,897 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:12:15,287 [INFO] Done.

enr_bg_Tpdioxin1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Benzo A Pyrene Metabolism WP696	0.000004	0.000084	1144.000000	14288.986904	CYP1A1;CYP1B1
1	WikiPathways_2024_Human	Estrogen Metabolism WP697	0.000007	0.000084	800.680000	9515.865341	CYP1A1;CYP1B1
2	WikiPathways_2024_Human	Estrogen Receptor Pathway WP2881	0.000008	0.000084	727.854545	8528.883211	CYP1A1;CYP1B1
3	WikiPathways_2024_Human	Tamoxifen Metabolism WP691	0.000014	0.000110	533.653333	5956.934069	CYP1A1;CYP1B1
4	WikiPathways_2024_Human	Estrogen Metabolism WP5276	0.000022	0.000136	421.221053	4519.180649	HSD17B2;CYP1B1

enr_bg_Troglitazone100 = gp.enrichr(gene_list=gene_list_Troglitazone_100uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:12:43,723 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:12:45,964 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:12:46,585 [INFO] Done.

enr_bg_Troglitazone100.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	7.734291e-19	6.334385e-16	2.459218	102.557878	ATF2;TRAF3IP2;ARPC5L;ICAM1;BSG;AKT1;LUC7L;UBAP...
1	WikiPathways_2024_Human	Sterol Regulatory Element Binding Proteins SRE...	5.200839e-12	2.129744e-09	5.581778	145.026881	SCARB1;IDI1;SEC23A;PRKAA1;SAR1A;SAR1B;INSIG2;I...
2	WikiPathways_2024_Human	EGF EGFR Signaling WP437	5.388344e-11	1.471018e-08	2.888704	68.301099	USP6NL;ATF1;SH3KBP1;INPPL1;PTEN;PIK3C2B;EPS8;R...
3	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	1.127974e-08	2.309527e-06	1.980242	36.238945	KEAP1;IRS2;AHR;NR3C1;RGS2;SCP2;CCND1;FTH1;PDK4...
4	WikiPathways_2024_Human	mRNA Processing WP411	3.119745e-08	5.110143e-06	2.926105	50.571666	CELF2;HNRNPU;EFTUD2;PTBP1;SNRNP70;RBM17;SNRPN;...

enr_bg_Troglitazone10 = gp.enrichr(gene_list=gene_list_Troglitazone_10uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:13:03,670 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:13:05,315 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:13:05,773 [INFO] Done.

enr_bg_Troglitazone10.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Nuclear Receptors Meta Pathway WP2882	0.000002	0.000284	11.275487	149.727096	ABCB4;PDK4;CYP1A1;ANKRD1;HMOX1;PCK1;CYP3A5;SQSTM1
1	WikiPathways_2024_Human	Estrogen Receptor Pathway WP2881	0.000006	0.000481	113.026415	1362.835594	PDK4;CYP1A1;PCK1
2	WikiPathways_2024_Human	PPAR Signaling WP3942	0.000035	0.001919	24.709677	253.749828	FABP4;ACSL5;LPL;PCK1
3	WikiPathways_2024_Human	Familial Partial Lipodystrophy WP5102	0.000072	0.002979	43.436865	414.465543	FABP4;LPL;CIDEC
4	WikiPathways_2024_Human	Novel Intracellular Components Of RIG I Like R...	0.000490	0.015412	21.690131	165.321389	CXCL10;IRF7;TRIM25

enr_bg_Troglitazone1= gp.enrichr(gene_list=gene_list_Troglitazone_1uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:13:29,699 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:13:31,704 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:13:32,485 [INFO] Done.

enr_bg_Troglitazone1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	Sterol Regulatory Element Binding Proteins SRE...	6.666081e-10	4.932900e-07	5.442782	114.999553	SCARB1;SEC23A;SAR1A;SAR1B;LPL;DBI;HMGCR;MED15;...
1	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	2.501338e-08	9.254949e-06	2.093132	36.637874	ITGB1;CTNND1;SRP54;FHL2;PLOD3;HMGB1;CSRP1;CAPZ...
2	WikiPathways_2024_Human	Retinoblastoma Gene In Cancer WP2446	4.175683e-08	1.030002e-05	4.058219	68.954832	RB1;CDKN1A;DNMT1;PCNA;PRKDC;PRIM1;HMGB2;TTK;HM...
3	WikiPathways_2024_Human	Proteasome Degradation WP183	3.348264e-07	6.194289e-05	4.678952	69.761560	PSMD10;PSMD12;PSMD11;RPN2;UBA7;HLA-B;HLA-C;HLA...
4	WikiPathways_2024_Human	EGFR Tyrosine Kinase Inhibitor Resistance WP4806	1.459952e-06	2.160729e-04	3.555785	47.779459	SHC1;ARAF;TGFA;PDGFA;PIK3R1;FOXO3;EGFR;IGF1R;C...

enr_bg_ValproicAcid1= gp.enrichr(gene_list=gene_list_Valproicacid_1uM,
                 gene_sets=['WikiPathways_2024_Human'],
                 organism='human',
                 background= Backgroundgenelist,
                 outdir='GEOD69851_ORApathwaytable', 
                 verbose=True)

2025-04-16 10:14:04,617 [INFO] Run: WikiPathways_2024_Human 
2025-04-16 10:14:06,619 [INFO] Save enrichment results for WikiPathways_2024_Human 
2025-04-16 10:14:07,063 [INFO] Done.

enr_bg_ValproicAcid1.results.head(5)

	Gene_set	Term	P-value	Adjusted P-value	Odds Ratio	Combined Score	Genes
0	WikiPathways_2024_Human	VEGFA VEGFR2 Signaling WP3888	1.815291e-08	0.000014	1.996050	35.578462	TRAF3IP2;NCF2;ARPC5L;ACTG1;KDR;LUC7L;PRKCI;MEF...
1	WikiPathways_2024_Human	Glioblastoma Signaling WP2261	7.281428e-07	0.000208	3.438640	48.597498	RB1;CDKN1A;IRS1;ARAF;PIK3CD;PIK3R1;FOXO4;PIK3C...
2	WikiPathways_2024_Human	Ciliopathies WP4803	8.319336e-07	0.000208	2.603973	36.454357	GALNT11;TRAF3IP1;ARL6;ARL3;PIK3R4;CCDC28B;SPAT...
3	WikiPathways_2024_Human	MAPK And NFkB Signaling Inhibited By Yersinia ...	2.684536e-06	0.000504	18.770931	240.793562	NFKBIA;IKBKB;MAP3K1;CHUK;MAPK1;IKBKG;RAF1;NFKB...
4	WikiPathways_2024_Human	TNF Alpha Signaling WP231	4.860513e-06	0.000730	3.036230	37.146348	DIABLO;PYGL;IKBKB;MAPK9;MAPK8;NRAS;MAPK1;RIPK1...

Section 6.4: Saving plots of ORA

In this section, we create and save the ORA plots for each comparison.

Step 59: We create the ORA barplot using the following commands below per condition. The variable ´ofname´ is set to save the figures in your laptop. The dotplot can’t be created for this comparison, it is too big.

ax_BisphenolA_c100 = barplot(enr_bg_BisphenolA100.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Bisphenol A 100uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Bisphenol A 100uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
              )

ax_BisphenolA_c10 = barplot(enr_bg_BisphenolA10.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Bisphenol A 10uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Bisphenol A 10uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_BisphenolA_c1= barplot(enr_bg_BisphenolA1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Bisphenol A 1uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Bisphenol A 1uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Farnesol_c100 = barplot(enr_bg_Farnesol100.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Farnesol 100uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Farnesol 100uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Tpdioxin_c100 = barplot(enr_bg_Tpdioxin100.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'tetrachlorodibenzopdioxin 100uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='tetrachlorodibenzopdioxin 100uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Tpdioxin_c10 = barplot(enr_bg_Tpdioxin10.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'tetrachlorodibenzopdioxin 10uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='tetrachlorodibenzopdioxin 10uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Tpdioxin_c1 = barplot(enr_bg_Tpdioxin1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'tetrachlorodibenzopdioxin 1uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='tetrachlorodibenzopdioxin 1uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Trogltiazone_c100 = barplot(enr_bg_Troglitazone100.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Troglitazone 100uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Troglitazone 100uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Trogltiazone_c10 = barplot(enr_bg_Troglitazone10.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Troglitazone 10uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Troglitazone 10uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Trogltiazone_c1 = barplot(enr_bg_Troglitazone1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Troglitazone 1uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='roglitazone 1uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

ax_Valproicacid_c1 = barplot(enr_bg_ValproicAcid1.results,
              column="Adjusted P-value",
              group='Gene_set', 
              size=10,
              title= 'Valproic acid 1uM ORA',
              top_term=10,
              figsize=(3,5),
              ofname='Valproic acid 1uM ORA',
              color = {'WikiPathways_2024_Human': 'darkred'}
             )

Section 7: Metadata

Step 60: At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.

%load_ext watermark
!pip install print-versions

Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)

%watermark

    Last updated: 2025-06-02T13:18:59.342286+02:00
    
    Python implementation: CPython
    Python version       : 3.12.3
    IPython version      : 8.25.0
    
    Compiler    : MSC v.1938 64 bit (AMD64)
    OS          : Windows
    Release     : 11
    Machine     : AMD64
    Processor   : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
    CPU cores   : 8
    Architecture: 64bit

from print_versions import print_versions
print_versions(globals())

    pandas==2.2.3
    json==2.0.9
    ipykernel==6.28.0
    gseapy==1.1.4
    numpy==1.26.4
    matplotlib==3.8.4