Part 7: KE enrichment score analysis and benchmarking for dataset: E-MEXP-3583

The AOP project ► Key objective 2

Author: Shakira Agata

This Jupyter Notebook shows the steps for the execution of KE enrichment analysis and benchmarking to Overrepresentation Analysis(ORA) for dataset:E-MEXP-3583. This notebook is subdivided into nine sections:

  • Section 1: Creation of dictKE dictionary
  • Section 2: Creation of dictWP dictionary
  • Section 3: Creation of KEgenes dictionary
  • Section 4: Calculation of N variable
  • Section 5: Comparison 1: Ag+ 24H
    • Section 5.1: Calculation of n variable
    • Section 5.2:Calculation of variable B and variable b
    • Section 5.3: Calculation of enrichment score and hypergeometric p-value
    • Section 5.4: Filtering results
    • Section 5.5: Calculation of percent gene overlap
      • Section 5.5.1 Creation of significant KE table
      • Section 5.5.2 Significant ORA pathway table
      • Section 5.5.3 Creation of for loop
      • Section 5.5.4 Tabulation
      • Section 5.5.5 Percent overlap calculation
  • Section 6: Comparison 2: Ag+ 48H
    • Section 6.1: Calculation of n variable
    • Section 6.2:Calculation of variable B and variable b
    • Section 6.3: Calculation of enrichment score and hypergeometric p-value
    • Section 6.4: Filtering results
    • Section 6.5: Calculation of percent gene overlap
      • Section 6.5.1 Creation of significant KE table
      • Section 6.5.2 Significant ORA pathway table
      • Section 6.5.3 Creation of for loop
      • Section 6.5.4 Tabulation
      • Section 6.5.5 Percent overlap calculation
  • Section 7: Comparison 3: AgNP 24H
    • Section 7.1: Calculation of n variable
    • Section 7.2:Calculation of variable B and variable b
    • Section 7.3: Calculation of enrichment score and hypergeometric p-value
    • Section 7.4: Filtering results
    • Section 7.5: Calculation of percent gene overlap
      • Section 7.5.1 Creation of significant KE table
      • Section 7.5.2 Significant ORA pathway table
      • Section 7.5.3 Creation of for loop
      • Section 7.5.4 Tabulation
      • Section 7.5.5 Percent overlap calculation
  • Section 8: Comparison 4: AgNP 48H
    • Section 8.1: Calculation of n variable
    • Section 8.2:Calculation of variable B and variable b
    • Section 8.3: Calculation of enrichment score and hypergeometric p-value
    • Section 8.4: Filtering results
    • Section 8.5: Calculation of percent gene overlap
      • Section 8.5.1 Creation of significant KE table
      • Section 8.5.2 Significant ORA pathway table
      • Section 8.5.3 Creation of for loop
      • Section 8.5.4 Tabulation
      • Section 8.5.5 Percent overlap calculation
  • Section 9: Metadata

Section 1: Creation of dictKE dictionary

In this section, the dictKE dictionary will be made which is used to retrieve the first neighbors of the key events present in the inflammatory stress response pathway AOP network.

Step 1. First, the necessary packages and inflammatory stress response pathway AOP network were loaded.

import pandas as pd
import numpy as np
from scipy.stats import hypergeom
import matplotlib.pyplot as plt
import scipy.stats as ss
import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()
You are connected to Cytoscape!





{'apiVersion': 'v1',
 'cytoscapeVersion': '3.10.1',
 'automationAPIVersion': '1.9.0',
 'py4cytoscapeVersion': '1.9.0'}
network=p4c.open_session('Agata,S.-Part4 Complete Molecular inflammation-process related AOP network.cys')
Opening C:\Users\shaki\Downloads\Agata,S.-Part4 Complete Molecular inflammation-process related AOP network.cys...

Step 2. Next, the nodetables are loaded in preparation for the dictionary creation.

nodetable=p4c.get_table_columns()
dataframe_for_dictKE=pd.read_excel('Nodetable-dictKE.xlsx')
df_corrected=pd.read_excel('nodetable-dictWP.xlsx').reset_index(drop=True)

Step 3. The dataframe will now be converted into a dictionary where the keys are the IDs from the key events and the values are the titles of the molecular pathways.

completedataframe_for_dictKE=dataframe_for_dictKE[['ID (KEID)','WPtitle']].copy()
complete_dataframe_for_dictKE=completedataframe_for_dictKE.rename(columns={"ID (KEID)":"KEID"})

Step 4. The format of the dataframe will now be converted into a dictionary format.

dictKE= complete_dataframe_for_dictKE.to_dict('records')

Section 2: Creation of dictWP dictionary

In this section, the dictWP dictionary will be created. The dictWP dictionary will contain the first neighbours: genes of the individual molecular pathways mapped to the inflammatory stress response pathway AOP network.

Step 5. First, the dataframe is created in which the molecular pathways mapped to the network are filtered.

df4= nodetable[nodetable['type'] == 'Molecular pathway']

Step 6. A duplicate network will be created for which we will create filters to only contain gene and molecular pathway nodes in the network in preparation for the dictWP creation. This requires a composite filter to exclude Molecular Initiating Event (MIE) nodes, Key Event (KE) nodes and Adverse Outcome (AO) nodes.

Clonednetwork_fordictWP= p4c.clone_network() 
p4c.rename_network('Cloned molecular inflammatory stress response pathway AOP network for dict WP')
{'network': 832855,
 'title': 'Cloned molecular inflammatory stress response pathway AOP network for dict WP'}
MIEfilter= p4c.create_column_filter('MIE filter', 'type', 'MIE', 'CONTAINS', network='Cloned molecular inflammatory stress response pathway AOP network for dict WP')
No edges selected.
KEfilter= p4c.create_column_filter('KE filter', 'type', 'KE', 'CONTAINS',network='Cloned molecular inflammatory stress response pathway AOP network for dict WP')
No edges selected.
AOfilter= p4c.create_column_filter('AO filter', 'type', 'AO', 'CONTAINS',network='Cloned molecular inflammatory stress response pathway AOP network for dict WP')
No edges selected.
combined_MIEKEAOfilter= p4c.create_composite_filter('MIE KE AO filter', ['MIE filter','KE filter','AO filter'],type='ANY',network='Cloned molecular inflammatory stress response pathway AOP network for dict WP')
No edges selected.

Step 7. You will delete the selected filtered nodes from the composite filter to only maintain the molecular pathway nodes and gene nodes.

Deletednodes= p4c.delete_selected_nodes(network='Cloned molecular inflammatory stress response pathway AOP network for dict WP')

Step 8. A for loop will be created for the WP dictionary which contains the WP titles as the keys and the names of the genes as values. Due to settings of the get_first_neighbours function, it is not possible to retrieve the gene IDs with this function.

name_list_WP=df_corrected['name'].tolist()
dictWP = {}

for name in name_list_WP:
        gene_neighbors_per_WP = p4c.get_first_neighbors(node_names=name, network= 'Cloned molecular inflammatory stress response pathway AOP network for dict WP', as_nested_list=False)
        dictWP[name] = gene_neighbors_per_WP

Section 3: Creation of KEgenes dictionary

In this section, you will identify the match between the dictKE and dictWP dictionary which will allow for matching between the keys: KE ID from the dictKE to the values:genes from dictWP.

Step 9. The KE_genes_dictionary dictionary will contain the match between the dictKE and dictWP dictionary by adding the values of the dictWP dictionary if the value: WPtitle of dictKE is present in dictWP.

KE_genes_dictionary=[]

for KEID in dictKE:
    WPtitle= KEID['WPtitle']
    
    if WPtitle in dictWP:
        KEID['gene'] = dictWP[WPtitle]
    KE_genes_dictionary.append(KEID)

Section 4: Calculation of N variable

In this section, variable N will be calculated per individual key event.

Step 10. First, the KEgenes dictionary is manipulated so that each gene is placed on an individual row. This requires the creation of a dataframe, adjustment of the column titles and explosion of the gene column.

first_dataframe=pd.DataFrame.from_dict(KE_genes_dictionary)
df5=df4.rename(columns={'name':'WPtitle'})
first_dataframe1=pd.merge(first_dataframe, df5, on='WPtitle')
second_dataframe=first_dataframe1.explode('gene')
second_dataframe1 = second_dataframe.drop(columns=['selected','AverageShortestPathLength','BetweennessCentrality','ClosenessCentrality','ClusteringCoefficient','group','type','Association type','CTL.Ext','CTL.Type','CTL.PathwayName','CTL.label','CTL.PathwayID','CTL.GeneName','CTL.GeneID','Eccentricity','EdgeCount','Indegree','IsSingleNode','NeighborhoodConnectivity','Outdegree','PartnerOfMultiEdgedNodePairs','SelfLoops','Stress','id','SUID'], axis=1)
second_dataframe1_reordered = second_dataframe.loc[:, ['KEID', 'WPtitle', 'shared name','gene']]
third_dataframe=second_dataframe1_reordered.rename(columns={'shared name':'ID'})

Step 11. The gene IDs that belong to gene symbols are added to complete the dataframe and merge this dataframe to the previous: third_dataframe. This will allow for a dataframe that contains all needed columns: KEID, WPtitle, WPID, gene symbol and gene ID.

df6= nodetable[nodetable['CTL.Type'] == 'gene']
df7=df6.rename(columns={'shared name':'gene'})
df8=df7.drop(columns=['name','selected','AverageShortestPathLength','BetweennessCentrality','ClosenessCentrality','ClusteringCoefficient','group','type','Association type','CTL.Ext','CTL.Type','CTL.PathwayName','CTL.label','CTL.PathwayID','CTL.GeneName','Eccentricity','EdgeCount','Indegree','IsSingleNode','NeighborhoodConnectivity','Outdegree','PartnerOfMultiEdgedNodePairs','SelfLoops','Stress','id','SUID'], axis=1)
mergeddataframe_gene=pd.merge(third_dataframe, df8, on='gene')
mergeddataframe_final=mergeddataframe_gene.rename(columns={'CTL.GeneID':'Entrez.Gene'})

Step 12. The following for loop will be run for the calculation of the N variable. This for loop iterates over each row of the dataframe and will count the number of genes belonging to individual Key Events that are unique.

variable_N_dictionary_count= {}

for index, row in mergeddataframe_final.iterrows():
    unique_KE = row['KEID']
    gene = row['Entrez.Gene']
    
    if unique_KE not in variable_N_dictionary_count:
       variable_N_dictionary_count[unique_KE] = 1
    else:
        variable_N_dictionary_count[unique_KE] += 1

print("The total is: ")

Step 13. The output of the dictionary will be converted into a dataframe and merged to the mergeddataframe_final dataframe to add the results into a separate column.

fourth_dataframe=pd.DataFrame.from_dict(variable_N_dictionary_count,orient='index')
df_reset = fourth_dataframe.reset_index()
df_reset.columns = ['KEID', 'N']
merged_dataframe= pd.merge(mergeddataframe_final, df_reset, on='KEID')
mergeddataframe=merged_dataframe.rename(columns={'ID':'WPID','gene':'Gene.Symbol'})

Section 5. Comparison 1: Ag+ 24H

Section 5.1 Calculation of n variable

In this section, variable n will be calculated for comparison 1.

Step 14. The table containing the differential expressed genes to control is loaded with the filter for significance.

Ag24H_DEG= pd.read_csv('topTable_Ag._.1.3_24 - H2O.control_.0.0_24.tsv',sep='\t')
Ag_24H_DEG= Ag24H_DEG[Ag24H_DEG['adj. p-value'] < 0.05]
Ag_24H_DEG = Ag_24H_DEG.copy()  
Ag_24H_DEG.rename(columns={Ag_24H_DEG.columns[0]: 'Entrez.Gene'}, inplace=True)
Ag_24H_DEG['Entrez.Gene'] = Ag_24H_DEG['Entrez.Gene'].astype(str)

Step 15. Here, the results of the DEG table are integrated into the mergeddataframe dataframe. This is followed by adjustment of the dataframe columns to remove non-relevant columns.

merged_dataframe_DEG= pd.merge(mergeddataframe,Ag_24H_DEG, on='Entrez.Gene')
mergeddataframeDEG= merged_dataframe_DEG.drop(['meanExpr'], axis=1)

Step 16. The following for loop for the key events is run to retrieve the n variable. It is comparable to the for loop of N, but adds a condition to check for significance of genes by p adjusted value being smaller than 0.05.

variable_n_dictionary_count= {}

for index, row in mergeddataframeDEG.iterrows():
    unique_KE = row['KEID']
    gene_expression_value = row['adj. p-value']

    if gene_expression_value < 0.05:
    
        if unique_KE not in variable_n_dictionary_count:
            variable_n_dictionary_count[unique_KE] = 1
        else:
            variable_n_dictionary_count[unique_KE] += 1

print("The total number of significant genes: ")

Step 17. The output of the n variable dictionary is saved as a dataframe and integrated as a separate column into a dataframe.

n_variable_dataframe=pd.DataFrame.from_dict(variable_n_dictionary_count,orient='index')
n_variable_dataframe_reset = n_variable_dataframe.reset_index()
n_variable_dataframe_reset.columns = ['KEID', 'n']
merged_dataframe2= pd.merge(mergeddataframeDEG, n_variable_dataframe_reset, on='KEID')

Section 5.2. Calculation of variable B and variable b.

In this section, variable B and variable b are calculated.

Step 18. Variable B is calculated by taking the length of the dataframe which includes all genes in 1 DEG table.

B=len(Ag24H_DEG.index)
B
20518

Step 19. Variable b is calculated by taking the length of the dataframe which includes all genes in 1 DEG table with the condition for significance.

Ag_24H_DEG_filtered=Ag_24H_DEG[Ag_24H_DEG['adj. p-value'] < 0.05]
b=len(Ag_24H_DEG_filtered)
b
127

Section 5.3. Calculation of enrichment score and hypergeometric p-value

In this section, the enrichment score and hypergeometric p-value will be calculated. This requires the four variables of the enrichment score per KE for which the formula will be applied to and stored in an additional dataframe.

Step 20. The final dataframe will be created that contains the KEID and the four variables: variable N, variable n, variable B and variable b.

Final_dataframe_ES= merged_dataframe2.loc[:, ['KEID','N','n']]
Final_dataframe_ES['B']=pd.Series([20518 for x in range(len(Final_dataframe_ES.index))])
Final_dataframe_ES['b']=pd.Series([127 for x in range(len(Final_dataframe_ES.index))])
Final_Dataframe_ES=Final_dataframe_ES.drop_duplicates(subset=['KEID'],keep='first')
Final_Dataframe_ES.reset_index(drop=True,inplace=True)
Copy_Final_DataFrame_ES=Final_Dataframe_ES.copy()

Step 21. The follow for loop will be used to calculate the enrichment score for individual key events and the results will be saved as a separate column into the dataframe.

def calculate_Enrichment_Score(row):
        return f"{(row['n']/row['N'])/(row['b']/row['B'])}"

Copy_Final_DataFrame_ES.loc[:,'Enrichmentscore']= Copy_Final_DataFrame_ES.apply(calculate_Enrichment_Score,axis=1)
Copy_Final_DataFrame_ES

KEID N n B b Enrichmentscore
0 https://identifiers.org/aop.events/1495 253 1 20518 127 0.6385733403877875
1 https://identifiers.org/aop.events/1668 156 1 20518 127 1.0356349687058348
2 https://identifiers.org/aop.events/244 417 1 20518 127 0.38743178685398133
3 https://identifiers.org/aop.events/41 275 1 20518 127 0.5874874731567645
4 https://identifiers.org/aop.events/1539 170 1 20518 127 0.9503473830477073
5 https://identifiers.org/aop.events/618 240 1 20518 127 0.6731627296587926
6 https://identifiers.org/aop.events/1497 528 6 20518 127 1.835898353614889
7 https://identifiers.org/aop.events/1115 34 3 20518 127 14.25521074571561
8 https://identifiers.org/aop.events/1917 166 1 20518 127 0.9732473199886159
9 https://identifiers.org/aop.events/1633 1056 12 20518 127 1.835898353614889
10 https://identifiers.org/aop.events/1392 102 9 20518 127 14.25521074571561
11 https://identifiers.org/aop.events/1582 51 2 20518 127 6.335649220318048
12 https://identifiers.org/aop.events/1896 205 1 20518 127 0.7880929517956596
13 https://identifiers.org/aop.events/265 268 3 20518 127 1.8084968856504875
14 https://identifiers.org/aop.events/1750 528 6 20518 127 1.835898353614889
15 https://identifiers.org/aop.events/1848 195 1 20518 127 0.8285079749646679
16 https://identifiers.org/aop.events/890 34 3 20518 127 14.25521074571561
17 https://identifiers.org/aop.events/149 1056 12 20518 127 1.835898353614889
18 https://identifiers.org/aop.events/1579 353 2 20518 127 0.9153487542102563
19 https://identifiers.org/aop.events/249 34 3 20518 127 14.25521074571561
20 https://identifiers.org/aop.events/288 51 1 20518 127 3.167824610159024
21 https://identifiers.org/aop.events/209 617 5 20518 127 1.3092305925292562
22 https://identifiers.org/aop.events/1945 1218 4 20518 127 0.5305716095832849
23 https://identifiers.org/aop.events/1087 528 6 20518 127 1.835898353614889
24 https://identifiers.org/aop.events/1538 34 3 20518 127 14.25521074571561
25 https://identifiers.org/aop.events/341 10 1 20518 127 16.155905511811024
26 https://identifiers.org/aop.events/1090 459 2 20518 127 0.7039610244797833
27 https://identifiers.org/aop.events/352 398 3 20518 127 1.2177818224983183

Step 22. The following for loop will be used to calculate the hypergeometric p-value for individual Key Events and save the result as a separate column into the dataframe. This requires some in between steps for manipulation of the dataframe.

p_value_dataframe=[]

for index, row in Copy_Final_DataFrame_ES.iterrows():

        M = row['B'] 
        n = row['b']
        N = row['N'] 
        k = row['n'] 

        hpd = ss.hypergeom(M, n, N)
        p = hpd.pmf(k)
        p_value_dataframe.append(p)
             
Hypergeometricpvalue_dataframe=pd.DataFrame(p_value_dataframe)
Hypergeometricpvalue_dataframe.columns= ['Hypergeometric p-value']
merged_finaltable=pd.concat([Copy_Final_DataFrame_ES,Hypergeometricpvalue_dataframe],axis=1)

Section 5.4. Filtering the results for significant KEs

In this section, the results will be filtered to only include significant KEs. Significant KEs have an enrichment score above 1 and a hypergeometric p-value below 0.05.

Step 23. Lastly, the results are filtered to showcase the significant KEs for the comparison 1.

filteredversion= merged_finaltable[(merged_finaltable['Enrichmentscore']>str(1))& (merged_finaltable['Hypergeometric p-value'] < 0.05)]
filteredversion

KEID N n B b Enrichmentscore Hypergeometric p-value
7 https://identifiers.org/aop.events/1115 34 3 20518 127 14.25521074571561 1.148290e-03
9 https://identifiers.org/aop.events/1633 1056 12 20518 127 1.835898353614889 1.689462e-02
10 https://identifiers.org/aop.events/1392 102 9 20518 127 14.25521074571561 1.337105e-08
11 https://identifiers.org/aop.events/1582 51 2 20518 127 6.335649220318048 3.591109e-02
16 https://identifiers.org/aop.events/890 34 3 20518 127 14.25521074571561 1.148290e-03
17 https://identifiers.org/aop.events/149 1056 12 20518 127 1.835898353614889 1.689462e-02
19 https://identifiers.org/aop.events/249 34 3 20518 127 14.25521074571561 1.148290e-03
24 https://identifiers.org/aop.events/1538 34 3 20518 127 14.25521074571561 1.148290e-03
# Ensure numeric types
filteredversion['Hypergeometric p-value'] = pd.to_numeric(filteredversion['Hypergeometric p-value'], errors='coerce')
filteredversion['Enrichmentscore'] = pd.to_numeric(filteredversion['Enrichmentscore'], errors='coerce')
filteredversion['combined_score'] = -np.log10(filteredversion['Hypergeometric p-value']) * filteredversion['Enrichmentscore']

# Sort by combined score (highest first)
C1_sorted = filteredversion.sort_values(by='combined_score', ascending=False)

# Show top rows
C1_sorted.to_excel('ConsistentKE-Ag+-24h.xlsx')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\3901049641.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion['Hypergeometric p-value'] = pd.to_numeric(filteredversion['Hypergeometric p-value'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\3901049641.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion['Enrichmentscore'] = pd.to_numeric(filteredversion['Enrichmentscore'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\3901049641.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion['combined_score'] = -np.log10(filteredversion['Hypergeometric p-value']) * filteredversion['Enrichmentscore']

Section 5.5. Calculation of percent gene overlap to ORA

Section 5.5.1 Creation of the significant KEs table

In this section, you merge the dataframes to retrieve the genes connected to only the significant KEs.

Step 24. The significant KE table is created using the significan KEs from the previous merggeddataframe_final.

significantKEID_genetable=mergeddataframe_final[(mergeddataframe_final['KEID'] =='https://identifiers.org/aop.events/1115') | (mergeddataframe_final['KEID'] == 'https://identifiers.org/aop.events/1633') |(mergeddataframe_final['KEID'] =='https://identifiers.org/aop.events/1392')| (mergeddataframe_final['KEID'] =='https://identifiers.org/aop.events/1582')|(mergeddataframe_final['KEID'] =='https://identifiers.org/aop.events/890') |(mergeddataframe_final['KEID']=='https://identifiers.org/aop.events/149')|(mergeddataframe_final['KEID']=='https://identifiers.org/aop.events/249')| (mergeddataframe_final['KEID']=='https://identifiers.org/aop.events/1538')]
significantKEIDgenetable=significantKEID_genetable.drop(columns={'WPtitle','ID'})
Section 5.5.2 Significant ORA pathway table plus splitting

In this section, the significant ORA pathway table is created.

Step 25. The significant ORA pathway table is created using the significant enriched patwhays identified from the ORA analysis. This requires data manipulation to restructure the table in a way that the individual genes for the enriched pathways are placed on individual rows.

file=open("C:/Users/shaki/Downloads/downloads/ORA_output_tabel/WikiPathways_2024_Human.human.enrichr.reports.txt","r")
datafile_ORA = pd.read_csv('WikiPathways_2024_Human.human.enrichr.reports.txt', sep='\t')
datafileORA=pd.DataFrame(datafile_ORA)
filtereddatafileORA=datafileORA[datafileORA['Adjusted P-value'] < 0.05]
# Make sure 'Combined Score' is numeric
datafileORA['Combined Score'] = pd.to_numeric(datafileORA['Combined Score'], errors='coerce')

# Sort by 'Combined Score' in descending order
ranked_df = datafileORA.sort_values(by='Combined Score', ascending=False)

# (Optional) Save to Excel
ranked_df.to_excel('Ag24H-ORAtable-thesis(EMEXP3583).xlsx', index=False)
dropped_datafileORA_df=filtereddatafileORA.drop(['Adjusted P-value','Odds Ratio','Old P-value','Gene_set','P-value','Old adjusted P-value','Combined Score'],axis=1)
droppeddatafileORAdf=dropped_datafileORA_df.copy()
droppeddatafileORAdf['Genes']= droppeddatafileORAdf['Genes'].replace({';':','},regex=True)
df_ORApathwaytable=droppeddatafileORAdf.copy()
df_ORApathwaytable['Genes'] = df_ORApathwaytable['Genes'].astype(str)
df_ORApathwaytable['Genes'] = df_ORApathwaytable['Genes'].str.split(',')
exploded_df_ORApathwaytable = df_ORApathwaytable.explode('Genes', ignore_index=True)
Section 5.5.3 For loop to get overlapping genes

In this section, the number of overlapping genes between the significant enrichment score-based Key Events and enriched pathways from ORA are calculated.

Step 26. Next, we create two sets by converting the significant KE table and ora pathway table into dictionaries where the values of the genes are grouped together per key.

ORA_gene_sets = exploded_df_ORApathwaytable.groupby('Term')['Genes'].apply(set).to_dict() 
SignificantKE_gene_sets = significantKEIDgenetable.groupby('KEID')['gene'].apply(set).to_dict()  
overlapping_genes_betweenORA_and_significantKEs = {}

for term, ORA_genes in ORA_gene_sets.items():
    for KEID, KEID_genes in SignificantKE_gene_sets.items():
        overlap = ORA_genes.intersection(KEID_genes)
        print(f"{term} x {KEID}: {len(overlap)} overlaps")
        overlapping_genes_betweenORA_and_significantKEs[(term, KEID)] = {
                'overlapping genes': overlap,
                'number of genes that overlap': len(overlap)
            }
if overlapping_genes_betweenORA_and_significantKEs:
    print("\ntitle of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:")
    for (term, KEID), result in overlapping_genes_betweenORA_and_significantKEs.items():
        print(f"Term: {term}, KEID: {KEID}, Title of overlapping gene(s): {result['overlapping genes']}, number: {result['number of genes that overlap']}")
else:
    print("No overlapping genes")
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1115: 1 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1392: 1 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/149: 0 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1538: 1 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1582: 0 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1633: 0 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/249: 1 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/890: 1 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/1115: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/1392: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/149: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/1538: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/1582: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/1633: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/249: 0 overlaps
Gastric Cancer Network 1 WP2361 x https://identifiers.org/aop.events/890: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/1115: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/1392: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/149: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/1538: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/1582: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/1633: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/249: 0 overlaps
Gastric Cancer Network 2 WP2363 x https://identifiers.org/aop.events/890: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1115: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1392: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/149: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1538: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1582: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1633: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/249: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/890: 0 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/1115: 2 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/1392: 2 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/149: 0 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/1538: 2 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/1582: 0 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/1633: 0 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/249: 2 overlaps
Melatonin Metabolism And Effects WP3298 x https://identifiers.org/aop.events/890: 2 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/1115: 3 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/1392: 3 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/149: 0 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/1538: 3 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/1582: 0 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/1633: 0 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/249: 3 overlaps
Oxidative Stress Response WP408 x https://identifiers.org/aop.events/890: 3 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/1115: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/1392: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/149: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/1538: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/1582: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/1633: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/249: 0 overlaps
Retinoblastoma Gene In Cancer WP2446 x https://identifiers.org/aop.events/890: 0 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1115: 1 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1392: 1 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/149: 0 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1538: 1 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1582: 0 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1633: 0 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/249: 1 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/890: 1 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/1115: 1 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/1392: 1 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/149: 0 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/1538: 1 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/1582: 0 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/1633: 0 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/249: 1 overlaps
Zinc Homeostasis WP3529 x https://identifiers.org/aop.events/890: 1 overlaps

title of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 1 WP2361, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): set(), number: 0
Term: Gastric Cancer Network 2 WP2363, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): set(), number: 0
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'CYP1A1', 'MAOA'}, number: 2
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'CYP1A1', 'MAOA'}, number: 2
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'CYP1A1', 'MAOA'}, number: 2
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'CYP1A1', 'MAOA'}, number: 2
Term: Melatonin Metabolism And Effects WP3298, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'CYP1A1', 'MAOA'}, number: 2
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'CYP1A1', 'MAOA', 'MT1X'}, number: 3
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'CYP1A1', 'MAOA', 'MT1X'}, number: 3
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'CYP1A1', 'MAOA', 'MT1X'}, number: 3
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'CYP1A1', 'MAOA', 'MT1X'}, number: 3
Term: Oxidative Stress Response WP408, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'CYP1A1', 'MAOA', 'MT1X'}, number: 3
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): set(), number: 0
Term: Retinoblastoma Gene In Cancer WP2446, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): set(), number: 0
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'CYP1A1'}, number: 1
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'CYP1A1'}, number: 1
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'CYP1A1'}, number: 1
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'CYP1A1'}, number: 1
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'CYP1A1'}, number: 1
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): set(), number: 0
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): set(), number: 0
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'MT1X'}, number: 1
Term: Zinc Homeostasis WP3529, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'MT1X'}, number: 1
Section 5.5.4 Tabulation gene overlap

In this section, a table is created that contains the number of overlapping genes and number of total genes in preparation for section 5.5.5.

final_geneoverlaptable_AG_24H=pd.DataFrame.from_dict(overlapping_genes_betweenORA_and_significantKEs,orient='index')
Section 5.5.5 Percent overlap calculation

In this section, the percent overlap for the genesets are calculated.

Step 27. Lastly, we calculate the percent overlap and add the result as a column to the dataframe. This is first done by running a for loop to calculate the total number of genes belonging to the enriched pathways of ORA.

variable_count= {}

for index, row in exploded_df_ORApathwaytable.iterrows():
    unique_KE = row['Term']
    gene_expression_value = row['Genes']

    if unique_KE not in variable_count:
            variable_count[unique_KE] = 1
    else:
            variable_count[unique_KE] += 1

print("The total number of genes: ")
print(variable_count)
The total number of genes: 
{'Zinc Homeostasis WP3529': 10, 'Copper Homeostasis WP3286': 8, 'Gastric Cancer Network 1 WP2361': 4, 'Vitamin D Receptor Pathway WP2877': 7, 'Gastric Cancer Network 2 WP2363': 3, 'Glucocorticoid Receptor Pathway WP2880': 4, 'Oxidative Stress Response WP408': 3, 'Melatonin Metabolism And Effects WP3298': 3, 'Retinoblastoma Gene In Cancer WP2446': 4}

Step 28. The result is converted into a dataframe and added to the final dataframe.

variable_count_df=pd.DataFrame.from_dict(variable_count,orient='index')
reset_variable_count_df = variable_count_df.reset_index()
Reset_variable_count_df=reset_variable_count_df.copy()
Reset_variable_count_df.columns = ['Term', 'Total number of genes']
Genesetoverlaptable_AG24H=final_geneoverlaptable_AG_24H.reset_index(level=[1])
Genesetoverlaptable_AG24h=Genesetoverlaptable_AG24H.copy()
Genesetoverlaptable_AG24h.insert(0, "Total number of genes", [10,10,10,10,10,10,10,10,8,8,8,8,8,8,8,8,8,4,4,4,4,4,4,4,4,4,7,7,7,7,7,7,7,7,7,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4])
table1=Genesetoverlaptable_AG24h.copy()
def calculate_Genesetoverlap_Score(row):
        return f"{(row['number of genes that overlap']/row['Total number of genes'])*100}"

table1.loc[:,'Percent geneset overlap']= table1.apply(calculate_Genesetoverlap_Score, axis=1)
table1.to_excel('geneoverlap-calculation-Ag24h.xlsx')

Section 6. Comparison 2: Ag+ 48H

In this section, Steps 14 to step 28 are repeated for comparison 2.

Section 6.1 Calculation of n variable

Step 29. The table containing the differential expressed genes for comparison 2 is loaded with the filter for significance.

Ag48HDEG= pd.read_csv('topTable_Ag._.1.3_48 - H2O.control_.0.0_48.tsv',sep='\t')
Ag_48H_DEG= Ag48HDEG[Ag48HDEG['adj. p-value'] < 0.05]
Ag48H_DEG = Ag_48H_DEG.copy()  
Ag48H_DEG.rename(columns={Ag48H_DEG.columns[0]: 'Entrez.Gene'}, inplace=True)
Ag48H_DEG['Entrez.Gene'] = Ag48H_DEG['Entrez.Gene'].astype(str)

Step 30. The results of the DEG table are next integrated into the mergeddataframe dataframe. This is followed by adjustment of the dataframe columns to remove non-relevant columns.

merged_dataframe_DEG_48h= pd.merge(mergeddataframe, Ag48H_DEG, on='Entrez.Gene')

Step 31. The following for loop for the key events is run to retrieve the n variable. It is comparable to the for loop of N, but adds a condition to check for significance of genes by p adjusted value being smaller than 0.05.

variable_n_dictionary_count2= {}

for index, row in merged_dataframe_DEG_48h.iterrows():
    unique_KE = row['KEID']
    gene_expression_value = row['adj. p-value']

    if gene_expression_value < 0.05:
    
        if unique_KE not in variable_n_dictionary_count2:
            variable_n_dictionary_count2[unique_KE] = 1
        else:
            variable_n_dictionary_count2[unique_KE] += 1

print("The total number of significant genes: ")
print(variable_n_dictionary_count2)
The total number of significant genes: 
{'https://identifiers.org/aop.events/486': 2, 'https://identifiers.org/aop.events/875': 2, 'https://identifiers.org/aop.events/1495': 2, 'https://identifiers.org/aop.events/1668 ': 2, 'https://identifiers.org/aop.events/244 ': 2, 'https://identifiers.org/aop.events/1814 ': 2, 'https://identifiers.org/aop.events/1270': 1, 'https://identifiers.org/aop.events/457': 2, 'https://identifiers.org/aop.events/188': 1, 'https://identifiers.org/aop.events/618': 2, 'https://identifiers.org/aop.events/2006': 1, 'https://identifiers.org/aop.events/1497': 3, 'https://identifiers.org/aop.events/1669': 1, 'https://identifiers.org/aop.events/202': 2, 'https://identifiers.org/aop.events/1633': 6, 'https://identifiers.org/aop.events/1815': 2, 'https://identifiers.org/aop.events/386': 2, 'https://identifiers.org/aop.events/1496': 4, 'https://identifiers.org/aop.events/68': 2, 'https://identifiers.org/aop.events/1493': 8, 'https://identifiers.org/aop.events/265': 2, 'https://identifiers.org/aop.events/1750': 3, 'https://identifiers.org/aop.events/1848': 3, 'https://identifiers.org/aop.events/149': 6, 'https://identifiers.org/aop.events/1579': 3, 'https://identifiers.org/aop.events/209': 4, 'https://identifiers.org/aop.events/1498': 2, 'https://identifiers.org/aop.events/1500': 2, 'https://identifiers.org/aop.events/1488': 2, 'https://identifiers.org/aop.events/52': 2, 'https://identifiers.org/aop.events/484': 2, 'https://identifiers.org/aop.events/388': 2, 'https://identifiers.org/aop.events/1945': 4, 'https://identifiers.org/aop.events/2012': 2, 'https://identifiers.org/aop.events/1818': 2, 'https://identifiers.org/aop.events/1585': 1, 'https://identifiers.org/aop.events/1087': 3, 'https://identifiers.org/aop.events/195': 2, 'https://identifiers.org/aop.events/1090': 2, 'https://identifiers.org/aop.events/1841': 1}

Step 32. The output of the n variable dictionary is saved as a dataframe and integrated as a separate column into a dataframe.

n_variable_dataframe2=pd.DataFrame.from_dict(variable_n_dictionary_count2,orient='index')
n_variable_dataframe2

0
https://identifiers.org/aop.events/486 2
https://identifiers.org/aop.events/875 2
https://identifiers.org/aop.events/1495 2
https://identifiers.org/aop.events/1668 2
https://identifiers.org/aop.events/244 2
https://identifiers.org/aop.events/1814 2
https://identifiers.org/aop.events/1270 1
https://identifiers.org/aop.events/457 2
https://identifiers.org/aop.events/188 1
https://identifiers.org/aop.events/618 2
https://identifiers.org/aop.events/2006 1
https://identifiers.org/aop.events/1497 3
https://identifiers.org/aop.events/1669 1
https://identifiers.org/aop.events/202 2
https://identifiers.org/aop.events/1633 6
https://identifiers.org/aop.events/1815 2
https://identifiers.org/aop.events/386 2
https://identifiers.org/aop.events/1496 4
https://identifiers.org/aop.events/68 2
https://identifiers.org/aop.events/1493 8
https://identifiers.org/aop.events/265 2
https://identifiers.org/aop.events/1750 3
https://identifiers.org/aop.events/1848 3
https://identifiers.org/aop.events/149 6
https://identifiers.org/aop.events/1579 3
https://identifiers.org/aop.events/209 4
https://identifiers.org/aop.events/1498 2
https://identifiers.org/aop.events/1500 2
https://identifiers.org/aop.events/1488 2
https://identifiers.org/aop.events/52 2
https://identifiers.org/aop.events/484 2
https://identifiers.org/aop.events/388 2
https://identifiers.org/aop.events/1945 4
https://identifiers.org/aop.events/2012 2
https://identifiers.org/aop.events/1818 2
https://identifiers.org/aop.events/1585 1
https://identifiers.org/aop.events/1087 3
https://identifiers.org/aop.events/195 2
https://identifiers.org/aop.events/1090 2
https://identifiers.org/aop.events/1841 1
n_variable_dataframe_reset2 = n_variable_dataframe2.reset_index()
n_variable_dataframe_reset2.columns = ['KEID', 'n']
n_variable_dataframe_reset2

KEID n
0 https://identifiers.org/aop.events/486 2
1 https://identifiers.org/aop.events/875 2
2 https://identifiers.org/aop.events/1495 2
3 https://identifiers.org/aop.events/1668 2
4 https://identifiers.org/aop.events/244 2
5 https://identifiers.org/aop.events/1814 2
6 https://identifiers.org/aop.events/1270 1
7 https://identifiers.org/aop.events/457 2
8 https://identifiers.org/aop.events/188 1
9 https://identifiers.org/aop.events/618 2
10 https://identifiers.org/aop.events/2006 1
11 https://identifiers.org/aop.events/1497 3
12 https://identifiers.org/aop.events/1669 1
13 https://identifiers.org/aop.events/202 2
14 https://identifiers.org/aop.events/1633 6
15 https://identifiers.org/aop.events/1815 2
16 https://identifiers.org/aop.events/386 2
17 https://identifiers.org/aop.events/1496 4
18 https://identifiers.org/aop.events/68 2
19 https://identifiers.org/aop.events/1493 8
20 https://identifiers.org/aop.events/265 2
21 https://identifiers.org/aop.events/1750 3
22 https://identifiers.org/aop.events/1848 3
23 https://identifiers.org/aop.events/149 6
24 https://identifiers.org/aop.events/1579 3
25 https://identifiers.org/aop.events/209 4
26 https://identifiers.org/aop.events/1498 2
27 https://identifiers.org/aop.events/1500 2
28 https://identifiers.org/aop.events/1488 2
29 https://identifiers.org/aop.events/52 2
30 https://identifiers.org/aop.events/484 2
31 https://identifiers.org/aop.events/388 2
32 https://identifiers.org/aop.events/1945 4
33 https://identifiers.org/aop.events/2012 2
34 https://identifiers.org/aop.events/1818 2
35 https://identifiers.org/aop.events/1585 1
36 https://identifiers.org/aop.events/1087 3
37 https://identifiers.org/aop.events/195 2
38 https://identifiers.org/aop.events/1090 2
39 https://identifiers.org/aop.events/1841 1
merged_dataframe2= pd.merge(mergeddataframeDEG, n_variable_dataframe_reset2, on='KEID')

Section 6.2. Calculation of variable B and variable b.

In this section, variable B and variable b are calculated.

Step 33. Variable B is calculated by taking the length of the dataframe which includes all genes in 1 DEG table.

B=len(Ag48H_DEG.index)
B
30

Step 34. Variable b is calculated by taking the length of the dataframe which includes all genes in 1 DEG table with the condition for significance.

Ag48H_DEG_filtered=Ag48H_DEG[Ag48H_DEG['adj. p-value'] < 0.05]
b=len(Ag48H_DEG_filtered)
b
30

Section 6.3. Calculation of enrichment score and hypergeometric p-value

In this section, the enrichment score and hypergeometric p-value will be calculated.

Step 35. The final dataframe will be created that contains the KEID and the four variables: variable N, variable n, variable B and variable b.

Final_dataframe_ES= merged_dataframe2.loc[:, ['KEID','N','n']]
Final_dataframe_ES['B']=pd.Series([20518 for x in range(len(Final_dataframe_ES.index))])
Final_dataframe_ES['b']=pd.Series([30 for x in range(len(Final_dataframe_ES.index))])
Final_Dataframe_ES=Final_dataframe_ES.drop_duplicates(subset=['KEID'],keep='first')
Final_Dataframe_ES.reset_index(drop=True,inplace=True)
Copy_Final_DataFrame_ES=Final_Dataframe_ES.copy()

Step 36. The follow for loop will be used to calculate the enrichment score for individual key events and the results will be saved as a separate column into the dataframe.

def calculate_Enrichment_Score(row):
        return f"{(row['n']/row['N'])/(row['b']/row['B'])}"

Copy_Final_DataFrame_ES.loc[:,'Enrichmentscore']= Copy_Final_DataFrame_ES.apply(calculate_Enrichment_Score,axis=1)
Copy_Final_DataFrame_ES

KEID N n B b Enrichmentscore
0 https://identifiers.org/aop.events/1495 253 2 20518 30 5.406587615283267
1 https://identifiers.org/aop.events/1668 156 2 20518 30 8.768376068376067
2 https://identifiers.org/aop.events/244 417 2 20518 30 3.2802557953637086
3 https://identifiers.org/aop.events/618 240 2 20518 30 5.699444444444444
4 https://identifiers.org/aop.events/1497 528 3 20518 30 3.8859848484848483
5 https://identifiers.org/aop.events/1633 1056 6 20518 30 3.8859848484848483
6 https://identifiers.org/aop.events/265 268 2 20518 30 5.103980099502487
7 https://identifiers.org/aop.events/1750 528 3 20518 30 3.8859848484848483
8 https://identifiers.org/aop.events/1848 195 3 20518 30 10.522051282051281
9 https://identifiers.org/aop.events/149 1056 6 20518 30 3.8859848484848483
10 https://identifiers.org/aop.events/1579 353 3 20518 30 5.812464589235128
11 https://identifiers.org/aop.events/209 617 4 20518 30 4.433927606699081
12 https://identifiers.org/aop.events/1945 1218 4 20518 30 2.246086480569239
13 https://identifiers.org/aop.events/1087 528 3 20518 30 3.8859848484848483
14 https://identifiers.org/aop.events/1090 459 2 20518 30 2.9801016702977488

Step 37. The following for loop will be used to calculate the hypergeometric p-value for individual Key Events and save the result as a separate column into the dataframe. This requires some in between steps for manipulation of the dataframe.

p_value_dataframe2=[]

for index, row in Copy_Final_DataFrame_ES.iterrows():

        M = row['B'] 
        n = row['b']
        N = row['N'] 
        k = row['n'] 

        hpd = ss.hypergeom(M, n, N)
        p = hpd.pmf(k)
        p_value_dataframe2.append(p)
             
Hypergeometricpvalue_dataframe2=pd.DataFrame(p_value_dataframe2)
Hypergeometricpvalue_dataframe2.columns= ['Hypergeometric p-value']
Hypergeometricpvalue_dataframe2

Hypergeometric p-value
0 0.046663
1 0.020231
2 0.101112
3 0.042743
4 0.034153
5 0.003083
6 0.051296
7 0.034153
8 0.002662
9 0.003083
10 0.012880
11 0.010082
12 0.069281
13 0.034153
14 0.115558
merged_finaltable=pd.concat([Copy_Final_DataFrame_ES,Hypergeometricpvalue_dataframe2],axis=1)

Section 6.4. Filtering the results for significant KEs

In this section, the results will be filtered to only include significant KEs. Significant KEs have an enrichment score above 1 and a hypergeometric p-value below 0.05.

Step 38. Lastly, we filter the results to showcase the significant KEs for comparison 2.

filteredversion_Ag48H= merged_finaltable[(merged_finaltable['Enrichmentscore']>str(1))& (merged_finaltable['Hypergeometric p-value'] < 0.05)]
filteredversion_Ag48H

KEID N n B b Enrichmentscore Hypergeometric p-value
0 https://identifiers.org/aop.events/1495 253 2 20518 30 5.406587615283267 0.046663
1 https://identifiers.org/aop.events/1668 156 2 20518 30 8.768376068376067 0.020231
3 https://identifiers.org/aop.events/618 240 2 20518 30 5.699444444444444 0.042743
4 https://identifiers.org/aop.events/1497 528 3 20518 30 3.8859848484848483 0.034153
5 https://identifiers.org/aop.events/1633 1056 6 20518 30 3.8859848484848483 0.003083
7 https://identifiers.org/aop.events/1750 528 3 20518 30 3.8859848484848483 0.034153
8 https://identifiers.org/aop.events/1848 195 3 20518 30 10.522051282051281 0.002662
9 https://identifiers.org/aop.events/149 1056 6 20518 30 3.8859848484848483 0.003083
10 https://identifiers.org/aop.events/1579 353 3 20518 30 5.812464589235128 0.012880
11 https://identifiers.org/aop.events/209 617 4 20518 30 4.433927606699081 0.010082
13 https://identifiers.org/aop.events/1087 528 3 20518 30 3.8859848484848483 0.034153
# Ensure numeric types
filteredversion_Ag48H['Hypergeometric p-value'] = pd.to_numeric(filteredversion_Ag48H['Hypergeometric p-value'], errors='coerce')
filteredversion_Ag48H['Enrichmentscore'] = pd.to_numeric(filteredversion_Ag48H['Enrichmentscore'], errors='coerce')
filteredversion_Ag48H['combined_score'] = -np.log10(filteredversion_Ag48H['Hypergeometric p-value']) * filteredversion_Ag48H['Enrichmentscore']

# Sort by combined score (highest first)
C2_sorted = filteredversion_Ag48H.sort_values(by='combined_score', ascending=False)

# Show top rows
C2_sorted.to_excel('ConsistentKE-Ag-48h.xlsx')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1386545660.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_Ag48H['Hypergeometric p-value'] = pd.to_numeric(filteredversion_Ag48H['Hypergeometric p-value'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1386545660.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_Ag48H['Enrichmentscore'] = pd.to_numeric(filteredversion_Ag48H['Enrichmentscore'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1386545660.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_Ag48H['combined_score'] = -np.log10(filteredversion_Ag48H['Hypergeometric p-value']) * filteredversion_Ag48H['Enrichmentscore']

Section 6.5. Calculation of percent gene overlap to ORA

Section 6.5.1 Creation of the significant KEs table

In this section, you merge the dataframes to retrieve the genes connected to only the significant KEs.

Step 39. The significant KE table is created using the significan KEs from the previous merggeddataframe_final.

mergeddataframe_final2=mergeddataframe_final.copy()
mergeddataframe_final2['KEID'] = mergeddataframe_final2['KEID'].str.strip()
significantKEID_genetable2=mergeddataframe_final2[(mergeddataframe_final2['KEID'] == 'https://identifiers.org/aop.events/1668')| (mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/1495') |(mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/618')| (mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/1497')|(mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/1633') |(mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/1750')|(mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/1848')| (mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/149')| (mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/1579')| (mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/209')| (mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/1087')]
significantKEIDgenetable2=significantKEID_genetable2.drop(columns={'WPtitle','ID'})
Section 6.5.2 Significant ORA pathway table plus splitting

In this section, the significant ORA pathway table is created.

Step 40. The significant ORA pathway table is created using the significant enriched patwhays identified from the ORA analysis. This requires data manipulation to restructure the table in a way that the individual genes for the enriched pathways are placed on individual rows.

datafile_ORA2 = pd.read_csv("C:/Users/shaki/Downloads/ORA_tables_for_comparison/Comparison 2-Ag 48H.txt", sep='\t')
datafileORA2=pd.DataFrame(datafile_ORA2)
filtereddatafileORA_2=datafileORA2[datafileORA2['Adjusted P-value'] < 0.05]
filtereddatafileORA_2

Gene_set Term P-value Adjusted P-value Old P-value Old adjusted P-value Odds Ratio Combined Score Genes
0 WikiPathways_2024_Human Platelet Mediated Interactions W Vascular And ... 0.000259 0.013650 0 0 101.071600 834.69240 CCL2;TLR4
1 WikiPathways_2024_Human P53 Transcriptional Gene Network WP4963 0.000266 0.013650 0 0 27.364940 225.27640 CCL2;ULBP1;SERPINB5
2 WikiPathways_2024_Human Network Map Of SARS CoV 2 Signaling WP5115 0.000428 0.013650 0 0 12.948480 100.44810 IFITM3;CCL2;PTGS2;CXCL5
3 WikiPathways_2024_Human LDL Influence On CD14 And TLR4 WP5272 0.000479 0.013650 0 0 72.172840 551.61330 CCL2;TLR4
4 WikiPathways_2024_Human Spinal Cord Injury WP2431 0.000564 0.013650 0 0 20.985580 156.97910 CCL2;PTGS2;TLR4
5 WikiPathways_2024_Human Interactions Immune Cells And miRNAs In Tumor ... 0.000713 0.014382 0 0 58.279200 422.28080 CCL2;TLR4
6 WikiPathways_2024_Human Immune Infiltration In Pancreatic Cancer WP5285 0.001385 0.023934 0 0 40.930930 269.42170 CCL2;CXCL5
7 WikiPathways_2024_Human Fibrin Complement Receptor 3 Signaling WP4136 0.001605 0.024270 0 0 37.855560 243.59600 CCL2;TLR4
8 WikiPathways_2024_Human SARS CoV 2 Innate Immunity Evasion And Cell Im... 0.003914 0.048148 0 0 23.631940 130.99350 CCL2;CXCL5
9 WikiPathways_2024_Human Glucocorticoid Receptor Pathway WP2880 0.004270 0.048148 0 0 22.570480 123.14720 CCL2;PTGS2
10 WikiPathways_2024_Human Non Genomic Actions Of 1 25 Dihydroxyvitamin D... 0.004767 0.048148 0 0 21.294730 113.84380 CCL2;TLR4
11 WikiPathways_2024_Human Burn Wound Healing WP5055 0.004895 0.048148 0 0 20.997940 111.70010 CCL2;TLR4
12 WikiPathways_2024_Human Cytokine Cytokine Receptor Interaction WP5473 0.005173 0.048148 0 0 9.452663 49.76173 TNFSF15;CCL2;CXCL5
# Make sure 'Combined Score' is numeric
datafileORA2['Combined Score'] = pd.to_numeric(datafileORA2['Combined Score'], errors='coerce')

# Sort by 'Combined Score' in descending order
ranked_df2 = datafileORA2.sort_values(by='Combined Score', ascending=False)

# (Optional) Save to Excel
ranked_df2.to_excel('Ag48H-ORAtable-thesis(EMEXP3583).xlsx', index=False)
dropped_datafileORA_df2=filtereddatafileORA_2.drop(['Adjusted P-value','Odds Ratio','Old P-value','Gene_set','P-value','Old adjusted P-value','Combined Score'],axis=1)
droppeddatafileORAdf2=dropped_datafileORA_df2.copy()
droppeddatafileORAdf2['Genes']= droppeddatafileORAdf2['Genes'].replace({';':','},regex=True)
df2_ORApathwaytable=droppeddatafileORAdf2.copy()
df2_ORApathwaytable['Genes'] = df2_ORApathwaytable['Genes'].astype(str)
df2_ORApathwaytable['Genes'] = df2_ORApathwaytable['Genes'].str.split(',')
exploded_df2_ORApathwaytable = df2_ORApathwaytable.explode('Genes', ignore_index=True)
Section 6.5.3 For loop to get overlapping genes

In this section, the number of overlapping genes between the significant enrichment score-based Key Events and enriched pathways from ORA are calculated.

Step 41. Next, two sets are created by converting the significant KE table and ORA pathway table into dictionaries where the values of the genes are grouped together per key. This is followed by running a for loop to calculate the number of overlapping genes along with the symbols.

ORA_gene_sets2 = exploded_df2_ORApathwaytable.groupby('Term')['Genes'].apply(set).to_dict() 
SignificantKE_gene_sets2 = significantKEIDgenetable2.groupby('KEID')['gene'].apply(set).to_dict()  
overlapping_genes_betweenORA_and_significantKEs2 = {}

for term, ORA_genes in ORA_gene_sets2.items():
    for KEID, KEID_genes in SignificantKE_gene_sets2.items():
        overlap = ORA_genes.intersection(KEID_genes)
        print(f"{term} x {KEID}: {len(overlap)} overlaps")
        overlapping_genes_betweenORA_and_significantKEs2[(term, KEID)] = {
                'overlapping genes': overlap,
                'number of genes that overlap': len(overlap)
            }
if overlapping_genes_betweenORA_and_significantKEs2:
    print("\ntitle of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:")
    for (term, KEID), result in overlapping_genes_betweenORA_and_significantKEs2.items():
        print(f"Term: {term}, KEID: {KEID}, Title of overlapping gene(s): {result['overlapping genes']}, number: {result['number of genes that overlap']}")
else:
    print("No overlapping genes")
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1087: 2 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/149: 2 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1495: 1 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1497: 2 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1579: 1 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1633: 2 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1668: 1 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1750: 2 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/1848: 1 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/209: 1 overlaps
Burn Wound Healing WP5055 x https://identifiers.org/aop.events/618: 0 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1087: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/149: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1495: 0 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1497: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1579: 2 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1633: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1668: 0 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1750: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/1848: 0 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/209: 1 overlaps
Cytokine Cytokine Receptor Interaction WP5473 x https://identifiers.org/aop.events/618: 0 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1087: 2 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/149: 2 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1495: 1 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1497: 2 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1579: 1 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1633: 2 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1668: 1 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1750: 2 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/1848: 1 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/209: 1 overlaps
Fibrin Complement Receptor 3 Signaling WP4136 x https://identifiers.org/aop.events/618: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1087: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/149: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1495: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1497: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1579: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1633: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1668: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1750: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/1848: 0 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/209: 1 overlaps
Glucocorticoid Receptor Pathway WP2880 x https://identifiers.org/aop.events/618: 0 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1087: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/149: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1495: 0 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1497: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1579: 2 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1633: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1668: 0 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1750: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/1848: 0 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/209: 1 overlaps
Immune Infiltration In Pancreatic Cancer WP5285 x https://identifiers.org/aop.events/618: 0 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1087: 2 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/149: 2 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1495: 1 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1497: 2 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1579: 1 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1633: 2 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1668: 1 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1750: 2 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/1848: 1 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/209: 1 overlaps
Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559 x https://identifiers.org/aop.events/618: 0 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1087: 2 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/149: 2 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1495: 1 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1497: 2 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1579: 1 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1633: 2 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1668: 1 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1750: 2 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/1848: 1 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/209: 1 overlaps
LDL Influence On CD14 And TLR4 WP5272 x https://identifiers.org/aop.events/618: 0 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1087: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/149: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1495: 0 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1497: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1579: 2 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1633: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1668: 0 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1750: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/1848: 0 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/209: 1 overlaps
Network Map Of SARS CoV 2 Signaling WP5115 x https://identifiers.org/aop.events/618: 0 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1087: 2 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/149: 2 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1495: 1 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1497: 2 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1579: 1 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1633: 2 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1668: 1 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1750: 2 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/1848: 1 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/209: 1 overlaps
Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341 x https://identifiers.org/aop.events/618: 0 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1087: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/149: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1495: 0 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1497: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1579: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1633: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1668: 0 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1750: 1 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/1848: 0 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/209: 3 overlaps
P53 Transcriptional Gene Network WP4963 x https://identifiers.org/aop.events/618: 0 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1087: 2 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/149: 2 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1495: 1 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1497: 2 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1579: 1 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1633: 2 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1668: 1 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1750: 2 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/1848: 1 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/209: 1 overlaps
Platelet Mediated Interactions W Vascular And Circulating Cells WP4462 x https://identifiers.org/aop.events/618: 0 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1087: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/149: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1495: 0 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1497: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1579: 2 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1633: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1668: 0 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1750: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/1848: 0 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/209: 1 overlaps
SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039 x https://identifiers.org/aop.events/618: 0 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1087: 2 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/149: 2 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1495: 1 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1497: 2 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1579: 1 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1633: 2 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1668: 1 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1750: 2 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/1848: 1 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/209: 1 overlaps
Spinal Cord Injury WP2431 x https://identifiers.org/aop.events/618: 0 overlaps

title of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Burn Wound Healing WP5055, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CXCL5', 'CCL2'}, number: 2
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Cytokine Cytokine Receptor Interaction WP5473, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Fibrin Complement Receptor 3 Signaling WP4136, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Glucocorticoid Receptor Pathway WP2880, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CXCL5', 'CCL2'}, number: 2
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Immune Infiltration In Pancreatic Cancer WP5285, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: LDL Influence On CD14 And TLR4 WP5272, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CXCL5', 'CCL2'}, number: 2
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Network Map Of SARS CoV 2 Signaling WP5115, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2', 'SERPINB5', 'ULBP1'}, number: 3
Term: P53 Transcriptional Gene Network WP4963, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Platelet Mediated Interactions W Vascular And Circulating Cells WP4462, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): set(), number: 0
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CXCL5', 'CCL2'}, number: 2
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): set(), number: 0
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): set(), number: 0
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1579, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1668, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TLR4', 'CCL2'}, number: 2
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/1848, Title of overlapping gene(s): {'TLR4'}, number: 1
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CCL2'}, number: 1
Term: Spinal Cord Injury WP2431, KEID: https://identifiers.org/aop.events/618, Title of overlapping gene(s): set(), number: 0
Section 6.5.4 Tabulation gene overlap

In this section, a table is created that contains the number of overlapping genes and number of total genes in preparation for section 6.5.5.

final_geneoverlaptable_AG_48H=pd.DataFrame.from_dict(overlapping_genes_betweenORA_and_significantKEs2,orient='index')
Section 6.5.5 Percent overlap calculation

In this section, the percent overlap for the genesets are calculated.

Step 42. Lastly, the percent overlap is calculated and add the result as a column to the dataframe. This is first done by running a for loop to calculate the total number of genes belonging to the enriched pathways of ORA.

variable_count2= {}

for index, row in exploded_df2_ORApathwaytable.iterrows():
    unique_KE = row['Term']
    gene_expression_value = row['Genes']

    if unique_KE not in variable_count2:
            variable_count2[unique_KE] = 1
    else:
            variable_count2[unique_KE] += 1

print("The total number of genes: ")
print(variable_count2)
The total number of genes: 
{'Platelet Mediated Interactions W Vascular And Circulating Cells WP4462': 2, 'P53 Transcriptional Gene Network WP4963': 3, 'Network Map Of SARS CoV 2 Signaling WP5115': 4, 'LDL Influence On CD14 And TLR4 WP5272': 2, 'Spinal Cord Injury WP2431': 3, 'Interactions Immune Cells And miRNAs In Tumor Microenvironment WP4559': 2, 'Immune Infiltration In Pancreatic Cancer WP5285': 2, 'Fibrin Complement Receptor 3 Signaling WP4136': 2, 'SARS CoV 2 Innate Immunity Evasion And Cell Immune Response WP5039': 2, 'Glucocorticoid Receptor Pathway WP2880': 2, 'Non Genomic Actions Of 1 25 Dihydroxyvitamin D3 WP4341': 2, 'Burn Wound Healing WP5055': 2, 'Cytokine Cytokine Receptor Interaction WP5473': 3}

Step 43. The result is converted into a dataframe and added to the final dataframe. This is followed by some data manipulation prior to calculation of gene set overlap.

variable_count_df2=pd.DataFrame.from_dict(variable_count2,orient='index')
reset_variable_count_df2 = variable_count_df2.reset_index()
reset_variable_count_df2.columns = ['Term', 'Total number of genes']
Genesetoverlaptable_AG48H=final_geneoverlaptable_AG_48H.reset_index(level=[1])
Genesetoverlaptable_AG48H.reset_index(inplace=True)
Genesetoverlaptable_AG48H.columns= ['Term','KEID','overlapping genes','number of genes that overlap']
tabulation_Ag48h=pd.merge(reset_variable_count_df2,Genesetoverlaptable_AG48H, on='Term')
def calculate_Genesetoverlap_Score(row):
        return f"{(row['number of genes that overlap']/row['Total number of genes'])*100}"

tabulation_Ag48h.loc[:,'Percent geneset overlap']= tabulation_Ag48h.apply(calculate_Genesetoverlap_Score, axis=1)
tabulation_Ag48h.to_excel('geneoverlap-calculation-Ag48h.xlsx')

Section 7. Comparison 3: AgNP 24H

Section 7.1 Calculation of n variable

In this section, variable n will be calculated for the comparison 3.

Step 44. The table containing the differential expressed genes for Bisphenol A 1uM to control is loaded with the filter for significance.

AgNP_24H_DEG= pd.read_csv('topTable_AgNP_12.1_24 - H2O.control_.0.0_24.tsv',sep='\t')
AgNP24H_DEG= AgNP_24H_DEG[AgNP_24H_DEG['adj. p-value'] < 0.05]
AgNP24h_DEG= AgNP24H_DEG.copy()  
AgNP24h_DEG.rename(columns={AgNP24h_DEG.columns[0]: 'Entrez.Gene'}, inplace=True)
AgNP24h_DEG['Entrez.Gene'] = AgNP24h_DEG['Entrez.Gene'].astype(str)

Step 45. Here, the results of the DEG table are integrated into the mergeddataframe dataframe. This is followed by adjustment of the dataframe columns to remove non-relevant columns.

merged_dataframe_DEG_AgNP_24h= pd.merge(mergeddataframe,AgNP24h_DEG, on='Entrez.Gene')

Step 46. Lastly, the following for loop for the key events is run to retrieve the n variable. It is comparable to the for loop of N, but adds a condition to check for significance of genes by p adjusted value being smaller than 0.05.

variable_n_dictionary_count3= {}

for index, row in merged_dataframe_DEG_AgNP_24h.iterrows():
    unique_KE = row['KEID']
    gene_expression_value = row['adj. p-value']

    if gene_expression_value < 0.05:
    
        if unique_KE not in variable_n_dictionary_count3:
            variable_n_dictionary_count3[unique_KE] = 1
        else:
            variable_n_dictionary_count3[unique_KE] += 1

print("The total number of significant genes: ")

Step 47. The output of the n variable dictionary is saved as a dataframe and integrated as a separate column into a dataframe.

n_variable_dataframe3=pd.DataFrame.from_dict(variable_n_dictionary_count3,orient='index')
n_variable_dataframe3_reset = n_variable_dataframe3.reset_index()
n_variable_dataframe3_reset.columns = ['KEID', 'n']
n_variable_dataframe3_reset

KEID n
0 https://identifiers.org/aop.events/486 40
1 https://identifiers.org/aop.events/875 51
2 https://identifiers.org/aop.events/2007 33
3 https://identifiers.org/aop.events/1495 49
4 https://identifiers.org/aop.events/105 8
... ... ...
96 https://identifiers.org/aop.events/1820 22
97 https://identifiers.org/aop.events/896 20
98 https://identifiers.org/aop.events/1549 35
99 https://identifiers.org/aop.events/357 9
100 https://identifiers.org/aop.events/352 76

101 rows × 2 columns

merged_dataframe3= pd.merge(mergeddataframeDEG, n_variable_dataframe3_reset, on='KEID')

Section 7.2. Calculation of variable B and variable b.

In this section, variable B and variable b are calculated.

Step 48. Variable B is calculated by taking the length of the dataframe which includes all genes in 1 DEG table.

B=len(AgNP_24H_DEG.index)
B
20518

Step 49. Variable b is calculated by taking the length of the dataframe which includes all genes in 1 DEG table with the condition for significance.

AgNP_24H_DEG_filtered=AgNP_24H_DEG[AgNP_24H_DEG['adj. p-value'] < 0.05]
b=len(AgNP_24H_DEG_filtered)
b
6213

Section 7.3. Calculation of enrichment score and hypergeometric p-value

In this section, the enrichment score and hypergeometric p-value will be calculated. This requires the four variables of the enrichment score per KE for which the formula will be applied to and stored in an additional dataframe.

Step 50. The final dataframe will be created that contains the KEID and the four variables: variable N, variable n, variable B and variable b.

Final_dataframe_ES= merged_dataframe3.loc[:, ['KEID','N','n']]
Final_dataframe_ES['B']=pd.Series([20518 for x in range(len(Final_dataframe_ES.index))])
Final_dataframe_ES['b']=pd.Series([6213 for x in range(len(Final_dataframe_ES.index))])
Final_Dataframe_ES=Final_dataframe_ES.drop_duplicates(subset=['KEID'],keep='first')
Final_Dataframe_ES.reset_index(drop=True,inplace=True)
Copy_Final_DataFrame_ES=Final_Dataframe_ES.copy()

Step 51. The following for loop will be used to calculate the enrichment score for individual key events and the results will be saved as a separate column into the dataframe.

def calculate_Enrichment_Score(row):
        return f"{(row['n']/row['N'])/(row['b']/row['B'])}"

Copy_Final_DataFrame_ES.loc[:,'Enrichmentscore']= Copy_Final_DataFrame_ES.apply(calculate_Enrichment_Score,axis=1)
Copy_Final_DataFrame_ES

KEID N n B b Enrichmentscore
0 https://identifiers.org/aop.events/1495 253 49 20518 6213 0.6396011423198457
1 https://identifiers.org/aop.events/1668 156 33 20518 6213 0.6985910435934579
2 https://identifiers.org/aop.events/244 417 135 20518 6213 1.069132139966443
3 https://identifiers.org/aop.events/41 275 103 20518 6213 1.236910290739359
4 https://identifiers.org/aop.events/1539 170 48 20518 6213 0.932450933053086
5 https://identifiers.org/aop.events/618 240 60 20518 6213 0.8256075969740866
6 https://identifiers.org/aop.events/1497 528 153 20518 6213 0.9569542601290549
7 https://identifiers.org/aop.events/1115 34 13 20518 6213 1.2626939718427206
8 https://identifiers.org/aop.events/1917 166 65 20518 6213 1.2931203326100151
9 https://identifiers.org/aop.events/1633 1056 306 20518 6213 0.9569542601290549
10 https://identifiers.org/aop.events/1392 102 39 20518 6213 1.2626939718427206
11 https://identifiers.org/aop.events/1582 51 16 20518 6213 1.0360565922812066
12 https://identifiers.org/aop.events/1896 205 66 20518 6213 1.0632214907373603
13 https://identifiers.org/aop.events/265 268 76 20518 6213 0.9365101100004564
14 https://identifiers.org/aop.events/1750 528 153 20518 6213 0.9569542601290549
15 https://identifiers.org/aop.events/1848 195 42 20518 6213 0.7112926989315208
16 https://identifiers.org/aop.events/890 34 13 20518 6213 1.2626939718427206
17 https://identifiers.org/aop.events/149 1056 306 20518 6213 0.9569542601290549
18 https://identifiers.org/aop.events/1579 353 91 20518 6213 0.8513347458882933
19 https://identifiers.org/aop.events/249 34 13 20518 6213 1.2626939718427206
20 https://identifiers.org/aop.events/288 51 19 20518 6213 1.230317203333933
21 https://identifiers.org/aop.events/209 617 216 20518 6213 1.1561182557303256
22 https://identifiers.org/aop.events/1945 1218 332 20518 6213 0.9001698594265903
23 https://identifiers.org/aop.events/1087 528 153 20518 6213 0.9569542601290549
24 https://identifiers.org/aop.events/1538 34 13 20518 6213 1.2626939718427206
25 https://identifiers.org/aop.events/341 10 5 20518 6213 1.6512151939481732
26 https://identifiers.org/aop.events/1090 459 129 20518 6213 0.9281340305852477
27 https://identifiers.org/aop.events/352 398 76 20518 6213 0.6306148479400058

Step 52. The following for loop will be used to calculate the hypergeometric p-value for individual Key Events and save the result as a separate column into the dataframe. This requires some in between steps for manipulation of the dataframe.

p_value_dataframe3=[]

for index, row in Copy_Final_DataFrame_ES.iterrows():

        M = row['B'] 
        n = row['b']
        N = row['N'] 
        k = row['n'] 

        hpd = ss.hypergeom(M, n, N)
        p = hpd.pmf(k)
        p_value_dataframe3.append(p)
             
Hypergeometricpvalue_dataframe3=pd.DataFrame(p_value_dataframe3)
Hypergeometricpvalue_dataframe3.columns= ['Hypergeometric p-value']
merged_finaltable_AgNp_24h=pd.concat([Copy_Final_DataFrame_ES,Hypergeometricpvalue_dataframe3],axis=1)

Section 7.4. Filtering the results for significant KEs

In this section, the results will be filtered to only include significant KEs. Significant KEs have an enrichment score above 1 and a hypergeometric p-value below 0.05.

Step 53. Lastly, we filter the results to showcase the significant KEs for the comparison: Bisphenol A 1uM.

filteredversion_AgNP_24h= merged_finaltable_AgNp_24h[(merged_finaltable_AgNp_24h['Enrichmentscore']>str(1))& (merged_finaltable_AgNp_24h['Hypergeometric p-value'] < 0.05)]
filteredversion_AgNP_24h

KEID N n B b Enrichmentscore Hypergeometric p-value
2 https://identifiers.org/aop.events/244 417 135 20518 6213 1.069132139966443 0.027258
3 https://identifiers.org/aop.events/41 275 103 20518 6213 1.236910290739359 0.001904
8 https://identifiers.org/aop.events/1917 166 65 20518 6213 1.2931203326100151 0.003231
10 https://identifiers.org/aop.events/1392 102 39 20518 6213 1.2626939718427206 0.018653
21 https://identifiers.org/aop.events/209 617 216 20518 6213 1.1561182557303256 0.001287
# Ensure numeric types
filteredversion_AgNP_24h['Hypergeometric p-value'] = pd.to_numeric(filteredversion_AgNP_24h['Hypergeometric p-value'], errors='coerce')
filteredversion_AgNP_24h['Enrichmentscore'] = pd.to_numeric(filteredversion_AgNP_24h['Enrichmentscore'], errors='coerce')
filteredversion_AgNP_24h['combined_score'] = -np.log10(filteredversion_AgNP_24h['Hypergeometric p-value']) * filteredversion_AgNP_24h['Enrichmentscore']

# Sort by combined score (highest first)
C3_sorted = filteredversion_AgNP_24h.sort_values(by='combined_score', ascending=False)

# Show top rows
C3_sorted.to_excel('ConsistentKE-AgNP24h.xlsx')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1225674419.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_AgNP_24h['Hypergeometric p-value'] = pd.to_numeric(filteredversion_AgNP_24h['Hypergeometric p-value'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1225674419.py:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_AgNP_24h['Enrichmentscore'] = pd.to_numeric(filteredversion_AgNP_24h['Enrichmentscore'], errors='coerce')
C:\Users\shaki\AppData\Local\Temp\ipykernel_16388\1225674419.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filteredversion_AgNP_24h['combined_score'] = -np.log10(filteredversion_AgNP_24h['Hypergeometric p-value']) * filteredversion_AgNP_24h['Enrichmentscore']

Section 7.5. Calculation of percent gene overlap to ORA

Section 7.5.1 Creation of the significant KEs table

In this section, you merge the dataframes to retrieve the genes connected to only the significant KEs.

Step 54. The significant KE table is created using the significan KEs from the previous merggeddataframe_final.

significantKEID_genetable3=mergeddataframe_final2[(mergeddataframe_final2['KEID'] == 'https://identifiers.org/aop.events/244')| (mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/41') | (mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/1917')|(mergeddataframe_final2['KEID'] =='https://identifiers.org/aop.events/1392')| (mergeddataframe_final2['KEID']=='https://identifiers.org/aop.events/209')]
significantKEIDgenetable3=significantKEID_genetable3.drop(columns={'WPtitle','ID'})
significantKEIDgenetable3

KEID gene Entrez.Gene
1221 https://identifiers.org/aop.events/244 CASP2 835
1222 https://identifiers.org/aop.events/244 RTCB 51493
1223 https://identifiers.org/aop.events/244 BCL2 596
1224 https://identifiers.org/aop.events/244 BCL2 100049703
1225 https://identifiers.org/aop.events/244 BCL2L11 10018
... ... ... ...
13650 https://identifiers.org/aop.events/209 FANCD2 2177
13651 https://identifiers.org/aop.events/209 RPA1 6117
13652 https://identifiers.org/aop.events/209 PCNA 5111
13653 https://identifiers.org/aop.events/209 RFC3 5983
13654 https://identifiers.org/aop.events/209 FAAP24 91442

1577 rows × 3 columns

Section 7.5.2 Significant ORA pathway table plus splitting

In this section, the significant ORA pathway table is created.

Step 55. The significant ORA pathway table is created using the significant enriched patwhays identified from the ORA analysis. This requires data manipulation to restructure the table in a way that the individual genes for the enriched pathways are placed on individual rows.

datafile_ORA3 = pd.read_csv("C:/Users/shaki/Downloads/ORA_tables_for_comparison/Comparison 3-AgNP-24H.txt", sep='\t')
datafileORA3=pd.DataFrame(datafile_ORA3)
filtereddatafileORA_3=datafileORA3[datafileORA3['Adjusted P-value'] < 0.05]
filtereddatafileORA_3

Gene_set Term P-value Adjusted P-value Old P-value Old adjusted P-value Odds Ratio Combined Score Genes
0 WikiPathways_2024_Human Ciliopathies WP4803 0.000002 0.001669 0 0 2.050893 26.87625 INVS;GALNT11;DYNC2I1;ODAD4;TRAF3IP1;IFT172;CEP...
1 WikiPathways_2024_Human Genes Related To Primary Cilium Development Ba... 0.000006 0.002491 0 0 2.477324 29.75603 DYNC2I1;TTC23;TRAF3IP1;IFT172;CEP19;CEP120;CBY...
2 WikiPathways_2024_Human Pluripotent Stem Cell Differentiation Pathway ... 0.000100 0.020127 0 0 3.121977 28.75355 ALK;CSF1R;EPO;PDGFA;FGF1;FGF4;INS;NT5E;FGF8;CX...
3 WikiPathways_2024_Human Bardet Biedl Syndrome WP5234 0.000119 0.020127 0 0 2.299628 20.77129 INVS;DYNC2I1;CEP104;TRAF3IP1;IFT172;PKD1L1;PKD...
4 WikiPathways_2024_Human NRF2 Pathway WP2884 0.000143 0.020127 0 0 1.918783 16.99291 SERPINA1;HSP90AB1;SRXN1;SLC2A1;KEAP1;SLC2A2;SL...
5 WikiPathways_2024_Human Nuclear Receptors Meta Pathway WP2882 0.000147 0.020127 0 0 1.556285 13.73149 KEAP1;IRS2;AHR;RGS2;SCP2;FTH1;PDK4;CYP1B1;ACAA...
6 WikiPathways_2024_Human Photodynamic Therapy Induced NFE2L2 NRF2 Survi... 0.000268 0.029633 0 0 4.723248 38.84128 ABCC3;ABCC4;JUN;ABCC2;SRXN1;EPHX1;ABCC6;KEAP1;...
7 WikiPathways_2024_Human Proximal Tubule Transport WP4917 0.000289 0.029633 0 0 2.611755 21.28253 ATP6V1A;SLC47A1;SLC1A1;SLC2A1;SLC2A2;SLC5A1;SL...
8 WikiPathways_2024_Human Osteoblast Differentiation And Related Disease... 0.000326 0.029714 0 0 1.950916 15.66235 IHH;FZD10;FGF1;FGF3;GLI3;FGF4;PIK3C2B;GLI2;FGF...
9 WikiPathways_2024_Human Vitamin D Receptor Pathway WP2877 0.000375 0.030776 0 0 1.708108 13.47314 ITGAM;IL25;HILPDA;TNFAIP3;GXYLT2;SLC2A4;TREM1;...
10 WikiPathways_2024_Human G1 To S Cell Cycle Control WP45 0.000550 0.041030 0 0 2.368871 17.77806 CDKN1C;PCNA;MCM7;ATF6B;PRIM1;CCND3;CCNB1;CCND2...
# Make sure 'Combined Score' is numeric
datafileORA3['Combined Score'] = pd.to_numeric(datafileORA3['Combined Score'], errors='coerce')

# Sort by 'Combined Score' in descending order
ranked_df3 = datafileORA3.sort_values(by='Combined Score', ascending=False)

# (Optional) Save to Excel
ranked_df3.to_excel('AgNP24H-ORAtable-thesis(EMEXP3583).xlsx', index=False)
dropped_datafileORA_df3=filtereddatafileORA_3.drop(['Adjusted P-value','Odds Ratio','Old P-value','Gene_set','P-value','Old adjusted P-value','Combined Score'],axis=1)
droppeddatafileORAdf3=dropped_datafileORA_df3.copy()
droppeddatafileORAdf3['Genes']= droppeddatafileORAdf3['Genes'].replace({';':','},regex=True)
df3_ORApathwaytable=droppeddatafileORAdf3.copy()
df3_ORApathwaytable['Genes'] = df3_ORApathwaytable['Genes'].astype(str)
df3_ORApathwaytable['Genes'] = df3_ORApathwaytable['Genes'].str.split(',')
exploded_df3_ORApathwaytable = df3_ORApathwaytable.explode('Genes', ignore_index=True)
Section 7.5.3 For loop to get overlapping genes

In this section, the number of overlapping genes between the significant enrichment score-based Key Events and enriched pathways from ORA are calculated.

Step 56. Next, two sets are created by converting the significant KE table and ORA pathway table into dictionaries where the values of the genes are grouped together per key. This is followed by running a for loop to calculate the number of overlapping genes along with the symbols.

ORA_gene_sets3 = exploded_df3_ORApathwaytable.groupby('Term')['Genes'].apply(set).to_dict() 
SignificantKE_gene_sets3 = significantKEIDgenetable3.groupby('KEID')['gene'].apply(set).to_dict()  
overlapping_genes_betweenORA_and_significantKEs3 = {}

for term, ORA_genes in ORA_gene_sets3.items():
    for KEID, KEID_genes in SignificantKE_gene_sets3.items():
        overlap = ORA_genes.intersection(KEID_genes)
        print(f"{term} x {KEID}: {len(overlap)} overlaps")
        overlapping_genes_betweenORA_and_significantKEs3[(term, KEID)] = {
                'overlapping genes': overlap,
                'number of genes that overlap': len(overlap)
            }
if overlapping_genes_betweenORA_and_significantKEs3:
    print("\ntitle of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:")
    for (term, KEID), result in overlapping_genes_betweenORA_and_significantKEs3.items():
        print(f"Term: {term}, KEID: {KEID}, Title of overlapping gene(s): {result['overlapping genes']}, number: {result['number of genes that overlap']}")
else:
    print("No overlapping genes")
Bardet Biedl Syndrome WP5234 x https://identifiers.org/aop.events/1392: 0 overlaps
Bardet Biedl Syndrome WP5234 x https://identifiers.org/aop.events/1917: 0 overlaps
Bardet Biedl Syndrome WP5234 x https://identifiers.org/aop.events/209: 1 overlaps
Bardet Biedl Syndrome WP5234 x https://identifiers.org/aop.events/244: 0 overlaps
Bardet Biedl Syndrome WP5234 x https://identifiers.org/aop.events/41: 0 overlaps
Ciliopathies WP4803 x https://identifiers.org/aop.events/1392: 0 overlaps
Ciliopathies WP4803 x https://identifiers.org/aop.events/1917: 0 overlaps
Ciliopathies WP4803 x https://identifiers.org/aop.events/209: 1 overlaps
Ciliopathies WP4803 x https://identifiers.org/aop.events/244: 0 overlaps
Ciliopathies WP4803 x https://identifiers.org/aop.events/41: 0 overlaps
G1 To S Cell Cycle Control WP45 x https://identifiers.org/aop.events/1392: 0 overlaps
G1 To S Cell Cycle Control WP45 x https://identifiers.org/aop.events/1917: 0 overlaps
G1 To S Cell Cycle Control WP45 x https://identifiers.org/aop.events/209: 6 overlaps
G1 To S Cell Cycle Control WP45 x https://identifiers.org/aop.events/244: 10 overlaps
G1 To S Cell Cycle Control WP45 x https://identifiers.org/aop.events/41: 3 overlaps
Genes Related To Primary Cilium Development Based On CRISPR WP4536 x https://identifiers.org/aop.events/1392: 0 overlaps
Genes Related To Primary Cilium Development Based On CRISPR WP4536 x https://identifiers.org/aop.events/1917: 0 overlaps
Genes Related To Primary Cilium Development Based On CRISPR WP4536 x https://identifiers.org/aop.events/209: 0 overlaps
Genes Related To Primary Cilium Development Based On CRISPR WP4536 x https://identifiers.org/aop.events/244: 0 overlaps
Genes Related To Primary Cilium Development Based On CRISPR WP4536 x https://identifiers.org/aop.events/41: 0 overlaps
NRF2 Pathway WP2884 x https://identifiers.org/aop.events/1392: 5 overlaps
NRF2 Pathway WP2884 x https://identifiers.org/aop.events/1917: 60 overlaps
NRF2 Pathway WP2884 x https://identifiers.org/aop.events/209: 60 overlaps
NRF2 Pathway WP2884 x https://identifiers.org/aop.events/244: 60 overlaps
NRF2 Pathway WP2884 x https://identifiers.org/aop.events/41: 60 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1392: 6 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1917: 61 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/209: 69 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/244: 66 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/41: 70 overlaps
Osteoblast Differentiation And Related Diseases WP4787 x https://identifiers.org/aop.events/1392: 1 overlaps
Osteoblast Differentiation And Related Diseases WP4787 x https://identifiers.org/aop.events/1917: 0 overlaps
Osteoblast Differentiation And Related Diseases WP4787 x https://identifiers.org/aop.events/209: 14 overlaps
Osteoblast Differentiation And Related Diseases WP4787 x https://identifiers.org/aop.events/244: 11 overlaps
Osteoblast Differentiation And Related Diseases WP4787 x https://identifiers.org/aop.events/41: 2 overlaps
Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612 x https://identifiers.org/aop.events/1392: 4 overlaps
Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612 x https://identifiers.org/aop.events/1917: 7 overlaps
Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612 x https://identifiers.org/aop.events/209: 10 overlaps
Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612 x https://identifiers.org/aop.events/244: 8 overlaps
Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612 x https://identifiers.org/aop.events/41: 8 overlaps
Pluripotent Stem Cell Differentiation Pathway WP2848 x https://identifiers.org/aop.events/1392: 0 overlaps
Pluripotent Stem Cell Differentiation Pathway WP2848 x https://identifiers.org/aop.events/1917: 1 overlaps
Pluripotent Stem Cell Differentiation Pathway WP2848 x https://identifiers.org/aop.events/209: 4 overlaps
Pluripotent Stem Cell Differentiation Pathway WP2848 x https://identifiers.org/aop.events/244: 4 overlaps
Pluripotent Stem Cell Differentiation Pathway WP2848 x https://identifiers.org/aop.events/41: 1 overlaps
Proximal Tubule Transport WP4917 x https://identifiers.org/aop.events/1392: 0 overlaps
Proximal Tubule Transport WP4917 x https://identifiers.org/aop.events/1917: 9 overlaps
Proximal Tubule Transport WP4917 x https://identifiers.org/aop.events/209: 9 overlaps
Proximal Tubule Transport WP4917 x https://identifiers.org/aop.events/244: 9 overlaps
Proximal Tubule Transport WP4917 x https://identifiers.org/aop.events/41: 9 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1392: 2 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/1917: 4 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/209: 10 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/244: 5 overlaps
Vitamin D Receptor Pathway WP2877 x https://identifiers.org/aop.events/41: 4 overlaps

title of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:
Term: Bardet Biedl Syndrome WP5234, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Bardet Biedl Syndrome WP5234, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: Bardet Biedl Syndrome WP5234, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'INVS'}, number: 1
Term: Bardet Biedl Syndrome WP5234, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): set(), number: 0
Term: Bardet Biedl Syndrome WP5234, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): set(), number: 0
Term: Ciliopathies WP4803, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Ciliopathies WP4803, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: Ciliopathies WP4803, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'INVS'}, number: 1
Term: Ciliopathies WP4803, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): set(), number: 0
Term: Ciliopathies WP4803, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): set(), number: 0
Term: G1 To S Cell Cycle Control WP45, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: G1 To S Cell Cycle Control WP45, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: G1 To S Cell Cycle Control WP45, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'GADD45A', 'POLE2', 'PCNA', 'CCND3', 'CCND2', 'CDC25A'}, number: 6
Term: G1 To S Cell Cycle Control WP45, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'GADD45A', 'CCNG2', 'TP53', 'CDC25A', 'CCNB1', 'CDK1', 'MDM2', 'CCND3', 'CCND2', 'CDK6'}, number: 10
Term: G1 To S Cell Cycle Control WP45, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'CCNA1', 'CCNB1', 'TP53'}, number: 3
Term: Genes Related To Primary Cilium Development Based On CRISPR WP4536, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Genes Related To Primary Cilium Development Based On CRISPR WP4536, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: Genes Related To Primary Cilium Development Based On CRISPR WP4536, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): set(), number: 0
Term: Genes Related To Primary Cilium Development Based On CRISPR WP4536, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): set(), number: 0
Term: Genes Related To Primary Cilium Development Based On CRISPR WP4536, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): set(), number: 0
Term: NRF2 Pathway WP2884, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'SOD3', 'HMOX1', 'GSR', 'GPX3', 'GCLC'}, number: 5
Term: NRF2 Pathway WP2884, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'FTH1', 'GPX3', 'G6PD', 'PGD', 'GSTA5', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'HSP90AB1', 'SLC6A2', 'GPX2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'CBR1', 'ABCC3', 'MAFG', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'KEAP1', 'RXRA', 'UGT1A4', 'TXNRD3', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 60
Term: NRF2 Pathway WP2884, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'FTH1', 'GPX3', 'G6PD', 'PGD', 'GSTA5', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'HSP90AB1', 'SLC6A2', 'GPX2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'CBR1', 'ABCC3', 'MAFG', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'KEAP1', 'RXRA', 'UGT1A4', 'TXNRD3', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 60
Term: NRF2 Pathway WP2884, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'FTH1', 'GPX3', 'G6PD', 'PGD', 'GSTA5', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'HSP90AB1', 'SLC6A2', 'GPX2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'CBR1', 'ABCC3', 'MAFG', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'KEAP1', 'RXRA', 'UGT1A4', 'TXNRD3', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 60
Term: NRF2 Pathway WP2884, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'FTH1', 'GPX3', 'G6PD', 'PGD', 'GSTA5', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'HSP90AB1', 'SLC6A2', 'GPX2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'CBR1', 'ABCC3', 'MAFG', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'KEAP1', 'RXRA', 'UGT1A4', 'TXNRD3', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 60
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'SOD3', 'HMOX1', 'GSR', 'GPX3', 'CYP1A1', 'GCLC'}, number: 6
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'FTH1', 'G6PD', 'GPX3', 'GSTA5', 'PGD', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SRC', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'GPX2', 'HSP90AB1', 'SLC6A2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'ABCC3', 'MAFG', 'CBR1', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'UGT1A4', 'KEAP1', 'TXNRD3', 'RXRA', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 61
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'CPT2', 'ACAA1', 'CPT1A', 'FTH1', 'G6PD', 'GPX3', 'EHHADH', 'GSTA5', 'PGD', 'SLC39A12', 'SLC6A20', 'APOA1', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'GPX2', 'HSP90AB1', 'SLC6A2', 'DNAJB1', 'CYP1A1', 'HMOX1', 'SCP2', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'ABCC3', 'JUN', 'MAFG', 'CBR1', 'CES5A', 'SLC39A11', 'SOD3', 'PCK1', 'SLC5A10', 'UGT1A4', 'KEAP1', 'TXNRD3', 'RXRA', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 69
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'GADD45B', 'FTH1', 'G6PD', 'GPX3', 'GSTA5', 'PGD', 'NFKB2', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SRC', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'GPX2', 'HSP90AB1', 'SLC6A2', 'DNAJB1', 'HMOX1', 'CDK1', 'SCP2', 'TGFB1', 'SLC6A8', 'ABCC2', 'SLC5A5', 'ABCC3', 'JUN', 'MAFG', 'CBR1', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'UGT1A4', 'KEAP1', 'TXNRD3', 'RXRA', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'GCLC', 'SLC6A1', 'SLC2A4', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 66
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'IRS2', 'SLC2A4', 'CPT1A', 'FTH1', 'NR0B2', 'G6PD', 'GPX3', 'ABCG8', 'GSTA5', 'PGD', 'SLC39A12', 'SLC6A20', 'ABCC4', 'SLC5A12', 'SLC5A2', 'TGFBR2', 'SRC', 'SLC6A6', 'SLC5A4', 'SERPINA1', 'GPX2', 'HSP90AB1', 'SLC6A2', 'DNAJB1', 'HMOX1', 'TGFB1', 'SLC6A8', 'ABCC2', 'IP6K3', 'SLC5A5', 'ABCC3', 'MAFG', 'CBR1', 'CES5A', 'SLC39A11', 'SOD3', 'SLC5A10', 'UGT1A4', 'KEAP1', 'TXNRD3', 'RXRA', 'SLC2A1', 'NRG1', 'EGR1', 'MAFF', 'SLC39A10', 'HSPA1A', 'SLC5A9', 'SLC6A9', 'ALDH3A1', 'SRXN1', 'SLC6A5', 'BAAT', 'GSTA4', 'SLC6A19', 'SLC6A17', 'GSR', 'HSP90AA1', 'CES2', 'FASN', 'SLC6A1', 'ABCG5', 'GCLC', 'FKBP5', 'SLC6A7', 'SLC2A14', 'HBEGF', 'UGT1A9'}, number: 70
Term: Osteoblast Differentiation And Related Diseases WP4787, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'MAPK14'}, number: 1
Term: Osteoblast Differentiation And Related Diseases WP4787, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: Osteoblast Differentiation And Related Diseases WP4787, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'WNT3A', 'MAPK14', 'FZD7', 'WNT1', 'WNT11', 'WNT7B', 'WNT7A', 'WNT6', 'LRP5', 'FZD10', 'FZD2', 'FZD9', 'FZD3', 'WNT10A'}, number: 14
Term: Osteoblast Differentiation And Related Diseases WP4787, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'WNT3A', 'PIK3R3', 'PIK3C2B', 'WNT1', 'WNT11', 'PIK3R1', 'WNT7B', 'WNT6', 'PIK3R5', 'WNT7A', 'WNT10A'}, number: 11
Term: Osteoblast Differentiation And Related Diseases WP4787, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'PIK3R3', 'PIK3R1'}, number: 2
Term: Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'GCLC', 'FOS', 'MAPK14', 'HMOX1'}, number: 4
Term: Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'ABCC4', 'SRXN1', 'KEAP1', 'HMOX1', 'GCLC', 'ABCC2', 'ABCC3'}, number: 7
Term: Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'ABCC4', 'SRXN1', 'MAPK14', 'KEAP1', 'HMOX1', 'GCLC', 'FOS', 'ABCC2', 'JUN', 'ABCC3'}, number: 10
Term: Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'ABCC4', 'SRXN1', 'KEAP1', 'HMOX1', 'GCLC', 'JUN', 'ABCC2', 'ABCC3'}, number: 8
Term: Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'ABCC4', 'SRXN1', 'KEAP1', 'HMOX1', 'GCLC', 'ABCC2', 'ABCC3', 'EPHX1'}, number: 8
Term: Pluripotent Stem Cell Differentiation Pathway WP2848, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Pluripotent Stem Cell Differentiation Pathway WP2848, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'TGFB1'}, number: 1
Term: Pluripotent Stem Cell Differentiation Pathway WP2848, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'WNT3A', 'TGFB1', 'WNT7B', 'WNT1'}, number: 4
Term: Pluripotent Stem Cell Differentiation Pathway WP2848, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'WNT3A', 'TGFB1', 'WNT7B', 'WNT1'}, number: 4
Term: Pluripotent Stem Cell Differentiation Pathway WP2848, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'TGFB1'}, number: 1
Term: Proximal Tubule Transport WP4917, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Proximal Tubule Transport WP4917, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'ABCC4', 'SLC5A2', 'SLC2A1', 'SLC6A19', 'ABCC2', 'SLC5A5', 'SLC6A20'}, number: 9
Term: Proximal Tubule Transport WP4917, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'ABCC4', 'SLC5A2', 'SLC2A1', 'SLC6A19', 'ABCC2', 'SLC5A5', 'SLC6A20'}, number: 9
Term: Proximal Tubule Transport WP4917, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'ABCC4', 'SLC5A2', 'SLC2A1', 'SLC6A19', 'ABCC2', 'SLC5A5', 'SLC6A20'}, number: 9
Term: Proximal Tubule Transport WP4917, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'SLC2A2', 'SLC5A1', 'ABCC4', 'SLC5A2', 'SLC2A1', 'SLC6A19', 'ABCC2', 'SLC5A5', 'SLC6A20'}, number: 9
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'CYP1A1', 'NOX1'}, number: 2
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'G6PD', 'RXRA', 'TGFB1', 'SLC2A4'}, number: 4
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'GADD45A', 'SFRP1', 'RXRA', 'TGFB1', 'NOX1', 'LRP5', 'G6PD', 'CYP1A1', 'SLC2A4', 'IRF5'}, number: 10
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/244, Title of overlapping gene(s): {'GADD45A', 'RXRA', 'TGFB1', 'G6PD', 'SLC2A4'}, number: 5
Term: Vitamin D Receptor Pathway WP2877, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'G6PD', 'RXRA', 'TGFB1', 'SLC2A4'}, number: 4
Section 7.5.4 Tabulation gene overlap

In this section, a table is created that contains the number of overlapping genes and number of total genes in preparation for section 7.5.5.

final_geneoverlaptable_AGNP_24H=pd.DataFrame.from_dict(overlapping_genes_betweenORA_and_significantKEs3,orient='index')
Section 7.5.5 Percent overlap calculation

In this section, the percent overlap for the genesets are calculated.

Step 57. Lastly, the percent overlap is calculated and add the result as a column to the dataframe. This is first done by running a for loop to calculate the total number of genes belonging to the enriched pathways of ORA.

variable_count3= {}

for index, row in exploded_df3_ORApathwaytable.iterrows():
    unique_KE = row['Term']
    gene_expression_value = row['Genes']

    if unique_KE not in variable_count3:
            variable_count3[unique_KE] = 1
    else:
            variable_count3[unique_KE] += 1

print("The total number of genes: ")
print(variable_count3)
The total number of genes: 
{'Ciliopathies WP4803': 81, 'Genes Related To Primary Cilium Development Based On CRISPR WP4536': 50, 'Pluripotent Stem Cell Differentiation Pathway WP2848': 26, 'Bardet Biedl Syndrome WP5234': 41, 'NRF2 Pathway WP2884': 60, 'Nuclear Receptors Meta Pathway WP2882': 118, 'Photodynamic Therapy Induced NFE2L2 NRF2 Survival Signaling WP3612': 15, 'Proximal Tubule Transport WP4917': 29, 'Osteoblast Differentiation And Related Diseases WP4787': 51, 'Vitamin D Receptor Pathway WP2877': 73, 'G1 To S Cell Cycle Control WP45': 31}

Step 58. The result is converted into a dataframe and added to the final dataframe. This is followed by some data manipulation prior to calculation of gene set overlap.

variable_count_df3=pd.DataFrame.from_dict(variable_count3,orient='index')
reset_variable_count_df3 = variable_count_df3.reset_index()
reset_variable_count_df3.columns = ['Term', 'Total number of genes']
Genesetoverlaptable_AGNP24H=final_geneoverlaptable_AGNP_24H.reset_index(level=[1])
Genesetoverlaptable_AGNP24H.reset_index(inplace=True)
Genesetoverlaptable_AGNP24H.columns= ['Term','KEID','overlapping genes','number of genes that overlap']
tabulation_AGNP24h=pd.merge(reset_variable_count_df3,Genesetoverlaptable_AGNP24H, on='Term')
def calculate_Genesetoverlap_Score(row):
        return f"{(row['number of genes that overlap']/row['Total number of genes'])*100}"

tabulation_AGNP24h.loc[:,'Percent geneset overlap']= tabulation_AGNP24h.apply(calculate_Genesetoverlap_Score, axis=1)
tabulation_AGNP24h.to_excel('geneoverlap-calculation-AgNP24h.xlsx')

Section 8. Comparison 4: AgNP 48H

Section 8.1 Calculation of n variable

In this section, variable n will be calculated for the comparison: Bisphenol A concentration 1uM to control.

Step 59. The table containing the differential expressed genes for comparison 4 is loaded with the filter for significance.

AgNP_48H_DEG= pd.read_csv('topTable_AgNP_12.1_48 - H2O.control_.0.0_48.tsv',sep='\t')
AgNP48H_DEG= AgNP_48H_DEG[AgNP_48H_DEG['adj. p-value'] < 0.05]
AgNP48h_DEG= AgNP48H_DEG.copy()  
AgNP48h_DEG.rename(columns={AgNP48h_DEG.columns[0]: 'Entrez.Gene'}, inplace=True)
AgNP48h_DEG['Entrez.Gene'] = AgNP48h_DEG['Entrez.Gene'].astype(str)

Step 60. Here, the results of the DEG table are integrated into the mergeddataframe dataframe. This is followed by adjustment of the dataframe columns to remove non-relevant columns.

merged_dataframe_DEG_AgNP_48h= pd.merge(mergeddataframe,AgNP48h_DEG, on='Entrez.Gene')
merged_dataframe_DEG_AgNP_48h

Step 61. The following for loop for the key events to retrieve the n variable. It is comparable to the for loop of N, but adds a condition to check for significance of genes by p adjusted value being smaller than 0.05.

variable_n_dictionary_count4= {}

for index, row in merged_dataframe_DEG_AgNP_48h.iterrows():
    unique_KE = row['KEID']
    gene_expression_value = row['adj. p-value']

    if gene_expression_value < 0.05:
    
        if unique_KE not in variable_n_dictionary_count4:
            variable_n_dictionary_count4[unique_KE] = 1
        else:
            variable_n_dictionary_count4[unique_KE] += 1

print("The total number of significant genes: ")

Step 62. The output of the n variable dictionary is saved as a dataframe and integrated as a separate column into a dataframe.

n_variable_dataframe4=pd.DataFrame.from_dict(variable_n_dictionary_count4,orient='index')
n_variable_dataframe4_reset = n_variable_dataframe4.reset_index()
n_variable_dataframe4_reset.columns = ['KEID', 'n']
merged_dataframe3= pd.merge(mergeddataframeDEG, n_variable_dataframe4_reset, on='KEID')

Section 8.2. Calculation of variable B and variable b.

In this section, variable B and variable b are calculated.

Step 63. Variable B is calculated by taking the length of the dataframe which includes all genes in 1 DEG table.

B=len(AgNP_48H_DEG.index)
B
20518

Step 64. Variable b is calculated by taking the length of the dataframe which includes all genes in 1 DEG table with the condition for significance.

AgNP_48H_DEG_filtered=AgNP_48H_DEG[AgNP_48H_DEG['adj. p-value'] < 0.05]
b=len(AgNP_48H_DEG_filtered)
b
2319

Section 8.3. Calculation of enrichment score and hypergeometric p-value

In this section, the enrichment score and hypergeometric p-value will be calculated. This requires the four variables of the enrichment score per KE for which the formula will be applied to and stored in an additional dataframe.

Step 65. The final dataframe will be created that contains the KEID and the four variables: variable N, variable n, variable B and variable b.

Final_dataframe_ES= merged_dataframe3.loc[:, ['KEID','N','n']]
Final_dataframe_ES['B']=pd.Series([20518 for x in range(len(Final_dataframe_ES.index))])
Final_dataframe_ES['b']=pd.Series([2319 for x in range(len(Final_dataframe_ES.index))])
Final_Dataframe_ES=Final_dataframe_ES.drop_duplicates(subset=['KEID'],keep='first')
Final_Dataframe_ES.reset_index(drop=True,inplace=True)
Copy_Final_DataFrame_ES=Final_Dataframe_ES.copy()

Step 66. The follow for loop will be used to calculate the enrichment score for individual key events and the results will be saved as a separate column into the dataframe.

def calculate_Enrichment_Score(row):
        return f"{(row['n']/row['N'])/(row['b']/row['B'])}"

Copy_Final_DataFrame_ES.loc[:,'Enrichmentscore']= Copy_Final_DataFrame_ES.apply(calculate_Enrichment_Score,axis=1)
Copy_Final_DataFrame_ES

KEID N n B b Enrichmentscore
0 https://identifiers.org/aop.events/1495 253 36 20518 2319 1.2589725365472033
1 https://identifiers.org/aop.events/1668 156 19 20518 2319 1.0776141351820523
2 https://identifiers.org/aop.events/244 417 70 20518 2319 1.4852387171763237
3 https://identifiers.org/aop.events/41 275 53 20518 2319 1.7052083578344897
4 https://identifiers.org/aop.events/1539 170 20 20518 2319 1.0409152017857595
5 https://identifiers.org/aop.events/618 240 19 20518 2319 0.700449187868334
6 https://identifiers.org/aop.events/1497 528 70 20518 2319 1.1730010323153919
7 https://identifiers.org/aop.events/1115 34 8 20518 2319 2.081830403571519
8 https://identifiers.org/aop.events/1917 166 32 20518 2319 1.7055959932875098
9 https://identifiers.org/aop.events/1633 1056 140 20518 2319 1.1730010323153919
10 https://identifiers.org/aop.events/1392 102 24 20518 2319 2.081830403571519
11 https://identifiers.org/aop.events/1582 51 10 20518 2319 1.7348586696429327
12 https://identifiers.org/aop.events/1896 205 37 20518 2319 1.596916248593275
13 https://identifiers.org/aop.events/265 268 30 20518 2319 0.9904230464752564
14 https://identifiers.org/aop.events/1750 528 70 20518 2319 1.1730010323153919
15 https://identifiers.org/aop.events/1848 195 26 20518 2319 1.179703895357194
16 https://identifiers.org/aop.events/890 34 8 20518 2319 2.081830403571519
17 https://identifiers.org/aop.events/149 1056 140 20518 2319 1.1730010323153919
18 https://identifiers.org/aop.events/1579 353 34 20518 2319 0.8521940320568966
19 https://identifiers.org/aop.events/249 34 8 20518 2319 2.081830403571519
20 https://identifiers.org/aop.events/288 51 10 20518 2319 1.7348586696429327
21 https://identifiers.org/aop.events/209 617 93 20518 2319 1.3336198817044458
22 https://identifiers.org/aop.events/1945 1218 134 20518 2319 0.9734009974006406
23 https://identifiers.org/aop.events/1087 528 70 20518 2319 1.1730010323153919
24 https://identifiers.org/aop.events/1538 34 8 20518 2319 2.081830403571519
25 https://identifiers.org/aop.events/341 10 1 20518 2319 0.8847779215178957
26 https://identifiers.org/aop.events/1090 459 49 20518 2319 0.9445341645833745
27 https://identifiers.org/aop.events/352 398 47 20518 2319 1.0448382490286707

Step 67. The following for loop will be used to calculate the hypergeometric p-value for individual Key Events and save the result as a separate column into the dataframe. This requires some in between steps for manipulation of the dataframe.

p_value_dataframe4=[]

for index, row in Copy_Final_DataFrame_ES.iterrows():

        M = row['B'] 
        n = row['b']
        N = row['N'] 
        k = row['n'] 

        hpd = ss.hypergeom(M, n, N)
        p = hpd.pmf(k)
        p_value_dataframe4.append(p)
             
Hypergeometricpvalue_dataframe4=pd.DataFrame(p_value_dataframe4)
Hypergeometricpvalue_dataframe4.columns= ['Hypergeometric p-value']
merged_finaltable4=pd.concat([Copy_Final_DataFrame_ES,Hypergeometricpvalue_dataframe4],axis=1)

Section 8.4. Filtering the results for significant KEs and calculation of percent gene overlap to ORA

In this section, the results will be filtered to only include significant KEs. Significant KEs have an enrichment score above 1 and a hypergeometric p-value below 0.05.

Section 8.4.1 Creation of the significant KEs table

In this section, you merge the dataframes to retrieve the genes connected to only the significant KEs.

Step 85. The significant KE table is created using the significan KEs from the previous merggeddataframe_final.

filteredversion_C5= merged_finaltable4[(merged_finaltable4['Enrichmentscore']>str(1))& (merged_finaltable4['Hypergeometric p-value'] < 0.05)]
SignificantKE_list5=filteredversion_C5['KEID'].tolist()
significantKEID_genetable5= mergeddataframe_final[mergeddataframe_final['KEID'].isin(SignificantKE_list5)]
significantKEID_genetable5

KEID WPtitle ID gene Entrez.Gene
482 https://identifiers.org/aop.events/1495 Cytosolic DNA-sensing pathway WP4655 TREX1 11277
483 https://identifiers.org/aop.events/1495 Cytosolic DNA-sensing pathway WP4655 IFNA5 3442
484 https://identifiers.org/aop.events/1495 Cytosolic DNA-sensing pathway WP4655 IFNA1 3439
485 https://identifiers.org/aop.events/1495 Cytosolic DNA-sensing pathway WP4655 IFNA2 3440
486 https://identifiers.org/aop.events/1495 Cytosolic DNA-sensing pathway WP4655 IFNA4 3441
... ... ... ... ... ...
18889 https://identifiers.org/aop.events/1538 Oxidative stress response WP408 TXNRD2 10587
18890 https://identifiers.org/aop.events/1538 Oxidative stress response WP408 MT1X 4501
18891 https://identifiers.org/aop.events/1538 Oxidative stress response WP408 NOX1 27035
18892 https://identifiers.org/aop.events/1538 Oxidative stress response WP408 NFIX 4784
18893 https://identifiers.org/aop.events/1538 Oxidative stress response WP408 NOX3 50508

5969 rows × 5 columns

Section 8.4.2 Significant ORA pathway table plus splitting

In this section, the significant ORA pathway table is created.

Step 86. The significant ORA pathway table is created using the significant enriched patwhays identified from the ORA analysis. This requires data manipulation to restructure the table in a way that the individual genes for the enriched pathways are placed on individual rows.

datafile_ORA5 = pd.read_csv("C:/Users/shaki/Downloads/ORA_tables_for_comparison/Comparison 4-AgNP-48H.txt", sep='\t')
datafileORA5=pd.DataFrame(datafile_ORA5)
filtereddatafileORA_5=datafileORA5[datafileORA5['Adjusted P-value'] < 0.05]
dropped_datafileORA_df5=filtereddatafileORA_5.drop(['Adjusted P-value','Odds Ratio','Old P-value','Gene_set','P-value','Old adjusted P-value','Combined Score'],axis=1)
droppeddatafileORAdf5=dropped_datafileORA_df5.copy()
droppeddatafileORAdf5['Genes']= droppeddatafileORAdf5['Genes'].replace({';':','},regex=True)
df5_ORApathwaytable=droppeddatafileORAdf5.copy()
df5_ORApathwaytable['Genes'] = df5_ORApathwaytable['Genes'].astype(str)
df5_ORApathwaytable['Genes'] = df5_ORApathwaytable['Genes'].str.split(',')
exploded_df5_ORApathwaytable = df5_ORApathwaytable.explode('Genes', ignore_index=True)
Section 8.4.3 For loop to get overlapping genes

In this section, the number of overlapping genes between the significant enrichment score-based Key Events and enriched pathways from ORA are calculated.

Step 87. Next, two sets are created by converting the significant KE table and ORA pathway table into dictionaries where the values of the genes are grouped together per key. This is followed by running a for loop to calculate the number of overlapping genes along with the symbols.

ORA_gene_sets5 = exploded_df5_ORApathwaytable.groupby('Term')['Genes'].apply(set).to_dict() 
SignificantKE_gene_sets5 = significantKEID_genetable5.groupby('KEID')['gene'].apply(set).to_dict()  
overlapping_genes_betweenORA_and_significantKEs5 = {}

for term, ORA_genes in ORA_gene_sets5.items():
    for KEID, KEID_genes in SignificantKE_gene_sets5.items():
        overlap = ORA_genes.intersection(KEID_genes)
        print(f"{term} x {KEID}: {len(overlap)} overlaps")
        overlapping_genes_betweenORA_and_significantKEs5[(term, KEID)] = {
                'overlapping genes': overlap,
                'number of genes that overlap': len(overlap)
            }
if overlapping_genes_betweenORA_and_significantKEs5:
    print("\ntitle of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:")
    for (term, KEID), result in overlapping_genes_betweenORA_and_significantKEs5.items():
        print(f"Term: {term}, KEID: {KEID}, Title of overlapping gene(s): {result['overlapping genes']}, number: {result['number of genes that overlap']}")
else:
    print("No overlapping genes")
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1087: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1115: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1392: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/149: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1495: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1497: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1538: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1582: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1633: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1750: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1896: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/1917: 1 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/209: 5 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/244 : 5 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/249: 2 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/288: 0 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/41: 3 overlaps
Copper Homeostasis WP3286 x https://identifiers.org/aop.events/890: 2 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1087: 5 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1115: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1392: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/149: 5 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1495: 11 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1497: 5 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1538: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1582: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1633: 5 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1750: 5 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1896: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/1917: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/209: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/244 : 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/249: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/288: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/41: 0 overlaps
Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865 x https://identifiers.org/aop.events/890: 0 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1087: 4 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1115: 5 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1392: 5 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/149: 4 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1495: 1 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1497: 4 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1538: 5 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1582: 1 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1633: 4 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1750: 4 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1896: 2 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/1917: 29 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/209: 34 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/244 : 31 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/249: 5 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/288: 8 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/41: 34 overlaps
Nuclear Receptors Meta Pathway WP2882 x https://identifiers.org/aop.events/890: 5 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1087: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1115: 3 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1392: 3 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/149: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1495: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1497: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1538: 3 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1582: 0 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1633: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1750: 1 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1896: 0 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/1917: 4 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/209: 7 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/244 : 5 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/249: 3 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/288: 0 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/41: 4 overlaps
Selenium Metabolism And Selenoproteins WP28 x https://identifiers.org/aop.events/890: 3 overlaps

title of Overlapping Gene(s) and the number between enriched pathways from ORA and significant KEs:
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TP53', 'JUN', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'SP1', 'MT1X'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'SP1', 'MT1X'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TP53', 'JUN', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'JUN', 'AKT1'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TP53', 'JUN', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'SP1', 'MT1X'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): {'GSK3B', 'AKT1', 'APC'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TP53', 'JUN', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TP53', 'JUN', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1896, Title of overlapping gene(s): {'TP53', 'AKT1'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'GSK3B'}, number: 1
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'APC', 'SP1', 'GSK3B', 'JUN', 'MT1X'}, number: 5
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/244 , Title of overlapping gene(s): {'TP53', 'AKT1', 'APC', 'GSK3B', 'JUN'}, number: 5
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'SP1', 'MT1X'}, number: 2
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/288, Title of overlapping gene(s): set(), number: 0
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'TP53', 'GSK3B', 'AKT1'}, number: 3
Term: Copper Homeostasis WP3286, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'SP1', 'MT1X'}, number: 2
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'FADD', 'TRAF6', 'CXCL8', 'CHUK', 'IKBKG'}, number: 5
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'FADD', 'TRAF6', 'CXCL8', 'CHUK', 'IKBKG'}, number: 5
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'FADD', 'RNF125', 'ATG12', 'TRADD', 'MAVS', 'TRAF6', 'ATG5', 'CXCL8', 'CHUK', 'IKBKG', 'CYLD'}, number: 11
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'FADD', 'TRAF6', 'CXCL8', 'CHUK', 'IKBKG'}, number: 5
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'FADD', 'TRAF6', 'CXCL8', 'CHUK', 'IKBKG'}, number: 5
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'FADD', 'TRAF6', 'CXCL8', 'CHUK', 'IKBKG'}, number: 5
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1896, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/244 , Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/288, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): set(), number: 0
Term: Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): set(), number: 0
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'TGFBR2', 'TGFB2', 'HSPA1A', 'JUN'}, number: 4
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'NFE2L2', 'HMOX1', 'GPX3', 'CYP1A1', 'SP1'}, number: 5
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'NFE2L2', 'HMOX1', 'GPX3', 'CYP1A1', 'SP1'}, number: 5
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'TGFBR2', 'TGFB2', 'HSPA1A', 'JUN'}, number: 4
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'TGFBR2', 'TGFB2', 'HSPA1A', 'JUN'}, number: 4
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'NFE2L2', 'HMOX1', 'GPX3', 'CYP1A1', 'SP1'}, number: 5
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): {'SRC'}, number: 1
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'TGFBR2', 'TGFB2', 'HSPA1A', 'JUN'}, number: 4
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'TGFBR2', 'TGFB2', 'HSPA1A', 'JUN'}, number: 4
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1896, Title of overlapping gene(s): {'POLK', 'GADD45B'}, number: 2
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'GPX3', 'SLC5A3', 'ABCC4', 'TGFBR2', 'SRC', 'SLC6A15', 'HMOX1', 'GPX2', 'TGFA', 'DNAJB1', 'ABCC3', 'RXRA', 'GSTM3', 'TXNRD3', 'SLC2A1', 'SLC2A8', 'MAFF', 'SLC39A10', 'HSPA1A', 'GGT1', 'SLC6A9', 'ALDH3A1', 'SLC2A6', 'NFE2L2', 'SLC2A3', 'CES2', 'TGFB2', 'SLC2A14', 'HBEGF'}, number: 29
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'CPT1A', 'GPX3', 'SP1', 'SLC5A3', 'ABCC4', 'TGFBR2', 'SLC6A15', 'HMOX1', 'GPX2', 'TGFA', 'DNAJB1', 'SCD', 'CYP1A1', 'JUN', 'ABCC3', 'RXRA', 'GSTM3', 'TXNRD3', 'SLC2A1', 'SLC2A8', 'MAFF', 'SLC39A10', 'HSPA1A', 'GGT1', 'SLC6A9', 'ALDH3A1', 'SLC2A6', 'NFE2L2', 'POLK', 'SLC2A3', 'CES2', 'TGFB2', 'SLC2A14', 'HBEGF'}, number: 34
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/244 , Title of overlapping gene(s): {'GADD45B', 'GPX3', 'SLC5A3', 'ABCC4', 'TGFBR2', 'SRC', 'SLC6A15', 'HMOX1', 'GPX2', 'TGFA', 'DNAJB1', 'JUN', 'ABCC3', 'RXRA', 'GSTM3', 'TXNRD3', 'SLC2A1', 'SLC2A8', 'MAFF', 'SLC39A10', 'HSPA1A', 'GGT1', 'SLC6A9', 'ALDH3A1', 'SLC2A6', 'NFE2L2', 'SLC2A3', 'CES2', 'TGFB2', 'SLC2A14', 'HBEGF'}, number: 31
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'NFE2L2', 'HMOX1', 'GPX3', 'CYP1A1', 'SP1'}, number: 5
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/288, Title of overlapping gene(s): {'ABCB1', 'ABCC4', 'BAAT', 'SRC', 'RXRA', 'PPARGC1A', 'CYP4F12', 'ABCC3'}, number: 8
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'CPT1A', 'GPX3', 'SLC5A3', 'ABCC4', 'TGFBR2', 'SRC', 'SLC6A15', 'HMOX1', 'GPX2', 'TGFA', 'DNAJB1', 'ABCC3', 'RXRA', 'GSTM3', 'TXNRD3', 'SLC2A1', 'SREBF1', 'PPARGC1A', 'SLC2A8', 'MAFF', 'SLC39A10', 'HSPA1A', 'GGT1', 'SLC6A9', 'ALDH3A1', 'SLC2A6', 'NFE2L2', 'BAAT', 'SLC2A3', 'CES2', 'FASN', 'TGFB2', 'SLC2A14', 'HBEGF'}, number: 34
Term: Nuclear Receptors Meta Pathway WP2882, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'NFE2L2', 'HMOX1', 'GPX3', 'CYP1A1', 'SP1'}, number: 5
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1087, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1115, Title of overlapping gene(s): {'SP1', 'NFE2L2', 'GPX3'}, number: 3
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1392, Title of overlapping gene(s): {'SP1', 'NFE2L2', 'GPX3'}, number: 3
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/149, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1495, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1497, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1538, Title of overlapping gene(s): {'SP1', 'NFE2L2', 'GPX3'}, number: 3
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1582, Title of overlapping gene(s): set(), number: 0
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1633, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1750, Title of overlapping gene(s): {'JUN'}, number: 1
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1896, Title of overlapping gene(s): set(), number: 0
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/1917, Title of overlapping gene(s): {'NFE2L2', 'TXNRD3', 'GPX2', 'GPX3'}, number: 4
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/209, Title of overlapping gene(s): {'NFE2L2', 'GPX4', 'TXNRD3', 'GPX2', 'GPX3', 'SP1', 'JUN'}, number: 7
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/244 , Title of overlapping gene(s): {'NFE2L2', 'TXNRD3', 'GPX2', 'GPX3', 'JUN'}, number: 5
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/249, Title of overlapping gene(s): {'SP1', 'NFE2L2', 'GPX3'}, number: 3
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/288, Title of overlapping gene(s): set(), number: 0
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/41, Title of overlapping gene(s): {'NFE2L2', 'TXNRD3', 'GPX2', 'GPX3'}, number: 4
Term: Selenium Metabolism And Selenoproteins WP28, KEID: https://identifiers.org/aop.events/890, Title of overlapping gene(s): {'SP1', 'NFE2L2', 'GPX3'}, number: 3
Section 8.4.4 Tabulation gene overlap

In this section, a table is created that contains the number of overlapping genes and number of total genes in preparation for section 7.5.5.

final_geneoverlaptable_C5=pd.DataFrame.from_dict(overlapping_genes_betweenORA_and_significantKEs5,orient='index')
Section 8.4.5 Percent overlap calculation

In this section, the percent overlap for the genesets are calculated.

Step 57. Lastly, the percent overlap is calculated and add the result as a column to the dataframe. This is first done by running a for loop to calculate the total number of genes belonging to the enriched pathways of ORA.

variable_count5= {}

for index, row in exploded_df5_ORApathwaytable.iterrows():
    unique_KE = row['Term']
    gene_expression_value = row['Genes']

    if unique_KE not in variable_count5:
            variable_count5[unique_KE] = 1
    else:
            variable_count5[unique_KE] += 1

print("The total number of genes: ")
print(variable_count5)
The total number of genes: 
{'Copper Homeostasis WP3286': 18, 'Nuclear Receptors Meta Pathway WP2882': 61, 'Selenium Metabolism And Selenoproteins WP28': 15, 'Novel Intracellular Components Of RIG I Like Receptor Pathway WP3865': 17}

Step 88. The result is converted into a dataframe and added to the final dataframe. This is followed by some data manipulation prior to calculation of gene set overlap.

variable_count_df5=pd.DataFrame.from_dict(variable_count5,orient='index')
reset_variable_count_df5 = variable_count_df5.reset_index()
reset_variable_count_df5.columns = ['Term', 'Total number of genes']
Genesetoverlaptable_C5=final_geneoverlaptable_C5.reset_index(level=[1])
Genesetoverlaptable_C5.reset_index(inplace=True)
Genesetoverlaptable_C5.columns= ['Term','KEID','overlapping genes','number of genes that overlap']
tabulation_C5=pd.merge(reset_variable_count_df5,Genesetoverlaptable_C5, on='Term')
def calculate_Genesetoverlap_Score(row):
        return f"{(row['number of genes that overlap']/row['Total number of genes'])*100}"

tabulation_C5.loc[:,'Percent geneset overlap']= tabulation_C5.apply(calculate_Genesetoverlap_Score, axis=1)
tabulation_C5.to_excel('genesetoverlap-AgNP48h.xlsx')

Section 9: Metadata

Step 89. At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.

%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-06-02T19:48:45.303185+02:00

Python implementation: CPython
Python version       : 3.12.3
IPython version      : 8.25.0

Compiler    : MSC v.1938 64 bit (AMD64)
OS          : Windows
Release     : 11
Machine     : AMD64
Processor   : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores   : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
pandas==2.2.3
json==2.0.9
ipykernel==6.28.0
numpy==1.26.4
scipy==1.13.1
py4cytoscape==1.9.0

References:

  1. Martens M, Meuleman AB, Kearns J, de Windt C, Evelo CT, Willighagen EL. Molecular Adverse Outcome Pathways: towards the implementation of transcriptomics data in risk assessments. bioRxiv. 2023:2023.03.02.530766.
  2. How can I iterate over rows in a Pandas DataFrame?[Internet]. Stack Overflow. Available from: https://stackoverflow.com/questions/16476924/how-can-i-iterate-over-rows-in-a-pandas-dataframe
  3. Python - Loop Dictionaries \[Internet\]. www.w3schools.com. Available from: https://www.w3schools.com/python/python_dictionaries_loop.asp
  4. Priya. apply(set) to two columns in a pandas dataframe [Internet]. Stack Overflow. 2018. Available from: https://stackoverflow.com/questions/52367388/applyset-to-two-columns-in-a-pandas-dataframe
  5. amnesic. Converting pandas dataframe to dictionary with same keys over multiple rows [Internet]. Stack Overflow. 2022. Available from: https://stackoverflow.com/questions/71006325/converting-pandas-dataframe-to-dictionary-with-same-keys-over-multiple-rows/71006478#71006478
  6. SuperDougDougy. GroupBy results to dictionary of lists [Internet]. Stack Overflow. 2015. Available from: https://stackoverflow.com/questions/29876184/groupby-results-to-dictionary-of-lists%E2%80%8C