Subsections of Key objective 1
Part 1: Construction of the preliminary AOP table for AOP networks
The AOP project ► Key objective 1
Author: Shakira Agata
This Jupyter Notebook describes the steps needed to create the preliminary table that is needed for construction of an AOP network that focuses on inflammatory processes in human organ systems. This notebook focuses on the organ systems: brain, liver, kidney and lung due to research interests of the author. The preliminary table will contain the following information: AOP (adverse outcome pathway), AOP title, KE name (key event name), AO (adverse outcome), AO title, KER (key event relationship), KER ID and title of the organ system. To achieve this result, three SPARQLqueries will be executed against AOP-Wiki RDF to extract the AOPs that are related to inflammatory processes along with their respective upstreamKEs, downstreamKEs and KERs. The detailed steps are outlined in the following eight sections:
- Section 1: System preparation for generation of AOP-Wiki RDF SPARQL queries
- Section 2: Execution of the first SPARQL query
- Section 3: Merging of two datasets
- Section 4: Filtering of results
- Section 5: Execution of second SPARQL query
- Section 6: Execution of third SPARQL query
- Section 7: Merging results from section 4-6
- Section 8: Metadata
Section 1: System preparation for generation of AOP-Wiki RDF SPARQL queries
In section 1 and section 2, the system requirements for this notebook will be fulfilled followed by the generation of the first SPARQL query. This query is run against AOP-Wiki RDF.
Step 1: Install SPARQLWRAPPER which is the Python wrapper you need to be able to run the query.
pip install sparqlwrapper
Requirement already satisfied: sparqlwrapper in c:\users\shaki\anaconda3\lib\site-packages (2.0.0)
Requirement already satisfied: rdflib>=6.1.1 in c:\users\shaki\anaconda3\lib\site-packages (from sparqlwrapper) (7.0.0)
Requirement already satisfied: isodate<0.7.0,>=0.6.0 in c:\users\shaki\anaconda3\lib\site-packages (from rdflib>=6.1.1->sparqlwrapper) (0.6.1)
Requirement already satisfied: pyparsing<4,>=2.1.0 in c:\users\shaki\anaconda3\lib\site-packages (from rdflib>=6.1.1->sparqlwrapper) (3.0.9)
Requirement already satisfied: six in c:\users\shaki\anaconda3\lib\site-packages (from isodate<0.7.0,>=0.6.0->rdflib>=6.1.1->sparqlwrapper) (1.16.0)
Note: you may need to restart the kernel to use updated packages.
Step 2: Import sys, sparqlwrapper and pandas which are packages that allow you to interact with variables and functions and manipulate data. For the usage of Pandas, the maximum column width is set to ´None´ as Pandas version 2.2.2 does not allow for non-negative integer to be set.
import sys
!{sys.executable} -m pip install watermark
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd
pd.set_option('display.max_colwidth', None)
Requirement already satisfied: watermark in c:\users\shaki\anaconda3\lib\site-packages (2.4.3)
Requirement already satisfied: ipython>=6.0 in c:\users\shaki\anaconda3\lib\site-packages (from watermark) (8.25.0)
Requirement already satisfied: importlib-metadata>=1.4 in c:\users\shaki\anaconda3\lib\site-packages (from watermark) (7.0.1)
Requirement already satisfied: setuptools in c:\users\shaki\anaconda3\lib\site-packages (from watermark) (69.5.1)
Requirement already satisfied: zipp>=0.5 in c:\users\shaki\anaconda3\lib\site-packages (from importlib-metadata>=1.4->watermark) (3.17.0)
Requirement already satisfied: decorator in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (5.1.1)
Requirement already satisfied: jedi>=0.16 in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (0.18.1)
Requirement already satisfied: matplotlib-inline in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (0.1.6)
Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (3.0.43)
Requirement already satisfied: pygments>=2.4.0 in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (2.15.1)
Requirement already satisfied: stack-data in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (0.2.0)
Requirement already satisfied: traitlets>=5.13.0 in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (5.14.3)
Requirement already satisfied: colorama in c:\users\shaki\anaconda3\lib\site-packages (from ipython>=6.0->watermark) (0.4.6)
Requirement already satisfied: wcwidth in c:\users\shaki\anaconda3\lib\site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=6.0->watermark) (0.2.5)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\shaki\anaconda3\lib\site-packages (from jedi>=0.16->ipython>=6.0->watermark) (0.8.3)
Requirement already satisfied: executing in c:\users\shaki\anaconda3\lib\site-packages (from stack-data->ipython>=6.0->watermark) (0.8.3)
Requirement already satisfied: asttokens in c:\users\shaki\anaconda3\lib\site-packages (from stack-data->ipython>=6.0->watermark) (2.0.5)
Requirement already satisfied: pure-eval in c:\users\shaki\anaconda3\lib\site-packages (from stack-data->ipython>=6.0->watermark) (0.2.2)
Requirement already satisfied: six in c:\users\shaki\anaconda3\lib\site-packages (from asttokens->stack-data->ipython>=6.0->watermark) (1.16.0)
Step 3: Create the variable: AOPWikiSPARQL for the SPARQL wrapper and set the endpoint to JSON to ensure results are displayed in human-readable format.
AOPWikiSPARQL = SPARQLWrapper("https://aopwiki.rdf.bigcat-bioinformatics.org/sparql/")
AOPWikiSPARQL.setReturnFormat(JSON)
Step 4: Define the triple, coretypes, ontologies and identifiers which are used to codify the semantic data in AOP-Wiki.
triple = ['subject','predicate','object']
coretypes = ['aopo:AdverseOutcomePathway','aopo:KeyEvent','aopo:KeyEventRelationship','ncbitaxon:131567','go:0008150','pato:0001241','pato:0000001','aopo:CellTypeContext','aopo:OrganContext','nci:C54571','cheminf:000000']
ontologies = ['http://aopkb.org/aop_ontology#','http://edamontology.org/','http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#','http://purl.bioontology.org/ontology/NCBITAXON/','http://purl.obolibrary.org/obo/MMO','http://purl.obolibrary.org/obo/CL_','http://purl.obolibrary.org/obo/UBERON_','http://purl.obolibrary.org/obo/MI_','http://purl.obolibrary.org/obo/MP_','http://purl.org/commons/record/mesh/','http://purl.obolibrary.org/obo/HP_','http://purl.obolibrary.org/obo/PCO_','http://purl.obolibrary.org/obo/NBO_','http://purl.obolibrary.org/obo/VT_','http://purl.obolibrary.org/obo/PR_','http://purl.obolibrary.org/obo/CHEBI_','http://purl.org/sig/ont/fma/fma','http://xmlns.com/foaf/0.1/','http://www.w3.org/2004/02/skos/core#','http://www.w3.org/2000/01/rdf-schema#','http://www.w3.org/1999/02/22-rdf-syntax-ns#','http://semanticscience.org/resource/CHEMINF_','http://purl.obolibrary.org/obo/GO_','http://purl.org/dc/terms/','http://purl.org/dc/elements/1.1/','http://purl.obolibrary.org/obo/PATO_']
identifiers = []
Section 2: Execution of the first SPARQL query
Step 5: Define and run the SPARQL query to extract the inflammatory-related AOPs. Subsequently, convert the results to a pandas dataframe (df1) to display the results.
first_sparqlquery= '''SELECT DISTINCT ?AOP ?AOPtitle ?KE ?KEname
WHERE {
?KE a aopo:KeyEvent ;
dc:identifier ?KElookup ;
dc:title ?KEname .
?AOP a aopo:AdverseOutcomePathway ;
aopo:has_key_event ?KElookup ;
dc:title ?AOPtitle .
FILTER regex(?KEname, "inflammation|inflammatory", "i")
}
ORDER BY DESC(?AOP)'''
AOPWikiSPARQL.setQuery(first_sparqlquery)
first_results = AOPWikiSPARQL.query().convert()
first_data= first_results["results"]["bindings"]
columns = [{
"AOP": item["AOP"]["value"],
"AOPtitle": item["AOPtitle"]["value"],
"KE": item["KE"]["value"],
"KEname": item["KEname"]["value"]
} for item in first_data]
df1= pd.DataFrame(columns)
display(df1)
AOP | AOPtitle | KE | KEname | |
---|---|---|---|---|
0 | https://identifiers.org/aop/62 | AKT2 activation leading to hepatic steatosis | https://identifiers.org/aop.events/486 | systemic inflammation leading to hepatic steatosis |
1 | https://identifiers.org/aop/544 | Inhibition of neuropathy target esterase leading to delayed neuropathy via increased inflammation | https://identifiers.org/aop.events/149 | Increase, Inflammation |
2 | https://identifiers.org/aop/535 | Binding and activation of GPER leading to learning and memory impairments | https://identifiers.org/aop.events/188 | Neuroinflammation |
3 | https://identifiers.org/aop/511 | The AOP framework on ROS-mediated oxidative stress induced vascular disrupting effects | https://identifiers.org/aop.events/2009 | Activation of inflammation pathway |
4 | https://identifiers.org/aop/507 | Nrf2 inhibition leading to vascular disrupting effects via inflammation pathway | https://identifiers.org/aop.events/2009 | Activation of inflammation pathway |
... | ... | ... | ... | ... |
67 | https://identifiers.org/aop/144 | Endocytic lysosomal uptake leading to liver fibrosis | https://identifiers.org/aop.events/1493 | Increased Pro-inflammatory mediators |
68 | https://identifiers.org/aop/14 | Glucocorticoid Receptor Activation Leading to Increased Disease Susceptibility | https://identifiers.org/aop.events/152 | Suppression, Inflammatory cytokines |
69 | https://identifiers.org/aop/12 | Chronic binding of antagonist to N-methyl-D-aspartate receptors (NMDARs) during brain development leads to neurodegeneration with impairment in learning and memory in aging | https://identifiers.org/aop.events/188 | Neuroinflammation |
70 | https://identifiers.org/aop/115 | Epithelial cytotoxicity leading to forestomach tumors (in mouse and rat) | https://identifiers.org/aop.events/149 | Increase, Inflammation |
71 | https://identifiers.org/aop/114 | HPPD inhibition leading to corneal papillomas and carcinomas (in rat) | https://identifiers.org/aop.events/777 | Increase, Inflammation (corneal cells) |
72 rows × 4 columns
Section 3: Merging of datasets
In this section, you will merge two datafiles: df1 which contains the previous SPARQL query results for inflammatory processes-related AOPs and table2 which is a snorql-csv file that contains additional AOPs that was created by Marvin Martens (supervisor of Shakira Agata). The snorql-csv file has five columns: AOP, AOPName, ao, aotitle and organ which were merged with df1.
Step 6: You first read and convert the snorql-csv file into JSON format.
table2= pd.read_excel('snorql-csv-1679052766.894.xlsx', sheet_name='snorql-csv-1679052766.894')
json_table2= table2.to_json()
Step 7: This is followed by converting the JSON formatted file into a pandas dataframe and verifying the result.
df2= pd.read_json("C:/Users/shaki/Downloads/csvjson.json")
Step 8: Next, you merge the two dataframes based on the shared ‘AOP’ column.
mergedJSONtable= pd.merge(df1,df2, on='AOP')
Section 4: Filtering of results
In this section, you filter the result of section 3 by removing AOPs that are not organ-system based i.e belonging to brain, kidney, liver or lung.
Step 9: The rows in column:‘organ’ that contain ‘other’ are filtered so that you only retain AOPs in the defined four organ systems (brain, liver, kidney and lung). This is done by using boolean function:’!=’ to only include the rows where organ is NOT defined as ‘Other’.
filtered_mergedJSONtable = mergedJSONtable[mergedJSONtable['organ'] != 'Other']
Section 5: Execution of second SPARQL query
In this section, you will run the second SPARQL query to retrieve/define the upstreamKEs and downstreamKEs from the AOPs retrieved in section 2.
Step 10: In preparation for the second SPARQL query, you first retrieve the uri’s for the AOPs so that the execution of the query takes less time and can be automated. This will be done with the following function where we select the ‘AOP’ column from the filtered_mergedJSONtable that contains the uri’s in the filtered_mergedJSONtable and join them (all 54).
values_AOPs = " ".join(f"<{AOP}>" for AOP in filtered_mergedJSONtable['AOP'])
print(values_AOPs)
<https://identifiers.org/aop/62> <https://identifiers.org/aop/48> <https://identifiers.org/aop/452> <https://identifiers.org/aop/452> <https://identifiers.org/aop/451> <https://identifiers.org/aop/451> <https://identifiers.org/aop/447> <https://identifiers.org/aop/429> <https://identifiers.org/aop/409> <https://identifiers.org/aop/409> <https://identifiers.org/aop/382> <https://identifiers.org/aop/38> <https://identifiers.org/aop/374> <https://identifiers.org/aop/362> <https://identifiers.org/aop/320> <https://identifiers.org/aop/320> <https://identifiers.org/aop/319> <https://identifiers.org/aop/303> <https://identifiers.org/aop/3> <https://identifiers.org/aop/280> <https://identifiers.org/aop/278> <https://identifiers.org/aop/27> <https://identifiers.org/aop/206> <https://identifiers.org/aop/173> <https://identifiers.org/aop/173> <https://identifiers.org/aop/171> <https://identifiers.org/aop/17> <https://identifiers.org/aop/17> <https://identifiers.org/aop/144> <https://identifiers.org/aop/12>
Step 11: Now you run the second SPARQL query to retrieve the KE, KEID and KEtitle for the respective AOPs. To do this, you need to use one curly bracket for values_AOPs as that marks the position for uri insertion and two surrounding curly brackets for values_AOPs to prevent identification of the curly brackets as placeholders (escaping).
second_sparqlsquery = f'''
SELECT DISTINCT ?AOP ?KE ?KEID ?KEtitle
WHERE {{
VALUES ?AOP {{ {values_AOPs} }}
?AOP a aopo:AdverseOutcomePathway ;
dc:title ?AOPName ;
aopo:has_key_event ?KE .
?KE a aopo:KeyEvent ; rdfs:label ?KEID ; dc:title ?KEtitle .
}}
'''
AOPWikiSPARQL.setQuery(second_sparqlsquery)
second_results = AOPWikiSPARQL.query().convert()
second_data= second_results["results"]["bindings"]
columns = [{
"AOP": item["AOP"]["value"],
"KE": item["KE"]["value"],
"KEID": item["KEID"]["value"],
"KEtitle": item["KEtitle"]["value"]
} for item in second_data]
df3= pd.DataFrame(columns)
display(df3)
AOP | KE | KEID | KEtitle | |
---|---|---|---|---|
0 | https://identifiers.org/aop/447 | https://identifiers.org/aop.events/105 | KE 105 | Inhibition, Mitochondrial Electron Transport Chain Complexes |
1 | https://identifiers.org/aop/27 | https://identifiers.org/aop.events/1115 | KE 1115 | Increase, Reactive oxygen species |
2 | https://identifiers.org/aop/303 | https://identifiers.org/aop.events/1115 | KE 1115 | Increase, Reactive oxygen species |
3 | https://identifiers.org/aop/319 | https://identifiers.org/aop.events/1115 | KE 1115 | Increase, Reactive oxygen species |
4 | https://identifiers.org/aop/382 | https://identifiers.org/aop.events/1115 | KE 1115 | Increase, Reactive oxygen species |
... | ... | ... | ... | ... |
185 | https://identifiers.org/aop/62 | https://identifiers.org/aop.events/459 | KE 459 | Increased, Liver Steatosis |
186 | https://identifiers.org/aop/62 | https://identifiers.org/aop.events/484 | KE 484 | Activation, AKT2 |
187 | https://identifiers.org/aop/62 | https://identifiers.org/aop.events/486 | KE 486 | systemic inflammation leading to hepatic steatosis |
188 | https://identifiers.org/aop/447 | https://identifiers.org/aop.events/759 | KE 759 | Increased, Kidney Failure |
189 | https://identifiers.org/aop/451 | https://identifiers.org/aop.events/780 | KE 780 | Increase, Cytotoxicity (epithelial cells) |
190 rows × 4 columns
Section 6: Execution of third SPARQL query
In this section, you will run the third SPARQL query to retrieve/define the KERs for the upstreamKEs and downstreamKEs of the AOPs. This approach allows for easier integration of the output into the nodetable and edgetable of the AOP network (1).
Step 12: You run the following query to retrieve the upstreamKEs, downstreamKEs and KERs.
third_sparqlsquery = f'''
SELECT DISTINCT ?AOP ?upstreamKE ?upstreamKEtitle ?downstreamKE ?downstreamKEtitle ?KER ?KERID
WHERE {{
VALUES ?AOP {{ {values_AOPs} }}
?AOP a aopo:AdverseOutcomePathway ;
dc:title ?AOPName ;
aopo:has_key_event_relationship ?KER .
?KER a aopo:KeyEventRelationship ;
rdfs:label ?KERID .
?KER aopo:has_upstream_key_event ?upstreamKE .
?upstreamKE dc:title ?upstreamKEtitle .
?KER aopo:has_downstream_key_event ?downstreamKE .
?downstreamKE dc:title ?downstreamKEtitle .
}}
'''
AOPWikiSPARQL.setQuery(third_sparqlsquery)
third_results = AOPWikiSPARQL.query().convert()
third_data= third_results["results"]["bindings"]
columns = [{
"AOP": item["AOP"]["value"],
"upstreamKE": item["upstreamKE"]["value"],
"downstreamKE": item["downstreamKE"]["value"],
"KER": item["KER"]["value"],
"KERID": item["KERID"]["value"],
"upstreamKEtitle": item["upstreamKEtitle"]["value"],
"downstreamKEtitle": item["downstreamKEtitle"]["value"]
} for item in third_data]
df4= pd.DataFrame(columns)
Section 7: Merging results from section 4-6
In this section, you will merge the dataframes: filtered_mergedJSONtable (result of section 4), df3 (result of section 5) and df4 (result of section 6).
Step 13: The three dataframes are merged to get the final table that will be used to define the nodes and edges of the network in Py4Cytoscape. The final table will be the merged version of:
- filtered_mergedJSONtable (AOP SPARQL query)
- df3 (KE SPARQL query)
- df4(KER/KERID SPARQL query).
first_merge = pd.merge(df3, df4)
final_result= pd.merge(filtered_mergedJSONtable,first_merge)
Step 14: Lastly the final_result table is stored in JSON-format in your laptop in preparation for the next Jupyter Notebook.
final_result.to_json('final_result.json')
Section 8: Metadata
Step 15: At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.
%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-04-26T20:54:28.526670+02:00
Python implementation: CPython
Python version : 3.12.3
IPython version : 8.25.0
Compiler : MSC v.1938 64 bit (AMD64)
OS : Windows
Release : 11
Machine : AMD64
Processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
json==2.0.9
ipykernel==6.28.0
pandas==2.2.2
SPARQLWrapper==2.0.0
References:
- Martens, M., Evelo, C. T., & Willighagen, E. L. (2022). Providing Adverse Outcome Pathways from the AOP-Wiki in a Semantic Web Format to Increase Usability and Accessibility of the Content. Applied In Vitro Toxicology. https://github.com/marvinm2/AOPWikiRDF
- Martens M, Evelo CT, Willighagen EL. Providing Adverse Outcome Pathways from the AOP-Wiki in a Semantic Web Format to Increase Usability and Accessibility of the Content. Appl In Vitro Toxicol. 2022 Mar 1;8(1):2-13. doi: 10.1089/aivt.2021.0010. Epub 2022 Mar 17. PMID: 35388368; PMCID: PMC8978481.
- msx. Package for listing version of packages used in a Jupyter Notebook \[Internet\]. Stack Overflow. 2016. Available from: https://stackoverflow.com/questions/40428931/package-for-listing-version-of-packages-used-in-a-jupyter-notebook
Part 2: Construction of an human inflammatory stress response AOP network
The AOP project ► Key objective 1
Author: Shakira Agata
This Jupyter notebook describes the steps needed to construct an AOP network that focuses on inflammatory processes in human organ systems AOP network. Briefly the nodetable and edgetable were defined followed by construction of the AOP network and adaptation of the visual style with py4cytoscape. Py4cytoscape is a Python package that collaborates with Cytoscape for visualization of molecular networks and biological pathways. The detailed steps are outlined in the following eight sections:
- Section 1: System preparation
- Section 2: Data cleaning and preparation
- Section 3: Data manipulation for node table
- Section 4: Definition of the node table
- Section 5: Definition of the edge table
- Section 6: Construction of the AOP network
- Section 7: Adaptation of stylistic aspects of the AOP network
- Section 8: Metadata
Section 1: System preparation
In this section, you import pandas and py4cytoscape which are required for execution of this notebook.
Step 1: We import pandas and py4cytoscape. For py4cytoscape, you must first open Cytoscape on your laptop prior to running the code. If not, an error message will be shown.
import pandas as pd
import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()
You are connected to Cytoscape!
{'apiVersion': 'v1',
'cytoscapeVersion': '3.10.1',
'automationAPIVersion': '1.9.0',
'py4cytoscapeVersion': '1.9.0'}
Section 2: Data cleaning and preparation
In this section, the result of Jupyter notebook: ‘Agata,Shakira-The AOP project-Part 1’ is imported and cleaned in preparation for definition of the nodetable and edgetable. The result of the first Jupyter Notebook was saved as ‘final_result.json’.
Step 2: You read the final_result table of the first Jupyter notebook into a new dataframe.
final_result= pd.read_json("C:/Users/shaki/Downloads/final_result.json")
Step 3: Now, you remove the redundant column: ‘AOPName’ as it contains the same output as the column:‘AOPtitle’.
final_result1= final_result.drop(columns='AOPName', axis=1)
Step 4: This is followed by the removal of duplicates in the KER column to prevent the visualization of duplicate identical interactions in the AOP network.
final_result2= final_result1.drop_duplicates(subset='KER', keep='first')
Step 5: Create a weight column and set the value to 1. This is a prerequisite for both the construction of the edgetable and adaptation of the visual style of the future AOP network. This will be done by using the assign function and copying the dataframe to prevent a slice dataframe.
final_result2 = final_result2.copy()
final_result2 = final_result2.assign(weight='1')
Step 6: You remove the columns: KE, KEID, KEname and KEtitle from the final_result2 dataframe to clean up the table. This columns were generated during the second SPARQL query, but can be removed from the final table as the definition of the nodetable and edgetable only requires the upstream key events, downstream key events and adverse outcomes (AO) columns.
Adaptedfinal_result2= final_result2.drop(['KE', 'KEname', 'KEID', 'KEtitle'], axis=1)
Step 7: You lastly change the column names: ‘upstreamKE’ and ‘downstreamKE’ to upstreamKEID and downstreamKEID for cohesiveness of the dataframe.
secondfinal_result2= Adaptedfinal_result2.rename(columns= {'upstreamKE': 'upstreamKEID', 'downstreamKE':'downstreamKEID'})
Section 3: Data manipulation for node table
In this section, functions will be constructed that allow for differentiation between molecular initiating events (MIE), Key events (KE) and Adverse outcomes (AO) in the AOP network. This is important for the correct establishment of directionality of the AOP network as MIEs precede KEs and KEs precede AOs.
Step 8: You will identify the most upstreamKEs as MIE using the function below. This is manually done as SPARQL queries cannot assign MIEs in the respective AOPs. The strategy behind these functions is to define the MIEs as the upstreamKEs that are not present in the downstreamKE column. This is logical as the most upstreamKE should not be present in the downstreamKE column. You first set the mostupstreamKE variable and then search our upstreamKEID column of the secondfinal_result2 for the MIEs that are present in the mostupstreamKE variable. Those that are not present in the mostupstreamKE get the type: ‘KE’. This is followed by searching the secondfinal_result2 table for the most downstreamKE as those get the assignment: ‘AO’. The downstreamKEs that are not AOs get the type: ‘KE’.
mostupstreamKE = set(secondfinal_result2['upstreamKEID']) - set(secondfinal_result2['downstreamKEID'])
secondfinal_result2['type'] = 'KE'
secondfinal_result2.loc[secondfinal_result2['upstreamKEID'].isin(mostupstreamKE), 'type'] = 'MIE'
secondfinal_result2.loc[secondfinal_result2['downstreamKEID'].isin(secondfinal_result2['ao']), 'type'] = 'AO'
Step 9: You can optionally save the finaltable to Excel and verify output. I included this line of code to verify my results with AOP-Wiki as an intermediate check.
secondfinal_result2.to_excel('checksecondfinal_result.xlsx')
Section 4: Definition of the node table
In this section, you will define your node table which will include: ’name’, ‘ID’ and ’type’.
Step 10: You will construct the node table by first making three dataframes for the nodes and then joining them together. The three dataframes are:
-
MIE_nodes: These include the most upstreamKE from the upstreamKEs. The results of section 3 will be employed for this by selecting the values ‘MIE’ from the type column of the secondfinal_result2 table.
-
KE_nodes: These include the downstreamKEs and upstreamKEs that were not MIE. The results of section 3 will be employed for this by selecting the values ‘KE’ from the type column of the secondfinal_result2 table.
-
AO_node: These include your adverse outcomes. The results of section 3 will be employed for this by selecting the values ‘AO’ from the type column of the secondfinal_result2 table.
MIE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'MIE'][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']].rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'})
KE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'KE'][['downstreamKEID', 'downstreamKEtitle', 'organ', 'type']].copy()
KE_nodes.rename(columns={'downstreamKEID': 'id', 'downstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)
upstreamKE_nodes = secondfinal_result2[(secondfinal_result2['type'] != 'MIE') & (secondfinal_result2['type'] != 'AO')][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']]
upstreamKE_nodes.rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)
KE_nodes = pd.concat([KE_nodes, upstreamKE_nodes], ignore_index=True)
AO_nodes = secondfinal_result2[secondfinal_result2['type'] == 'AO'][['ao', 'aotitle', 'organ', 'type']].rename(columns={'ao': 'id', 'aotitle': 'name', 'organ': 'group'})
nodetable = pd.concat([MIE_nodes, KE_nodes, AO_nodes], ignore_index=True)
nodetable
id | name | group | type | |
---|---|---|---|---|
0 | https://identifiers.org/aop.events/486 | systemic inflammation leading to hepatic steat... | Liver | MIE |
1 | https://identifiers.org/aop.events/875 | Binding of agonist, Ionotropic glutamate recep... | Brain | MIE |
2 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE |
3 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE |
4 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE |
... | ... | ... | ... | ... |
275 | https://identifiers.org/aop.events/1276 | Lung fibrosis | Lung | AO |
276 | https://identifiers.org/aop.events/1458 | Pulmonary fibrosis | Lung | AO |
277 | https://identifiers.org/aop.events/1090 | Increased, mesotheliomas | Lung | AO |
278 | https://identifiers.org/aop.events/341 | Impairment, Learning and memory | Brain | AO |
279 | https://identifiers.org/aop.events/352 | N/A, Neurodegeneration | Brain | AO |
280 rows × 4 columns
Step 11: You remove the duplicates so that only the unique nodes remain in the list. With the following function, you remove duplicate nodes, but retain the nodes that have the same id, but different group. These nodes are valuable as those represent the KEs which are present in multiple organ systems.
finalnodetable= nodetable.drop_duplicates(['id', 'group'], keep='first')
finalnodetable
id | name | group | type | |
---|---|---|---|---|
0 | https://identifiers.org/aop.events/486 | systemic inflammation leading to hepatic steat... | Liver | MIE |
1 | https://identifiers.org/aop.events/875 | Binding of agonist, Ionotropic glutamate recep... | Brain | MIE |
2 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE |
8 | https://identifiers.org/aop.events/1495 | Substance interaction with the lung resident c... | Lung | MIE |
11 | https://identifiers.org/aop.events/105 | Inhibition, Mitochondrial Electron Transport C... | Kidney | MIE |
... | ... | ... | ... | ... |
269 | https://identifiers.org/aop.events/896 | Parkinsonian motor deficits | Brain | AO |
270 | https://identifiers.org/aop.events/1588 | Bronchiolitis obliterans | Lung | AO |
271 | https://identifiers.org/aop.events/1549 | Liver Injury | Liver | AO |
272 | https://identifiers.org/aop.events/357 | Cholestasis, Pathology | Liver | AO |
276 | https://identifiers.org/aop.events/1458 | Pulmonary fibrosis | Lung | AO |
137 rows × 4 columns
Step 12: This list can also be saved to Excel to use for assignment of molecular pathways to the AOP network nodes present in this table. More details can be found in Jupyter Notebook: ‘Agata,Shakira-The AOP project-Part 3’.
finalnodetable.to_excel('finalnodetable.xlsx')
Section 5: Definition of the edge table
In this section, the edge table will be defined which includes: ‘source’, ’target’, ‘interaction’, ‘weight’, ‘group’ and ‘ID’.
Step 13: You will define these columns and select the columns from the final_result2 table. There are different methods to do so, but you will use the transpose() function in this case. Afterwards, you rename the columns to: source, target, interaction, weight and group respectively so the table can be read by Cytoscape.
preliminaryedgetable= pd.DataFrame([secondfinal_result2.upstreamKEID, secondfinal_result2.downstreamKEID, secondfinal_result2.KER, secondfinal_result2.KERID, secondfinal_result2.weight, secondfinal_result2.organ]).transpose()
edgetable= preliminaryedgetable.rename(columns= {'upstreamKEID': 'source', 'downstreamKEID':'target', 'KER':'interaction', 'KERID':'id', 'organ':'group'})
edgetable.to_excel('edgetable.xlsx')
Section 6: Construction of the AOP network
In this section, the AOP network will constructed using predefined py4cytoscape functions.
Step 14: You will first define the nodes, edges and title of the network.
p4c.create_network_from_data_frames(nodes=finalnodetable, edges=edgetable, title='Agata,S-The inflammatory-process related AOP network')
Applying default style...
Applying preferred layout
17292
Section 7: Adaptation of stylistic aspects
In this section, the style of your AOP network will be adapted and displayed.
Step 15: Prior to adapting the style of the network, you first need to import the stylistic mappings in py4cytoscape.
from py4cytoscape import get_node_color
from py4cytoscape import set_node_color_mapping
from py4cytoscape import gen_node_color_map
from py4cytoscape import set_edge_color_default
from py4cytoscape import set_node_color_default
from py4cytoscape import set_edge_source_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_color_default
from py4cytoscape import get_arrow_shapes
from py4cytoscape import get_edge_target_arrow_shape
from py4cytoscape import set_edge_target_arrow_shape_mapping
from py4cytoscape import gen_edge_arrow_map
from py4cytoscape import select_nodes
from py4cytoscape import get_table_value
from py4cytoscape import get_network_suid
from py4cytoscape import clear_selection
from py4cytoscape import set_node_color_bypass
Step 16: The style of the network can be changed with the following commands. In this Jupyter Notebook, the color of the nodes are changed according to the ’type’ assignment and edges are adapted to be arrow-shaped. This requires retrieval of the Cytoscape columns.
style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 50, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p')
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p')
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)
{'message': 'Visual Style applied.'}
set_edge_target_arrow_shape_default('ARROW', style_name='default')
set_edge_target_arrow_color_default('#000000', style_name='default')
''
Newtable= p4c.get_table_columns()
Newtable
SUID | shared name | id | group | type | degree.layout | name | selected | |
---|---|---|---|---|---|---|---|---|
17665 | 17665 | https://identifiers.org/aop.events/1276 | https://identifiers.org/aop.events/1276 | Lung | AO | NaN | Lung fibrosis | False |
17410 | 17410 | https://identifiers.org/aop.events/2010 | https://identifiers.org/aop.events/2010 | Lung | KE | NaN | Pulmonary inflammation | False |
17668 | 17668 | https://identifiers.org/aop.events/344 | https://identifiers.org/aop.events/344 | Liver | AO | NaN | N/A, Liver fibrosis | False |
17413 | 17413 | https://identifiers.org/aop.events/2013 | https://identifiers.org/aop.events/2013 | Lung | KE | NaN | Airway remodeling | False |
17671 | 17671 | https://identifiers.org/aop.events/1841 | https://identifiers.org/aop.events/1841 | Brain | AO | NaN | Encephalitis | False |
... | ... | ... | ... | ... | ... | ... | ... | ... |
17401 | 17401 | https://identifiers.org/aop.events/389 | https://identifiers.org/aop.events/389 | Brain | KE | NaN | Increased, Intracellular Calcium overload | False |
17659 | 17659 | https://identifiers.org/aop.events/1941 | https://identifiers.org/aop.events/1941 | Brain | AO | NaN | Memory Loss | False |
17404 | 17404 | https://identifiers.org/aop.events/177 | https://identifiers.org/aop.events/177 | Liver | KE | NaN | Mitochondrial dysfunction | False |
17662 | 17662 | https://identifiers.org/aop.events/1090 | https://identifiers.org/aop.events/1090 | Lung | AO | NaN | Increased, mesotheliomas | False |
17407 | 17407 | https://identifiers.org/aop.events/2011 | https://identifiers.org/aop.events/2011 | Lung | KE | NaN | Emphysema | False |
122 rows × 8 columns
Step 17: The color of the nodes are changed by using the ‘select and color’ method I developed. With this, you first select the nodes, convert them to lists and change the color for the entire length of the lists with py4cytoscape’s bypass function. With this approach, a datatype must be a list as dataframes are determined to be ‘ambigous’. This approach also allows you to see the highlighted nodes in Cytoscape as verification while running the code.
Step 17a: Change appearance of MIE nodes to green.
MIEnodes_df = Newtable[Newtable['type'] == 'MIE']
MIEnodeslist = MIEnodes_df['name'].tolist()
p4c.select_nodes(MIEnodeslist,by_col='SUID')
{}
color = '#09d63a'
new_colors = [color] * len(MIEnodeslist)
p4c.set_node_color_bypass(node_names=MIEnodeslist, new_colors=new_colors)
''
Step 17b: Change appearance of KE nodes to yellow
KEnodes_df = Newtable[Newtable['type'] == 'KE']
KEnodeslist = KEnodes_df['name'].tolist()
p4c.select_nodes(KEnodeslist, by_col='SUID')
{'nodes': [17596], 'edges': []}
color = '#f7ff00'
new_colors = [color] * len(KEnodeslist)
p4c.set_node_color_bypass(node_names=KEnodeslist, new_colors=new_colors)
''
Step 17c: Changing appearance of the AO nodes to pink.
AO_nodes_df = Newtable[Newtable['type'] == 'AO']
AOnodeslist = AO_nodes_df['name'].tolist()
p4c.select_nodes(AOnodeslist, by_col='SUID')
{'nodes': [17596], 'edges': []}
color = '#ff28d5'
new_colors = [color] * len(AOnodeslist)
p4c.set_node_color_bypass(node_names=AOnodeslist, new_colors=new_colors)
Section 8: Metadata
Step 18: At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.
%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-06-02T13:18:59.342286+02:00
Python implementation: CPython
Python version : 3.12.3
IPython version : 8.25.0
Compiler : MSC v.1938 64 bit (AMD64)
OS : Windows
Release : 11
Machine : AMD64
Processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
json==2.0.9
ipykernel==6.28.0
pandas==2.2.2
py4cytoscape==1.9.0
References:
- CyTargetLinker app update: A flexible solution for network extension in Cytoscape. Martina Kutmon, Friederike Ehrhart, Egon L Willighagen, Chris T. Evelo, Susan L. Coort F1000Research 2018, 7:743 doi: 10.12688/f1000research.14613.1
- Kutmon,M. 2024 . cyTargetLinker-automation.Maastricht:Github; \[accessed 2024 December 18\].https://github.com/CyTargetLinker/cytargetlinker-automation
Part 3: Enrichment of the human inflammatory stress response AOP network
The AOP project ► Key objective 1
Author: Shakira Agata
This Jupyter notebook describes the steps needed to enrich your AOP network by addition of molecular pathways from WikiPatwhays and genes through CyTargetLinker. This notebook is dependent on the output: ´Agata,completenodetable.xlsx´ which contains the KE-WP mapping results. This notebook is subdivided into the following ten sections:
- Section 1: KE-WP mapping
- Section 2: Importing the tables needed to construct the node table and edge table
- Section 3: Loading the node table and edge table
- Section 4: Defining the node table and edge table
- Section 5: Construction of the molecular AOP network
- Section 6: Adaptation of stylistic aspects of the molecular AOP network
- Section 7: Extension of the molecular AOP network using Cytargetlinker and WikiPathways linkset
- Section 8: Changing the visual style of CytargetLinker-extended molecular AOP ntwork
- Section 9: Saving results
- Section 10: Metadata
Section 1: System preparation
In this section, you will install the necessary packages for this notebook.
Step 1: You will import pandas and py4cytoscape by running the following code below.
import pandas as pd
import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()
You are connected to Cytoscape!
{'apiVersion': 'v1',
'cytoscapeVersion': '3.10.1',
'automationAPIVersion': '1.9.0',
'py4cytoscapeVersion': '1.9.0'}
Section 2: KE-WP mapping
In this section, you will execute KE-WP mapping. This is a manual process that requires the assignment of molecular pathways to the MIEs, KEs and AOs that are present in the human inflammatory stress response AOP network. This requires usage of the decision tree method available at: https://github.com/marvinm2/KE-WP-mapping. Afterwards, your results should be tabulated in the following order to use in the next sections of the Jupyternotebook: ID (KEID), Name (KEtitle), Group, Type, KE level of organisation, Cell/organ term, WPID, WPtitle, Confidence score and Causative/responsive.
- An example is provided at section 3.
Section 3: Loading the node table and edge table
In this section, the node table and edge table will be loaded and reorganised in preparation for section 4.
Step 2: You will now import the node table which contains the molecular pathway information. You will rename the columns: ‘Type’ to ’type’ and ‘Causative/responsive’ to ‘association type’ for coherence of the table.
nodetable0 = pd.read_excel('Agata,completenodetable.xlsx')
nodetable0.rename(columns={'Type': 'type', 'Causative/responsive': 'Association'}, inplace=True
Note
For KE-WP mapping, it is advised to use: https://github.com/marvinm2/KE-WP-mapping to ensure smoother documentation of the KE-WP matches.
Section 4: Defining the nodetable and edgetable
In this section you will load your node table and edgetable.
Step 3: First, the nodes of the network are defined as MIE, KE, AO and molecular pathway. You will do this by creating a new dataframe: ‘molecularpathwaytable’ for the molecular pathway information and adding a type: ‘Molecular pathway’ column to it. This ensures easier integration in the final node table using the ‘pd.concat’ function and manipulation of the future molecular AOP network.
preliminarymolecularpathway_table= pd.DataFrame([nodetable0.WPID, nodetable0.WPtitle, nodetable0.Group, nodetable0.Association]).transpose()
molecularpathwaytable= preliminarymolecularpathway_table.rename(columns= {'WPID': 'id', 'WPtitle':'name', 'Group':'group', 'Association':'Association type'})
molecularpathwaytable
id | name | group | Association type | |
---|---|---|---|---|
0 | WP5095 | Overview of proinflammatory and profibrotic me... | Liver | Responsive |
1 | WP5083 | Neuroinflammation and glutamatergic signaling | Brain | Other |
2 | WP1545 | miRNAs involved in DNA damage response | Lung | Causative |
3 | WP1530 | miRNA regulation of DNA damage response | Lung | Other |
4 | WP4655 | Cytosolic DNA-sensing pathway | Lung | Responsive |
... | ... | ... | ... | ... |
287 | WP408 | Oxidative stress response | Brain | Causative |
288 | WP1772 | Apoptosis modulation and signaling | Brain | Causative |
289 | WP254 | Apoptosis | Brain | Causative |
290 | WP3676 | BDNF-TrkB signaling | Brain | Causative |
291 | WP5477 | Molecular pathway for oxidative stress | Brain | Causative |
292 rows × 4 columns
molecularpathwaytable['type'] = 'Molecular pathway'
molecularpathwaytable.dropna(inplace=True)
molecularpathwaytable
id | name | group | Association type | type | |
---|---|---|---|---|---|
0 | WP5095 | Overview of proinflammatory and profibrotic me... | Liver | Responsive | Molecular pathway |
1 | WP5083 | Neuroinflammation and glutamatergic signaling | Brain | Other | Molecular pathway |
2 | WP1545 | miRNAs involved in DNA damage response | Lung | Causative | Molecular pathway |
3 | WP1530 | miRNA regulation of DNA damage response | Lung | Other | Molecular pathway |
4 | WP4655 | Cytosolic DNA-sensing pathway | Lung | Responsive | Molecular pathway |
... | ... | ... | ... | ... | ... |
287 | WP408 | Oxidative stress response | Brain | Causative | Molecular pathway |
288 | WP1772 | Apoptosis modulation and signaling | Brain | Causative | Molecular pathway |
289 | WP254 | Apoptosis | Brain | Causative | Molecular pathway |
290 | WP3676 | BDNF-TrkB signaling | Brain | Causative | Molecular pathway |
291 | WP5477 | Molecular pathway for oxidative stress | Brain | Causative | Molecular pathway |
275 rows × 5 columns
Step 4: The nodes derived from the node table will be joined with the formed molecularpathway table to form the final nodetable. This table organises the nodes in a way that you can easily select node type with organ-system location. For the molecular pathways, this table also allows you to see the association type for the respective molecular pathways.
MIE_nodes = nodetable0[nodetable0['type'] == 'MIE'][['ID (KEID)', 'Name (KEtitle)', 'Group', 'type']].rename(columns={'ID (KEID)': 'id', 'Name (KEtitle)': 'name', 'Group': 'group'})
KE_nodes = nodetable0[nodetable0['type'] == 'KE'][['ID (KEID)', 'Name (KEtitle)', 'Group', 'type']].copy()
KE_nodes.rename(columns={'ID (KEID)': 'id', 'Name (KEtitle)': 'name', 'Group': 'group'}, inplace=True)
AO_nodes = nodetable0[nodetable0['type'] == 'AO'][['ID (KEID)', 'Name (KEtitle)', 'Group', 'type']].rename(columns={'ID (KEID)': 'id', 'Name (KEtitle)': 'name', 'Group': 'group'})
nodetable = pd.concat([MIE_nodes, KE_nodes, AO_nodes, molecularpathwaytable], ignore_index=True)
nodetable
id | name | group | type | Association type | |
---|---|---|---|---|---|
0 | https://identifiers.org/aop.events/486 | systemic inflammation leading to hepatic steat... | Liver | MIE | NaN |
1 | https://identifiers.org/aop.events/875 | Binding of agonist, Ionotropic glutamate recep... | Brain | MIE | NaN |
2 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE | NaN |
3 | https://identifiers.org/aop.events/2007 | Non-coding RNA expression profile alteration | Lung | MIE | NaN |
4 | https://identifiers.org/aop.events/1495 | Substance interaction with the lung resident c... | Lung | MIE | NaN |
... | ... | ... | ... | ... | ... |
562 | WP408 | Oxidative stress response | Brain | Molecular pathway | Causative |
563 | WP1772 | Apoptosis modulation and signaling | Brain | Molecular pathway | Causative |
564 | WP254 | Apoptosis | Brain | Molecular pathway | Causative |
565 | WP3676 | BDNF-TrkB signaling | Brain | Molecular pathway | Causative |
566 | WP5477 | Molecular pathway for oxidative stress | Brain | Molecular pathway | Causative |
567 rows × 5 columns
Step 5: You will load the edge table in the dataframe: ’edgetable1’
edgetable1= pd.read_excel('edgetable.xlsx')
Step 6: Now edgetable2 will be created which will contain the confidence score and association type for the molecular pathways. This is needed as edgetable1 was made before assignment of molecular pathways to the inflammatory-related AOP network in the previous notebook. Secondly, you have to establish the interactions between the molecular pathways and the MIEs,KEs and AOs so that they are present in the network with the following function.
edgetable2 = pd.DataFrame({
'source': nodetable0['ID (KEID)'],
'target': nodetable0['WPID'],
'interaction': ['interacts'] * len(nodetable0),
'group' : nodetable0['Group'],
'confidence score': nodetable0['Confidence score'],
'Association type': nodetable0['Association']
})
edgetable2
source | target | interaction | group | confidence score | Association type | |
---|---|---|---|---|---|---|
0 | https://identifiers.org/aop.events/486 | WP5095 | interacts | Liver | Medium | Responsive |
1 | https://identifiers.org/aop.events/875 | WP5083 | interacts | Brain | Medium | Other |
2 | https://identifiers.org/aop.events/2007 | WP1545 | interacts | Lung | Medium | Causative |
3 | https://identifiers.org/aop.events/2007 | WP1530 | interacts | Lung | Medium | Other |
4 | https://identifiers.org/aop.events/1495 | WP4655 | interacts | Lung | High | Responsive |
... | ... | ... | ... | ... | ... | ... |
287 | https://identifiers.org/aop.events/352 | WP408 | interacts | Brain | High | Causative |
288 | https://identifiers.org/aop.events/352 | WP1772 | interacts | Brain | High | Causative |
289 | https://identifiers.org/aop.events/352 | WP254 | interacts | Brain | High | Causative |
290 | https://identifiers.org/aop.events/352 | WP3676 | interacts | Brain | High | Causative |
291 | https://identifiers.org/aop.events/352 | WP5477 | interacts | Brain | High | Causative |
292 rows × 6 columns
Step 7: For edgetable 2, you also need to identify the valid edges.
edgetable_2 = edgetable2[edgetable2['source'].isin(nodetable['id']) & edgetable2['target'].isin(nodetable['id'])]
edgetable_2
source | target | interaction | group | confidence score | Association type | |
---|---|---|---|---|---|---|
0 | https://identifiers.org/aop.events/486 | WP5095 | interacts | Liver | Medium | Responsive |
1 | https://identifiers.org/aop.events/875 | WP5083 | interacts | Brain | Medium | Other |
2 | https://identifiers.org/aop.events/2007 | WP1545 | interacts | Lung | Medium | Causative |
3 | https://identifiers.org/aop.events/2007 | WP1530 | interacts | Lung | Medium | Other |
4 | https://identifiers.org/aop.events/1495 | WP4655 | interacts | Lung | High | Responsive |
... | ... | ... | ... | ... | ... | ... |
287 | https://identifiers.org/aop.events/352 | WP408 | interacts | Brain | High | Causative |
288 | https://identifiers.org/aop.events/352 | WP1772 | interacts | Brain | High | Causative |
289 | https://identifiers.org/aop.events/352 | WP254 | interacts | Brain | High | Causative |
290 | https://identifiers.org/aop.events/352 | WP3676 | interacts | Brain | High | Causative |
291 | https://identifiers.org/aop.events/352 | WP5477 | interacts | Brain | High | Causative |
275 rows × 6 columns
Step 8: You can now make the final edgetable by merging edgetable1 with the present edges of the edgetable2.
edgetable = pd.concat([edgetable1, edgetable_2], ignore_index=True)
Step 9: Lastly the additional spaces present in the node table and edge table will be removed. This is done to prevent parsing errors when constructing the network.
Step 9a: You first remove the additional spaces in the nodetable.
nodetable['id'] = nodetable['id'].str.strip()
Step 9b: You first remove the additional spaces for the source and target columns in the edgetable.
edgetable['source'] = edgetable['source'].str.strip()
edgetable['target'] = edgetable['target'].str.strip()
Section 5: Construction of the molecular AOP network
In this section, the molecular AOP network will be constructed.
Step 10: You can construct the molecular AOP network and display the output.
p4c.create_network_from_data_frames(nodes=nodetable, edges=edgetable, title='Agata,S.-Molecular inflammatory-process related AOP network')
Applying default style...
Applying preferred layout
51035
Section 6: Adaptation of stylistic aspects of the molecular AOP network
In this section you can change the stylistic aspects of your molecular AOP network.
Step 11: You first have to import the necessary functions.
from py4cytoscape import get_node_color
from py4cytoscape import set_node_color_mapping
from py4cytoscape import gen_node_color_map
from py4cytoscape import set_edge_color_default
from py4cytoscape import set_node_color_default
from py4cytoscape import set_edge_source_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_shape_default
from py4cytoscape import get_arrow_shapes
from py4cytoscape import get_edge_target_arrow_shape
from py4cytoscape import set_edge_target_arrow_shape_mapping
from py4cytoscape import gen_edge_arrow_map
from py4cytoscape import select_nodes
from py4cytoscape import get_table_value
from py4cytoscape import get_network_suid
from py4cytoscape import clear_selection
from py4cytoscape import set_node_color_bypass
from py4cytoscape import set_edge_color_bypass
from py4cytoscape import set_edge_target_arrow_color_default
Step 12: Next, you can define the style for your network and set the nodelabels and edgewidth.
style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 50, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p')
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p')
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)
{'message': 'Visual Style applied.'}
Step 13: You can change the edge color and target shape to show the directionality of your network.
set_edge_target_arrow_shape_default('ARROW', style_name='default')
set_edge_target_arrow_color_default('#ffffff', style_name='default')
''
Step 14: In preparation for step 15, you will retrieve the columns of the nodetable.
table_columns = p4c.get_table_columns()
table_columns
SUID | shared name | id | group | type | Association type | name | selected | |
---|---|---|---|---|---|---|---|---|
51201 | 51201 | https://identifiers.org/aop.events/1943 | https://identifiers.org/aop.events/1943 | Brain | KE | None | Hyperphosphorylation of Tau | False |
51204 | 51204 | https://identifiers.org/aop.events/386 | https://identifiers.org/aop.events/386 | Brain | KE | None | Decrease of neuronal network function | False |
51207 | 51207 | https://identifiers.org/aop.events/1944 | https://identifiers.org/aop.events/1944 | Brain | KE | None | Synaptic dysfunction | False |
51210 | 51210 | https://identifiers.org/aop.events/1582 | https://identifiers.org/aop.events/1582 | Brain | KE | None | Impaired axonial transport | False |
51213 | 51213 | https://identifiers.org/aop.events/1942 | https://identifiers.org/aop.events/1942 | Brain | KE | None | Accumulation, Cytosolic toxic Tau oligomers | False |
... | ... | ... | ... | ... | ... | ... | ... | ... |
51705 | 51705 | WP5485 | WP5485 | Brain | Molecular pathway | Causative | Post-COVID neuroinflammation | False |
51195 | 51195 | https://identifiers.org/aop.events/1392 | https://identifiers.org/aop.events/1392 | Brain | KE | None | Oxidative Stress | False |
51708 | 51708 | WP2371 | WP2371 | Brain | Molecular pathway | Other | Parkinson's disease pathway | False |
51198 | 51198 | https://identifiers.org/aop.events/1815 | https://identifiers.org/aop.events/1815 | Liver | KE | None | Activation of ER stress | False |
51711 | 51711 | WP4197 | WP4197 | Liver | Molecular pathway | Causative | Immune response to tuberculosis | False |
213 rows × 8 columns
Step 15: You can change the color of the nodes by selection of the node type followed by list conversion and color change with the bypass function.
Step 15a: Changing appearance of the MIE nodes to green.
table_columns_df = table_columns[table_columns['type'] == 'MIE']
MIEnodeslist = table_columns_df['name'].tolist()
p4c.select_nodes(MIEnodeslist)
{}
color = '#09d63a'
new_colors = [color] * len(MIEnodeslist)
p4c.set_node_color_bypass(node_names=MIEnodeslist, new_colors=new_colors, network='current')
''
Step 15b: Changing appearance of the KE nodes to yellow.
table_columns_df2 = table_columns[table_columns['type'] == 'KE']
KEnodeslist = table_columns_df2['name'].tolist()
p4c.select_nodes(KEnodeslist, by_col='SUID')
{'nodes': [51519, 51342], 'edges': []}
color = '#f7ff00'
new_colors = [color] * len(KEnodeslist)
p4c.set_node_color_bypass(node_names=KEnodeslist, new_colors=new_colors, network='current')
''
Step 15c: Changing appearance of the AO nodes to pink.
table_columns_df3 = table_columns[table_columns['type'] == 'AO']
AOnodeslist = table_columns_df3['name'].tolist()
p4c.select_nodes(AOnodeslist)
{'nodes': [51519, 51342], 'edges': []}
color = '#ff28d5'
new_colors = [color] * len(AOnodeslist)
p4c.set_node_color_bypass(node_names=AOnodeslist, new_colors=new_colors, network='current')
''
Step 15d: Changing appearance of the molecular pathway nodes to grey.
table_columns_df4 = table_columns[table_columns['type'] == 'Molecular pathway']
molecularpathwaynodeslist = table_columns_df4['name'].tolist()
p4c.select_nodes(molecularpathwaynodeslist)
{'nodes': [51519, 51342], 'edges': []}
color = '#414a4c'
new_colors = [color] * len(molecularpathwaynodeslist)
p4c.set_node_color_bypass(node_names=molecularpathwaynodeslist, new_colors=new_colors, network='current')
''
set_node_color_default('#a7a5a5',style_name='default')
''
Section 7: Extension of the molecular AOP network using Cytargetlinker and WikiPathways linkset
In this section, the AOP network will be enriched with genes associated to the molecular pathways you matched to your MIEs, KEs and AOs. This will be done using CyTargetLinker.
Step 16: You first install the CyTargetLinker app and check its functionality and status.
p4c.install_app('CyTargetLinker')
{}
{}
p4c.get_app_information('CyTargetLinker')
{'app': 'CyTargetLinker',
'descriptionName': 'Flexible network extension app',
'version': '4.1.0'}
p4c.get_app_status('CyTargetLinker')
{'appName': 'CyTargetLinker', 'status': 'Installed'}
p4c.commands_get('cytargetlinker extend')
["Available arguments for 'cytargetlinker extend':",
'direction',
'idAttribute',
'linkSetDirectory',
'linkSetFiles',
'network']
Step 17: You import the operating system (os) package in preparation for step 17. Os allows the user to interact with the operating system of Cytoscape.
import os
Step 18: Next you define the filepath by which you retrieve the WikiPathways linkset file in your laptop.
file_path = os.path.join(os.getcwd(), "wikipathways_hsa_20240410 (1).xgmml")
linkset = "linkSetFiles="+file_path
print(linkset)
linkSetFiles=C:\Users\shaki\Downloads\wikipathways_hsa_20240410 (1).xgmml
Step 19: You construct the string for the network extension including: LinkSetFiles, idAttribute and direction.
cmd_cytargetlinker = ['cytargetlinker','extend', linkset, 'idAttribute="id"', 'direction="BOTH"']
cmd_ctl = " ".join(cmd_cytargetlinker)
print(cmd_ctl)
cytargetlinker extend linkSetFiles=C:\Users\shaki\Downloads\wikipathways_hsa_20240410 (1).xgmml idAttribute="id" direction="BOTH"
Step 20: You lastly run the command for the network extension and display the output. You have to use commands.commands_get as this function converts the command string of step 18 into a CyREST query.
p4c.commands.commands_get(cmd_ctl)
['Extension step: 1',
'Linkset: WikiPathways-20240410_Homo sapiens_20240410',
'Added edges: 5758',
'Added nodes: 2739']
Section 8: Changing the visual style of CytargetLinker-extended molecular AOP network
In this section, you will apply the Cytoscape’s preferred layout to the AOP network.
Step 21: You will apply the layout to your extended network which uses Cytoscape’s preferred layout for organising the network.
cmd_cytargetlinker = ['cytargetlinker','applyLayout', 'network="current"']
cmd_ctl = " ".join(cmd_cytargetlinker)
p4c.commands.commands_get(cmd_ctl)
[]
Step 22: You can now apply the previous visual style to our extended network and view the network in Cytoscape.
cmd_cytargetlinker = ['cytargetlinker','applyVisualstyle', 'network="current"']
cmd_ctl = " ".join(cmd_cytargetlinker)
p4c.commands.commands_get(cmd_ctl)
[]
style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 50, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p')
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p')
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)
{'message': 'Visual Style applied.'}
Section 9: Saving results
In this section, the Cytoscape network along with the style settings and bypass settings are saved for the next Jupyternotebook:‘Agata,Shakira-The AOP project-Part 4’.
Step 23: You will save your molecular AOP network as a Cytoscape in preparation for the next part.
p4c.save_session('Agata,S.-Part4-Complete Molecular inflammation-process related AOP network')
This file has been overwritten.
{}
Section 10: Metadata
At last, the metadata belonging to this jupyternotebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.
Step 24: At last, the metadata belonging to this jupyternotebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.
%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-06-02T18:23:43.809510+02:00
Python implementation: CPython
Python version : 3.12.3
IPython version : 8.25.0
Compiler : MSC v.1938 64 bit (AMD64)
OS : Windows
Release : 11
Machine : AMD64
Processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
json==2.0.9
ipykernel==6.28.0
numpy==1.26.4
pandas==2.2.2
ipywidgets==8.0.3
xarray==2023.6.0
py4cytoscape==1.9.0
References:
- CyTargetLinker app update: A flexible solution for network extension in Cytoscape. Martina Kutmon, Friederike Ehrhart, Egon L Willighagen, Chris T. Evelo, Susan L. Coort F1000Research 2018, 7:743 doi: 10.12688/f1000research.14613.1
- Kutmon,M. 2024 . cyTargetLinker-automation.Maastricht:Github; \[accessed 2024 December 18\].https://github.com/CyTargetLinker/cytargetlinker-automation