Part 2: Construction of an human inflammatory stress response AOP network

The AOP project ► Key objective 1

Author: Shakira Agata

This Jupyter notebook describes the steps needed to construct an AOP network that focuses on inflammatory processes in human organ systems AOP network. Briefly the nodetable and edgetable were defined followed by construction of the AOP network and adaptation of the visual style with py4cytoscape. Py4cytoscape is a Python package that collaborates with Cytoscape for visualization of molecular networks and biological pathways. The detailed steps are outlined in the following eight sections:

  • Section 1: System preparation
  • Section 2: Data cleaning and preparation
  • Section 3: Data manipulation for node table
  • Section 4: Definition of the node table
  • Section 5: Definition of the edge table
  • Section 6: Construction of the AOP network
  • Section 7: Adaptation of stylistic aspects of the AOP network
  • Section 8: Metadata

Section 1: System preparation

In this section, you import pandas and py4cytoscape which are required for execution of this notebook.

Step 1: We import pandas and py4cytoscape. For py4cytoscape, you must first open Cytoscape on your laptop prior to running the code. If not, an error message will be shown.

import pandas as pd
import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()
You are connected to Cytoscape!





{'apiVersion': 'v1',
 'cytoscapeVersion': '3.10.1',
 'automationAPIVersion': '1.9.0',
 'py4cytoscapeVersion': '1.9.0'}

Section 2: Data cleaning and preparation

In this section, the result of Jupyter notebook: ‘Agata,Shakira-The AOP project-Part 1’ is imported and cleaned in preparation for definition of the nodetable and edgetable. The result of the first Jupyter Notebook was saved as ‘final_result.json’.

Step 2: You read the final_result table of the first Jupyter notebook into a new dataframe.

final_result= pd.read_json("C:/Users/shaki/Downloads/final_result.json")

Step 3: Now, you remove the redundant column: ‘AOPName’ as it contains the same output as the column:‘AOPtitle’.

final_result1= final_result.drop(columns='AOPName', axis=1)

Step 4: This is followed by the removal of duplicates in the KER column to prevent the visualization of duplicate identical interactions in the AOP network.

final_result2= final_result1.drop_duplicates(subset='KER', keep='first')

Step 5: Create a weight column and set the value to 1. This is a prerequisite for both the construction of the edgetable and adaptation of the visual style of the future AOP network. This will be done by using the assign function and copying the dataframe to prevent a slice dataframe.

final_result2 = final_result2.copy()
final_result2 = final_result2.assign(weight='1')

Step 6: You remove the columns: KE, KEID, KEname and KEtitle from the final_result2 dataframe to clean up the table. This columns were generated during the second SPARQL query, but can be removed from the final table as the definition of the nodetable and edgetable only requires the upstream key events, downstream key events and adverse outcomes (AO) columns.

Adaptedfinal_result2= final_result2.drop(['KE', 'KEname', 'KEID', 'KEtitle'], axis=1)

Step 7: You lastly change the column names: ‘upstreamKE’ and ‘downstreamKE’ to upstreamKEID and downstreamKEID for cohesiveness of the dataframe.

secondfinal_result2= Adaptedfinal_result2.rename(columns= {'upstreamKE': 'upstreamKEID', 'downstreamKE':'downstreamKEID'})

Section 3: Data manipulation for node table

In this section, functions will be constructed that allow for differentiation between molecular initiating events (MIE), Key events (KE) and Adverse outcomes (AO) in the AOP network. This is important for the correct establishment of directionality of the AOP network as MIEs precede KEs and KEs precede AOs.

Step 8: You will identify the most upstreamKEs as MIE using the function below. This is manually done as SPARQL queries cannot assign MIEs in the respective AOPs. The strategy behind these functions is to define the MIEs as the upstreamKEs that are not present in the downstreamKE column. This is logical as the most upstreamKE should not be present in the downstreamKE column. You first set the mostupstreamKE variable and then search our upstreamKEID column of the secondfinal_result2 for the MIEs that are present in the mostupstreamKE variable. Those that are not present in the mostupstreamKE get the type: ‘KE’. This is followed by searching the secondfinal_result2 table for the most downstreamKE as those get the assignment: ‘AO’. The downstreamKEs that are not AOs get the type: ‘KE’.

mostupstreamKE = set(secondfinal_result2['upstreamKEID']) - set(secondfinal_result2['downstreamKEID'])
secondfinal_result2['type'] = 'KE'

secondfinal_result2.loc[secondfinal_result2['upstreamKEID'].isin(mostupstreamKE), 'type'] = 'MIE'

secondfinal_result2.loc[secondfinal_result2['downstreamKEID'].isin(secondfinal_result2['ao']), 'type'] = 'AO'

Step 9: You can optionally save the finaltable to Excel and verify output. I included this line of code to verify my results with AOP-Wiki as an intermediate check.

secondfinal_result2.to_excel('checksecondfinal_result.xlsx')

Section 4: Definition of the node table

In this section, you will define your node table which will include: ’name’, ‘ID’ and ’type’.

Step 10: You will construct the node table by first making three dataframes for the nodes and then joining them together. The three dataframes are:

  1. MIE_nodes: These include the most upstreamKE from the upstreamKEs. The results of section 3 will be employed for this by selecting the values ‘MIE’ from the type column of the secondfinal_result2 table.

  2. KE_nodes: These include the downstreamKEs and upstreamKEs that were not MIE. The results of section 3 will be employed for this by selecting the values ‘KE’ from the type column of the secondfinal_result2 table.

  3. AO_node: These include your adverse outcomes. The results of section 3 will be employed for this by selecting the values ‘AO’ from the type column of the secondfinal_result2 table.

MIE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'MIE'][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']].rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'})

KE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'KE'][['downstreamKEID', 'downstreamKEtitle', 'organ', 'type']].copy()
KE_nodes.rename(columns={'downstreamKEID': 'id', 'downstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)

upstreamKE_nodes = secondfinal_result2[(secondfinal_result2['type'] != 'MIE') & (secondfinal_result2['type'] != 'AO')][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']]
upstreamKE_nodes.rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)
KE_nodes = pd.concat([KE_nodes, upstreamKE_nodes], ignore_index=True)

AO_nodes = secondfinal_result2[secondfinal_result2['type'] == 'AO'][['ao', 'aotitle', 'organ', 'type']].rename(columns={'ao': 'id', 'aotitle': 'name', 'organ': 'group'})

nodetable = pd.concat([MIE_nodes, KE_nodes, AO_nodes], ignore_index=True)

nodetable

id name group type
0 https://identifiers.org/aop.events/486 systemic inflammation leading to hepatic steat... Liver MIE
1 https://identifiers.org/aop.events/875 Binding of agonist, Ionotropic glutamate recep... Brain MIE
2 https://identifiers.org/aop.events/2007 Non-coding RNA expression profile alteration Lung MIE
3 https://identifiers.org/aop.events/2007 Non-coding RNA expression profile alteration Lung MIE
4 https://identifiers.org/aop.events/2007 Non-coding RNA expression profile alteration Lung MIE
... ... ... ... ...
275 https://identifiers.org/aop.events/1276 Lung fibrosis Lung AO
276 https://identifiers.org/aop.events/1458 Pulmonary fibrosis Lung AO
277 https://identifiers.org/aop.events/1090 Increased, mesotheliomas Lung AO
278 https://identifiers.org/aop.events/341 Impairment, Learning and memory Brain AO
279 https://identifiers.org/aop.events/352 N/A, Neurodegeneration Brain AO

280 rows × 4 columns

Step 11: You remove the duplicates so that only the unique nodes remain in the list. With the following function, you remove duplicate nodes, but retain the nodes that have the same id, but different group. These nodes are valuable as those represent the KEs which are present in multiple organ systems.

finalnodetable= nodetable.drop_duplicates(['id', 'group'], keep='first')
finalnodetable

id name group type
0 https://identifiers.org/aop.events/486 systemic inflammation leading to hepatic steat... Liver MIE
1 https://identifiers.org/aop.events/875 Binding of agonist, Ionotropic glutamate recep... Brain MIE
2 https://identifiers.org/aop.events/2007 Non-coding RNA expression profile alteration Lung MIE
8 https://identifiers.org/aop.events/1495 Substance interaction with the lung resident c... Lung MIE
11 https://identifiers.org/aop.events/105 Inhibition, Mitochondrial Electron Transport C... Kidney MIE
... ... ... ... ...
269 https://identifiers.org/aop.events/896 Parkinsonian motor deficits Brain AO
270 https://identifiers.org/aop.events/1588 Bronchiolitis obliterans Lung AO
271 https://identifiers.org/aop.events/1549 Liver Injury Liver AO
272 https://identifiers.org/aop.events/357 Cholestasis, Pathology Liver AO
276 https://identifiers.org/aop.events/1458 Pulmonary fibrosis Lung AO

137 rows × 4 columns

Step 12: This list can also be saved to Excel to use for assignment of molecular pathways to the AOP network nodes present in this table. More details can be found in Jupyter Notebook: ‘Agata,Shakira-The AOP project-Part 3’.

finalnodetable.to_excel('finalnodetable.xlsx')

Section 5: Definition of the edge table

In this section, the edge table will be defined which includes: ‘source’, ’target’, ‘interaction’, ‘weight’, ‘group’ and ‘ID’.

Step 13: You will define these columns and select the columns from the final_result2 table. There are different methods to do so, but you will use the transpose() function in this case. Afterwards, you rename the columns to: source, target, interaction, weight and group respectively so the table can be read by Cytoscape.

preliminaryedgetable= pd.DataFrame([secondfinal_result2.upstreamKEID, secondfinal_result2.downstreamKEID, secondfinal_result2.KER, secondfinal_result2.KERID, secondfinal_result2.weight, secondfinal_result2.organ]).transpose()

edgetable= preliminaryedgetable.rename(columns= {'upstreamKEID': 'source', 'downstreamKEID':'target', 'KER':'interaction',  'KERID':'id', 'organ':'group'})

edgetable.to_excel('edgetable.xlsx')

Section 6: Construction of the AOP network

In this section, the AOP network will constructed using predefined py4cytoscape functions.

Step 14: You will first define the nodes, edges and title of the network.

p4c.create_network_from_data_frames(nodes=finalnodetable, edges=edgetable, title='Agata,S-The inflammatory-process related AOP network')
Applying default style...
Applying preferred layout





17292

Section 7: Adaptation of stylistic aspects

In this section, the style of your AOP network will be adapted and displayed.

Step 15: Prior to adapting the style of the network, you first need to import the stylistic mappings in py4cytoscape.

from py4cytoscape import get_node_color
from py4cytoscape import set_node_color_mapping
from py4cytoscape import gen_node_color_map
from py4cytoscape import set_edge_color_default
from py4cytoscape import set_node_color_default
from py4cytoscape import set_edge_source_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_color_default
from py4cytoscape import get_arrow_shapes
from py4cytoscape import get_edge_target_arrow_shape
from py4cytoscape import set_edge_target_arrow_shape_mapping
from py4cytoscape import gen_edge_arrow_map
from py4cytoscape import select_nodes
from py4cytoscape import get_table_value
from py4cytoscape import get_network_suid
from py4cytoscape import clear_selection
from py4cytoscape import set_node_color_bypass

Step 16: The style of the network can be changed with the following commands. In this Jupyter Notebook, the color of the nodes are changed according to the ’type’ assignment and edges are adapted to be arrow-shaped. This requires retrieval of the Cytoscape columns.

style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 50, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p') 
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p') 
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)
{'message': 'Visual Style applied.'}
set_edge_target_arrow_shape_default('ARROW', style_name='default')
set_edge_target_arrow_color_default('#000000', style_name='default')
''
Newtable= p4c.get_table_columns()
Newtable

SUID shared name id group type degree.layout name selected
17665 17665 https://identifiers.org/aop.events/1276 https://identifiers.org/aop.events/1276 Lung AO NaN Lung fibrosis False
17410 17410 https://identifiers.org/aop.events/2010 https://identifiers.org/aop.events/2010 Lung KE NaN Pulmonary inflammation False
17668 17668 https://identifiers.org/aop.events/344 https://identifiers.org/aop.events/344 Liver AO NaN N/A, Liver fibrosis False
17413 17413 https://identifiers.org/aop.events/2013 https://identifiers.org/aop.events/2013 Lung KE NaN Airway remodeling False
17671 17671 https://identifiers.org/aop.events/1841 https://identifiers.org/aop.events/1841 Brain AO NaN Encephalitis False
... ... ... ... ... ... ... ... ...
17401 17401 https://identifiers.org/aop.events/389 https://identifiers.org/aop.events/389 Brain KE NaN Increased, Intracellular Calcium overload False
17659 17659 https://identifiers.org/aop.events/1941 https://identifiers.org/aop.events/1941 Brain AO NaN Memory Loss False
17404 17404 https://identifiers.org/aop.events/177 https://identifiers.org/aop.events/177 Liver KE NaN Mitochondrial dysfunction False
17662 17662 https://identifiers.org/aop.events/1090 https://identifiers.org/aop.events/1090 Lung AO NaN Increased, mesotheliomas False
17407 17407 https://identifiers.org/aop.events/2011 https://identifiers.org/aop.events/2011 Lung KE NaN Emphysema False

122 rows × 8 columns

Step 17: The color of the nodes are changed by using the ‘select and color’ method I developed. With this, you first select the nodes, convert them to lists and change the color for the entire length of the lists with py4cytoscape’s bypass function. With this approach, a datatype must be a list as dataframes are determined to be ‘ambigous’. This approach also allows you to see the highlighted nodes in Cytoscape as verification while running the code.

Step 17a: Change appearance of MIE nodes to green.

MIEnodes_df = Newtable[Newtable['type'] == 'MIE']
MIEnodeslist = MIEnodes_df['name'].tolist()
p4c.select_nodes(MIEnodeslist,by_col='SUID')
{}
color = '#09d63a' 
new_colors = [color] * len(MIEnodeslist)
p4c.set_node_color_bypass(node_names=MIEnodeslist, new_colors=new_colors)
''

Step 17b: Change appearance of KE nodes to yellow

KEnodes_df = Newtable[Newtable['type'] == 'KE']
KEnodeslist = KEnodes_df['name'].tolist()
p4c.select_nodes(KEnodeslist, by_col='SUID')
{'nodes': [17596], 'edges': []}
color = '#f7ff00' 
new_colors = [color] * len(KEnodeslist)
p4c.set_node_color_bypass(node_names=KEnodeslist, new_colors=new_colors)
''

Step 17c: Changing appearance of the AO nodes to pink.

AO_nodes_df = Newtable[Newtable['type'] == 'AO']
AOnodeslist = AO_nodes_df['name'].tolist()
p4c.select_nodes(AOnodeslist, by_col='SUID')
{'nodes': [17596], 'edges': []}
color = '#ff28d5' 
new_colors = [color] * len(AOnodeslist)
p4c.set_node_color_bypass(node_names=AOnodeslist, new_colors=new_colors)

Section 8: Metadata

Step 18: At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.

%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-06-02T13:18:59.342286+02:00

Python implementation: CPython
Python version       : 3.12.3
IPython version      : 8.25.0

Compiler    : MSC v.1938 64 bit (AMD64)
OS          : Windows
Release     : 11
Machine     : AMD64
Processor   : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores   : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
json==2.0.9
ipykernel==6.28.0
pandas==2.2.2
py4cytoscape==1.9.0

References:

  1. CyTargetLinker app update: A flexible solution for network extension in Cytoscape. Martina Kutmon, Friederike Ehrhart, Egon L Willighagen, Chris T. Evelo, Susan L. Coort F1000Research 2018, 7:743 doi: 10.12688/f1000research.14613.1
  2. Kutmon,M. 2024 . cyTargetLinker-automation.Maastricht:Github; \[accessed 2024 December 18\].https://github.com/CyTargetLinker/cytargetlinker-automation