Part 2: Construction of an human inflammatory stress response AOP network

The AOP project ► Key objective 1

Author: Shakira Agata

This Jupyter notebook describes the steps needed to construct an AOP network that focuses on inflammatory processes in human organ systems AOP network. Briefly the nodetable and edgetable were defined followed by construction of the AOP network and adaptation of the visual style with py4cytoscape. Py4cytoscape is a Python package that collaborates with Cytoscape for visualization of molecular networks and biological pathways. The detailed steps are outlined in the following eight sections:

Section 1: System preparation
Section 2: Data cleaning and preparation
Section 3: Data manipulation for node table
Section 4: Definition of the node table
Section 5: Definition of the edge table
Section 6: Construction of the AOP network
Section 7: Adaptation of stylistic aspects of the AOP network
Section 8: Metadata

Section 1: System preparation

In this section, you import pandas and py4cytoscape which are required for execution of this notebook.

Step 1: We import pandas and py4cytoscape. For py4cytoscape, you must first open Cytoscape on your laptop prior to running the code. If not, an error message will be shown.

import pandas as pd

import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()

You are connected to Cytoscape!





{'apiVersion': 'v1',
 'cytoscapeVersion': '3.10.1',
 'automationAPIVersion': '1.9.0',
 'py4cytoscapeVersion': '1.9.0'}

Section 2: Data cleaning and preparation

In this section, the result of Jupyter notebook: ‘Agata,Shakira-The AOP project-Part 1’ is imported and cleaned in preparation for definition of the nodetable and edgetable. The result of the first Jupyter Notebook was saved as ‘final_result.json’.

Step 2: You read the final_result table of the first Jupyter notebook into a new dataframe.

final_result= pd.read_json("C:/Users/shaki/Downloads/final_result.json")

Step 3: Now, you remove the redundant column: ‘AOPName’ as it contains the same output as the column:‘AOPtitle’.

final_result1= final_result.drop(columns='AOPName', axis=1)

Step 4: This is followed by the removal of duplicates in the KER column to prevent the visualization of duplicate identical interactions in the AOP network.

final_result2= final_result1.drop_duplicates(subset='KER', keep='first')

Step 5: Create a weight column and set the value to 1. This is a prerequisite for both the construction of the edgetable and adaptation of the visual style of the future AOP network. This will be done by using the assign function and copying the dataframe to prevent a slice dataframe.

final_result2 = final_result2.copy()
final_result2 = final_result2.assign(weight='1')

Step 6: You remove the columns: KE, KEID, KEname and KEtitle from the final_result2 dataframe to clean up the table. This columns were generated during the second SPARQL query, but can be removed from the final table as the definition of the nodetable and edgetable only requires the upstream key events, downstream key events and adverse outcomes (AO) columns.

Adaptedfinal_result2= final_result2.drop(['KE', 'KEname', 'KEID', 'KEtitle'], axis=1)

Step 7: You lastly change the column names: ‘upstreamKE’ and ‘downstreamKE’ to upstreamKEID and downstreamKEID for cohesiveness of the dataframe.

secondfinal_result2= Adaptedfinal_result2.rename(columns= {'upstreamKE': 'upstreamKEID', 'downstreamKE':'downstreamKEID'})

Section 3: Data manipulation for node table

In this section, functions will be constructed that allow for differentiation between molecular initiating events (MIE), Key events (KE) and Adverse outcomes (AO) in the AOP network. This is important for the correct establishment of directionality of the AOP network as MIEs precede KEs and KEs precede AOs.

Step 8: You will identify the most upstreamKEs as MIE using the function below. This is manually done as SPARQL queries cannot assign MIEs in the respective AOPs. The strategy behind these functions is to define the MIEs as the upstreamKEs that are not present in the downstreamKE column. This is logical as the most upstreamKE should not be present in the downstreamKE column. You first set the mostupstreamKE variable and then search our upstreamKEID column of the secondfinal_result2 for the MIEs that are present in the mostupstreamKE variable. Those that are not present in the mostupstreamKE get the type: ‘KE’. This is followed by searching the secondfinal_result2 table for the most downstreamKE as those get the assignment: ‘AO’. The downstreamKEs that are not AOs get the type: ‘KE’.

mostupstreamKE = set(secondfinal_result2['upstreamKEID']) - set(secondfinal_result2['downstreamKEID'])
secondfinal_result2['type'] = 'KE'

secondfinal_result2.loc[secondfinal_result2['upstreamKEID'].isin(mostupstreamKE), 'type'] = 'MIE'

secondfinal_result2.loc[secondfinal_result2['downstreamKEID'].isin(secondfinal_result2['ao']), 'type'] = 'AO'

Step 9: You can optionally save the finaltable to Excel and verify output. I included this line of code to verify my results with AOP-Wiki as an intermediate check.

secondfinal_result2.to_excel('checksecondfinal_result.xlsx')

Section 4: Definition of the node table

In this section, you will define your node table which will include: ’name’, ‘ID’ and ’type’.

Step 10: You will construct the node table by first making three dataframes for the nodes and then joining them together. The three dataframes are:

MIE_nodes: These include the most upstreamKE from the upstreamKEs. The results of section 3 will be employed for this by selecting the values ‘MIE’ from the type column of the secondfinal_result2 table.
KE_nodes: These include the downstreamKEs and upstreamKEs that were not MIE. The results of section 3 will be employed for this by selecting the values ‘KE’ from the type column of the secondfinal_result2 table.
AO_node: These include your adverse outcomes. The results of section 3 will be employed for this by selecting the values ‘AO’ from the type column of the secondfinal_result2 table.

MIE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'MIE'][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']].rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'})

KE_nodes = secondfinal_result2[secondfinal_result2['type'] == 'KE'][['downstreamKEID', 'downstreamKEtitle', 'organ', 'type']].copy()
KE_nodes.rename(columns={'downstreamKEID': 'id', 'downstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)

upstreamKE_nodes = secondfinal_result2[(secondfinal_result2['type'] != 'MIE') & (secondfinal_result2['type'] != 'AO')][['upstreamKEID', 'upstreamKEtitle', 'organ', 'type']]
upstreamKE_nodes.rename(columns={'upstreamKEID': 'id', 'upstreamKEtitle': 'name', 'organ': 'group'}, inplace=True)
KE_nodes = pd.concat([KE_nodes, upstreamKE_nodes], ignore_index=True)

AO_nodes = secondfinal_result2[secondfinal_result2['type'] == 'AO'][['ao', 'aotitle', 'organ', 'type']].rename(columns={'ao': 'id', 'aotitle': 'name', 'organ': 'group'})

nodetable = pd.concat([MIE_nodes, KE_nodes, AO_nodes], ignore_index=True)

nodetable

	id	name	group	type
0	https://identifiers.org/aop.events/486	systemic inflammation leading to hepatic steat...	Liver	MIE
1	https://identifiers.org/aop.events/875	Binding of agonist, Ionotropic glutamate recep...	Brain	MIE
2	https://identifiers.org/aop.events/2007	Non-coding RNA expression profile alteration	Lung	MIE
3	https://identifiers.org/aop.events/2007	Non-coding RNA expression profile alteration	Lung	MIE
4	https://identifiers.org/aop.events/2007	Non-coding RNA expression profile alteration	Lung	MIE
...	...	...	...	...
275	https://identifiers.org/aop.events/1276	Lung fibrosis	Lung	AO
276	https://identifiers.org/aop.events/1458	Pulmonary fibrosis	Lung	AO
277	https://identifiers.org/aop.events/1090	Increased, mesotheliomas	Lung	AO
278	https://identifiers.org/aop.events/341	Impairment, Learning and memory	Brain	AO
279	https://identifiers.org/aop.events/352	N/A, Neurodegeneration	Brain	AO

280 rows × 4 columns

Step 11: You remove the duplicates so that only the unique nodes remain in the list. With the following function, you remove duplicate nodes, but retain the nodes that have the same id, but different group. These nodes are valuable as those represent the KEs which are present in multiple organ systems.

finalnodetable= nodetable.drop_duplicates(['id', 'group'], keep='first')
finalnodetable

	id	name	group	type
0	https://identifiers.org/aop.events/486	systemic inflammation leading to hepatic steat...	Liver	MIE
1	https://identifiers.org/aop.events/875	Binding of agonist, Ionotropic glutamate recep...	Brain	MIE
2	https://identifiers.org/aop.events/2007	Non-coding RNA expression profile alteration	Lung	MIE
8	https://identifiers.org/aop.events/1495	Substance interaction with the lung resident c...	Lung	MIE
11	https://identifiers.org/aop.events/105	Inhibition, Mitochondrial Electron Transport C...	Kidney	MIE
...	...	...	...	...
269	https://identifiers.org/aop.events/896	Parkinsonian motor deficits	Brain	AO
270	https://identifiers.org/aop.events/1588	Bronchiolitis obliterans	Lung	AO
271	https://identifiers.org/aop.events/1549	Liver Injury	Liver	AO
272	https://identifiers.org/aop.events/357	Cholestasis, Pathology	Liver	AO
276	https://identifiers.org/aop.events/1458	Pulmonary fibrosis	Lung	AO

137 rows × 4 columns

Step 12: This list can also be saved to Excel to use for assignment of molecular pathways to the AOP network nodes present in this table. More details can be found in Jupyter Notebook: ‘Agata,Shakira-The AOP project-Part 3’.

finalnodetable.to_excel('finalnodetable.xlsx')

Section 5: Definition of the edge table

In this section, the edge table will be defined which includes: ‘source’, ’target’, ‘interaction’, ‘weight’, ‘group’ and ‘ID’.

Step 13: You will define these columns and select the columns from the final_result2 table. There are different methods to do so, but you will use the transpose() function in this case. Afterwards, you rename the columns to: source, target, interaction, weight and group respectively so the table can be read by Cytoscape.

preliminaryedgetable= pd.DataFrame([secondfinal_result2.upstreamKEID, secondfinal_result2.downstreamKEID, secondfinal_result2.KER, secondfinal_result2.KERID, secondfinal_result2.weight, secondfinal_result2.organ]).transpose()

edgetable= preliminaryedgetable.rename(columns= {'upstreamKEID': 'source', 'downstreamKEID':'target', 'KER':'interaction',  'KERID':'id', 'organ':'group'})

edgetable.to_excel('edgetable.xlsx')

Section 6: Construction of the AOP network

In this section, the AOP network will constructed using predefined py4cytoscape functions.

Step 14: You will first define the nodes, edges and title of the network.

p4c.create_network_from_data_frames(nodes=finalnodetable, edges=edgetable, title='Agata,S-The inflammatory-process related AOP network')

Applying default style...
Applying preferred layout





17292

Section 7: Adaptation of stylistic aspects

In this section, the style of your AOP network will be adapted and displayed.

Step 15: Prior to adapting the style of the network, you first need to import the stylistic mappings in py4cytoscape.

from py4cytoscape import get_node_color
from py4cytoscape import set_node_color_mapping
from py4cytoscape import gen_node_color_map
from py4cytoscape import set_edge_color_default
from py4cytoscape import set_node_color_default
from py4cytoscape import set_edge_source_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_color_default
from py4cytoscape import get_arrow_shapes
from py4cytoscape import get_edge_target_arrow_shape
from py4cytoscape import set_edge_target_arrow_shape_mapping
from py4cytoscape import gen_edge_arrow_map
from py4cytoscape import select_nodes
from py4cytoscape import get_table_value
from py4cytoscape import get_network_suid
from py4cytoscape import clear_selection
from py4cytoscape import set_node_color_bypass

Step 16: The style of the network can be changed with the following commands. In this Jupyter Notebook, the color of the nodes are changed according to the ’type’ assignment and edges are adapted to be arrow-shaped. This requires retrieval of the Cytoscape columns.

style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 50, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p') 
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p') 
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)

{'message': 'Visual Style applied.'}

set_edge_target_arrow_shape_default('ARROW', style_name='default')
set_edge_target_arrow_color_default('#000000', style_name='default')

''

Newtable= p4c.get_table_columns()
Newtable

	SUID	shared name	id	group	type	degree.layout	name	selected
17665	17665	https://identifiers.org/aop.events/1276	https://identifiers.org/aop.events/1276	Lung	AO	NaN	Lung fibrosis	False
17410	17410	https://identifiers.org/aop.events/2010	https://identifiers.org/aop.events/2010	Lung	KE	NaN	Pulmonary inflammation	False
17668	17668	https://identifiers.org/aop.events/344	https://identifiers.org/aop.events/344	Liver	AO	NaN	N/A, Liver fibrosis	False
17413	17413	https://identifiers.org/aop.events/2013	https://identifiers.org/aop.events/2013	Lung	KE	NaN	Airway remodeling	False
17671	17671	https://identifiers.org/aop.events/1841	https://identifiers.org/aop.events/1841	Brain	AO	NaN	Encephalitis	False
...	...	...	...	...	...	...	...	...
17401	17401	https://identifiers.org/aop.events/389	https://identifiers.org/aop.events/389	Brain	KE	NaN	Increased, Intracellular Calcium overload	False
17659	17659	https://identifiers.org/aop.events/1941	https://identifiers.org/aop.events/1941	Brain	AO	NaN	Memory Loss	False
17404	17404	https://identifiers.org/aop.events/177	https://identifiers.org/aop.events/177	Liver	KE	NaN	Mitochondrial dysfunction	False
17662	17662	https://identifiers.org/aop.events/1090	https://identifiers.org/aop.events/1090	Lung	AO	NaN	Increased, mesotheliomas	False
17407	17407	https://identifiers.org/aop.events/2011	https://identifiers.org/aop.events/2011	Lung	KE	NaN	Emphysema	False

122 rows × 8 columns

Step 17: The color of the nodes are changed by using the ‘select and color’ method I developed. With this, you first select the nodes, convert them to lists and change the color for the entire length of the lists with py4cytoscape’s bypass function. With this approach, a datatype must be a list as dataframes are determined to be ‘ambigous’. This approach also allows you to see the highlighted nodes in Cytoscape as verification while running the code.

Step 17a: Change appearance of MIE nodes to green.

MIEnodes_df = Newtable[Newtable['type'] == 'MIE']
MIEnodeslist = MIEnodes_df['name'].tolist()
p4c.select_nodes(MIEnodeslist,by_col='SUID')

{}

color = '#09d63a' 
new_colors = [color] * len(MIEnodeslist)

p4c.set_node_color_bypass(node_names=MIEnodeslist, new_colors=new_colors)

''

Step 17b: Change appearance of KE nodes to yellow

KEnodes_df = Newtable[Newtable['type'] == 'KE']
KEnodeslist = KEnodes_df['name'].tolist()
p4c.select_nodes(KEnodeslist, by_col='SUID')

{'nodes': [17596], 'edges': []}

color = '#f7ff00' 
new_colors = [color] * len(KEnodeslist)

p4c.set_node_color_bypass(node_names=KEnodeslist, new_colors=new_colors)

''

Step 17c: Changing appearance of the AO nodes to pink.

AO_nodes_df = Newtable[Newtable['type'] == 'AO']
AOnodeslist = AO_nodes_df['name'].tolist()
p4c.select_nodes(AOnodeslist, by_col='SUID')

{'nodes': [17596], 'edges': []}

color = '#ff28d5' 
new_colors = [color] * len(AOnodeslist)

p4c.set_node_color_bypass(node_names=AOnodeslist, new_colors=new_colors)

Section 8: Metadata

Step 18: At last, the metadata belonging to this Jupyter Notebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.

%load_ext watermark
!pip install print-versions

Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)

%watermark

Last updated: 2025-06-02T13:18:59.342286+02:00

Python implementation: CPython
Python version       : 3.12.3
IPython version      : 8.25.0

Compiler    : MSC v.1938 64 bit (AMD64)
OS          : Windows
Release     : 11
Machine     : AMD64
Processor   : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores   : 8
Architecture: 64bit

from print_versions import print_versions
print_versions(globals())

json==2.0.9
ipykernel==6.28.0
pandas==2.2.2
py4cytoscape==1.9.0

References:

CyTargetLinker app update: A flexible solution for network extension in Cytoscape. Martina Kutmon, Friederike Ehrhart, Egon L Willighagen, Chris T. Evelo, Susan L. Coort F1000Research 2018, 7:743 doi: 10.12688/f1000research.14613.1
Kutmon,M. 2024 . cyTargetLinker-automation.Maastricht:Github; \[accessed 2024 December 18\].https://github.com/CyTargetLinker/cytargetlinker-automation