Part 4: Visualization of transcriptomics expression datasets in the enriched AOP network part 1
The AOP project ► Key objective 2
Author: Shakira Agata
This Jupyter Notebook describes the steps needed for the mapping of transcriptomics datasets: GSE109565, E-MEXP-2599 and E-MEXP-3583 in the enriched AOP network. This notebook is subdivided into the following seven sections:
- Section 1: System preparation
- Section 2: Retrieval of molecular inflammation-process related AOP network
- Section 3: Adaptation of gene node color of molecular inflammation-process related AOP network
- Section 4: Mapping of dataset:GSE109565
- Section 4.1 PCB126 concentration 1
- Section 4.2 PCB126 concentration 2
- Section 4.3 PCB126 concentration 3
- Section 4.4 Roundup
- Section 5: Mapping of dataset:E-MEXP-2599
- Section 5.1 CdCl2 exposure time 1
- Section 5.2 CdCl2 exposure time 2
- Section 5.3 CsA exposure time 1
- Section 5.4 CsA exposure time 2
- Section 5.5 Diquat dibromide exposure time 1
- Section 5.6 Diquat dibromide exposure time 2
- Section 6: Mapping of dataset:E-MEXP-3583
- Section 6.1 Ag+ exposure time 1
- Section 6.2 Ag+ exposure time 2
- Section 6.3 AgNP exposure time 1
- Section 6.4 AgNP exposure time 2
- Section 7: Metadata
Section 1: System preparation
In this section, you will import the required packages and tools you need for this Jupyternotebook.
step 1: You imported Pandas, Py4cytoscape and style mapping functions of Py4cytoscape.
import pandas as pd
import py4cytoscape as p4c
p4c.cytoscape_ping()
p4c.cytoscape_version_info()
You are connected to Cytoscape!
{'apiVersion': 'v1',
'cytoscapeVersion': '3.10.1',
'automationAPIVersion': '1.9.0',
'py4cytoscapeVersion': '1.9.0'}
from py4cytoscape import get_node_color
from py4cytoscape import set_node_color_mapping
from py4cytoscape import gen_node_color_map
from py4cytoscape import set_edge_color_default
from py4cytoscape import set_node_color_default
from py4cytoscape import set_edge_source_arrow_shape_default
from py4cytoscape import set_edge_target_arrow_shape_default
from py4cytoscape import get_arrow_shapes
from py4cytoscape import get_edge_target_arrow_shape
from py4cytoscape import set_edge_target_arrow_shape_mapping
from py4cytoscape import gen_edge_arrow_map
from py4cytoscape import select_nodes
from py4cytoscape import get_table_value
from py4cytoscape import get_network_suid
from py4cytoscape import clear_selection
from py4cytoscape import set_node_color_bypass
from py4cytoscape import set_edge_color_bypass
from py4cytoscape import set_edge_target_arrow_color_default
from py4cytoscape import set_node_size_bypass
from py4cytoscape import create_subnetwork
Section 2: Retrieval of molecular inflammation-process related AOP network
In this section, you will open your Cytoscape molecular inflammatory-process related AOP network using the py4cytoscape session function. This allows you to open the complete network with previously established style settings and bypass settings.
step 2: You open the session you saved in the previous Jupyternotebook.
p4c.open_session('Agata,S.-Part4-Complete Molecular inflammation-process related AOP network.cys')
Opening C:\Users\shaki\Downloads\Agata,S.-Part4-Complete Molecular inflammation-process related AOP network.cys...
{}
Section 3: Adaptation of gene node color of molecular inflammation-process related AOP network
In this section, you will change the node color of genes and adapt the style for easier intepretation of the upcoming results. This is needed in preparation for the mapping of transcriptomics datasets. These datasets may contain genes that are not present in the build AOP network and so therefore should receive a distinct color to correctly inform user.
step 3: You can change the style with the following command.
style_name = "default"
defaults = {'NODE_SHAPE': "ELLIPSE", 'NODE_SIZE': 20, 'EDGE_TRANSPARENCY': 140, 'NODE_LABEL_POSITION': "C,C,c,0.00,0.00"}
nodeLabels = p4c.map_visual_property('node label', 'name', 'p')
edgeWidth = p4c.map_visual_property('edge width', 'weight', 'p')
arrowShapes = p4c.map_visual_property('Edge Target Arrow Shape','interaction', 'd')
p4c.create_visual_style(style_name, defaults, [nodeLabels, edgeWidth])
p4c.set_visual_style(style_name)
{'message': 'Visual Style applied.'}
Section 4: Mapping of dataset: GSE109565
In this section, you will map the transcriptomics expression data of datasets:GSE109565. GSE109565 focused on the alteration in gene expression profiles of HepaRG cells following exposure to three concentrations of polychlorinated biphenyl (PCB) 126 and one concentration of Roundup.
4.1 PCB concentration 1
step 4: You first import the expression table into a new dataframe.
PCB1_DEG= pd.read_csv('PCB concentration 1-GSE109565.top.table.tsv',sep='\t')
PCB1_DEG
GeneID | padj | pvalue | lfcSE | stat | log2FoldChange | baseMean | Symbol | Description | |
---|---|---|---|---|---|---|---|---|---|
0 | 8614 | 0.000000e+00 | 0.000000e+00 | 0.1324 | 37.857129 | 5.013970 | 629.14 | STC2 | stanniocalcin 2 |
1 | 219855 | 1.200000e-167 | 1.470000e-171 | 0.1606 | 27.921350 | 4.482891 | 1920.21 | SLC37A2 | solute carrier family 37 member 2 |
2 | 25976 | 2.290000e-140 | 4.220000e-144 | 0.1479 | 25.560261 | 3.780030 | 1739.75 | TIPARP | TCDD inducible poly(ADP-ribose) polymerase |
3 | 9429 | 4.060000e-136 | 9.960000e-140 | 0.1112 | 25.163900 | 2.799237 | 314.26 | ABCG2 | ATP binding cassette subfamily G member 2 (Jun... |
4 | 1544 | 4.620000e-118 | 1.420000e-121 | 0.3223 | 23.447062 | 7.556574 | 3656.99 | CYP1A2 | cytochrome P450 family 1 subfamily A member 2 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16311 | 6235 | NaN | NaN | 0.5455 | 1.117138 | 0.609401 | 851.99 | RPS29 | ribosomal protein S29 |
16312 | 2353 | NaN | NaN | 0.6604 | -0.916202 | -0.605087 | 226.41 | FOS | Fos proto-oncogene, AP-1 transcription factor ... |
16313 | 692086 | NaN | NaN | 0.6949 | 1.311695 | 0.911529 | 233.99 | SNORD17 | small nucleolar RNA, C/D box 17 |
16314 | 6590 | NaN | NaN | 0.5002 | 1.628616 | 0.814707 | 905.72 | SLPI | secretory leukocyte peptidase inhibitor |
16315 | 109864280 | NaN | NaN | 1.1505 | -0.347231 | -0.399505 | 2664.35 | RNA18SN2 | RNA, 18S ribosomal N2 |
16316 rows × 9 columns
step 5: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('PCB concentration 1-GSE109565.top.table.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 6: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FoldChange')
Log2Foldchange_column
log2FoldChange | |
---|---|
172034 | 0.008573 |
176130 | -0.994943 |
172032 | NaN |
176128 | 1.398152 |
172038 | -0.164600 |
... | ... |
176120 | -0.660842 |
172030 | -0.395790 |
176126 | 0.110734 |
172028 | 0.793714 |
176124 | -0.138918 |
2952 rows × 1 columns
step 7: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation.
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 8: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FoldChange', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
4.2 PCB concentration 2
step 9: You first import the expression table into a new dataframe.
PCB2_DEG= pd.read_csv('PCB concentration 2-GSE109565.top.table.tsv',sep='\t')
PCB2_DEG
GeneID | padj | pvalue | lfcSE | stat | log2FoldChange | baseMean | Symbol | Description | |
---|---|---|---|---|---|---|---|---|---|
0 | 1544 | 3.270000e-95 | 2.020000e-99 | 0.3816 | 21.164597 | 8.076617 | 5340.49 | CYP1A2 | cytochrome P450 family 1 subfamily A member 2 |
1 | 860 | 2.070000e-51 | 2.570000e-55 | 0.1072 | 15.666336 | 1.679040 | 437.13 | RUNX2 | RUNX family transcription factor 2 |
2 | 218 | 3.080000e-45 | 5.740000e-49 | 0.1315 | 14.707909 | 1.934003 | 6887.75 | ALDH3A1 | aldehyde dehydrogenase 3 family member A1 |
3 | 220 | 1.640000e-41 | 4.070000e-45 | 0.2124 | 14.095158 | 2.993965 | 103.37 | ALDH1A3 | aldehyde dehydrogenase 1 family member A3 |
4 | 54658 | 1.180000e-37 | 3.640000e-41 | 0.1259 | 13.437516 | 1.691207 | 2218.58 | UGT1A1 | UDP glucuronosyltransferase family 1 member A1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
16455 | 26576 | NaN | 3.860000e-01 | 0.4052 | 0.867259 | 0.351455 | 8.58 | SRPK3 | SRSF protein kinase 3 |
16456 | 158960 | NaN | 7.670000e-01 | 0.4480 | 0.295747 | 0.132487 | 7.25 | ATP6AP1-DT | ATP6AP1 divergent transcript |
16457 | 100507404 | NaN | 1.850000e-01 | 0.4109 | -1.324651 | -0.544237 | 8.00 | TMLHE-AS1 | TMLHE antisense RNA 1 |
16458 | 4509 | NaN | 4.570000e-01 | 0.4443 | 0.743520 | 0.330376 | 9.29 | ATP8 | ATP synthase F0 subunit 8 |
16459 | 4576 | NaN | 4.930000e-01 | 0.4309 | 0.685110 | 0.295210 | 8.82 | TRNT | tRNA-Thr |
16460 rows × 9 columns
step 10: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB2_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('PCB concentration 2-GSE109565.top.table.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 11: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FoldChange')
Log2Foldchange_column
log2FoldChange | |
---|---|
172034 | -0.064747 |
176130 | -0.577900 |
172032 | NaN |
176128 | 1.191256 |
172038 | -0.072468 |
... | ... |
176120 | -0.158248 |
172030 | -0.267251 |
176126 | 0.012222 |
172028 | 0.602433 |
176124 | -0.040633 |
2952 rows × 1 columns
step 12: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 13: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FoldChange', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
4.3 PCB concentration 3
step 14: You first import the expression table into a new dataframe.
PCB3_DEG= pd.read_csv('PCB concentration 3-GSE109565.top.table.tsv',sep='\t')
PCB3_DEG
GeneID | padj | pvalue | lfcSE | stat | log2FoldChange | baseMean | Symbol | Description | |
---|---|---|---|---|---|---|---|---|---|
0 | 343172.0 | 0.0473 | 0.000009 | 0.4999 | 4.446418 | 2.222794 | 14.21 | OR2T8 | olfactory receptor family 2 subfamily T member 8 |
1 | 90427.0 | 0.0473 | 0.000004 | 0.1534 | 4.624244 | 0.709588 | 991.41 | BMF | Bcl2 modifying factor |
2 | 1543.0 | 0.0473 | 0.000010 | 1.2959 | 4.418258 | 5.725682 | 2895.26 | CYP1A1 | cytochrome P450 family 1 subfamily A member 1 |
3 | 92154.0 | 0.0473 | 0.000011 | 0.0494 | 4.394152 | 0.216957 | 1807.35 | MTSS2 | MTSS I-BAR domain containing 2 |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
step 15: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('PCB concentration 3-GSE109565.top.table.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 16: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FoldChange')
Log2Foldchange_column
log2FoldChange | |
---|---|
172034 | -0.064747 |
176130 | -0.577900 |
172032 | NaN |
176128 | 1.191256 |
172038 | -0.072468 |
... | ... |
176120 | -0.158248 |
172030 | -0.267251 |
176126 | 0.012222 |
172028 | 0.602433 |
176124 | -0.040633 |
2952 rows × 1 columns
step 17: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 18: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FoldChange', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='default')
''
p4c.save_session('PCB3network.cys')
This file has been overwritten.
{}
4.4 Roundup
step 19: You first import the expression table into a new dataframe.
Roundup_DEG= pd.read_csv('GSE109565.top.table-Roundup.tsv',sep='\t')
Roundup_DEG= Roundup_DEG[Roundup_DEG['padj'] < 0.05]
Roundup_DEG.to_excel('GSE109565.top.table-Roundup.xlsx')
step 20: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('GSE109565.top.table-Roundup.xlsx', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 21: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FoldChange')
Log2Foldchange_column
log2FoldChange | |
---|---|
172034 | -0.064747 |
176130 | -0.577900 |
172032 | NaN |
176128 | 1.191256 |
172038 | -0.072468 |
... | ... |
176120 | -0.158248 |
172030 | -0.267251 |
176126 | 0.012222 |
172028 | 0.602433 |
176124 | -0.040633 |
2952 rows × 1 columns
step 22: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 23: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FoldChange', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
Section 5: Mapping of dataset: E-MEXP-2599
In this section, you will map the transcriptomics expression data of datasets:E-MEXP-2599. This dataset aimed to identify transcriptional alterations in renal proximal tubular cell cultures due to nephrotoxin exposure.
5.1 CdCl2 exposure time 1
step 24: You first import the expression table into a new dataframe.
Cdcl2_X12= pd.read_csv('topTable_X12_CdCl2_.5 - X12_vehicle...control_.0.TSV.tsv',sep='\t')
Cdcl2_X12
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 3162 | 10.077089 | 4.973286e+00 | 0.088921 | 0.000000 | 0.000000 |
1 | 4501 | 9.584053 | 3.788092e+00 | 0.084215 | 0.000000 | 0.000000 |
2 | 4499 | 9.121542 | 4.862877e+00 | 0.112327 | 0.000000 | 0.000000 |
3 | 4616 | 7.572723 | 3.341071e+00 | 0.081584 | 0.000000 | 0.000000 |
4 | 7378 | 9.993817 | 2.289442e+00 | 0.058827 | 0.000000 | 0.000000 |
... | ... | ... | ... | ... | ... | ... |
20406 | 5914 | 5.563178 | 5.265372e-05 | 0.110517 | 0.999600 | 0.999796 |
20407 | 181 | 5.327866 | 7.374238e-05 | 0.195193 | 0.999683 | 0.999830 |
20408 | 10840 | 4.467638 | -2.202214e-05 | 0.153921 | 0.999880 | 0.999978 |
20409 | 55071 | 5.196982 | 7.201510e-06 | 0.094173 | 0.999936 | 0.999985 |
20410 | 388963 | 6.430315 | 3.777609e-07 | 0.154951 | 0.999998 | 0.999998 |
20411 rows × 6 columns
step 25: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X12_CdCl2_.5 - X12_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 26: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.583972 |
176130 | 0.067827 |
172032 | NaN |
176128 | 0.018703 |
172038 | -0.079165 |
... | ... |
176120 | -0.827504 |
172030 | -0.123909 |
176126 | -0.021546 |
172028 | -0.018164 |
176124 | 0.029460 |
2952 rows × 1 columns
step 27: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 28: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
5.2 CdCl2 exposure time 2
step 29: You first import the expression table into a new dataframe.
Cdcl2_X48= pd.read_csv('topTable_X48_CdCl2_.5 - X48_vehicle...control_.0.TSV.tsv',sep='\t')
Cdcl2_X48
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 83729 | 8.601250 | 1.424848 | 0.116040 | 3.116727e-11 | 3.969629e-07 |
1 | 23753 | 9.838958 | 0.832729 | 0.068662 | 3.889696e-11 | 3.969629e-07 |
2 | 64764 | 8.148485 | 0.684942 | 0.064768 | 4.352100e-10 | 2.961024e-06 |
3 | 6745 | 9.397867 | 0.577089 | 0.060004 | 2.210019e-09 | 9.234976e-06 |
4 | 79174 | 11.181143 | 0.749899 | 0.078080 | 2.262255e-09 | 9.234976e-06 |
... | ... | ... | ... | ... | ... | ... |
20406 | 10365 | 4.205712 | 0.000043 | 0.113952 | 9.996835e-01 | 9.998605e-01 |
20407 | 64864 | 6.929174 | 0.000051 | 0.156795 | 9.997261e-01 | 9.998605e-01 |
20408 | 55622 | 7.781066 | 0.000026 | 0.100295 | 9.997845e-01 | 9.998605e-01 |
20409 | 56955 | 5.269484 | 0.000045 | 0.202175 | 9.998115e-01 | 9.998605e-01 |
20410 | 30818 | 6.337445 | 0.000002 | 0.096839 | 9.999829e-01 | 9.999829e-01 |
20411 rows × 6 columns
step 30: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X48_CdCl2_.5 - X48_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 31: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | -0.145011 |
176130 | 0.019860 |
172032 | NaN |
176128 | -0.129695 |
172038 | 0.103577 |
... | ... |
176120 | -0.266440 |
172030 | 0.011238 |
176126 | -0.010369 |
172028 | 0.094546 |
176124 | 0.002527 |
2952 rows × 1 columns
step 32: This is followed by definition of the color scheme so that low and high expression values receive distinct color. For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 33: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
5.3 CsA exposure time 1
step 34: You first import the expression table into a new dataframe.
CsA_X12= pd.read_csv('topTable_X12_CsA_.5 - X12_vehicle...control_.0.TSV.tsv',sep='\t')
CsA_X12
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 6241 | 10.251530 | -1.341259 | 0.072327 | 1.498431e-14 | 3.058447e-10 |
1 | 983 | 9.069451 | -0.903562 | 0.060879 | 9.808784e-13 | 7.853102e-09 |
2 | 1672 | 8.069463 | -1.108119 | 0.075321 | 1.154246e-12 | 7.853102e-09 |
3 | 51514 | 8.314937 | -1.034641 | 0.073624 | 2.686419e-12 | 1.370813e-08 |
4 | 7298 | 9.796690 | -0.797287 | 0.057650 | 3.603946e-12 | 1.471203e-08 |
... | ... | ... | ... | ... | ... | ... |
20407 | 1101 | 4.627745 | 0.000038 | 0.132666 | 9.997583e-01 | 9.998667e-01 |
20408 | 55501 | 6.552664 | -0.000020 | 0.084855 | 9.998047e-01 | 9.998667e-01 |
20409 | 64919 | 3.861795 | 0.000014 | 0.064401 | 9.998150e-01 | 9.998667e-01 |
20410 | 3213 | 6.052742 | 0.000021 | 0.097976 | 9.998177e-01 | 9.998667e-01 |
20411 | 91582 | 8.899855 | -0.000007 | 0.075645 | 9.999198e-01 | 9.999198e-01 |
20412 rows × 6 columns
step 35: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X12_CsA_.5 - X12_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 36: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.088729 |
176130 | 0.104596 |
172032 | NaN |
176128 | 0.060109 |
172038 | 0.008881 |
... | ... |
176120 | -0.416854 |
172030 | 0.187784 |
176126 | 0.084212 |
172028 | 0.003976 |
176124 | 0.089362 |
2952 rows × 1 columns
step 37: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 38: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
5.4 CsA exposure time 2
step 39: You first import the expression table into a new dataframe.
CsA_X48= pd.read_csv('topTable_X48_CsA_.5 - X48_vehicle...control_.0.TSV.tsv',sep='\t')
CsA_X48
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 6241 | 10.251530 | -1.341259 | 0.072327 | 1.498431e-14 | 3.058447e-10 |
1 | 983 | 9.069451 | -0.903562 | 0.060879 | 9.808784e-13 | 7.853102e-09 |
2 | 1672 | 8.069463 | -1.108119 | 0.075321 | 1.154246e-12 | 7.853102e-09 |
3 | 51514 | 8.314937 | -1.034641 | 0.073624 | 2.686419e-12 | 1.370813e-08 |
4 | 7298 | 9.796690 | -0.797287 | 0.057650 | 3.603946e-12 | 1.471203e-08 |
... | ... | ... | ... | ... | ... | ... |
20406 | 1101 | 4.627745 | 0.000038 | 0.132666 | 9.997583e-01 | 9.998667e-01 |
20407 | 55501 | 6.552664 | -0.000020 | 0.084855 | 9.998047e-01 | 9.998667e-01 |
20408 | 64919 | 3.861795 | 0.000014 | 0.064401 | 9.998150e-01 | 9.998667e-01 |
20409 | 3213 | 6.052742 | 0.000021 | 0.097976 | 9.998177e-01 | 9.998667e-01 |
20410 | 91582 | 8.899855 | -0.000007 | 0.075645 | 9.999198e-01 | 9.999198e-01 |
20411 rows × 6 columns
step 40: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X48_CsA_.5 - X48_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 41: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.088729 |
176130 | 0.104596 |
172032 | NaN |
176128 | 0.060109 |
172038 | 0.008881 |
... | ... |
176120 | -0.416854 |
172030 | 0.187784 |
176126 | 0.084212 |
172028 | 0.003976 |
176124 | 0.089362 |
2952 rows × 1 columns
step 42: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 43: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
5.5 Diquat dibromide exposure time 1
step 44: You first import the expression table into a new dataframe.
Diquat_X12= pd.read_csv('topTable_X12_Diquat_30 - X12_vehicle...control_.0.TSV.tsv',sep='\t')
Diquat_X12
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 10628 | 7.232834 | -1.454574 | 0.114973 | 1.820506e-11 | 3.156962e-07 |
1 | 7296 | 11.342399 | 0.736758 | 0.059977 | 3.093393e-11 | 3.156962e-07 |
2 | 835 | 6.171604 | 0.796337 | 0.066514 | 4.898336e-11 | 3.332665e-07 |
3 | 2316 | 9.740765 | -0.804938 | 0.084547 | 2.621614e-09 | 1.337744e-05 |
4 | 1063 | 8.276373 | -0.544829 | 0.064959 | 2.118435e-08 | 8.194407e-05 |
... | ... | ... | ... | ... | ... | ... |
20406 | 26030 | 5.651535 | 0.000083 | 0.118398 | 9.994120e-01 | 9.995947e-01 |
20407 | 112268153 | 4.700970 | 0.000119 | 0.180532 | 9.994478e-01 | 9.995947e-01 |
20408 | 9406 | 9.979627 | 0.000036 | 0.096421 | 9.996866e-01 | 9.997846e-01 |
20409 | 401262 | 4.907649 | 0.000040 | 0.132109 | 9.997442e-01 | 9.997932e-01 |
20410 | 23162 | 5.949681 | -0.000010 | 0.102911 | 9.999185e-01 | 9.999185e-01 |
20411 rows × 6 columns
step 45: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X12_CsA_.5 - X12_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 46: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.088729 |
176130 | 0.104596 |
172032 | NaN |
176128 | 0.060109 |
172038 | 0.008881 |
... | ... |
176120 | -0.416854 |
172030 | 0.187784 |
176126 | 0.084212 |
172028 | 0.003976 |
176124 | 0.089362 |
2952 rows × 1 columns
step 47: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 48: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
5.6 Diquat dibromide exposure time 2
step 49: You first import the expression table into a new dataframe.
Diquat_X48= pd.read_csv('topTable_X48_Diquat_30 - X48_vehicle...control_.0.TSV.tsv',sep='\t')
Diquat_X48
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 3162 | 10.077089 | 5.104026 | 0.088921 | 3.820970e-24 | 7.798982e-20 |
1 | 90637 | 9.408076 | 5.551762 | 0.115706 | 1.329066e-22 | 1.356378e-18 |
2 | 27063 | 4.810138 | 5.208681 | 0.118688 | 7.756080e-22 | 5.276979e-18 |
3 | 1649 | 10.590937 | 2.895794 | 0.069728 | 2.305203e-21 | 1.170734e-17 |
4 | 7296 | 11.342399 | 2.441458 | 0.059977 | 3.421953e-21 | 1.170734e-17 |
... | ... | ... | ... | ... | ... | ... |
20406 | 728621 | 3.399475 | -0.000045 | 0.150793 | 9.997491e-01 | 9.999451e-01 |
20407 | 79781 | 3.955769 | 0.000014 | 0.105414 | 9.998896e-01 | 9.999465e-01 |
20408 | 653784 | 9.990381 | 0.000005 | 0.063024 | 9.999277e-01 | 9.999465e-01 |
20409 | 24141 | 4.212187 | 0.000015 | 0.208701 | 9.999402e-01 | 9.999465e-01 |
20410 | 54900 | 3.336191 | -0.000008 | 0.121110 | 9.999465e-01 | 9.999465e-01 |
20411 rows × 6 columns
step 50: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_X48_Diquat_30 - X48_vehicle...control_.0.TSV.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 51: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.785087 |
176130 | -0.011815 |
172032 | NaN |
176128 | -0.070274 |
172038 | 0.010107 |
... | ... |
176120 | -1.021143 |
172030 | 0.066869 |
176126 | 0.004609 |
172028 | 0.118902 |
176124 | 0.027394 |
2952 rows × 1 columns
step 52: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 53: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
Section 6: Mapping of dataset: E-MEXP-3583
In this section, you will map the transcriptomics expression data of datasets:E-MEXP-3583. In this study, the effect of AgNPs and Ag+ on the transcriptome of human lung epithelial cell line A549 was determined by comparing two exposure times (24 and 48 hours).
6.1 Ag+ exposure time 1
step 54: You first import the expression table into a new dataframe.
Ag_X24= pd.read_csv('topTable_Ag._.1.3_24 - H2O.control_.0.0_24.tsv',sep='\t')
Ag_X24
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 4490 | 9.925881 | 3.935193 | 0.206737 | 9.842270e-11 | 0.000002 |
1 | 4495 | 9.561522 | 4.109887 | 0.243653 | 3.924060e-10 | 0.000004 |
2 | 79974 | 6.381454 | 2.532636 | 0.174021 | 2.094514e-09 | 0.000011 |
3 | 4489 | 12.843381 | 2.498127 | 0.172052 | 2.150589e-09 | 0.000011 |
4 | 1638 | 6.429810 | 2.349836 | 0.170834 | 3.954792e-09 | 0.000016 |
... | ... | ... | ... | ... | ... | ... |
20513 | 391257 | 6.462645 | -0.000055 | 0.146291 | 9.996750e-01 | 0.999870 |
20514 | 10078 | 8.351476 | -0.000037 | 0.128166 | 9.997478e-01 | 0.999894 |
20515 | 84328 | 9.130185 | -0.000033 | 0.167015 | 9.998260e-01 | 0.999923 |
20516 | 253018 | 7.162925 | -0.000015 | 0.199487 | 9.999336e-01 | 0.999982 |
20517 | 440737 | 10.365751 | 0.000003 | 0.187502 | 9.999848e-01 | 0.999985 |
20518 rows × 6 columns
step 55: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_Ag._.1.3_24 - H2O.control_.0.0_24.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 56: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | -0.037742 |
176130 | 0.328724 |
172032 | NaN |
176128 | -0.013540 |
172038 | -0.016723 |
... | ... |
176120 | -0.105028 |
172030 | -0.107576 |
176126 | 0.068160 |
172028 | -0.239989 |
176124 | -0.063865 |
2952 rows × 1 columns
step 57: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 58: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
6.2 Ag+ exposure time 2
step 59: You first import the expression table into a new dataframe.
Ag_X48= pd.read_csv('topTable_Ag._.1.3_48 - H2O.control_.0.0_48.tsv',sep='\t')
Ag_X48
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 116496 | 9.406879 | -1.412485 | 0.170255 | 9.744846e-07 | 0.007563 |
1 | 51200 | 8.635450 | 1.278591 | 0.154984 | 1.033257e-06 | 0.007563 |
2 | 160428 | 9.374766 | -1.188524 | 0.145008 | 1.105778e-06 | 0.007563 |
3 | 80329 | 10.119216 | -1.284059 | 0.164517 | 1.834365e-06 | 0.008277 |
4 | 4133 | 7.091635 | -1.010351 | 0.130650 | 2.016940e-06 | 0.008277 |
... | ... | ... | ... | ... | ... | ... |
20513 | 10618 | 11.693438 | -0.000020 | 0.141399 | 9.998772e-01 | 0.999968 |
20514 | 645191 | 7.747050 | -0.000021 | 0.188698 | 9.999046e-01 | 0.999968 |
20515 | 3238 | 5.981553 | 0.000023 | 0.246879 | 9.999184e-01 | 0.999968 |
20516 | 389799 | 6.754615 | -0.000009 | 0.158938 | 9.999528e-01 | 0.999968 |
20517 | 6352 | 7.074925 | 0.000007 | 0.189656 | 9.999683e-01 | 0.999968 |
20518 rows × 6 columns
step 60: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_Ag._.1.3_48 - H2O.control_.0.0_48.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 61: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.249750 |
176130 | -0.136309 |
172032 | NaN |
176128 | 0.298794 |
172038 | -0.064556 |
... | ... |
176120 | 0.314105 |
172030 | -0.180400 |
176126 | -0.138275 |
172028 | -0.208312 |
176124 | -0.030240 |
2952 rows × 1 columns
step 62: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 63: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
6.3 AgNP exposure time 1
step 64: You first import the expression table into a new dataframe.
Agnp_X24= pd.read_csv('topTable_AgNP_12.1_24 - H2O.control_.0.0_24.tsv',sep='\t')
Agnp_X24
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 3310 | 9.434652 | 5.819499 | 0.177670 | 1.829144e-13 | 3.753039e-09 |
1 | 4490 | 9.925881 | 4.711945 | 0.206737 | 1.235097e-11 | 1.267086e-07 |
2 | 4495 | 9.561522 | 4.698081 | 0.243653 | 8.487924e-11 | 5.554945e-07 |
3 | 10112 | 9.875817 | -2.880784 | 0.152609 | 1.082941e-10 | 5.554945e-07 |
4 | 7779 | 10.760136 | 2.118909 | 0.133429 | 7.797174e-10 | 2.962740e-06 |
... | ... | ... | ... | ... | ... | ... |
20513 | 114655 | 3.169466 | -0.000187 | 0.172065 | 9.990522e-01 | 9.992470e-01 |
20514 | 130500 | 5.737074 | -0.000124 | 0.148645 | 9.992769e-01 | 9.994230e-01 |
20515 | 26056 | 9.261859 | -0.000045 | 0.164172 | 9.997598e-01 | 9.998572e-01 |
20516 | 79736 | 8.096957 | -0.000028 | 0.192082 | 9.998737e-01 | 9.999225e-01 |
20517 | 245932 | 5.079037 | 0.000004 | 0.208761 | 9.999836e-01 | 9.999836e-01 |
20518 rows × 6 columns
step 65: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_AgNP_12.1_24 - H2O.control_.0.0_24.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 66: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | -0.094355 |
176130 | 0.466970 |
172032 | NaN |
176128 | 0.234870 |
172038 | -0.073984 |
... | ... |
176120 | -0.414748 |
172030 | 0.074024 |
176126 | 0.066062 |
172028 | -0.490685 |
176124 | -0.323264 |
2952 rows × 1 columns
step 67: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 68: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
6.4 AgNP exposure time 2
step 69: You first import the expression table into a new dataframe.
Agnp_X48= pd.read_csv('topTable_AgNP_12.1_48 - H2O.control_.0.0_48.tsv',sep='\t')
Agnp_X48
GeneID | meanExpr | log2FC | log2FC SE | p-value | adj. p-value | |
---|---|---|---|---|---|---|
0 | 3310 | 9.434652 | 5.230100 | 0.177670 | 6.346082e-13 | 1.302089e-08 |
1 | 4494 | 13.205864 | 3.027132 | 0.139071 | 2.101787e-11 | 2.156224e-07 |
2 | 57823 | 8.314329 | 3.276183 | 0.159564 | 4.121020e-11 | 2.818503e-07 |
3 | 131909 | 6.916464 | 3.079920 | 0.170572 | 1.802186e-10 | 9.244314e-07 |
4 | 4490 | 9.925881 | 3.469312 | 0.206737 | 4.160864e-10 | 1.707452e-06 |
... | ... | ... | ... | ... | ... | ... |
20513 | 406894 | 3.200780 | -0.000104 | 0.219327 | 9.995863e-01 | 9.997392e-01 |
20514 | 6909 | 8.359756 | -0.000066 | 0.148913 | 9.996144e-01 | 9.997392e-01 |
20515 | 115207 | 8.937618 | 0.000066 | 0.160592 | 9.996417e-01 | 9.997392e-01 |
20516 | 23135 | 9.375034 | 0.000061 | 0.267180 | 9.998008e-01 | 9.998496e-01 |
20517 | 641556 | 5.486193 | 0.000013 | 0.190845 | 9.999388e-01 | 9.999388e-01 |
20518 rows × 6 columns
step 70: You will now integrate the expression table into the nodetable of the AOP network. This will be done using the function: p4c.load_table_data_from_file where you select ‘True’ for the second variable as the first row of the PCB1_DEG table has the needed column names. You also select ’node’ for table and ‘CTL.GeneID’ for table_key_column as you want the expression data to be matched to the Gene ID’s that were added by CyTargetLinker extension in the previous notebook.
p4c.load_table_data_from_file('topTable_AgNP_12.1_48 - H2O.control_.0.0_48.tsv', first_row_as_column_names=True,table='node', table_key_column='CTL.GeneID')
{'mappedTables': [171586, 171624]}
step 71: You define the style for the mapping so that the expression values (log2FoldChange) are mapped to the gene nodes. This is done by first retrieving the log2FoldChange column.
Log2Foldchange_column = p4c.get_table_columns(table='node', columns='log2FC')
Log2Foldchange_column
log2FC | |
---|---|
172034 | 0.090557 |
176130 | -0.276784 |
172032 | NaN |
176128 | 0.129527 |
172038 | -0.034512 |
... | ... |
176120 | 0.489602 |
172030 | -0.121748 |
176126 | 0.115153 |
172028 | -0.797962 |
176124 | -0.927628 |
2952 rows × 1 columns
step 72: This is followed by definition of the color scheme so that low and high expression values receive distinct color.For the color scheme, you will use the following colors:
- Low expression value (minimum) = blue node color
- No expression value = white node color
- High expression value (maximum) = red node color
This color scheme was also described in the official py4cytoscape documentation (1).
Blue_expression_color= Log2Foldchange_column.min().values[0]
Red_expression_color= Log2Foldchange_column.max().values[0]
White_expression_color= Blue_expression_color + (Red_expression_color -Blue_expression_color)/2
step 73: You apply this color scheme to the network.
p4c.set_node_color_mapping('log2FC', [Blue_expression_color,White_expression_color,Red_expression_color], ['#0000FF', '#FFFFFF', '#FF0000'],mapping_type='c', style_name='Sample1')
''
p4c.notebook_export_show_image()
Section 7: Metadata
step 74. At last, the metadata belonging to this jupyternotebook is displayed which contains the version numbers of packages and system-set-up for interested users. This requires the usage of packages:Watermark and print_versions.
%load_ext watermark
!pip install print-versions
Requirement already satisfied: print-versions in c:\users\shaki\anaconda3\lib\site-packages (0.1.0)
%watermark
Last updated: 2025-06-03T20:22:47.796839+02:00
Python implementation: CPython
Python version : 3.12.3
IPython version : 8.25.0
Compiler : MSC v.1938 64 bit (AMD64)
OS : Windows
Release : 11
Machine : AMD64
Processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
CPU cores : 8
Architecture: 64bit
from print_versions import print_versions
print_versions(globals())
json==2.0.9
ipykernel==6.28.0
pandas==2.2.2
py4cytoscape==1.9.0