广西大学超算平台(在运行)
 
 首页  |  中心概况  |  中心新闻  |  中心公告  |  学术动态  |  下载中心  |  网站地图 
信息查询:
 
软件下载

​ICOLEC:Identifying CO-Localized and co-Expressed Complexes



 
当前位置: 首页>>学术动态>>ICOLEC>>正文

​ICOLEC:Identifying CO-Localized and co-Expressed Complexes
2017-11-03 09:13 admin  审核人:

Identifying CO-Localized and co-Expressed Complexes by ICOLEC method

Software suite for the identification of protein complexes from PPI
data sets and the evaluation of identified complexes using cyc2008.  

Contact:  JinXiong Zhang(Zhangjx@gxu.edu.cn)

Getting started
===================================================================
unpacking the download archive to a directory of your choice  

Package structure and the specified files
===================================================================
 

.\bin\             executable files and batch files            
.\common\             public files            
.\src\             source codes            
.\bin\preparing_data.bat             run according to prompt. generate all data files used by ICOLEC            
.\bin\identify_and_analyze.bat             identify and analyze protein complexes            
.\bin\README.txt            
...            

Usage
===================================================================
This software suite is executed by the following two stages:  

1.preparing data stage run preparing_data batch file one time
2.identifying and evaluating stage run identify_and_analyze batch file with different parameters many times to obtain better results

1.preparing data stage
///////////////////////////////////////////////////////////////////  

PPIs file format
###################################################################
after creating "mydata" or "datadir", we need copy our own PPIs file (e.g. my_PPIs.txt)
into directory "mydata" or "datadir". Note that my_PPIs.txt must be formatted as follows.  

my_PPIs.txt                           a pair of two systemic names separated by tab per line.
 

YKL171W YML096W
YFL017W-A YFR031C-A
...  


examples for running preparing_data  

###################################################################

preparing_data create default directory "mydata" in the current directory
preparing_data datadir create given directory "datadir" in the current directory

Therefore, we can execute the following batch command to generate all data files used by ICOLEC.

preparing_data mydata my_PPIs.txt

or

preparing_data datadir my_PPIs.txt

after running preparing_data, "mydata" or "datadir" includes the following files (e.g. "datadir"):

.\datadir\my_PPIs.txt PPI input file
.\datadir\my_PPIs_coded_and_scored.txt PPI input file in which proteins are coded and interactions are scored
.\datadir\my_PPIs_protein_aa.txt protein amino acid composition profiles file, each item is set as 0.05 for lacking amino acid composition profile.
.\datadir\my_PPIs_protein_ex.txt gene expression patterns file, each item is set as 1 for lacking expression pattern.
.\datadir\my_PPIs_protein_lo.txt protein localization data file, each item is set as 1 for lacking localization data.
.\datadir\my_PPIs_protein_lo_for_evaluation.txt protein localization data file for evaluating protein set in PPI data set
.\datadir\my_PPIs_protein_sim_cc.txt the upper triangle protein similarity matrix based on CC terms
.\datadir\my_PPIs_protein_sim_mf.txt the upper triangle protein similarity matrix based on MF terms

2.identifying and evaluating stage
///////////////////////////////////////////////////////////////////  

Optional parameters
###################################################################  

-r r-reliable link threshold (r) 1--999, default: 999, 990 for DIP, 999 for BioGrid and STRING.
-d pcc_threshold delta (δ) 0--1.0, default: 0.3.
-c CC_similarity_threshold cc_sim (σ) 0--1.0,  default: 0.7, 0.63 for DIP, 0.7 for BioGrid and STRING.
-f MF_similarity_threshold mf_sim (ω) 0--1.0,  default: 0.75, 0.68 for DIP, 0.7 for BioGrid, and 0.75 for STRING.
-a AA_discrepancy_threshold distant (ε) 0--1.0, default: 0.2.
-m Density_attenuation_coefficient_threshold miu (μ) >0, default: 0.08, 0.4 for DIP, 0.1 for BioGrid, and 0.08 for STRING.
-u Attachment_co_expression_significant_threshold p2delta (γ) 0--1.0, default: 0.01.
-e Attachment_to_cluster_connectivity_threshold extra_link (η) 0--1.0,  default: 0.9, 0.7 for DIP and BioGrid,and 0.9 for STRING.

examples

#####################################################################

setting of all optional parameters:
identify_and_analyze mydata my_PPIs.txt  -r 900 -d 0.3 -c 0.7 -f 0.75 -a 0.2 -m 0.08 -u 0.01 -e 0.9  


some optional parameters by default:
identify_and_analyze mydata my_PPIs.txt  -r 990 -c 0.63 -f 0.68 -m 0.4 -u 0.01 -e 0.7  

all optional parameters by default:

identify_and_analyze STRING STRING_PPIs.txt default optional parameters for STRING data set
identify_and_analyze BioGrid BioGrid_PPIs.txt default optional parameters for BioGrid data set
identify_and_analyze DIP DIP_PPIs.txt default optional parameters for DIP data set

Results
===================================================================
The identified complexes and its evaluating metrics are written in the following files placed in subdirectory "complexes":  

.\datadir\complexes\my_PPIs_complexes.txt a identified complex per line, separated by tab
.\datadir\complexes\my_PPIs_complexes_measure.txt a evaluating metric per line
.\datadir\complexes\my_PPIs_complexes_matched.txt the number of perfectly matched complexes of different size

Availability
===================================================================
Software suite
Now windows command line version of ICOLEC software suite is only presented.  

PPIs file, data files used by ICOLEC, the complexes identified by ICOLEC and its evaluating metrics for:
STRING  

BioGrid

DIP  

Additional files
///////////////////////////////////////////////////////////////////
Additional_file_1: BP term based enrichment analysis of the complexes identified by ICOLEC as well as significant complexes statistics.
Additional_file_2: CC term based enrichment analysis of the complexes identified by ICOLEC.  

附件【STRING.rar已下载
附件【BioGrid.rar已下载
附件【DIP.rar已下载
附件【additional_file_1.zip已下载
附件【additional_file_2.zip已下载
附件【ICOLEC.rar已下载
关闭窗口

广西大学超算中心   版权所有    西安博达软件有限公司  技术支持