广西大学超算平台(在运行)
 
 首页  |  中心概况  |  中心新闻  |  中心公告  |  学术动态  |  下载中心  |  网站地图 
信息查询:
 
软件下载

​ICOLEC:Identifying CO-Localized and co-Expressed Complexes



 
当前位置: 首页>>学术动态>>ICOLEC>>正文

​ICOLEC:Identifying CO-Localized and co-Expressed Complexes
2017-11-03 09:13 admin  审核人:

Identifying CO-Localized and co-Expressed Complexes by ICOLEC method

Software suite for the identification of protein complexes from PPI
data sets and the evaluation of identified complexes matching cyc2008.  

Web:     http://hpc.gxu.edu.cn/xsdt/ICOLEC.htm

Contact:  JinXiong Zhang(Zhangjx@gxu.edu.cn)

Getting started
===================================================================
unpacking the downloaded archive to a directory of your choice  

Package structure and the specified files
===================================================================
 

.\bin\             executable files and batch files            
.\common\             public files            
.\bin\preparing_data.bat           run according to prompt. generate all data files used in ICOLEC            
.\bin\identify_and_analyze.bat identify and analyze protein complexes            
.\bin\README.txt             this file
...            

Usage
===================================================================
This software suite is executed by the two following stages:  

1.preparing data stage run preparing_data batch file one time
2.identifying and evaluating stage run identify_and_analyze batch file with different parameters many times to obtain better results

1.preparing data stage
///////////////////////////////////////////////////////////////////  

examples for running preparing_data:

preparing_data                  create default directory "yourdata" in the current directory.

or

preparing_data datadir          create given directory "datadir" in the current directory.

PPIs file format

####################################################################
after creating "yourdata" or "datadir", please copy your PPIs file (e.g. your_PPIs.txt)
into directory "yourdata" or "datadir". Note that your_PPIs.txt must be formatted as follows.  

your_PPIs.txt           A pair of two systemic names separated by tab per line.

YKL171W YML096W
YFL017W-A YFR031C-A
...  


examples for running preparing_data  

###################################################################

preparing_data create default directory "yourdata" in the current directory.
preparing_data datadir create given directory "datadir" in the current directory.

Therefore, you can execute the batch command to generate all data files used in ICOLEC.

preparing_data yourdata your_PPIs.txt

or

preparing_data datadir your_PPIs.txt

after running preparing_data, "yourdata" or "datadir" includes the following files (e.g. "yourdata"):

.\yourdata\your_PPIs.txt PPI input file
.\yourdata\your_PPIs_coded_and_scored.txt PPI input file in which proteins are coded and interactions are scored
.\datadir\your_PPIs_protein_ex.txt gene expression patterns file, each item is set as 1 for lacking expression pattern.
.\datadir\your_PPIs_protein_lo.txt protein localization data file, each item is set as 1 for lacking localization data.
.\datadir\your_PPIs_protein_lo_for_evaluation.txt protein localization data file for evaluating protein set in PPI data set
.\datadir\your_PPIs_protein_sim_cc.txt the upper triangle protein similarity matrix based on CC terms
.\datadir\your_PPIs_protein_sim_mf.txt the upper triangle protein similarity matrix based on MF terms
.\yourdata\your_PPIs_protein_sim_bp.txt the upper triangle protein similarity matrix based on BP terms

2.identifying and evaluating stage
///////////////////////////////////////////////////////////////////  

Optional parameters
###################################################################  

-L co_localization variable (L) 0--1, default: 1; 1 represents co_localization="on", 0 represents co_localization="off".
-r r-reliable link threshold (r) 1--999, default: 999, 990 for DIP, 999 for BioGrid and STRING, 1 for three Y2H data sets
-d pcc_threshold delta (δ) 0--1.0, default: 0.3.
-c CC_similarity_threshold cc_sim (σ) 0--1.0, default: 0.7, 0.6 for DIP, 0.8 for Uetz, 0.7 for BioGrid, STRING, Ito and Yu.
-f MF_similarity_threshold mf_sim (ω) 0--1.0, default: 0.75, 0.8 for DIP, 0.4 for Yu, 0.75 for BioGrid and STRING, 0.3 for Uetz and Ito.
-p BP_similarity_threshold bp_sim (θ) 0--1.0, default: 0.3, 0.1 for DIP, 0.2 for Uetz and Ito, 0.3 for BioGrid, STRING, and Yu.
-m Density_attenuation_coefficient_threshold miu (μ) >0, default: 0.08, 0.1 for BioGrid, 0.08 for STRING, 0.4 for DIP, Uetz, Ito, and Yu.
-u Attachment_co_expression_significant_threshold p2delta (γ) 0--1.0, default: 0.01.
-e Attachment_to_cluster_connectivity_threshold extra_link (η) 0--1.0, default:0.9, 0.7 for DIP and BioGrid, 0.9 for STRING, 0.6 for Uetz, Ito, and Yu.

advanced example
####################################################################
set all optional parameters:  

identify_and_analyze yourdata your_PPIs.txt  -L 1 -r 999 -d 0.3 -c 0.7 -f 0.75 -p 0.3 -m 0.08 -u 0.01 -e 0.9

some optional parameters by default:

identify_and_analyze yourdata your_PPIs.txt  -r 990 -c 0.6 -f 0.8 -p 0.1 -m 0.4 -e 0.7

all optional parameters by default:

identify_and_analyze STRING STRING_PPIs.txt default optional parameters for STRING data set
identify_and_analyze BioGrid BioGrid_PPIs.txt default optional parameters for BioGrid data set
identify_and_analyze DIP DIP_PPIs.txt default optional parameters for DIP data set
identify_and_analyze Uetz Uetz_PPIs.txt default optional parameters for Uetz data set
identify_and_analyze Ito Ito_PPIs.txt default optional parameters for Ito data set
identify_and_analyze Yu Yu_PPIs.txt default optional parameters for Yu data set

Results
===================================================================
The identified complexes and its evaluating metrics are written in the following files placed in subdirectory "complexes":  

.\datadir\complexes\your_PPIs_complexes.txt an identified complex per line, separated by tab
.\datadir\complexes\your_PPIs_complexes_measure.txt an evaluating metric per line
.\datadir\complexes\your_PPIs_complexes_matched.txt matching analysis of identified complexes matched with known complexes
.\datadir\complexes\your_PPIs__complexes_PM.txt the number of perfectly matched complexes of different size

Availability
===================================================================
Software suite
Now windows command line version of ICOLEC software suite is presented only.  

PPIs file, data files used by ICOLEC, the complexes identified by ICOLEC and its evaluating metrics for:
STRING    

BioGrid  

DIP  

Ito  

Uetz

Yu

Additional files
///////////////////////////////////////////////////////////////////
Additional_file_1: BP term based enrichment analysis of the complexes identified by ICOLEC as well as significant complexes statistics.
Additional_file_2: CC term based enrichment analysis of six examples of the complexes identified by ICOLEC.  

附件【BioGrid.rar已下载
附件【DIP.rar已下载
附件【Ito.rar已下载
附件【STRING.rar已下载
附件【Uetz.rar已下载
附件【Yu.rar已下载
附件【Additional_file_1.rar已下载
附件【Additional_file_2.rar已下载
附件【ICOLEC.rar已下载
关闭窗口

广西大学超算中心   版权所有    西安博达软件有限公司  技术支持