Methodology:Identification of differentially expressed genesWe retrieved the raw expression data for 19 breast tumor samples from the Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo. Entry GSE21422 (3) containing 19 samples in two tumor stagesRelevant functional attributes in the disease conditionMicroarrays measure the relative abundance of mRNA transcripts; their translated proteins are likely to be differentially present in diseased tissue. Moreover, the extent of differential protein concentration under the disease condition is quite difficult to estimate due to the heterogeneity of cells in the tumor sample. Therefore, we considered a Boolean combination of seven proteins functional attributes for searching genes associated with breast cancer, where the causative effects are not additive but combinatorial as well as non-linear. These functional attributes are tissue specificity (TS), transcription factors (TFs), post-translation modifications (PTMs), protein kinases (PKs), nuclear receptors (NRs) ,secreted proteins (SPs) along with the gene attribute of methylation (METH), in cancer vs. non-cancer associated genes.Data integrationWeighting schema for Boolean-based probability calculationWe used phi-correlation (r?) as a measure of association between the functional attributes of the cancerous genes. This is one of the powerful methods to detect the association strength between two categorical data having binary values. Moreover, computationally it is related to the chi-square (?2) value:where N is the total number of genes.Scoring schema on the weighted functional attributes for ranking genesWe used the Boolean algorithm proposed by Nagaraj and Reverter 2 for ranking the differentially expressed genes in breast cancer samples, with our own set of Boolean variables representing relevant functional attributes in the disease condition. The particular combination across the seven Boolean variables i.e. functional attributes for a given differentially and non-differentially expressed genes, was decomposed into its root. For example, if a given gene has four known functional attributes, then 24 Boolean states are known to exist containing (24-1) roots, i.e. all possible combinations of Boolean states at the positions of known functional attributes, excluding the Boolean value with all zero status. The probability of each root is simply the average sum of all the weights associated with known functional attributes calculated via r?. These root probabilities are then used to rank the differentially and non-differentially expressed genes by summing up all the probability values associated with the individual roots.