Identifying Architectural Decay

Architecture tends to decay, leading to the occurrence of defects or architectural elements that become resistant to maintenance. To address this problem, we construct novel models that predict the quality of an architectural 
element by utilizing multiple architectural views (both structural and semantic) and architectural metrics as features for prediction. Our findings show that we can predict low architectural quality, i.e., architectural decay, with high 
performance---even for cases of decay that suddenly occur in an architectural module. We furtherreport the factors that best predict architectural quality.

Prediction Model Construction					

Figure 1 overviews our approach for predicting architectural quality. Our approach begins with a set of source files, a version control repository, and architectural modules identified by an Architectural Model Extractor from the source files. Given those three artifacts, four Metrics Extractors—Lifted File-Level Extractor, Architectural Co-Change Extractor, Architectural smell Extractor, and Architectural Dependency Extractor—compute 19 metrics that are used as independent variables for a stepwise regression analysis. A user selects among 6 architectural quality metrics to be predicted, which serves as the dependent variable inputted to the stepwise regression analysis. The result of regression analysis is a prediction model for the selected quality metric. Each prediction model produced by our approach utilizes independent variables of release k of system s and predicts the selected architectural-quality metric for k + 1 of system s.

Here is the results of the regression models for the five projects of Camel, Cassandra, HBase, Hive and OpenJPA. In each folder, there is a Data.txt file which contains the input file to the regression models (values of all metrics). There are also different files for each independent variable which shows the results of the regression model for that independent variable.