Empirical Software Engineering

The Empirical software engineering research group follows an approach that is driven by industrial data analysis to address practical problems utilizing statistics and other analysis methods. It involves estimation techniques for software intensive systems and software engineering, software project measurement, software quality management, knowledge science, and so on. Its current research hotspots include
Quantitative software process management model
Empirical methods for software process management
Reuse-based software project cost estimation
Uncertainty in software cost estimation
Software defect estimation
Knowledge management and knowledge discovery

According to the research focuses, the group is divided to following three subgroups:

Software Cost Estimation

The research work is aiming at extending and improving current theories, methodologies, and tools of software cost estimation as well as promoting the software cost estimation practice in China. First, we study the cost estimation of reuse-based software projects on the basis of abundant home and abroad historical project data and research. We analyze the distinctions and disadvantages of different estimation models and construct systematic reuse-oriented cost estimation models and corresponding self-configuration rules to guide the estimation and decision support for reuse-based projects. Besides, as uncertainty is the nature and a challenge of software cost estimation, we are doing research on the uncertainty of software cost estimation by: modeling software cost and its certainty, estimating the probability distribution of cost with incomplete or uncertain information. Whats more, we develop software cost estimation tools to validate and apply our research work. Current research topics include:
Reuse-based software project cost estimation
Software cost and its certainty estimation model
Software cost estimation tools

Trustworthy Process Measurement and Management

The subgroup studies how to support the development of trustworthy software through quantitative analysis and management of software process and work products. Its research focuses on Software Measurement, Software Process Measurement, Assessment and Management of Trustworthy Software Process, and Defect-Based Software Trustworthiness Assessment and Management. The research is Data Driven and Statistical and Machine Learning approaches Based. The subgroup does research on the measurement, quantitative analysis and management of the properties of trustworthy software process and presents a multi-level (including process, project and person) and multi-dimension (including size, quality, effort, productivity and schedule) quantitative analysis framework consisting of the metrics and corresponding assessment methods and supporting the management and assessment of processes. Besides, considering software defects, the main factor decreasing the trustworthiness of software, the subgroup studies software prediction, defect related effort estimation, and the analysis of defect description to assess and manage the trustworthiness of products. The current main research topics of the subgroup include:
Measurement model of the trustworthiness of software process
Assessment methods of the trustworthiness of software process
Software defect prediction
Automatic bug triaging
Bug fixing effort estimation

Software Repository

This rapidly growing interdisciplinary field merges software engineering, statistics, data mining and others in order to extract useful knowledge from software repository to promote the research of empirical software engineering. We are mainly interested in constructing and mining software repository. Knowledge management techniques, such as Ontology and semantics, are employed to construct software repository. Knowledge discovery techniques, such as data mining and text mining, are employed to extract patterns from the data in the repository.
Our perspective on research of empirical software engineering is on how to manage data and experience in software engineering and how to make use historical data and experience to gain new insights and improve the productivity of software development. The current main research topics of the subgroup include:
Software Repository
Knowledge discovery from software repository
Mining software effort data, mailing list, etc.

Team Members

Faculty and Staff

Prof. Mingshu Li Prof. Qing Wang Prof. Yongji Wang Ye Yang
Wen Zhang Da Yang Yanbin Liu

PH.D. Students

Jing Du Jie Hu Jia Chen Zhimin He
Dandan Wang

Master Students

Yueming Sun Wenpei Liu Lihua Cao Xu Wang
Xihao Xie Rongbo Qi Ran Liang

Opening for visiting students...