Segla Kpodjedo

Ph.D., Lecturer, Research Associate @ Ecole Polytechnique de Montreal

I'm in the Program Committee of the

- 4th Intl. Workshop on Emerging Trends in Software Metrics (WETSoM'13), a satellite workshop of ICSE
- ERA (Early Research Achievement) Track of the 29th IEEE Intl. Conf. of Softw. Maintenance (ICSM'13)



Please consider submitting a paper to these venues.

Research Statement

I envisage carrying out research activities on problems relevant to Search-Based Software Engineering (SBSE). I believe that, in terms of costs, performance and new insights, there are benefits in reformulating many software engineering problems as optimization problems. Consistent with my Ph.D. work, I expect my contributions to be primarily in SBSE, and in particular the analysis of software evolution to improve software quality.

Current Research Activity

My Ph.D. work is mainly on software evolution and investigates the usefulness of graph matching techniques as efficient ways of retrieving the evolution through time of software artifacts. It involves three distinct but related aspects. First, in order to match artifacts efficiently, I proposed and implemented a tabu search technique for approximate graph matching problems. Second, I modeled software artifacts, such as class diagrams, as directed graphs and applied the developed algorithm to match subsequent diagrams. Third, based on the obtained matchings, I proposed new design evolution metrics and assessed their usefulness in defect prediction models. The following paragraphs detail each of those three aspects.

Approximate Graph Matching

Many practical problems can be modeled as approximate graph matching problems in which the goal is to find a “good” matching between two objects represented as graphs. Existing literature is centered on image processing and biochemistry communities with very specific graph matching problems, benchmarks, and algorithms. As a result, researchers and practitioners from other fields, such as software engineering, may not find tools or algorithms suitable for their own specific graph matching problem.  Moreover, given that most graph instances in image processing and biochemistry literature are very small (less than 100 nodes), the proposed algorithms have scalability issues.

To address this situation, I tackled in a generic way, the approximate graph matching problem. My approach avoids as much as possible assumptions about graphs to be matched and tries to make the best out of basic graph features such as node connectivity and edge types. Consequently, my proposal is a local search technique using new node similarity measures derived from simple structural information.

The proposed technique was devised as follows. First, I observed and empirically validated that initializing a local search with a very small subset of “correct” node matches is enough to get excellent results [3]. Instead of directly trying to correctly match all nodes and edges of a given graph to the nodes and edges of another graph, one could focus on correctly matching a reduced subset of nodes. Second, in order to retrieve such subsets, I resorted to the concept of local node similarity. My approach consists in assessing, by analyzing their neighborhoods, how likely it is to have a pair of nodes included in a good matching. I investigated many ways of computing similarity values between pairs of nodes and proposed additional techniques to attach a level of confidence to any computed similarity value. My work results in a similarity enhanced tabu algorithm [5, 0] which advances the state-of-art.

Software Evolution through the Matching of High Level artifacts

Long-lived Object Oriented (OO) systems often possess a rich history of changes and design decisions that should be used to gain actionable knowledge. However, given the size and complexity of OO systems, retrieving and understanding the history of the design evolution is a difficult task which requires appropriate techniques.

In my approach, a design representation, usually the class diagram, is first recovered from the code and modeled as a directed graph; then my graph matching algorithm is applied to gain insight on its evolution through time. My key difference is that, unlike the other approaches which are solely dedicated to a specific representation, my generic graph matching algorithm can be adapted to study different artifacts - provided that they can be represented as directed graphs - and different problems (by changing the optimization criterion). I tested my approach on the evolution of class diagrams [2, 6, 7] and empirically proved it to be both efficient and scalable. Ongoing work involves other artifacts such as state and activity diagrams.

Design Evolution Metrics for Defect Prediction

Testing is the most widely adopted practice to ensure software quality. However, this activity is often a compromise between the available resources and sought software quality. In object-oriented development, testing effort should be focused on defect-prone classes or alternatively on classes deemed critical based on criteria such as their connectivity or evolution profile [10, 11]. Unfortunately, the identification of defect-prone classes is a challenging and difficult activity on which many metrics, techniques, and models have been tried with mixed success.

Following the retrieval of class diagrams’ evolution by my graph matching approach, I proposed and investigated the usefulness of elementary design evolution metrics in the identification of defective classes [1]. The metrics include the numbers of added, deleted, and modified attributes, methods, and relations. They are used to recommend a ranked list of classes likely to contain defects for a system. I evaluated the efficiency of our approach according to three criteria: presence of defects, number of defects, and defect density in the top-ranked classes. I conducted experiments with small to large systems and made comparisons against known metrics (31 complexity metrics and the Chidamber and Kemerer’s metrics). Results show that the design evolution metrics, when used in conjunction with known metrics, improve the identification of defective classes. In addition, they provide evidence that design evolution metrics make significantly better predictions of defect density than other metrics and, thus, can help in reducing the testing effort by focusing test activity on a reduced volume of code [1].

My Research Plan

My research plan is in line with the work I’m pursuing for my Ph.D. and will be centered on software evolution, software quality, and search-based software engineering. In order to broaden the range of problems I will be able to tackle, I foresee extensions to the graph matching tool I have been developing. The current version of my algorithm abides to the one-to-one constraint (one node is matched to at most one node) which is enough for most matching tasks but may fall short in cases where nodes are expected to split or merge. To address that issue, I’m considering the addition of new and more complex graph edit operations able to deal with cases where a group of nodes can be matched with another group of nodes (many-to-many matching).

I also intend to conduct experiments destined to retrieve the evolution of different software artifacts (such as sequence diagrams or state machines) and investigate how the extracted information can help maintainers and managers. Moreover, I’m interested in matching artifacts of different kinds and believe that extensions to my graph matching approach have the potential to bring significant contributions in software traceability.

Additionally, the investigation of metrics, techniques and models which can be useful in the prediction of software quality is definitely part of my agenda. For instance, I would like to pursue my work on Design Evolution Metrics [1] and contribute to the efforts in identifying “micro-architectures” with desirable or unwanted properties [4]. Along related lines, software testing is a research area I am willing to explore, given my teaching experience in the field. Problems such as automatic test generation and optimal integration test orders look particularly interesting to me.

 

Finally, I believe that my advanced knowledge of meta-heuristics and graph theory could result in fruitful collaborations with future colleagues and contribute to the outreach of my institution.

Publications

Under Review

0.       Kpodjedo, S., Galinier, P., Antoniol, G., “Using Local Similarity Measures to Efficiently Address Approximate Graph Matching”, submitted to Discrete Applied Mathematics, 37 pages, October 2010

Journal papers

1.       Kpodjedo, S., Ricca, F., Galinier, P., Antoniol, G., Gueheneuc, Y.-G., “Design Evolution Metrics for Defect Prediction in Object Oriented Systems”, Empirical Software Engineering Vol. 16, 1 pp 141-175 (2011)

2.       Kpodjedo, S., Ricca, F., Galinier, P., Antoniol, G., Gueheneuc, Y-G., “Studying Software Evolution of Large Object-Oriented Software Systems Using an ETGM Algorithm”, Journal of Software Maintenance and Evolution, http://dx.doi.org/10.1002/smr.519, 32 pages, September 2010

3.       Kpodjedo, S., Galinier, P., Antoniol, G., “On the Use of Similarity Metrics for Approximate Graph Matching”, Electronic Notes in Discrete Mathematics Vol. 36, 1 pp. 687-694, August 2010.

Peer-reviewed Conference papers

4.        Belderrar, A., Kpodjedo, S., Guéhéneuc, Y.-G., Antoniol, G., Galinier, P., “Sub-Graph Mining: Identifying Micro-Architectures in Evolving Object Oriented Softwares”, Proceedings of the 15th to appear in Eur. Conf. on Software Maintenance and Reengineering (CSMR2011), March 1-4, 2011, Oldenburg, Germany.

5.       Kpodjedo, S., Galinier, P., Antoniol, G., “Enhancing a Tabu Algorithm for Approximate Graph Matching by Using Similarity Measures”, Proceedings of the 10th Eur. Conf. on Evolutionary Computation in Combinatorial Optimization (EvoCOP2010), April 7–9 2010, Istanbul, Turkey, pp. 119-130.

6.       Kpodjedo, S., Ricca, F., Antoniol, G., Galinier, P., “Recovering the Evolution Stable Part Using an ECGM Algorithm: Is There a Tunnel in Mozilla?”, Proceedings of the 13th Eur. Conf. on Software Maintenance and Reengineering (CSMR2009), March 24-27, 2009, Fraunhofer IESE, Kaiserslautern, Germany, pp. 179-188.

7.       Kpodjedo, S., Ricca, F., Antoniol, G., Galinier, P., “Error Correcting Graph Matching Application to Software Evolution”, Proceedings of the 15th Working Conference on Reverse Engineering (WCRE2008), October 15-18, 2008, Antwerp, Belgium, pp. 289-293.

8.       Kpodjedo, S., Pourzandi, M., Pierre, S., “Reputation Based Trust Management Using TCG in Mobile Ad-Hoc Networks (RTA)”, Proceedings of the 33rd IEEE Conference on Local Computer Networks (LCN2008), October 14-17, Montreal, Quebec, Canada, pp. 518-519

Workshop papers

9.       Kpodjedo, S.,  “Approximate Graph Matching in Software Engineering”, Proceedings of the 16th  Working Conference on Reverse Engineering (WCRE2009), October 13-16, 2008, Lille, France, Doctoral Symposium,  pp. 295-298

10.   Kpodjedo, S., Ricca, F., Antoniol, G., Galinier, P., “Evolution and Search Based Metrics to Improve Defects Prediction”, Proceedings of the 1st International Symposium on Search Based Software Engineering (SSBSE2009), May 13-15, 2009, Cumberland Lodge, Windsor, UK, pp. 23 - 32.

11.   Kpodjedo, S., Ricca, F., Antoniol, G., Galinier, P., “Not all classes are created equal: toward a recommendation system for focusing testing”, Proceedings of the 1st International Workshop on Recommendation Systems for Software Engineering (RSSE2008) co-located with ACM SIGSOFT 2008 / FSE 16, November 10, 2008, Atlanta, Georgia, USA, pp. 6-10