AssertJ is currently one of the most popular Java projects on Github (that are Maven- buildable). It is a relatively large system, comprising of the order of tens of thousands of lines of code, and over 500 classes (you’ll calculate the precise metrics throughout the course of this project).
1.1 Setup
As with the other projects, we will use Github Classroom for this project. Please clone your personal version of this repository here:
https://classroom.github.com/a/_4zauxdP
We have created a base repository for you, which contains the following familiar directory structure:
· analys•isCode: This is where you keep the code that you use to analyse the system. As a starting point, we have included the Java, R, and Bash scripts that
have been developed in the labs and in project 1. You are free to (and expected to) edit and improve them to suit your needs, and are free to do so in any way that you wish (as long as this does not involve any additional 3rd party code libraries or tools).
· assert•j-core This contains a snapshot of the assertj-core repository. This contains the full subject system that you are expected to use as the basis for
your analysis. It does notcontain any git data. If you wish to analyse the commit- history of this repository (as we did in our lab session), you will have to clone a separate copy of this on your local machine, and check-out the specific repository we used in our repository (9d45e93). For this you would have to write: git clone https://github.com/joel-costigliola/assertj-core.git
change directory into the assertj-core directory
git checkout 9d45e93
· dataF•ilesThis is where you can store any data files that are produced (e.g. CSV files with metrics, data traces, images, etc.).
· notes•AndReportsThis is where you are expected to store your final report, along with any other notes that you use in its production.
2 Instructions
For this project, the goal is to put all of the skills that we have learnt throughout the course into practice.
Put yourself into the position you’ll hopefully find yourself in a few months from now. You are a graduate employee, put in charge of re-engineering a large system (in this case AssertJ).
Everything in your repository will count as your submission. The main component will be the write-up, which will provide the necessary guidelines.
The detailed submission instructions (along with guidelines on how much detail is expected in the report), are provided below. Each exercise is associated with a section title you are expected to use in the report, along with a guideline for how much detail is required, and to what extent it contributes to the final mark.
Five sections are to be completed:
1. Investigate the system without using any tools, to gain a broad under- standing of its function, its key components, and architectural features. In doing so, adopt the relevant re-engineering patterns, such as ”Read the Code in One Hour”, or ”Skim Documentation”.
In your write-up, list every pattern that you used, how you applied it in the context of this project, and what it enabled you to learn about the system. This latter point is especially important. Try to highlight gen- uinely insightful observations. Include any notes taken whilst applying the patterns in the notesAndReports directory.
Submission: ”Initial Exploration”, 1.5 – 3 pages of the report.
Assessment of this section contributes to 15% of the final mark.
2. Carry out some static and dynamic analysis to calculate metrics that can give you an insight into potential weaknesses (or strengths) within the system. For this you should:
Calculate the LOC and the Weighted Method COunt for each class, and calculate the LOC (i.e. t• e number of nodes in the CFG) and the Cyclomatic Complexity for
each method.
For particularly•large methods (if there are any), you may wish to employ slice-based metrics to gain an understanding, for ple, of how cohesive the methods are. Analyse the re•pository log to identify classes that have been particularly change- prone throughout the development of the project.
Use dynamic a•nalysis to highlight any particularly important classes that contribute to the system when it executes.
In your write-up, detail what you did and how you went about it. If you wrote your own code to carry out some form of ana2lysis (on top of what is already in the code
base), provide some details. Although you are not expected to include complete tables of results (these would be too large), you are expected to focus on specific sub-groups – e.g. the top 10 results for a given metric. Do not include visualisations or charts yet. The purpose of this part is to focus on method you used, and to highlight individual classes or methods that appear to be notable.
Commit any scripts or code you used to the repository, and ensure that this is described in the report. Ifyoucommitcodetotherepositoryanddonotmention it, then it will probably not be counted towards yourproject.
Submission: “Metrics and Analysis”, 4 – 7 pages, including at most 2.5 pages of tables.
Assessment of this section contributes to 25% of the final mark.
3. Use visualisation to give you the big picture:
Use charts created in Excel or R to visualise metrics, or relationships between metrics, to poi •t you towards key problematic areas within the system.
· Visualise the class structure.
· Use visualisation to home-in on any code duplication in the system.
Given that there are over 500 classes in the system, you’ll need to use some ingenuity to ensure that the results are readable when it comes to class diagrams or duplication plots. For ple, you could think of leaving out classes that are neverexecutedorhaveametricvaluebelowagiventhreshold.Intheclassdiagram you could also use metrics to visually accentuate the most relevant / important classes (and to correspondingly reduce the prominence of less important classes). In your write-up, provide any visualisations you pro