diff --git a/README.md b/README.md
index 4243a65..c074f16 100644
--- a/README.md
+++ b/README.md
@@ -1,56 +1,27 @@
# Ingham Medical Physics Coding Challenge
-## Welcome
+## Janet Cui workload
-Welcome to the Ingham/UNSW Medical Physics Coding Challenge for September/October 2021. Thank you for your interest in joining our team. Before we progress with the hiring process, we'd like to ask you to complete this challenge. This is a chance for you to showcase your skills to us, there is no correct or incorrect solution to this challenge.
+## 1. Create dash-env
+conda create -n dash-dev python=3.7 -y
-## Task
+## 2. Activate dash-env
+conda activate dash-dev
-We've included some data for approximately 200 patients treated for Head and Neck Cancer extracted from the publicly available [HNSCC dataset](https://wiki.cancerimagingarchive.net/display/Public/HNSCC). This is a comprehensive dataset with a number of different attributes typically tracked for patients receiving cancer treatment.
+## 3. Make sure to run python file under "src" file
+cd ingham-medphys-coding
+cd src
-Your task is to implement a web-based dashboard/GUI which can visualise some of this data. It's up to you what you would like to display and what kind of functionality you want to provide to a user to explore the data. You may also use the programming language of your choice and any libraries/frameworks you like. See below for a few tips in case you need some help getting started.
+## 4. Download dash module, dash-bootstrap-components, pandas
+pip install dash -U
+pip install dash-bootstrap-components
+pip install pandas
-> #### Important: The are many different attributes in the dataset, we do not expect your dashboard to deal with and visualise all of them. Pick out some key attributes which you think would be most interesting to display.
+## 5. Run python file
+python3 challenge_september_2021.py
-## Submission
+## 6. Open the URL shows in Terminal (copy and paste it in Google Chrome)
+e.g.,"Dash is running on http://127.0.0.1:8050/"
-Since we commonly use GitHub for collaborating on open source code, ideally you will fork this repository and add your code directly in there. Best to follow up with a GitHub Pull Request back into our repo so we can see your code. This would require making your submission public, if you would prefer to keep your code private, please add it to a private GitHub repository and add GitHub user `pchlap` with read permissions.
-
-In addition, it would be great if you can make your dashboard really easy to run/access. Think about how you might best package your tool and include some instructions in your submission on how to run your dashboard.
-
-> ### Please submit by 10am on Tuesday 5th October 2021.
-
-## Tips
-
-Here are a few tips to help you get started with the challenge:
-
-1. Python is commonly used in research thanks to its ease of use. [Dash](https://dash.plotly.com/) is one of many Python libraries which can be used to easily create web-based dashboards.
-
-2. See `challenge_september_2021.py` for some code on getting started using Python and Dash. Use the following commands in your Python environment to run the example:
-
-```bash
-pip install -r requirements.txt
-python challenge_september_2021.py
-```
-
-3. At diagnosis, the patient's disease is assigned a stage (I, II, III, IVA, IVB). Typically one can expect poorer overall survival the higher the stage at diagnosis. A good starting point for the dashboard could be to visualise the overall survival based on the stage. Additionally you could allow the user to filter by gender or an age bracket.
-
-4. Web applications are often deployed using Docker. Perhaps you might like to build or even deploy ([Heroku](https://www.heroku.com/) has a free plan) a Docker image which runs your dashboard.
-
-## Data
-
-You can find the clinical data for this challenge in the `data/hnscc.csv` file.
-
-This data was extracted from the HNSCC dataset: https://wiki.cancerimagingarchive.net/display/Public/HNSCC:
-- Grossberg A, Elhalawani H, Mohamed A, Mulder S, Williams B, White AL, Zafereo J, Wong AJ, Berends JE, AboHashem S, Aymard JM, Kanwar A, Perni S, Rock CD, Chamchod S, Kantor M, Browne T, Hutcheson K, Gunn GB, Frank SJ, Rosenthal DI, Garden AS, Fuller CD, M.D. Anderson Cancer Center Head and Neck Quantitative Imaging Working Group. (2020) HNSCC [ Dataset ]. The Cancer Imaging Archive. DOI: https://doi.org/10.7937/k9/tcia.2020.a8sh-7363
-
-- Grossberg A, Mohamed A, Elhalawani H, Bennett W, Smith K, Nolan T, Williams B, Chamchod S, Heukelom J, Kantor M, Browne T, Hutcheson K, Gunn G, Garden A, Morrison W, Frank S, R osenthal D, Freymann J, Fuller C. (2018) Imaging and Clinical Data Archive for Head and Neck Squamous Cell Carcinoma Patients Treated with Radiotherapy. Scientific Data 5 :180173 (2018) DOI: 10.1038/sdata.2018.173
-
-- Elhalawani, H., Mohamed, A., White, A. et al. Matched computed tomography segmentation and demographic data for oropharyngeal cancer radiomics challenges. Sci Data 4, 170077 (2017). DOI: 10.1038/sdata.2017.77
-TCIA Citation
-
-- Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7
-
-## Good luck
-
-Thanks for participating, we look forward to you submission! If you have any questions please contact: **Phillip Chlap**: [phillip.chlap@unsw.edu.au](phillip.chlap@unsw.edu.au).
+## (7. If error shows address occupied, it needs to kill current process on terminal, and redo step6)
+pkill -9 python
diff --git a/challenge_september_2021.py b/challenge_september_2021.py
deleted file mode 100644
index 76d0d78..0000000
--- a/challenge_september_2021.py
+++ /dev/null
@@ -1,39 +0,0 @@
-# Copyright 2020 University of New South Wales, University of Sydney, Ingham Institute
-
-# Licensed under the MIT Licence;
-#
-# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
-# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
-# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
-# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
-# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
-import dash
-from dash import dcc
-from dash import html
-import plotly.express as px
-import pandas as pd
-
-app = dash.Dash(__name__)
-
-hnscc_csv = "data/hnscc.csv"
-
-df = pd.read_csv(hnscc_csv)
-
-fig = px.histogram(df, x="Site", color="Stage")
-
-app.layout = html.Div(children=[
- html.H1(children='Ingham Medical Physics Coding Challenge Dashboard'),
-
- html.Div(children='''
- HNSCC Dataset: Histogram of patients by Site and Stage
- '''),
-
- dcc.Graph(
- id='stage-graph',
- figure=fig
- )
-])
-
-if __name__ == '__main__':
- app.run_server(debug=True)
diff --git a/requirements.txt b/requirements.txt
index 7233fd3..7bc2ea5 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,2 +1,3 @@
pandas
dash
+dash-bootstrap-components
diff --git a/src/__pycache__/styles.cpython-37.pyc b/src/__pycache__/styles.cpython-37.pyc
new file mode 100644
index 0000000..272d762
Binary files /dev/null and b/src/__pycache__/styles.cpython-37.pyc differ
diff --git a/src/__pycache__/styles.cpython-38.pyc b/src/__pycache__/styles.cpython-38.pyc
new file mode 100644
index 0000000..274034f
Binary files /dev/null and b/src/__pycache__/styles.cpython-38.pyc differ
diff --git a/src/challenge_september_2021.py b/src/challenge_september_2021.py
new file mode 100644
index 0000000..3d955c5
--- /dev/null
+++ b/src/challenge_september_2021.py
@@ -0,0 +1,371 @@
+# Copyright 2020 University of New South Wales, University of Sydney, Ingham Institute
+
+# Licensed under the MIT Licence;
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
+# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
+# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+from styles import CARD
+import dash
+import dash_bootstrap_components as dbc
+from dash import dcc
+from dash import html
+import plotly.express as px
+import pandas as pd
+from dash.dependencies import Input, Output, State
+from plotly.subplots import make_subplots
+import plotly.graph_objects as go
+from plotly.offline import plot
+
+
+app = dash.Dash(external_stylesheets=[dbc.themes.BOOTSTRAP])
+
+hnscc_csv = "../data/hnscc.csv"
+
+df = pd.read_csv(hnscc_csv)
+
+header = html.Div([
+ html.Div([
+ html.H1(children='Data Visualisation of Ingham Head and Neck Cancer dataset',
+ style = {'textAlign' : 'center', 'font-family' : 'Arial'}
+ ),
+ html.Br([]),
+ ],
+ style = {'padding-top' : '2%'}
+ )],
+ style = {'height' : '10%',
+ 'background-color' : '#6092cd'},
+
+)
+
+def average_follow_up_duration_by_site(dataframe):
+ average_follow_up_grouped_by_site = dataframe.groupby('Site')["Follow up duration (day)"].mean()
+ fig = px.bar(average_follow_up_grouped_by_site, x="Follow up duration (day)", title="Average Follow Up Duration By Site", orientation='h')
+ fig.update_layout(title_x=0.5)
+
+ return dcc.Graph(
+ id='average-follow-up-duration-by-site',
+ figure=fig
+ )
+
+def survival_by_age(dataframe):
+ fig = px.scatter(dataframe, x="Age", y="Survival (months)", title="Survival of months by Age of Patient", color="Alive or Dead")
+ fig.update_layout(title_x=0.5)
+ return dcc.Graph(
+ id='age-by-cancer-grade',
+ figure=fig)
+
+def survival_by_stage(dataframe):
+ df_stage_sensor = dataframe.groupby(['Stage', 'Overall Survival Censor']).size()
+ df_stage_sensor = df_stage_sensor.to_frame(name='occurance').reset_index()
+ occ = df_stage_sensor.loc[:,'occurance']
+ fig = make_subplots(rows=1, cols=5, specs=[[{"type": "pie"}, {"type": "pie"},{"type": "pie"}, {"type": "pie"}, {"type": "pie"}]])
+
+ fig.add_trace(go.Pie(
+ values=occ[:1],
+ labels=["death"],
+ domain=dict(x=[0, 0.2]),
+ name="Stage I",
+ title="Stage I"),
+ row=1, col=1)
+
+ fig.add_trace(go.Pie(
+ values=occ[1:3],
+ labels=["survival", "death"],
+ domain=dict(x=[0.2, 0.4]),
+ name="Stage II",
+ title="Stage II",
+ marker_colors=['lightskyblue','crimson']),
+ row=1, col=2)
+
+ fig.add_trace(go.Pie(
+ values=occ[3:5],
+ labels=["survival", "death"],
+ domain=dict(x=[0, 0.5]),
+ name="Stage III",
+ title="Stage III",
+ marker_colors=['lightskyblue','crimson']),
+ row=1, col=3)
+
+ fig.add_trace(go.Pie(
+ values=occ[5:7],
+ labels=["survival", "death"],
+ domain=dict(x=[0.5, 1.0]),
+ name="Stage IVA",
+ title="Stage IVA",
+ marker_colors=['lightskyblue','crimson']),
+ row=1, col=4)
+
+ fig.add_trace(go.Pie(
+ values=occ[7:9],
+ labels=["survival", "death"],
+ domain=dict(x=[0.5, 1.0]),
+ name="Stage IVB",
+ title="Stage IVB",
+ marker_colors=['lightskyblue','crimson']),
+ row=1, col=5)
+
+ fig.update_traces(hole=.4, hoverinfo="label+percent+name")
+ fig.update(layout_title_text='Overall survival rate based on the Stage', layout_title_x = 0.49)
+
+ fig.update_layout(legend=dict(
+ orientation="h",
+ yanchor="bottom",
+ y=1.02,
+ xanchor="right",
+ x=1
+ ))
+
+ return dcc.Graph(id='stage-sur-graph', figure=fig)
+
+def age_filter(dataframe):
+ df_sex_sur = dataframe.groupby(['Sex', 'Overall Survival Censor']).size()
+ df_sex_sur = df_sex_sur.to_frame(name='occurance').reset_index()
+
+ df_sex_stage = df.groupby(['Sex', 'Stage']).size()
+ df_sex_stage = df_sex_stage.to_frame(name='occurance').reset_index()
+
+ df_sex_site = df.groupby(['Sex', 'Site']).size()
+ df_sex_site = df_sex_site.to_frame(name='occurance').reset_index()
+
+ df_sex_rec = df.groupby(['Sex', 'Site of recurrence (Distal/Local/ Locoregional)']).size()
+ df_sex_rec = df_sex_rec.to_frame(name='occurance').reset_index()
+
+ gender_by_survival = html.Div(className="row", children=[
+ html.Div(className="col-sm-3", children=[
+ html.P("Gender:",
+ style={'margin-left':'40px','margin-top':'40px'}),
+ dcc.Dropdown(
+ id='gender',
+ value='Male',
+ options=[{'label': x, 'value': x} for x in df_sex_sur['Sex'].unique()],
+ clearable=False,
+ style={'margin-left':'30px','margin-top':'10px'}
+ ),
+ html.P("Diagnostic information:",
+ style={'margin-left':'40px','margin-top':'80px'}),
+ dcc.Dropdown(
+ id='survival',
+ value='Overall Survival Censor',
+ options=[{'value': x, 'label': x}
+ for x in ['Overall Survival Censor', 'Stage', 'Site of diagnosis', 'Site of recurrence']],
+ clearable=False,
+ style={'margin-left':'30px','margin-top':'10px'}
+ )
+ ]),
+ html.Div(className="col-lg", children=[
+ dcc.Graph(id="pie-chart")
+ ])
+ ])
+
+ @app.callback(
+ Output("pie-chart", "figure"),
+ [Input("gender", "value"),
+ Input("survival", "value")])
+
+ def generate_chart(gender, survival):
+ if survival == 'Overall Survival Censor':
+ array = df_sex_sur['occurance'].loc[(df_sex_sur['Sex'] == gender)]
+ fig = px.pie(df_sex_sur, values=array, labels=df_sex_site.index,
+ names=["survival", "death"])
+ elif survival == 'Stage':
+ array = df_sex_stage['occurance'].loc[(df_sex_stage['Sex'] == gender)]
+ fig = px.pie(df_sex_stage, values=array, labels=df_sex_site.index,
+ names=["Stage I", "Stage II", "Stage III", "Stage IVA", "Stage IVB"])
+ elif survival == 'Site of diagnosis':
+ array = df_sex_site['occurance'].loc[(df_sex_site['Sex'] == gender)]
+ if gender == 'Female':
+ fig = px.pie(df_sex_site, values=array, labels=df_sex_site.index,
+ names=["Glottis", "Hypopharynx", "Nasopharynx", "Oral cavity", "Oropharynx"])
+ elif gender == 'Male':
+ fig = px.pie(df_sex_site, values=array, labels=df_sex_site.index,
+ names=["CUP", "Glottis", "Hypopharynx", "Nasopharynx", "Oral cavity", "Oropharynx", "Sinus"])
+ elif survival == 'Site of recurrence':
+ array = df_sex_rec['occurance'].loc[(df_sex_rec['Sex'] == gender)]
+ if gender == 'Female':
+ fig = px.pie(df_sex_rec, values=array,
+ names=["Complete response", "Distant metastasis", "Local recurrence", "Regional recurrence", "Residual tumor"])
+ elif gender == 'Male':
+ fig = px.pie(df_sex_rec, values=array, labels=df_sex_rec.index,
+ names=["Complete response", "Distant metastasis", "Local recurrence", "Local recurrence and distant metastasis",
+ "Locoregional and distant metastasis", "Locoregional recurrence", "Regional and distant metastasis",
+ "Regional recurrence", "Regional recurrence and distant metasatsis","Residual tumor"])
+
+ fig.update_layout(title=f"{survival} by {gender}", legend_traceorder="normal")
+ return fig
+
+ return gender_by_survival
+
+def causation_of_death(dataframe):
+ df_dead = dataframe.loc[(df['Alive or Dead']) == "Dead"]
+ df_dead = df_dead[['Cause of Death']]
+ df_dead = df_dead.groupby(['Cause of Death']).size()
+ df_dead = df_dead.to_frame(name='occurance').reset_index()
+ df_dead.at[2,'occurance'] = 9
+ df_dead = df_dead.drop([3], axis=0)
+
+ fig = px.bar(df_dead, x='Cause of Death', y='occurance', height=450)
+ fig.update(layout_title_text="Causation of Death", layout_title_x = 0.49)
+ fig.update_traces(width=0.4)
+ return dcc.Graph(id='death-causation-graph', figure=fig)
+
+
+def age_RT_distribution(dataframe):
+ fig = px.scatter(dataframe, x="Age", y="Total RT treatment time (days)",
+ title="Age Distribution of Total RT treatment time (days)",
+ color="Sex",
+ log_x=True)
+ fig.update_layout(title_x=0.5)
+
+ age_RT = html.Div([
+ dcc.Graph(id='age-RT-distribution', figure=fig),
+ html.P("Age Range Slider:"),
+ dcc.RangeSlider(
+ id='range-slider',
+ min=20, max=95, step=0.1,
+ marks={20: '20', 95: '95'},
+ value=[20, 95]),
+ ])
+
+ @app.callback(
+ Output("age-RT-distribution", "figure"),
+ [Input("range-slider", "value")])
+
+ def update_bar_chart(slider_range):
+ low, high = slider_range
+ mask = (dataframe['Age'] > low) & (dataframe['Age'] < high)
+ fig = px.scatter(
+ dataframe[mask], x="Age", y="Total RT treatment time (days)",
+ color="Sex", size='Total RT treatment time (days)',
+ title="Age Distribution of Total RT treatment time (days)",
+ hover_data=['Age'])
+ fig.update_layout(title_x=0.5)
+ return fig
+
+ return age_RT
+
+
+def Receive_Concurrent_Chemoradiotherapy(dataframe):
+ df_concurrent = dataframe.groupby(['Sex', 'Received Concurrent Chemoradiotherapy?']).size()
+ df_concurrent = df_concurrent.to_frame(name='occurance').reset_index()
+
+ occ = df_concurrent.loc[:,'occurance']
+ fig = make_subplots(rows=1, cols=2, specs=[[{"type": "pie"}, {"type": "pie"}]])
+
+ fig.add_trace(go.Pie(
+ values=occ[:2],
+ labels=["No","YES"],
+ domain=dict(x=[0, 0.2]),
+ name="Female",
+ title="Female",
+ marker_colors=['crimson','lightskyblue']),
+ row=1, col=1)
+
+ fig.add_trace(go.Pie(
+ values=occ[2:4],
+ labels=["No","YES"],
+ domain=dict(x=[0.2, 0.4]),
+ name="Male",
+ title="Male",
+ marker_colors=['crimson','lightskyblue']),
+ row=1, col=2)
+
+ fig.update_traces(hole=.4, hoverinfo="label+percent+name")
+ fig.update(layout_title_text='Received Concurrent Chemoradiotherapy ratio by Gender', layout_title_x = 0.49)
+
+ fig.update_layout(legend=dict(
+ orientation="h",
+ yanchor="bottom",
+ y=1.02,
+ xanchor="right",
+ x=1
+ ))
+
+ return dcc.Graph(id='concurrent-gender', figure=fig)
+
+def BMI_difference(dataframe):
+ df['difference']=df['BMI start treat (kg/m2)']-df['BMI stop treat (kg/m2)']
+ diff_BMI = df[['Age','Sex','difference']]
+
+ fig = px.scatter(df, x="Age", y="difference",
+ title="The difference in BMI before and after treatment (>0 means an increase in BMI)",
+ color="Sex",
+ log_x=True)
+ fig.update_layout(title_x=0.5)
+
+ bmi_diff = html.Div([
+ dcc.Graph(id='bmi-difference', figure=fig),
+ html.P("Age Range Slider:"),
+ dcc.RangeSlider(
+ id='range-slider1',
+ min=20, max=95, step=0.1,
+ marks={20: '20', 95: '95'},
+ value=[20, 95]),
+ ])
+
+ @app.callback(
+ Output("bmi-difference", "figure"),
+ [Input("range-slider1", "value")])
+
+ def update_bar_chart(slider_range):
+ low, high = slider_range
+ mask = (diff_BMI['Age'] > low) & (diff_BMI['Age'] < high)
+ fig = px.scatter(
+ diff_BMI[mask], x="Age", y="difference",
+ color="Sex",
+ hover_data=['Age'],
+ title="The difference in BMI before and after treatment (>0 means an increase in BMI)")
+ fig.update_layout(title_x=0.5)
+ return fig
+
+ return bmi_diff
+
+app.layout = html.Div(children=[
+ header,
+ html.Div(className="container", children=[
+ html.Div(className="row", children=[
+ html.Div([
+ survival_by_stage(df)
+ ], className="col-lg", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ survival_by_age(df)
+ ], className="col-lg", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ average_follow_up_duration_by_site(df)
+ ], className="col-sm", style=CARD),
+ html.Div([
+ causation_of_death(df)
+ ], className="col-sm", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ age_filter(df)
+ ], className="col-sm", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ age_RT_distribution(df)
+ ], className="col-lg", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ Receive_Concurrent_Chemoradiotherapy(df)
+ ], className="col-lg", style=CARD)
+ ]),
+ html.Div(className="row", children=[
+ html.Div([
+ BMI_difference(df)
+ ], className="col-lg", style=CARD)
+ ])
+ ])
+])
+
+if __name__ == '__main__':
+ app.run_server(debug=True)
diff --git a/src/styles.py b/src/styles.py
new file mode 100644
index 0000000..baa28b1
--- /dev/null
+++ b/src/styles.py
@@ -0,0 +1,7 @@
+#!/usr/bin/python3
+
+CARD = {
+ 'padding': '1% 1% 1% 1%',
+ 'box-shadow': '0 4px 8px 4px rgba(0,0,0,0.2)',
+ 'margin': '1.5% 1.5% 1.5% 1.5%'
+}
\ No newline at end of file