Knn by pasaunders · Pull Request #21 · CCallahanIV/data-structures

pasaunders · 2017-02-20T09:54:51Z

No description provided.

…tructures into decision-tree

nhuntwalker · 2017-02-21T22:19:23Z

src/knn.py

+        """Calcute the distance between two rows."""
+        dist = 0.0
+        for i in range(len(row1) - 1):
+            dist += (row1[i] - row2[i]) ** 2


You're missing out on using the power of Numpy (or pandas) here to broadcast mathematical operations. If row1 and row2 are numpy arrays, then you could just have

return sqrt(np.sum((row1 - row2)**2))

Written this way to account for the difference in length between rows. Test data is submitted without a "classification" column. Present data has such columns.

nhuntwalker · 2017-02-22T01:14:00Z

src/knn.py

+
+    def predict(self, test_data, tk=None):
+        """Given data, categorize the data by its k nearest neighbors."""
+        if tk is None:


What's tk?

nhuntwalker · 2017-02-22T01:14:45Z

src/knn.py

+        for row in self.data.iterrows():
+            distances.append((row[1][-1], self._distance(row[1], test_data)))
+        distances.sort(key=lambda x: x[1])
+        # import pdb; pdb.set_trace()


corpse code

nhuntwalker · 2017-02-22T01:15:20Z

src/knn.py

+        if my_class:
+            return my_class
+        else:
+            self.predict(test_data, tk - 1)


Confused as to why this has to be recursive

Written for the case in which the classification is a "tie" between two classes. In that case, the classify function returns None and therefore predict is run once again with a decreased k value. This is based on my interpretation of the algorithm in the class notes. Does not mean I didn't interpret it incorrectly, though.

https://codefellows.github.io/sea-python-401d5/lectures/k_nearest_neighbors.html?highlight=nearest

CCallahanIV and others added 20 commits February 8, 2017 14:05

Initial commit for decision tree.

2f72f7d

including data file.

2ad032d

Wrote out initial tree class.

a944f66

added some stuff.

cb95f60

gini and split functions added

38e6948

test database.

7b6964d

Wrote and tested test_split and calculate gini.

8529b82

Wrote out draft of fit, predict, and helper functions.

e8e6f4c

testing _get_split

772a3e6

Merge branch 'decision-tree' of https://github.com/CCallahanIV/data-s…

ac1b615

…tructures into decision-tree

Pushing up tinkering changes and troubleshooting gini splits.

5646a18

Troubleshooting weird behavior. Passing the torch.

36f5025

stopping fiddling for now.

149a405

Wrote out first draft of knn.

6fbc80a

testing distance function, debugging distance fucntion.

132c87e

edge case test of distance function - zero distance

dfb41ec

Troubleshooting knn.py and wrote test_simple_prediction

5d2d7be

fixed merge conlflict in test_knn

323abe6

classify unit test

1baad41

run test predictions of the entire flowers dataset, tests pass.

7248c2e

nhuntwalker reviewed Feb 22, 2017

View reviewed changes

pasaunders added 2 commits February 21, 2017 19:47

cleared corpse code, more semantic vars

5a92423

added readme section

e841260

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Knn#21

Knn#21
pasaunders wants to merge 22 commits intomasterfrom
knn

pasaunders commented Feb 20, 2017

Uh oh!

nhuntwalker Feb 21, 2017

Uh oh!

CCallahanIV Feb 22, 2017

Uh oh!

nhuntwalker Feb 22, 2017

Uh oh!

nhuntwalker Feb 22, 2017

Uh oh!

nhuntwalker Feb 22, 2017

Uh oh!

CCallahanIV Feb 22, 2017 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

pasaunders commented Feb 20, 2017

Uh oh!

nhuntwalker Feb 21, 2017

Choose a reason for hiding this comment

Uh oh!

CCallahanIV Feb 22, 2017

Choose a reason for hiding this comment

Uh oh!

nhuntwalker Feb 22, 2017

Choose a reason for hiding this comment

Uh oh!

nhuntwalker Feb 22, 2017

Choose a reason for hiding this comment

Uh oh!

nhuntwalker Feb 22, 2017

Choose a reason for hiding this comment

Uh oh!

CCallahanIV Feb 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CCallahanIV Feb 22, 2017 •

edited

Loading